summaryrefslogtreecommitdiff
path: root/drivers/s390/cio/vfio_ccw_drv.c
AgeCommit message (Collapse)Author
2024-01-10Merge tag 's390-6.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Add machine variable capacity information to /proc/sysinfo. - Limit the waste of page tables and always align vmalloc area size and base address on segment boundary. - Fix a memory leak when an attempt to register interruption sub class (ISC) for the adjunct-processor (AP) guest failed. - Reset response code AP_RESPONSE_INVALID_GISA to understandable by guest AP_RESPONSE_INVALID_ADDRESS in response to a failed interruption sub class (ISC) registration attempt. - Improve reaction to adjunct-processor (AP) AP_RESPONSE_OTHERWISE_CHANGED response code when enabling interrupts on behalf of a guest. - Fix incorrect sysfs 'status' attribute of adjunct-processor (AP) queue device bound to the vfio_ap device driver when the mediated device is attached to a guest, but the queue device is not passed through. - Rework struct ap_card to hold the whole adjunct-processor (AP) card hardware information. As result, all the ugly bit checks are replaced by simple evaluations of the required bit fields. - Improve handling of some weird scenarios between service element (SE) host and SE guest with adjunct-processor (AP) pass-through support. - Change local_ctl_set_bit() and local_ctl_clear_bit() so they return the previous value of the to be changed control register. This is useful if a bit is only changed temporarily and the previous content needs to be restored. - The kernel starts with machine checks disabled and is expected to enable it once trap_init() is called. However the implementation allows machine checks early. Consistently enable it in trap_init() only. - local_mcck_disable() and local_mcck_enable() assume that machine checks are always enabled. Instead implement and use local_mcck_save() and local_mcck_restore() to disable machine checks and restore the previous state. - Modification of floating point control (FPC) register of a traced process using ptrace interface may lead to corruption of the FPC register of the tracing process. Fix this. - kvm_arch_vcpu_ioctl_set_fpu() allows to set the floating point control (FPC) register in vCPU, but may lead to corruption of the FPC register of the host process. Fix this. - Use READ_ONCE() to read a vCPU floating point register value from the memory mapped area. This avoids that, depending on code generation, a different value is tested for validity than the one that is used. - Get rid of test_fp_ctl(), since it is quite subtle to use it correctly. Instead copy a new floating point control register value into its save area and test the validity of the new value when loading it. - Remove superfluous save_fpu_regs() call. - Remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. All machines provide the vector facility since many years and the need to make the task structure size dependent on the vector facility does not exist. - Remove the "novx" kernel command line option, as the vector code runs without any problems since many years. - Add the vector facility to the z13 architecture level set (ALS). All hypervisors support the vector facility since many years. This allows compile time optimizations of the kernel. - Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx(). As result, the compiled code will have less runtime checks and less code. - Convert pgste_get_lock() and pgste_set_unlock() ASM inlines to C. - Convert the struct subchannel spinlock from pointer to member. * tag 's390-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (24 commits) Revert "s390: update defconfigs" s390/cio: make sch->lock spinlock pointer a member s390: update defconfigs s390/mm: convert pgste locking functions to C s390/fpu: get rid of MACHINE_HAS_VX s390/als: add vector facility to z13 architecture level set s390/fpu: remove "novx" option s390/fpu: remove ARCH_WANTS_DYNAMIC_TASK_STRUCT support KVM: s390: remove superfluous save_fpu_regs() call s390/fpu: get rid of test_fp_ctl() KVM: s390: use READ_ONCE() to read fpc register value KVM: s390: fix setting of fpc register s390/ptrace: handle setting of fpc register correctly s390/nmi: implement and use local_mcck_save() / local_mcck_restore() s390/nmi: consistently enable machine checks in trap_init() s390/ctlreg: return old register contents when changing bits s390/ap: handle outband SE bind state change s390/ap: store TAPQ hwinfo in struct ap_card s390/vfio-ap: fix sysfs status attribute for AP queue devices s390/vfio-ap: improve reaction to response code 07 from PQAP(AQIC) command ...
2023-12-12s390/cio: make sch->lock spinlock pointer a memberHalil Pasic
The lock member of struct subchannel used to be a spinlock, but became a pointer to a spinlock with commit 2ec2298412e1 ("[S390] subchannel lock conversion."). This might have been justified back then, but with the current state of affairs, there is no reason to manage a separate spinlock object. Let's simplify things and pull the spinlock back into struct subchannel. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Link: https://lore.kernel.org/r/20231101115751.2308307-1-pasic@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-28eventfd: simplify eventfd_signal()Christian Brauner
Ever since the eventfd type was introduced back in 2007 in commit e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() function only ever passed 1 as a value for @n. There's no point in keeping that additional argument. Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-2-bd549b14ce0c@kernel.org Acked-by: Xu Yilun <yilun.xu@intel.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Eric Farman <farman@linux.ibm.com> # s390 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-06-01vfio/ccw: use struct_size() helperGustavo A. R. Silva
Prefer struct_size() over open-coded versions. Link: https://github.com/KSPP/linux/issues/160 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/f657276073630e806e69726a40ad1cc85101448a.1684805398.git.gustavoars@kernel.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-01vfio/ccw: replace one-element array with flexible-array memberGustavo A. R. Silva
One-element arrays are deprecated, and we are replacing them with flexible array members instead. So, replace one-element array with flexible-array member in struct vfio_ccw_parent and refactor the rest of the code accordingly. Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/297 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/3c10549ebe1564eade68a2515bde233527376971.1684805398.git.gustavoars@kernel.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-02-14vfio/ccw: remove WARN_ON during shutdownEric Farman
The logic in vfio_ccw_sch_shutdown() always assumed that the input subchannel would point to a vfio_ccw_private struct, without checking that one exists. The blamed commit put in a check for this scenario, to prevent the possibility of a missing private. The trouble is that check was put alongside a WARN_ON(), presuming that such a scenario would be a cause for concern. But this can be triggered by binding a subchannel to vfio-ccw, and rebooting the system before starting the mdev (via "mdevctl start" or similar) or after stopping it. In those cases, shutdown doesn't need to worry because either the private was never allocated, or it was cleaned up by vfio_ccw_mdev_remove(). Remove the WARN_ON() piece of this check, since there are plausible scenarios where private would be NULL in this path. Fixes: 9e6f07cd1eaa ("vfio/ccw: create a parent struct") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20230210174227.2256424-1-farman@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-12-05vfio/ap/ccw/samples: Fix device_register() unwind pathAlex Williamson
We always need to call put_device() if device_register() fails. All vfio drivers calling device_register() include a similar unwind stack via gotos, therefore split device_unregister() into its device_del() and put_device() components in the unwind path, and add a goto target to handle only the put_device() requirement. Reported-by: Ruan Jinjie <ruanjinjie@huawei.com> Link: https://lore.kernel.org/all/20221118032827.3725190-1-ruanjinjie@huawei.com Fixes: d61fc96f47fd ("sample: vfio mdev display - host device") Fixes: 9d1a546c53b4 ("docs: Sample driver to demonstrate how to use Mediated device framework.") Fixes: a5e6e6505f38 ("sample: vfio bochs vbe display (host device for bochs-drm)") Fixes: 9e6f07cd1eaa ("vfio/ccw: create a parent struct") Fixes: 36360658eb5a ("s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem") Cc: Tony Krowiak <akrowiak@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Jason Herne <jjherne@linux.ibm.com> Cc: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Link: https://lore.kernel.org/r/166999942139.645727.12439756512449846442.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: replace vfio_init_device with _alloc_Eric Farman
Now that we have a reasonable separation of structs that follow the subchannel and mdev lifecycles, there's no reason we can't call the official vfio_alloc_device routine for our private data, and behave like everyone else. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-7-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: move private to mdev lifecycleEric Farman
Now that the mdev parent data is split out into its own struct, it is safe to move the remaining private data to follow the mdev probe/remove lifecycle. The mdev parent data will remain where it is, and follow the subchannel and the css driver interfaces. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-5-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: move private initialization to callbackEric Farman
There's already a device initialization callback that is used to initialize the release completion workaround that was introduced by commit ebb72b765fb49 ("vfio/ccw: Use the new device life cycle helpers"). Move the other elements of the vfio_ccw_private struct that require distinct initialization over to that routine. With that done, the vfio_ccw_alloc_private routine only does a kzalloc, so fold it inline. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-4-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: remove private->schEric Farman
These places all rely on the ability to jump from a private struct back to the subchannel struct. Rather than keeping a copy in our back pocket, let's use the relationship provided by the vfio_device embedded within the private. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-3-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: create a parent structEric Farman
Move the stuff associated with the mdev parent (and thus the subchannel struct) into its own struct, and leave the rest in the existing private structure. The subchannel will point to the parent, and the parent will point to the private, for the areas where one or both are needed. Further separation of these structs will follow. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-2-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: add mdev available instance checking to the coreJason Gunthorpe
Many of the mdev drivers use a simple counter for keeping track of the available instances. Move this code to the core code and store the counter in the mdev_parent. Implement it using correct locking, fixing mdpy. Drivers just provide the value in the mdev_driver at registration time and the core code takes care of maintaining it and exposing the value in sysfs. [hch: count instances per-parent instead of per-type, use an atomic_t to avoid taking mdev_list_lock in the show method] Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Link: https://lore.kernel.org/r/20220923092652.100656-15-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: consolidate all the name sysfs into the core codeChristoph Hellwig
Every driver just emits a static string, simply add a field to the mdev_type for the driver to fill out or fall back to the sysfs name and provide a standard sysfs show function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Link: https://lore.kernel.org/r/20220923092652.100656-12-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: simplify mdev_type handlingChristoph Hellwig
Instead of abusing struct attribute_group to control initialization of struct mdev_type, just define the actual attributes in the mdev_driver, allocate the mdev_type structures in the caller and pass them to mdev_register_parent. This allows the caller to use container_of to get at the containing structure and thus significantly simplify the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Link: https://lore.kernel.org/r/20220923092652.100656-6-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: embedd struct mdev_parent in the parent data structureChristoph Hellwig
Simplify mdev_{un}register_device by requiring the caller to pass in a structure allocate as part of the parent device structure. This removes the need for a list of parents and the separate mdev_parent refcount as we can simplify rely on the reference to the parent device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Link: https://lore.kernel.org/r/20220923092652.100656-5-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: make mdev.h standalone includableChristoph Hellwig
Include <linux/device.h> and <linux/uuid.h> so that users of this headers don't need to do that and remove those includes that aren't needed any more. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Link: https://lore.kernel.org/r/20220923092652.100656-4-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-08-01vfio/ccw: Remove FSM Close from remove handlersEric Farman
Now that neither vfio_ccw_sch_probe() nor vfio_ccw_mdev_probe() affect the FSM state, it doesn't make sense for their _remove() counterparts try to revert things in this way. Since the FSM open and close are handled alongside MDEV open/close, these are unnecessary. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220728204914.2420989-3-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Move FSM open/close to MDEV open/closeEric Farman
Part of the confusion that has existed is the FSM lifecycle of subchannels between the common CSS driver and the vfio-ccw driver. During configuration, the FSM state goes from NOT_OPER to STANDBY to IDLE, but then back to NOT_OPER. For example: vfio_ccw_sch_probe: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_probe: VFIO_CCW_STATE_STANDBY vfio_ccw_mdev_probe: VFIO_CCW_STATE_IDLE vfio_ccw_mdev_remove: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_remove: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_shutdown: VFIO_CCW_STATE_NOT_OPER Rearrange the open/close events to align with the mdev open/close, to better manage the memory and state of the devices as time progresses. Specifically, make mdev_open() perform the FSM open, and mdev_close() perform the FSM close instead of reset (which is both close and open). This makes the NOT_OPER state a dead-end path, indicating the device is probably not recoverable without fully probing and re-configuring the device. This has the nice side-effect of removing a number of special-cases where the FSM state is managed outside of the FSM itself (such as the aforementioned mdev_close() routine). Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-12-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Create a CLOSE FSM eventEric Farman
Refactor the vfio_ccw_sch_quiesce() routine to extract the bit that disables the subchannel and affects the FSM state. Use this to form the basis of a CLOSE event that will mirror the OPEN event, and move the subchannel back to NOT_OPER state. A key difference with that mirroring is that while OPEN handles the transition from NOT_OPER => STANDBY, the later probing of the mdev handles the transition from STANDBY => IDLE. On the other hand, the CLOSE event will move from one of the operating states {IDLE, CP_PROCESSING, CP_PENDING} => NOT_OPER. That is, there is no stop in a STANDBY state on the deconfigure path. Add a call to cp_free() in this event, such that it is captured for the various permutations of this event. In the unlikely event that cio_disable_subchannel() returns -EBUSY, the remaining logic of vfio_ccw_sch_quiesce() can still be used. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-10-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Create an OPEN FSM EventEric Farman
Move the process of enabling a subchannel for use by vfio-ccw into the FSM, such that it can manage the sequence of lifecycle events for the device. That is, if the FSM state is NOT_OPER(erational), then do the work that would enable the subchannel and move the FSM to STANDBY state. An attempt to perform this event again from any of the other operating states (IDLE, CP_PROCESSING, CP_PENDING) will convert the device back to NOT_OPER so the configuration process can be started again. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-9-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Flatten MDEV device (un)registerEric Farman
The vfio_ccw_mdev_(un)reg routines are merely vfio-ccw routines that pass control to mdev_(un)register_device. Since there's only one caller of each, let's just call the mdev routines directly. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-7-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Do not change FSM state in subchannel eventEric Farman
The routine vfio_ccw_sch_event() is tasked with handling subchannel events, specifically machine checks, on behalf of vfio-ccw. It correctly calls cio_update_schib(), and if that fails (meaning the subchannel is gone) it makes an FSM event call to mark the subchannel Not Operational. If that worked, however, then it decides that if the FSM state was already Not Operational (implying the subchannel just came back), then it should simply change the FSM to partially- or fully-open. Remove this trickery, since a subchannel returning will require more probing than simply "oh all is well again" to ensure it works correctly. Fixes: bbe37e4cb8970 ("vfio: ccw: introduce a finite state machine") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-4-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Fix FSM state if mdev probe failsEric Farman
The FSM is in STANDBY state when arriving in vfio_ccw_mdev_probe(), and this routine converts it to IDLE as part of its processing. The error exit sets it to IDLE (again) but clears the private->mdev pointer. The FSM should of course be managing the state itself, but the correct thing for vfio_ccw_mdev_probe() to do would be to put the state back the way it found it. The corresponding check of private->mdev in vfio_ccw_sch_io_todo() can be removed, since the distinction is unnecessary at this point. Fixes: 3bf1311f351ef ("vfio/ccw: Convert to use vfio_register_emulated_iommu_dev()") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-3-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Remove UUID from s390 debug logMichael Kawano
As vfio-ccw devices are created/destroyed, the uuid of the associated mdevs that are recorded in $S390DBF/vfio_ccw_msg/sprintf get lost. This is because a pointer to the UUID is stored instead of the UUID itself, and that memory may have been repurposed if/when the logs are examined. The result is usually garbage UUID data in the logs, though there is an outside chance of an oops happening here. Simply remove the UUID from the traces, as the subchannel number will provide useful configuration information for problem determination, and is stored directly into the log instead of a pointer. As we were the only consumer of mdev_uuid(), remove that too. Cc: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Michael Kawano <mkawano@linux.ibm.com> Fixes: 60e05d1cf0875 ("vfio-ccw: add some logging") Fixes: b7701dfbf9832 ("vfio-ccw: Register a chp_event callback for vfio-ccw") [farman: reworded commit message, added Fixes: tags] Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Link: https://lore.kernel.org/r/20220707135737.720765-2-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-12-06s390/cio: remove uevent suppress from cio driverVineeth Vijayan
commit fa1a8c23eb7d ("s390: cio: Delay uevents for subchannels") introduced suppression of uevents for a subchannel until after it is clear that the subchannel would not be unregistered again immediately. This was done to avoid uevents being generated for I/O subchannels with no valid device, which can happen on LPAR. However, this also has some drawbacks: All subchannel drivers need to manually remove the uevent suppression and generate an ADD uevent as soon as they are sure that the subchannel will stay around. This misses out on all uevents that are not the initial ADD uevent that would be generated while uevents are suppressed; for example, all subchannels were missing the BIND uevent. As uevents being generated even for I/O subchannels without an operational device turned out to be not as bad as missing uevents and complicating the code flow, let's remove uevent suppression for subchannels. Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> [cohuck@redhat.com: modified changelog] Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Link: https://lore.kernel.org/r/20211122103756.352463-2-vneethv@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-10-28vfio/ccw: Convert to use vfio_register_emulated_iommu_dev()Jason Gunthorpe
This is a more complicated conversion because vfio_ccw is sharing the vfio_device between both the mdev_device, its vfio_device and the css_driver. The mdev is a singleton, and the reason for this sharing is so the extra css_driver function callbacks to be delivered to the vfio_device implementation. This keeps things as they are, with the css_driver allocating the singleton, not the mdev_driver. Embed the vfio_device in the vfio_ccw_private and instantiate it as a vfio_device when the mdev probes. The drvdata of both the css_device and the mdev_device point at the private, and container_of is used to get it back from the vfio_device. Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-cea4f5bd2c00+b52-ccw_mdev_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-10-28vfio/ccw: Use functions for alloc/free of the vfio_ccw_privateJason Gunthorpe
Makes the code easier to understand what is memory lifecycle and what is other stuff. Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-cea4f5bd2c00+b52-ccw_mdev_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-10-28vfio/ccw: Remove unneeded GFP_DMAJason Gunthorpe
Since the ccw_io_region was split out of the private the allocation no longer needs the GFP_DMA. Remove it. Reported-by: Christoph Hellwig <hch@infradead.org> Fixes: c98e16b2fa12 ("s390/cio: Convert ccw_io_region to pointer") Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-cea4f5bd2c00+b52-ccw_mdev_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-07-21s390/cio: Make struct css_driver::remove return voidUwe Kleine-König
The driver core ignores the return value of css_remove() (because there is only little it can do when a device disappears) and all callbacks return 0 anyhow. So make it impossible for future drivers to return an unused error code by changing the remove prototype to return void. The real motivation for this change is the quest to make struct bus_type::remove return void, too. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20210713193522.1770306-3-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-12vfio-ccw: Serialize FSM IDLE state with I/O completionEric Farman
Today, the stacked call to vfio_ccw_sch_io_todo() does three things: 1) Update a solicited IRB with CP information, and release the CP if the interrupt was the end of a START operation. 2) Copy the IRB data into the io_region, under the protection of the io_mutex 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that vfio-ccw can accept more work. The trouble is that step 3 is (A) invoked for both solicited and unsolicited interrupts, and (B) sitting after the mutex for step 2. This second piece becomes a problem if it processes an interrupt for a CLEAR SUBCHANNEL while another thread initiates a START, thus allowing the CP and FSM states to get out of sync. That is: CPU 1 CPU 2 fsm_do_clear() fsm_irq() fsm_io_request() vfio_ccw_sch_io_todo() fsm_io_helper() Since the FSM state and CP should be kept in sync, let's make a note when the CP is released, and rely on that as an indication that the FSM should also be reset at the end of this routine and open up the device for more work. Signed-off-by: Eric Farman <farman@linux.ibm.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210511195631.3995081-4-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-03vfio-ccw: Add trace for CRW eventEric Farman
Since CRW events are (should be) rare, let's put a trace in that routine too. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-9-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-03vfio-ccw: Wire up the CRW irq and CRW regionFarhan Ali
Use the IRQ to notify userspace that there is a CRW pending in the region, related to path-availability changes on the passthrough subchannel. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-8-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-03vfio-ccw: Introduce a new CRW regionFarhan Ali
This region provides a mechanism to pass a Channel Report Word that affect vfio-ccw devices, and needs to be passed to the guest for its awareness and/or processing. The base driver (see crw_collect_info()) provides space for two CRWs, as a subchannel event may have two CRWs chained together (one for the ssid, one for the subchannel). As vfio-ccw will deal with everything at the subchannel level, provide space for a single CRW to be transferred in one shot. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-7-farman@linux.ibm.com> [CH: added padding to ccw_crw_region] Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-02vfio-ccw: Introduce a new schib regionFarhan Ali
The schib region can be used by userspace to get the subchannel- information block (SCHIB) for the passthrough subchannel. This can be useful to get information such as channel path information via the SCHIB.PMCW fields. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-5-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-02vfio-ccw: Register a chp_event callback for vfio-ccwFarhan Ali
Register the chp_event callback to receive channel path related events for the subchannels managed by vfio-ccw. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-3-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-02vfio-ccw: Introduce new helper functions to free/destroy regionsFarhan Ali
Consolidate some of the cleanup code for the regions, so that as more are added we reduce code duplication. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-2-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-04-06s390/cio: generate delayed uevent for vfio-ccw subchannelsCornelia Huck
The common I/O layer delays the ADD uevent for subchannels and delegates generating this uevent to the individual subchannel drivers. The vfio-ccw I/O subchannel driver, however, did not do that, and will not generate an ADD uevent for subchannels that had not been bound to a different driver (or none at all, which also triggers the uevent). Generate the ADD uevent at the end of the probe function if uevents were still suppressed for the device. Message-Id: <20200327124503.9794-3-cohuck@redhat.com> Fixes: 63f1934d562d ("vfio: ccw: basic implementation for vfio_ccw driver") Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-05vfio-ccw: fix error return code in vfio_ccw_sch_init()Wei Yongjun
Fix to return negative error code -ENOMEM from the memory alloc failed error handling case instead of 0, as done elsewhere in this function. Fixes: 60e05d1cf087 ("vfio-ccw: add some logging") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link https://lore.kernel.org/kvm/20190904083315.105600-1-weiyongjun1@huawei.com/ Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-23vfio-ccw: add some loggingCornelia Huck
Usually, the common I/O layer logs various things into the s390 cio debug feature, which has been very helpful in the past when looking at crash dumps. As vfio-ccw devices unbind from the standard I/O subchannel driver, we lose some information there. Let's introduce some vfio-ccw debug features and log some things there. (Unfortunately we cannot reuse the cio debug feature from a module.) Message-Id: <20190816151505.9853-2-cohuck@redhat.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-07-15vfio-ccw: Don't call cp_free if we are processing a channel programFarhan Ali
There is a small window where it's possible that we could be working on an interrupt (queued in the workqueue) and setting up a channel program (i.e allocating memory, pinning pages, translating address). This can lead to allocating and freeing the channel program at the same time and can cause memory corruption. Let's not call cp_free if we are currently processing a channel program. The only way we know for sure that we don't have a thread setting up a channel program is when the state is set to VFIO_CCW_STATE_CP_PENDING. Fixes: d5afd5d135c8 ("vfio-ccw: add handling for async channel instructions") Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <62e87bf67b38dc8d5760586e7c96d400db854ebe.1562854091.git.alifm@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-07-08Merge tag 's390-5.3-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Improve stop_machine wait logic: replace cpu_relax_yield call in generic stop_machine function with a weak stop_machine_yield function. This is overridden on s390, which yields the current cpu to the neighbouring cpu after a couple of retries, instead of blindly giving up the cpu to the hipervisor. This significantly improves stop_machine performance on s390 in overcommitted scenarios. This includes common code changes which have been Acked by Peter Zijlstra and Thomas Gleixner. - Improve jump label transformation speed: transform jump labels without using stop_machine. - Refactoring of the vfio-ccw cp handling, simplifying the code and avoiding unneeded allocating/copying. - Various vfio-ccw fixes (ccw translation, state machine). - Add support for vfio-ap queue interrupt control in the guest. This includes s390 kvm changes which have been Acked by Christian Borntraeger. - Add protected virtualization support for virtio-ccw. - Enforce both CONFIG_SMP and CONFIG_HOTPLUG_CPU, which allows to remove some code which most likely isn't working at all, besides that s390 didn't even compile for !CONFIG_SMP. - Support for special flagged EP11 CPRBs for zcrypt. - Handle PCI devices with no support for new MIO instructions. - Avoid KASAN false positives in reworked stack unwinder. - Couple of fixes for the QDIO layer. - Convert s390 specific documentation to ReST format. - Let s390 crypto modules return -ENODEV instead of -EOPNOTSUPP if hardware is missing. This way our modules behave like most other modules and which is also what systemd's systemd-modules-load.service expects. - Replace defconfig with performance_defconfig, so there is one config file less to maintain. - Remove the SCLP call home device driver, which was never useful. - Cleanups all over the place. * tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits) docs: s390: s390dbf: typos and formatting, update crash command docs: s390: unify and update s390dbf kdocs at debug.c docs: s390: restore important non-kdoc parts of s390dbf.rst vfio-ccw: Fix the conversion of Format-0 CCWs to Format-1 s390/pci: correctly handle MIO opt-out s390/pci: deal with devices that have no support for MIO instructions s390: ap: kvm: Enable PQAP/AQIC facility for the guest s390: ap: implement PAPQ AQIC interception in kernel vfio: ap: register IOMMU VFIO notifier s390: ap: kvm: add PQAP interception for AQIC s390/unwind: cleanup unused READ_ONCE_TASK_STACK s390/kasan: avoid false positives during stack unwind s390/qdio: don't touch the dsci in tiqdio_add_input_queues() s390/qdio: (re-)initialize tiqdio list entries s390/dasd: Fix a precision vs width bug in dasd_feature_list() s390/cio: introduce driver_override on the css bus vfio-ccw: make convert_ccw0_to_ccw1 static vfio-ccw: Remove copy_ccw_from_iova() vfio-ccw: Factor out the ccw0-to-ccw1 transition vfio-ccw: Copy CCW data outside length calculation ...
2019-06-21vfio-ccw: Move guest_cp storage into common structEric Farman
Rather than allocating/freeing a piece of memory every time we try to figure out how long a CCW chain is, let's use a piece of memory allocated for each device. The io_mutex added with commit 4f76617378ee9 ("vfio-ccw: protect the I/O region") is held for the duration of the VFIO_CCW_EVENT_IO_REQ event that accesses/uses this space, so there should be no race concerns with another CPU attempting an (unexpected) SSCH for the same device. Suggested-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20190618202352.39702-2-farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-06-13vfio-ccw: Destroy kmem cache region on module exitFarhan Ali
Free the vfio_ccw_cmd_region on module exit. Fixes: d5afd5d135c8 ("vfio-ccw: add handling for async channel instructions") Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Message-Id: <c0f39039d28af39ea2939391bf005e3495d890fd.1559576250.git.alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-03s390/cio: Set vfio-ccw FSM state before ioeventfdEric Farman
Otherwise, the guest can believe it's okay to start another I/O and bump into the non-idle state. This results in a cc=2 (with the asynchronous CSCH/HSCH code) returned to the guest, which is unfortunate since everything is otherwise working normally. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <20190514234248.36203-3-farman@linux.ibm.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-24vfio-ccw: Prevent quiesce function going into an infinite loopFarhan Ali
The quiesce function calls cio_cancel_halt_clear() and if we get an -EBUSY we go into a loop where we: - wait for any interrupts - flush all I/O in the workqueue - retry cio_cancel_halt_clear During the period where we are waiting for interrupts or flushing all I/O, the channel subsystem could have completed a halt/clear action and turned off the corresponding activity control bits in the subchannel status word. This means the next time we call cio_cancel_halt_clear(), we will again start by calling cancel subchannel and so we can be stuck between calling cancel and halt forever. Rather than calling cio_cancel_halt_clear() immediately after waiting, let's try to disable the subchannel. If we succeed in disabling the subchannel then we know nothing else can happen with the device. Suggested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Message-Id: <4d5a4b98ab1b41ac6131b5c36de18b76c5d66898.1555449329.git.alifm@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-24vfio-ccw: Do not call flush_workqueue while holding the spinlockFarhan Ali
Currently we call flush_workqueue while holding the subchannel spinlock. But flush_workqueue function can go to sleep, so do not call the function while holding the spinlock. Fixes the following bug: [ 285.203430] BUG: scheduling while atomic: bash/14193/0x00000002 [ 285.203434] INFO: lockdep is turned off. .... [ 285.203485] Preemption disabled at: [ 285.203488] [<000003ff80243e5c>] vfio_ccw_sch_quiesce+0xbc/0x120 [vfio_ccw] [ 285.203496] CPU: 7 PID: 14193 Comm: bash Tainted: G W .... [ 285.203504] Call Trace: [ 285.203510] ([<0000000000113772>] show_stack+0x82/0xd0) [ 285.203514] [<0000000000b7a102>] dump_stack+0x92/0xd0 [ 285.203518] [<000000000017b8be>] __schedule_bug+0xde/0xf8 [ 285.203524] [<0000000000b95b5a>] __schedule+0x7a/0xc38 [ 285.203528] [<0000000000b9678a>] schedule+0x72/0xb0 [ 285.203533] [<0000000000b9bfbc>] schedule_timeout+0x34/0x528 [ 285.203538] [<0000000000b97608>] wait_for_common+0x118/0x1b0 [ 285.203544] [<0000000000166d6a>] flush_workqueue+0x182/0x548 [ 285.203550] [<000003ff80243e6e>] vfio_ccw_sch_quiesce+0xce/0x120 [vfio_ccw] [ 285.203556] [<000003ff80245278>] vfio_ccw_mdev_reset+0x38/0x70 [vfio_ccw] [ 285.203562] [<000003ff802458b0>] vfio_ccw_mdev_remove+0x40/0x78 [vfio_ccw] [ 285.203567] [<000003ff801a499c>] mdev_device_remove_ops+0x3c/0x80 [mdev] [ 285.203573] [<000003ff801a4d5c>] mdev_device_remove+0xc4/0x130 [mdev] [ 285.203578] [<000003ff801a5074>] remove_store+0x6c/0xa8 [mdev] [ 285.203582] [<000000000046f494>] kernfs_fop_write+0x14c/0x1f8 [ 285.203588] [<00000000003c1530>] __vfs_write+0x38/0x1a8 [ 285.203593] [<00000000003c187c>] vfs_write+0xb4/0x198 [ 285.203597] [<00000000003c1af2>] ksys_write+0x5a/0xb0 [ 285.203601] [<0000000000b9e270>] system_call+0xdc/0x2d8 Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <626bab8bb2958ae132452e1ddaf1b20882ad5a9d.1554756534.git.alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-24vfio-ccw: add handling for async channel instructionsCornelia Huck
Add a region to the vfio-ccw device that can be used to submit asynchronous I/O instructions. ssch continues to be handled by the existing I/O region; the new region handles hsch and csch. Interrupt status continues to be reported through the same channels as for ssch. Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-24vfio-ccw: protect the I/O regionCornelia Huck
Introduce a mutex to disallow concurrent reads or writes to the I/O region. This makes sure that the data the kernel or user space see is always consistent. The same mutex will be used to protect the async region as well. Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-03-11vfio: ccw: only free cp on final interruptCornelia Huck
When we get an interrupt for a channel program, it is not necessarily the final interrupt; for example, the issuing guest may request an intermediate interrupt by specifying the program-controlled-interrupt flag on a ccw. We must not switch the state to idle if the interrupt is not yet final; even more importantly, we must not free the translated channel program if the interrupt is not yet final, or the host can crash during cp rewind. Fixes: e5f84dbaea59 ("vfio: ccw: return I/O results asynchronously") Cc: stable@vger.kernel.org # v4.12+ Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>