summaryrefslogtreecommitdiff
path: root/drivers/s390
AgeCommit message (Collapse)Author
2022-07-27s390/ism: CleanupsStefan Raspl
Reworked signature of the function to retrieve the system EID: No plausible reason to use a double pointer. And neither to pass in the device as an argument, as this identifier is by definition per system, not per device. Plus some minor consistency edits. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang < wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25vfio: Replace phys_pfn with pages for vfio_pin_pages()Nicolin Chen
Most of the callers of vfio_pin_pages() want "struct page *" and the low-level mm code to pin pages returns a list of "struct page *" too. So there's no gain in converting "struct page *" to PFN in between. Replace the output parameter "phys_pfn" list with a "pages" list, to simplify callers. This also allows us to replace the vfio_iommu_type1 implementation with a more efficient one. And drop the pfn_valid check in the gvt code, as there is no need to do such a check at a page-backed struct page pointer. For now, also update vfio_iommu_type1 to fit this new parameter too. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Eric Farman <farman@linux.ibm.com> Tested-by: Terrence Xu <terrence.xu@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-11-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25vfio/ccw: Add kmap_local_page() for memcpyNicolin Chen
A PFN is not secure enough to promise that the memory is not IO. And direct access via memcpy() that only handles CPU memory will crash on S390 if the PFN is an IO PFN, as we have to use the memcpy_to/fromio() that uses the special S390 IO access instructions. On the other hand, a "struct page *" is always a CPU coherent thing that fits memcpy(). Also, casting a PFN to "void *" for memcpy() is not a proper practice, kmap_local_page() is the correct API to call here, though S390 doesn't use highmem, which means kmap_local_page() is a NOP. There's a following patch changing the vfio_pin_pages() API to return a list of "struct page *" instead of PFNs. It will block any IO memory from ever getting into this call path, for such a security purpose. In this patch, add kmap_local_page() to prepare for that. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Eric Farman <farman@linux.ibm.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-10-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25vfio/ccw: Change pa_pfn list to pa_iova listNicolin Chen
The vfio_ccw_cp code maintains both iova and its PFN list because the vfio_pin/unpin_pages API wanted pfn list. Since vfio_pin/unpin_pages() now accept "iova", change to maintain only pa_iova list and rename all "pfn_array" strings to "page_array", so as to simplify the code. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-8-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25vfio/ap: Change saved_pfn to saved_iovaNicolin Chen
The vfio_ap_ops code maintains both nib address and its PFN, which is redundant, merely because vfio_pin/unpin_pages API wanted pfn. Since vfio_pin/unpin_pages() now accept "iova", change "saved_pfn" to "saved_iova" and remove pfn in the vfio_ap_validate_nib(). Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-7-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25vfio: Pass in starting IOVA to vfio_pin/unpin_pages APINicolin Chen
The vfio_pin/unpin_pages() so far accepted arrays of PFNs of user IOVA. Among all three callers, there was only one caller possibly passing in a non-contiguous PFN list, which is now ensured to have contiguous PFN inputs too. Pass in the starting address with "iova" alone to simplify things, so callers no longer need to maintain a PFN list or to pin/unpin one page at a time. This also allows VFIO to use more efficient implementations of pin/unpin_pages. For now, also update vfio_iommu_type1 to fit this new parameter too, while keeping its input intact (being user_iova) since we don't want to spend too much effort swapping its parameters and local variables at that level. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Eric Farman <farman@linux.ibm.com> Tested-by: Terrence Xu <terrence.xu@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-6-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25s390/qeth: Fix typo 'the the' in commentSlark Xiao
Replace 'the the' with 'the' in the comment. Signed-off-by: Slark Xiao <slark_xiao@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-23vfio/ccw: Only pass in contiguous pagesNicolin Chen
This driver is the only caller of vfio_pin/unpin_pages that might pass in a non-contiguous PFN list, but in many cases it has a contiguous PFN list to process. So letting VFIO API handle a non-contiguous PFN list is actually counterproductive. Add a pair of simple loops to pass in contiguous PFNs only, to have an efficient implementation in VFIO. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-5-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-23vfio/ap: Pass in physical address of ind to ap_aqic()Nicolin Chen
The ap_aqic() is called by vfio_ap_irq_enable() where it passes in a virt value that's casted from a physical address "h_nib". Inside the ap_aqic(), it does virt_to_phys() again. Since ap_aqic() needs a physical address, let's just pass in a pa of ind directly. So change the "ind" to "pa_ind". Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-4-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-20vfio: Replace the DMA unmapping notifier with a callbackJason Gunthorpe
Instead of having drivers register the notifier with explicit code just have them provide a dma_unmap callback op in their driver ops and rely on the core code to wire it up. Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-681e038e30fd+78-vfio_unmap_notif_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-20s390/crash: support multi-segment iteratorsAlexander Gordeev
Make it possible to handle not only single-, but also multi- segment iterators in copy_oldmem_iter() callback. Change the semantics of called functions to match the iterator model - instead of an error code the exact number of bytes copied is returned. The swap page used to copy data to user space is adopted for kernel space too. That does not bring any performance impact. Suggested-by: Matthew Wilcox <willy@infradead.org> Fixes: cc02e6e21aa5 ("s390/crash: add missing iterator advance in copy_oldmem_page()") Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Link: https://lore.kernel.org/r/5af6da3a0bffe48a90b0b7139ecf6a818b2d18e8.1658206891.git.agordeev@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-20s390/zcore: fix race when reading from hardware system areaAlexander Gordeev
Memory buffer used for reading out data from hardware system area is not protected against concurrent access. Reported-by: Matthew Wilcox <willy@infradead.org> Fixes: 411ed3225733 ("[S390] zfcpdump support.") Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Link: https://lore.kernel.org/r/e68137f0f9a0d2558f37becc20af18e2939934f6.1658206891.git.agordeev@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/uvdevice: autoload module based on CPU facilitySteffen Eiden
Make sure the uvdevice driver will be automatically loaded when facility 158 is available. Signed-off-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220713125644.16121-4-seiden@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/cpufeature: rework to allow more than only hwcap bitsHeiko Carstens
Rework cpufeature implementation to allow for various cpu feature indications, which is not only limited to hwcap bits. This is achieved by adding a sequential list of cpu feature numbers, where each of them is mapped to an entry which indicates what this number is about. Each entry contains a type member, which indicates what feature name space to look into (e.g. hwcap, or cpu facility). If wanted this allows also to automatically load modules only in e.g. z/VM configurations. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Link: https://lore.kernel.org/r/20220713125644.16121-2-seiden@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: handle config changed and scan complete notificationTony Krowiak
This patch implements two new AP driver callbacks: void (*on_config_changed)(struct ap_config_info *new_config_info, struct ap_config_info *old_config_info); void (*on_scan_complete)(struct ap_config_info *new_config_info, struct ap_config_info *old_config_info); The on_config_changed callback is invoked at the start of the AP bus scan function when it determines that the host AP configuration information has changed since the previous scan. The vfio_ap device driver registers a callback function for this callback that performs the following operations: 1. Unplugs the adapters, domains and control domains removed from the host's AP configuration from the guests to which they are assigned in a single operation. 2. Stores bitmaps identifying the adapters, domains and control domains added to the host's AP configuration with the structure representing the mediated device. When the vfio_ap device driver's probe callback is subsequently invoked, the probe function will recognize that the queue is being probed due to a change in the host's AP configuration and the plugging of the queue into the guest will be bypassed. The on_scan_complete callback is invoked after the ap bus scan is completed if the host AP configuration data has changed. The vfio_ap device driver registers a callback function for this callback that hot plugs each queue and control domain added to the AP configuration for each guest using them in a single hot plug operation. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: sysfs attribute to display the guest's matrixTony Krowiak
The matrix of adapters and domains configured in a guest's APCB may differ from the matrix of adapters and domains assigned to the matrix mdev, so this patch introduces a sysfs attribute to display the matrix of adapters and domains that are or will be assigned to the APCB of a guest that is or will be using the matrix mdev. For a matrix mdev denoted by $uuid, the guest matrix can be displayed as follows: cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: implement in-use callback for vfio_ap driverTony Krowiak
Let's implement the callback to indicate when an APQN is in use by the vfio_ap device driver. The callback is invoked whenever a change to the apmask or aqmask would result in one or more queue devices being removed from the driver. The vfio_ap device driver will indicate a resource is in use if the APQN of any of the queue devices to be removed are assigned to any of the matrix mdevs under the driver's control. There is potential for a deadlock condition between the matrix_dev->guests_lock used to lock the guest during assignment of adapters and domains and the ap_perms_mutex locked by the AP bus when changes are made to the sysfs apmask/aqmask attributes. The AP Perms lock controls access to the objects that store the adapter numbers (ap_perms) and domain numbers (aq_perms) for the sysfs /sys/bus/ap/apmask and /sys/bus/ap/aqmask attributes. These attributes identify which queues are reserved for the zcrypt default device drivers. Before allowing a bit to be removed from either mask, the AP bus must check with the vfio_ap device driver to verify that none of the queues are assigned to any of its mediated devices. The apmask/aqmask attributes can be written or read at any time from userspace, so care must be taken to prevent a deadlock with asynchronous operations that might be taking place in the vfio_ap device driver. For example, consider the following: 1. A system administrator assigns an adapter to a mediated device under the control of the vfio_ap device driver. The driver will need to first take the matrix_dev->guests_lock to potentially hot plug the adapter into the KVM guest. 2. At the same time, a system administrator sets a bit in the sysfs /sys/bus/ap/ap_mask attribute. To complete the operation, the AP bus must: a. Take the ap_perms_mutex lock to update the object storing the values for the /sys/bus/ap/ap_mask attribute. b. Call the vfio_ap device driver's in-use callback to verify that the queues now being reserved for the default zcrypt drivers are not assigned to a mediated device owned by the vfio_ap device driver. To do the verification, the in-use callback function takes the matrix_dev->guests_lock, but has to wait because it is already held by the operation in 1 above. 3. The vfio_ap device driver calls an AP bus function to verify that the new queues resulting from the assignment of the adapter in step 1 are not reserved for the default zcrypt device driver. This AP bus function tries to take the ap_perms_mutex lock but gets stuck waiting for the waiting for the lock due to step 2a above. Consequently, we have the following deadlock situation: matrix_dev->guests_lock locked (1) ap_perms_mutex lock locked (2a) Waiting for matrix_dev->gusts_lock (2b) which is currently held (1) Waiting for ap_perms_mutex lock (3) which is currently held (2a) To prevent this deadlock scenario, the function called in step 3 will no longer take the ap_perms_mutex lock and require the caller to take the lock. The lock will be the first taken by the adapter/domain assignment functions in the vfio_ap device driver to maintain the proper locking order. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: reset queues after adapter/domain unassignmentTony Krowiak
When an adapter or domain is unassigned from an mdev attached to a KVM guest, one or more of the guest's queues may get dynamically removed. Since the removed queues could get re-assigned to another mdev, they need to be reset. So, when an adapter or domain is unassigned from the mdev, the queues that are removed from the guest's AP configuration (APCB) will be reset. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: hot plug/unplug of AP devices when probed/removedTony Krowiak
When an AP queue device is probed or removed, if the mediated device is attached to a KVM guest, the mediated device's adapter, domain and control domain bitmaps must be filtered to update the guest's APCB and if any changes are detected, the guest's APCB must then be hot plugged into the guest to reflect those changes to the guest. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: allow hot plug/unplug of AP devices when assigned/unassignedTony Krowiak
Let's hot plug an adapter, domain or control domain into the guest when it is assigned to a matrix mdev that is attached to a KVM guest. Likewise, let's hot unplug an adapter, domain or control domain from the guest when it is unassigned from a matrix_mdev that is attached to a KVM guest. Whenever an assignment or unassignment of an adapter, domain or control domain is performed, the APQNs and control domains assigned to the matrix mdev will be filtered and assigned to the AP control block (APCB) that supplies the AP configuration to the guest so that no adapter, domain or control domain that is not in the host's AP configuration nor any APQN that does not reference a queue device bound to the vfio_ap device driver is assigned. After updating the APCB, if the mdev is in use by a KVM guest, it is hot plugged into the guest to dynamically provide access to the adapters, domains and control domains provided via the newly refreshed APCB. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: prepare for dynamic update of guest's APCB on queue probe/removeTony Krowiak
The callback functions for probing and removing a queue device must take and release the locks required to perform a dynamic update of a guest's APCB in the proper order. The proper order for taking the locks is: matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock The proper order for releasing the locks is: matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock A new helper function is introduced to be used by the probe callback to acquire the required locks. Since the probe callback only has access to a queue device when it is called, the helper function will find the ap_matrix_mdev object to which the queue device's APQN is assigned and return it so the KVM guest to which the mdev is attached can be dynamically updated. Note that in order to find the ap_matrix_mdev (matrix_mdev) object, it is necessary to search the matrix_dev->mdev_list. This presents a locking order dilemma because the matrix_dev->mdevs_lock can't be taken to protect against changes to the list while searching for the matrix_mdev to which a queue device's APQN is assigned. This is due to the fact that the proper locking order requires that the matrix_dev->mdevs_lock be taken after both the matrix_mdev->kvm->lock and the matrix_dev->mdevs_lock. Consequently, the matrix_dev->guests_lock will be used to protect against removal of a matrix_mdev object from the list while a queue device is being probed. This necessitates changes to the mdev probe/remove callback functions to take the matrix_dev->guests_lock prior to removing a matrix_mdev object from the list. A new macro is also introduced to acquire the locks required to dynamically update the guest's APCB in the proper order when a queue device is removed. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: prepare for dynamic update of guest's APCB on assign/unassignTony Krowiak
The functions backing the matrix mdev's sysfs attribute interfaces to assign/unassign adapters, domains and control domains must take and release the locks required to perform a dynamic update of a guest's APCB in the proper order. The proper order for taking the locks is: matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock The proper order for releasing the locks is: matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock Two new macros are introduced for this purpose: One to take the locks and the other to release the locks. These macros will be used by the assignment/unassignment functions to prepare for dynamic update of the KVM guest's APCB. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: use proper locking order when setting/clearing KVM pointerTony Krowiak
The group notifier that handles the VFIO_GROUP_NOTIFY_SET_KVM event must use the required locks in proper locking order to dynamically update the guest's APCB. The proper locking order is: 1. matrix_dev->guests_lock: required to use the KVM pointer to update a KVM guest's APCB. 2. matrix_mdev->kvm->lock: required to update a KVM guest's APCB. 3. matrix_dev->mdevs_lock: required to store or access the data stored in a struct ap_matrix_mdev instance. Two macros are introduced to acquire and release the locks in the proper order. These macros are now used by the group notifier functions. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: introduce new mutex to control access to the KVM pointerTony Krowiak
The vfio_ap device driver registers for notification when the pointer to the KVM object for a guest is set. Recall that the KVM lock (kvm->lock) mutex must be taken outside of the matrix_dev->lock mutex to prevent the reporting by lockdep of a circular locking dependency (a.k.a., a lockdep splat): * see commit 0cc00c8d4050 ("Fix circular lockdep when setting/clearing crypto masks") * see commit 86956e70761b ("replace open coded locks for VFIO_GROUP_NOTIFY_SET_KVM notification") With the introduction of support for hot plugging/unplugging AP devices passed through to a KVM guest, a new guests_lock mutex is introduced to ensure the proper locking order is maintained: struct ap_matrix_dev { ... struct mutex guests_lock; ... } The matrix_dev->guests_lock controls access to the matrix_mdev instances that hold the state for AP devices that have been passed through to a KVM guest. This lock must be held to control access to the KVM pointer (matrix_mdev->kvm) while the vfio_ap device driver is using it to plug/unplug AP devices passed through to the KVM guest. Keep in mind, the proper locking order must be maintained whenever dynamically updating a KVM guest's APCB to plug/unplug adapters, domains and control domains: 1. matrix_dev->guests_lock: required to use the KVM pointer - stored in a struct ap_matrix_mdev instance - to update a KVM guest's APCB 2. matrix_mdev->kvm->lock: required to update a guest's APCB 3. matrix_dev->mdevs_lock: required to access data stored in a struct ap_matrix_mdev instance. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: rename matrix_dev->lock mutex to matrix_dev->mdevs_lockTony Krowiak
The matrix_dev->lock mutex is being renamed to matrix_dev->mdevs_lock to better reflect its purpose, which is to control access to the state of the mediated devices under the control of the vfio_ap device driver. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: allow assignment of unavailable AP queues to mdev deviceTony Krowiak
The current implementation does not allow assignment of an AP adapter or domain to an mdev device if each APQN resulting from the assignment does not reference an AP queue device that is bound to the vfio_ap device driver. This patch allows assignment of AP resources to the matrix mdev as long as the APQNs resulting from the assignment: 1. Are not reserved by the AP BUS for use by the zcrypt device drivers. 2. Are not assigned to another matrix mdev. The rationale behind this is that the AP architecture does not preclude assignment of APQNs to an AP configuration profile that are not available to the system. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdevTony Krowiak
Refresh the guest's APCB by filtering the APQNs and control domain numbers assigned to the matrix mdev. Filtering of APQNs: ----------------- APQNs that do not reference an AP queue device bound to the vfio_ap device driver must be filtered from the APQNs assigned to the matrix mdev before they can be assigned to the guest's APCB. Given that the APQNs are configured in the guest's APCB as a matrix of APIDs (adapters) and APQIs (domains), it is not possible to filter an individual APQN. For example, suppose the matrix of APQNs is structured as follows: APIDs 3 4 5 0 (3,0) (4,0) (5,0) APQIs 1 (3,1) (4,1) (5,1) 2 (3,2) (4,2) (5,2) Now suppose APQN (4,1) does not reference a queue device bound to the vfio_ap device driver. If we filter APID 4, the APQNs (4,0), (4,1) and (4,2) will be removed. Similarly, if we filter domain 1, APQNs (3,1), (4,1) and (5,1) will be removed. To resolve this dilemma, the choice was made to filter the APID - in this case 4 - from the guest's APCB. The reason for this design decision is because the APID references an AP adapter which is a real hardware device that can be physically installed, removed, enabled or disabled; whereas, a domain is a partition within the adapter. It therefore better reflects reality to remove the APID from the guest's APCB. Filtering of control domains: ---------------------------- Any control domains that are not assigned to the host's AP configuration will be filtered from those assigned to the matrix mdev before assigning them to the guest's APCB. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: introduce shadow APCBTony Krowiak
The APCB is a field within the CRYCB that provides the AP configuration to a KVM guest. Let's introduce a shadow copy of the KVM guest's APCB and maintain it for the lifespan of the guest. The shadow APCB serves the following purposes: 1. The shadow APCB can be maintained even when the mediated device is not currently in use by a KVM guest. Since the mediated device's AP configuration is filtered to ensure that no AP queues are passed through to the KVM guest that are not bound to the vfio_ap device driver or available to the host, the mediated device's AP configuration may differ from the guest's. Having a shadow of a guest's APCB allows us to provide a sysfs interface to view the guest's APCB even if the mediated device is not currently passed through to a KVM guest. This can aid in problem determination when the guest is unexpectedly missing AP resources. 2. If filtering was done in-place for the real APCB, the guest could pick up a transient state. Doing the filtering on a shadow and transferring the AP configuration to the real APCB after the guest is started or when AP resources are assigned to or unassigned from the mediated device, or when the host configuration changes, the guest's AP configuration will never be in a transient state. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: manage link between queue struct and matrix mdevTony Krowiak
Let's create links between each queue device bound to the vfio_ap device driver and the matrix mdev to which the queue's APQN is assigned. The idea is to facilitate efficient retrieval of the objects representing the queue devices and matrix mdevs as well as to verify that a queue assigned to a matrix mdev is bound to the driver. The links will be created as follows: * When the queue device is probed, if its APQN is assigned to a matrix mdev, the structures representing the queue device and the matrix mdev will be linked. * When an adapter or domain is assigned to a matrix mdev, for each new APQN assigned that references a queue device bound to the vfio_ap device driver, the structures representing the queue device and the matrix mdev will be linked. The links will be removed as follows: * When the queue device is removed, if its APQN is assigned to a matrix mdev, the link from the structure representing the matrix mdev to the structure representing the queue will be removed. Since the storage allocated for the vfio_ap_queue will be freed, there is no need to remove the link to the matrix_mdev to which the queue's APQN is assigned. * When an adapter or domain is unassigned from a matrix mdev, for each APQN unassigned that references a queue device bound to the vfio_ap device driver, the structures representing the queue device and the matrix mdev will be unlinked. * When an mdev is removed, the link from any queues assigned to the mdev to the mdev will be removed. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.cTony Krowiak
Let's move the probe and remove callbacks into the vfio_ap_ops.c file to keep all code related to managing queues in a single file. This way, all functions related to queue management can be removed from the vfio_ap_private.h header file defining the public interfaces for the vfio_ap device driver. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19s390/vfio-ap: use new AP bus interface to search for queue devicesTony Krowiak
This patch refactors the vfio_ap device driver to use the AP bus's ap_get_qdev() function to retrieve the vfio_ap_queue struct containing information about a queue that is bound to the vfio_ap device driver. The bus's ap_get_qdev() function retrieves the queue device from a hashtable keyed by APQN. This is much more efficient than looping over the list of devices attached to the AP bus by several orders of magnitude. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-16Merge tag 's390-5.19-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - Fix building of out-of-tree kernel modules without a pre-built kernel in case CONFIG_EXPOLINE_EXTERN=y. - Fix a reference counting error that could prevent unloading of zcrypt modules. * tag 's390-5.19-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ap: fix error handling in __verify_queue_reservations() s390/nospec: remove unneeded header includes s390/nospec: build expoline.o for modules_prepare target
2022-07-15s390/ap: fix error handling in __verify_queue_reservations()Tony Krowiak
The AP bus's __verify_queue_reservations function increments the ref count for the device driver passed in as a parameter, but fails to decrement it before returning control to the caller. This will prevents any subsequent removal of the module. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reported-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Fixes: 4f8206b88286 ("s390/ap: driver callback to indicate resource in use") Link: https://lore.kernel.org/r/20220706222619.602094-1-akrowiak@linux.ibm.com Cc: stable@vger.kernel.org [agordeev@linux.ibm.com fixed description, added Fixes and Link] Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-11s390/airq: allow for airq structure that uses an input vectorMatthew Rosato
When doing device passthrough where interrupts are being forwarded from host to guest, we wish to use a pinned section of guest memory as the vector (the same memory used by the guest as the vector). To accomplish this, add a new parameter for airq_iv_create which allows passing an existing vector to be used instead of allocating a new one. The caller is responsible for ensuring the vector is pinned in memory as well as for unpinning the memory when the vector is no longer needed. A subsequent patch will use this new parameter for zPCI interpretation. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-7-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/airq: pass more TPI info to airq handlersMatthew Rosato
A subsequent patch will introduce an airq handler that requires additional TPI information beyond directed vs floating, so pass the entire tpi_info structure via the handler. Only pci actually uses this information today, for the other airq handlers this is effectively a no-op. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-6-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/sclp: detect the AISI facilityMatthew Rosato
Detect the Adapter Interruption Suppression Interpretation facility. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-5-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/sclp: detect the AENI facilityMatthew Rosato
Detect the Adapter Event Notification Interpretation facility. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-4-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/sclp: detect the AISII facilityMatthew Rosato
Detect the Adapter Interruption Source ID Interpretation facility. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-3-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/sclp: detect the zPCI load/store interpretation facilityMatthew Rosato
Detect the zPCI Load/Store Interpretation facility. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-2-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-07scsi: zfcp: Drop redundant "the" in the commentsJiang Jian
There is an unexpected word "the" in the comments that needs to be dropped. file: ./drivers/s390/scsi/zfcp_diag.h line: 5 * Definitions for handling diagnostics in the the zfcp device driver. changed to * Definitions for handling diagnostics in the zfcp device driver. Link: https://lore.kernel.org/r/20220621114207.106405-1-jiangjian@cdjrlc.com Link: https://lore.kernel.org/r/4c02fefa19ab46f0f163990bde3ce10bd9c7caf1.1657122360.git.bblock@linux.ibm.com Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-07scsi: zfcp: Declare zfcp_sdev_attrs as staticJulian Wiedmann
These attributes are now only accessed through the zfcp_sysfs_sdev_attr_group. Link: https://lore.kernel.org/r/0791b9149ebfa39e6b8eab093113cd2527dbf3d3.1657122360.git.bblock@linux.ibm.com Fixes: d8d7cf3f7d07 ("scsi: zfcp: Switch to attribute groups") Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-07vfio/ccw: Move FSM open/close to MDEV open/closeEric Farman
Part of the confusion that has existed is the FSM lifecycle of subchannels between the common CSS driver and the vfio-ccw driver. During configuration, the FSM state goes from NOT_OPER to STANDBY to IDLE, but then back to NOT_OPER. For example: vfio_ccw_sch_probe: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_probe: VFIO_CCW_STATE_STANDBY vfio_ccw_mdev_probe: VFIO_CCW_STATE_IDLE vfio_ccw_mdev_remove: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_remove: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_shutdown: VFIO_CCW_STATE_NOT_OPER Rearrange the open/close events to align with the mdev open/close, to better manage the memory and state of the devices as time progresses. Specifically, make mdev_open() perform the FSM open, and mdev_close() perform the FSM close instead of reset (which is both close and open). This makes the NOT_OPER state a dead-end path, indicating the device is probably not recoverable without fully probing and re-configuring the device. This has the nice side-effect of removing a number of special-cases where the FSM state is managed outside of the FSM itself (such as the aforementioned mdev_close() routine). Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-12-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Refactor vfio_ccw_mdev_resetEric Farman
Use both the FSM Close and Open events when resetting an mdev, rather than making a separate call to cio_enable_subchannel(). Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-11-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Create a CLOSE FSM eventEric Farman
Refactor the vfio_ccw_sch_quiesce() routine to extract the bit that disables the subchannel and affects the FSM state. Use this to form the basis of a CLOSE event that will mirror the OPEN event, and move the subchannel back to NOT_OPER state. A key difference with that mirroring is that while OPEN handles the transition from NOT_OPER => STANDBY, the later probing of the mdev handles the transition from STANDBY => IDLE. On the other hand, the CLOSE event will move from one of the operating states {IDLE, CP_PROCESSING, CP_PENDING} => NOT_OPER. That is, there is no stop in a STANDBY state on the deconfigure path. Add a call to cp_free() in this event, such that it is captured for the various permutations of this event. In the unlikely event that cio_disable_subchannel() returns -EBUSY, the remaining logic of vfio_ccw_sch_quiesce() can still be used. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-10-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Create an OPEN FSM EventEric Farman
Move the process of enabling a subchannel for use by vfio-ccw into the FSM, such that it can manage the sequence of lifecycle events for the device. That is, if the FSM state is NOT_OPER(erational), then do the work that would enable the subchannel and move the FSM to STANDBY state. An attempt to perform this event again from any of the other operating states (IDLE, CP_PROCESSING, CP_PENDING) will convert the device back to NOT_OPER so the configuration process can be started again. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-9-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Update trace data for not operational eventEric Farman
We currently cut a very basic trace whenever the FSM directs control to the not operational routine. Convert this to a message, so it's alongside the other configuration related traces (create, remove, etc.), and record both the event that brought us here and the current state of the device. This will provide some better footprints if things go bad. Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-8-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Flatten MDEV device (un)registerEric Farman
The vfio_ccw_mdev_(un)reg routines are merely vfio-ccw routines that pass control to mdev_(un)register_device. Since there's only one caller of each, let's just call the mdev routines directly. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-7-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Pass enum to FSM event jumptableEric Farman
The FSM has an enumerated list of events defined. Use that as the argument passed to the jump table, instead of a regular int. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-6-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Remove private->mdevEric Farman
There are no remaining users of private->mdev. Remove it. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220707135737.720765-5-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Do not change FSM state in subchannel eventEric Farman
The routine vfio_ccw_sch_event() is tasked with handling subchannel events, specifically machine checks, on behalf of vfio-ccw. It correctly calls cio_update_schib(), and if that fails (meaning the subchannel is gone) it makes an FSM event call to mark the subchannel Not Operational. If that worked, however, then it decides that if the FSM state was already Not Operational (implying the subchannel just came back), then it should simply change the FSM to partially- or fully-open. Remove this trickery, since a subchannel returning will require more probing than simply "oh all is well again" to ensure it works correctly. Fixes: bbe37e4cb8970 ("vfio: ccw: introduce a finite state machine") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-4-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>