linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2022-08-01	vfio/ccw: Check return code from subchannel quiesce	Eric Farman
	If a subchannel is busy when a close is performed, the subchannel needs to be quiesced and left nice and tidy, so nothing unexpected (like a solicited interrupt) shows up while in the closed state. Unfortunately, the return code from this call isn't checked, so any busy subchannel is treated as a failing one. Fix that, so that the close on a busy subchannel happens normally. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220728204914.2420989-4-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-08-01	vfio/ccw: Remove FSM Close from remove handlers	Eric Farman
	Now that neither vfio_ccw_sch_probe() nor vfio_ccw_mdev_probe() affect the FSM state, it doesn't make sense for their _remove() counterparts try to revert things in this way. Since the FSM open and close are handled alongside MDEV open/close, these are unnecessary. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220728204914.2420989-3-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-08-01	vfio/ccw: Add length to DMA_UNMAP checks	Eric Farman
	As pointed out with the simplification of the VFIO_IOMMU_NOTIFY_DMA_UNMAP notifier [1], the length parameter was never used to check against the pinned pages. Let's correct that, and see if a page is within the affected range instead of simply the first page of the range. [1] https://lore.kernel.org/kvm/20220720170457.39cda0d0.alex.williamson@redhat.com/ Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220728204914.2420989-2-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25	vfio: Replace phys_pfn with pages for vfio_pin_pages()	Nicolin Chen
	Most of the callers of vfio_pin_pages() want "struct page " and the low-level mm code to pin pages returns a list of "struct page " too. So there's no gain in converting "struct page *" to PFN in between. Replace the output parameter "phys_pfn" list with a "pages" list, to simplify callers. This also allows us to replace the vfio_iommu_type1 implementation with a more efficient one. And drop the pfn_valid check in the gvt code, as there is no need to do such a check at a page-backed struct page pointer. For now, also update vfio_iommu_type1 to fit this new parameter too. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Eric Farman <farman@linux.ibm.com> Tested-by: Terrence Xu <terrence.xu@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-11-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25	vfio/ccw: Add kmap_local_page() for memcpy	Nicolin Chen
	A PFN is not secure enough to promise that the memory is not IO. And direct access via memcpy() that only handles CPU memory will crash on S390 if the PFN is an IO PFN, as we have to use the memcpy_to/fromio() that uses the special S390 IO access instructions. On the other hand, a "struct page " is always a CPU coherent thing that fits memcpy(). Also, casting a PFN to "void " for memcpy() is not a proper practice, kmap_local_page() is the correct API to call here, though S390 doesn't use highmem, which means kmap_local_page() is a NOP. There's a following patch changing the vfio_pin_pages() API to return a list of "struct page *" instead of PFNs. It will block any IO memory from ever getting into this call path, for such a security purpose. In this patch, add kmap_local_page() to prepare for that. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Eric Farman <farman@linux.ibm.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-10-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25	vfio: Rename user_iova of vfio_dma_rw()	Nicolin Chen
	Following the updated vfio_pin/unpin_pages(), use the simpler "iova". Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Terrence Xu <terrence.xu@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-9-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25	vfio/ccw: Change pa_pfn list to pa_iova list	Nicolin Chen
	The vfio_ccw_cp code maintains both iova and its PFN list because the vfio_pin/unpin_pages API wanted pfn list. Since vfio_pin/unpin_pages() now accept "iova", change to maintain only pa_iova list and rename all "pfn_array" strings to "page_array", so as to simplify the code. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-8-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25	vfio/ap: Change saved_pfn to saved_iova	Nicolin Chen
	The vfio_ap_ops code maintains both nib address and its PFN, which is redundant, merely because vfio_pin/unpin_pages API wanted pfn. Since vfio_pin/unpin_pages() now accept "iova", change "saved_pfn" to "saved_iova" and remove pfn in the vfio_ap_validate_nib(). Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-7-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25	vfio: Pass in starting IOVA to vfio_pin/unpin_pages API	Nicolin Chen
	The vfio_pin/unpin_pages() so far accepted arrays of PFNs of user IOVA. Among all three callers, there was only one caller possibly passing in a non-contiguous PFN list, which is now ensured to have contiguous PFN inputs too. Pass in the starting address with "iova" alone to simplify things, so callers no longer need to maintain a PFN list or to pin/unpin one page at a time. This also allows VFIO to use more efficient implementations of pin/unpin_pages. For now, also update vfio_iommu_type1 to fit this new parameter too, while keeping its input intact (being user_iova) since we don't want to spend too much effort swapping its parameters and local variables at that level. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Eric Farman <farman@linux.ibm.com> Tested-by: Terrence Xu <terrence.xu@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-6-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-23	vfio/ccw: Only pass in contiguous pages	Nicolin Chen
	This driver is the only caller of vfio_pin/unpin_pages that might pass in a non-contiguous PFN list, but in many cases it has a contiguous PFN list to process. So letting VFIO API handle a non-contiguous PFN list is actually counterproductive. Add a pair of simple loops to pass in contiguous PFNs only, to have an efficient implementation in VFIO. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-5-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-23	vfio/ap: Pass in physical address of ind to ap_aqic()	Nicolin Chen
	The ap_aqic() is called by vfio_ap_irq_enable() where it passes in a virt value that's casted from a physical address "h_nib". Inside the ap_aqic(), it does virt_to_phys() again. Since ap_aqic() needs a physical address, let's just pass in a pa of ind directly. So change the "ind" to "pa_ind". Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-4-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-23	drm/i915/gvt: Replace roundup with DIV_ROUND_UP	Nicolin Chen
	It's a bit redundant for the maths here using roundup. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Terrence Xu <terrence.xu@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-3-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-23	vfio: Make vfio_unpin_pages() return void	Nicolin Chen
	There's only one caller that checks its return value with a WARN_ON_ONCE, while all other callers don't check the return value at all. Above that, an undo function should not fail. So, simplify the API to return void by embedding similar WARN_ONs. Also for users to pinpoint which condition fails, separate WARN_ON lines, yet remove the "driver->ops->unpin_pages" check, since it's unreasonable for callers to unpin on something totally random that wasn't even pinned. And remove NULL pointer checks for they would trigger oops vs. warnings. Note that npage is already validated in the vfio core, thus drop the same check in the type1 code. Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Tested-by: Terrence Xu <terrence.xu@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20220723020256.30081-2-nicolinc@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-22	vfio/spapr_tce: Fix the comment	Alexey Kardashevskiy
	Grepping for "iommu_ops" finds this spot and gives wrong impression that iommu_ops is used in here, fix the comment. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Link: https://lore.kernel.org/r/20220714080912.3713509-1-aik@ozlabs.ru [aw: convert to multi-line comment] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-20	vfio: Replace the iommu notifier with a device list	Jason Gunthorpe
	Instead of bouncing the function call to the driver op through a blocking notifier just have the iommu layer call it directly. Register each device that is being attached to the iommu with the lower driver which then threads them on a linked list and calls the appropriate driver op at the right time. Currently the only use is if dma_unmap() is defined. Also, fully lock all the debugging tests on the pinning path that a dma_unmap is registered. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-681e038e30fd+78-vfio_unmap_notif_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-20	vfio: Replace the DMA unmapping notifier with a callback	Jason Gunthorpe
	Instead of having drivers register the notifier with explicit code just have them provide a dma_unmap callback op in their driver ops and rely on the core code to wire it up. Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-681e038e30fd+78-vfio_unmap_notif_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	Merge branches 'v5.20/vfio/spapr_tce-unused-arg-v1', ↵	Alex Williamson
	'v5.20/vfio/comment-typo-v1' and 'v5.20/vfio/vfio-ccw-rework-v4' into v5.20/vfio/next
2022-07-07	vfio/ccw: Move FSM open/close to MDEV open/close	Eric Farman
	Part of the confusion that has existed is the FSM lifecycle of subchannels between the common CSS driver and the vfio-ccw driver. During configuration, the FSM state goes from NOT_OPER to STANDBY to IDLE, but then back to NOT_OPER. For example: vfio_ccw_sch_probe: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_probe: VFIO_CCW_STATE_STANDBY vfio_ccw_mdev_probe: VFIO_CCW_STATE_IDLE vfio_ccw_mdev_remove: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_remove: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_shutdown: VFIO_CCW_STATE_NOT_OPER Rearrange the open/close events to align with the mdev open/close, to better manage the memory and state of the devices as time progresses. Specifically, make mdev_open() perform the FSM open, and mdev_close() perform the FSM close instead of reset (which is both close and open). This makes the NOT_OPER state a dead-end path, indicating the device is probably not recoverable without fully probing and re-configuring the device. This has the nice side-effect of removing a number of special-cases where the FSM state is managed outside of the FSM itself (such as the aforementioned mdev_close() routine). Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-12-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Refactor vfio_ccw_mdev_reset	Eric Farman
	Use both the FSM Close and Open events when resetting an mdev, rather than making a separate call to cio_enable_subchannel(). Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-11-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Create a CLOSE FSM event	Eric Farman
	Refactor the vfio_ccw_sch_quiesce() routine to extract the bit that disables the subchannel and affects the FSM state. Use this to form the basis of a CLOSE event that will mirror the OPEN event, and move the subchannel back to NOT_OPER state. A key difference with that mirroring is that while OPEN handles the transition from NOT_OPER => STANDBY, the later probing of the mdev handles the transition from STANDBY => IDLE. On the other hand, the CLOSE event will move from one of the operating states {IDLE, CP_PROCESSING, CP_PENDING} => NOT_OPER. That is, there is no stop in a STANDBY state on the deconfigure path. Add a call to cp_free() in this event, such that it is captured for the various permutations of this event. In the unlikely event that cio_disable_subchannel() returns -EBUSY, the remaining logic of vfio_ccw_sch_quiesce() can still be used. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-10-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Create an OPEN FSM Event	Eric Farman
	Move the process of enabling a subchannel for use by vfio-ccw into the FSM, such that it can manage the sequence of lifecycle events for the device. That is, if the FSM state is NOT_OPER(erational), then do the work that would enable the subchannel and move the FSM to STANDBY state. An attempt to perform this event again from any of the other operating states (IDLE, CP_PROCESSING, CP_PENDING) will convert the device back to NOT_OPER so the configuration process can be started again. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-9-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Update trace data for not operational event	Eric Farman
	We currently cut a very basic trace whenever the FSM directs control to the not operational routine. Convert this to a message, so it's alongside the other configuration related traces (create, remove, etc.), and record both the event that brought us here and the current state of the device. This will provide some better footprints if things go bad. Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-8-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Flatten MDEV device (un)register	Eric Farman
	The vfio_ccw_mdev_(un)reg routines are merely vfio-ccw routines that pass control to mdev_(un)register_device. Since there's only one caller of each, let's just call the mdev routines directly. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-7-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Pass enum to FSM event jumptable	Eric Farman
	The FSM has an enumerated list of events defined. Use that as the argument passed to the jump table, instead of a regular int. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-6-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Remove private->mdev	Eric Farman
	There are no remaining users of private->mdev. Remove it. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220707135737.720765-5-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Do not change FSM state in subchannel event	Eric Farman
	The routine vfio_ccw_sch_event() is tasked with handling subchannel events, specifically machine checks, on behalf of vfio-ccw. It correctly calls cio_update_schib(), and if that fails (meaning the subchannel is gone) it makes an FSM event call to mark the subchannel Not Operational. If that worked, however, then it decides that if the FSM state was already Not Operational (implying the subchannel just came back), then it should simply change the FSM to partially- or fully-open. Remove this trickery, since a subchannel returning will require more probing than simply "oh all is well again" to ensure it works correctly. Fixes: bbe37e4cb8970 ("vfio: ccw: introduce a finite state machine") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-4-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Fix FSM state if mdev probe fails	Eric Farman
	The FSM is in STANDBY state when arriving in vfio_ccw_mdev_probe(), and this routine converts it to IDLE as part of its processing. The error exit sets it to IDLE (again) but clears the private->mdev pointer. The FSM should of course be managing the state itself, but the correct thing for vfio_ccw_mdev_probe() to do would be to put the state back the way it found it. The corresponding check of private->mdev in vfio_ccw_sch_io_todo() can be removed, since the distinction is unnecessary at this point. Fixes: 3bf1311f351ef ("vfio/ccw: Convert to use vfio_register_emulated_iommu_dev()") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-3-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/ccw: Remove UUID from s390 debug log	Michael Kawano
	As vfio-ccw devices are created/destroyed, the uuid of the associated mdevs that are recorded in $S390DBF/vfio_ccw_msg/sprintf get lost. This is because a pointer to the UUID is stored instead of the UUID itself, and that memory may have been repurposed if/when the logs are examined. The result is usually garbage UUID data in the logs, though there is an outside chance of an oops happening here. Simply remove the UUID from the traces, as the subchannel number will provide useful configuration information for problem determination, and is stored directly into the log instead of a pointer. As we were the only consumer of mdev_uuid(), remove that too. Cc: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Michael Kawano <mkawano@linux.ibm.com> Fixes: 60e05d1cf0875 ("vfio-ccw: add some logging") Fixes: b7701dfbf9832 ("vfio-ccw: Register a chp_event callback for vfio-ccw") [farman: reworded commit message, added Fixes: tags] Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Link: https://lore.kernel.org/r/20220707135737.720765-2-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07	vfio/spapr_tce: Remove the unused parameters container	Deming Wang
	The parameter of container has been unused for tce_iommu_unuse_page. So, we should delete it. Signed-off-by: Deming Wang <wangdeming@inspur.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Link: https://lore.kernel.org/r/20220702064613.5293-1-wangdeming@inspur.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-06	vfio/pci: fix the wrong word	Bo Liu
	This patch fixes a wrong word in comment. Signed-off-by: Bo Liu <liubo03@inspur.com> Link: https://lore.kernel.org/r/20220704023649.3913-1-liubo03@inspur.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-30	Merge branches 'v5.20/vfio/migration-enhancements-v3', ↵	Alex Williamson
	'v5.20/vfio/simplify-bus_type-determination-v3', 'v5.20/vfio/check-vfio_register_iommu_driver-return-v2', 'v5.20/vfio/check-iommu_group_set_name_return-v1', 'v5.20/vfio/clear-caps-buf-v3', 'v5.20/vfio/remove-useless-judgement-v1' and 'v5.20/vfio/move-device_open-count-v2' into v5.20/vfio/next
2022-06-30	vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()	Yi Liu
	We do not protect the vfio_device::open_count with group_rwsem elsewhere (see vfio_device_fops_release as a comparison, where we already drop group_rwsem before open_count--). So move the group_rwsem unlock prior to open_count--. This change now also drops group_rswem before setting device->kvm = NULL, but that's also OK (again, just like vfio_device_fops_release). The setting of device->kvm before open_device is technically done while holding the group_rwsem, this is done to protect the group kvm value we are copying from, and we should not be relying on that to protect the contents of device->kvm; instead we assume this value will not change until after the device is closed and while under the dev_set->lock. Cc: Matthew Rosato <mjrosato@linux.ibm.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20220627074119.523274-1-yi.l.liu@intel.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-30	vfio: remove useless judgement	Li Zhe
	In function vfio_dma_do_unmap(), we currently prevent process to unmap vfio dma region whose mm_struct is different from the vfio_dma->task. In our virtual machine scenario which is using kvm and qemu, this judgement stops us from liveupgrading our qemu, which uses fork() && exec() to load the new binary but the new process cannot do the VFIO_IOMMU_UNMAP_DMA action during vm exit because of this judgement. This judgement is added in commit 8f0d5bb95f76 ("vfio iommu type1: Add task structure to vfio_dma") for the security reason. But it seems that no other task who has no family relationship with old and new process can get the same vfio_dma struct here for the reason of resource isolation. So this patch delete it. Signed-off-by: Li Zhe <lizhe.67@bytedance.com> Reviewed-by: Jason Gunthorpe <jgg@ziepe.ca> Link: https://lore.kernel.org/r/20220627035109.73745-1-lizhe.67@bytedance.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-30	vfio: Clear the caps->buf to NULL after free	Schspa Shi
	On buffer resize failure, vfio_info_cap_add() will free the buffer, report zero for the size, and return -ENOMEM. As additional hardening, also clear the buffer pointer to prevent any chance of a double free. Signed-off-by: Schspa Shi <schspa@gmail.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220629022948.55608-1-schspa@gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-30	vfio: check iommu_group_set_name() return value	Liam Ni
	As iommu_group_set_name() can fail, we should check the return value. Signed-off-by: Liam Ni <zhiguangni01@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20220625114239.9301-1-zhiguangni01@gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-30	vfio: Split migration ops from main device ops	Yishai Hadas
	vfio core checks whether the driver sets some migration op (e.g. set_state/get_state) and accordingly calls its op. However, currently mlx5 driver sets the above ops without regards to its migration caps. This might lead to unexpected usage/Oops if user space may call to the above ops even if the driver doesn't support migration. As for example, the migration state_mutex is not initialized in that case. The cleanest way to manage that seems to split the migration ops from the main device ops, this will let the driver setting them separately from the main ops when it's applicable. As part of that, validate ops construction on registration and include a check for VFIO_MIGRATION_STOP_COPY since the uAPI claims it must be set in migration_flags. HISI driver was changed as well to match this scheme. This scheme may enable down the road to come with some extra group of ops (e.g. DMA log) that can be set without regards to the other options based on driver caps. Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 devices") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20220628155910.171454-3-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-30	vfio/mlx5: Protect mlx5vf_disable_fds() upon close device	Yishai Hadas
	Protect mlx5vf_disable_fds() upon close device to be called under the state mutex as done in all other places. This will prevent a race with any other flow which calls mlx5vf_disable_fds() as of health/recovery upon MLX5_PF_NOTIFY_DISABLE_VF event. Encapsulate this functionality in a separate function named mlx5vf_cmd_close_migratable() to consider migration caps and for further usage upon close device. Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 devices") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20220628155910.171454-2-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-27	vfio: check vfio_register_iommu_driver() return value	Bo Liu
	As vfio_register_iommu_driver() can fail, we should check the return value. Signed-off-by: Bo Liu <liubo03@inspur.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220622045651.5416-1-liubo03@inspur.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-27	vfio: Use device_iommu_capable()	Robin Murphy
	Use the new interface to check the capabilities for our device specifically. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4ea5eb64246f1ee188d1a61c3e93b37756932eb7.1656092606.git.robin.murphy@arm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-27	vfio/type1: Simplify bus_type determination	Robin Murphy
	Since IOMMU groups are mandatory for drivers to support, it stands to reason that any device which has been successfully added to a group must be on a bus supported by that IOMMU driver, and therefore a domain viable for any device in the group must be viable for all devices in the group. This already has to be the case for the IOMMU API's internal default domain, for instance. Thus even if the group contains devices on different buses, that can only mean that the IOMMU driver actually supports such an odd topology, and so without loss of generality we can expect the bus type of any device in a group to be suitable for IOMMU API calls. Furthermore, scrutiny reveals a lack of protection for the bus being removed while vfio_iommu_type1_attach_group() is using it; the reference that VFIO holds on the iommu_group ensures that data remains valid, but does not prevent the group's membership changing underfoot. We can address both concerns by recycling vfio_bus_type() into some superficially similar logic to indirect the IOMMU API calls themselves. Each call is thus protected from races by the IOMMU group's own locking, and we no longer need to hold group-derived pointers beyond that scope. It also gives us an easy path for the IOMMU API's migration of bus-based interfaces to device-based, of which we can already take the first step with device_iommu_capable(). As with domains, any capability must in practice be consistent for devices in a given group - and after all it's still the same capability which was expected to be consistent across an entire bus! - so there's no need for any complicated validation. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/194a12d3434d7b38f84fa96503c7664451c8c395.1656092606.git.robin.murphy@arm.com [aw: add comment to vfio_iommu_device_capable()] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-27	vfio: de-extern-ify function prototypes	Alex Williamson
	The use of 'extern' in function prototypes has been disrecommended in the kernel coding style for several years now, remove them from all vfio related files so contributors no longer need to decide between style and consistency. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/165471414407.203056.474032786990662279.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-06-26	Linux 5.19-rc4	Linus Torvalds

2022-06-26	Merge tag 'soc-fixes-5.19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "A number of fixes have accumulated, but they are largely for harmless issues: - Several OF node leak fixes - A fix to the Exynos7885 UART clock description - DTS fixes to prevent boot failures on TI AM64 and J721s2 - Bus probe error handling fixes for Baikal-T1 - A fixup to the way STM32 SoCs use separate dts files for different firmware stacks - Multiple code fixes for Arm SCMI firmware, all dealing with robustness of the implementation - Multiple NXP i.MX devicetree fixes, addressing incorrect data in DT nodes - Three updates to the MAINTAINERS file, including Florian Fainelli taking over BCM283x/BCM2711 (Raspberry Pi) from Nicolas Saenz Julienne" * tag 'soc-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (29 commits) ARM: dts: aspeed: nuvia: rename vendor nuvia to qcom arm: mach-spear: Add missing of_node_put() in time.c ARM: cns3xxx: Fix refcount leak in cns3xxx_init MAINTAINERS: Update email address arm64: dts: ti: k3-am64-main: Remove support for HS400 speed mode arm64: dts: ti: k3-j721s2: Fix overlapping GICD memory region ARM: dts: bcm2711-rpi-400: Fix GPIO line names bus: bt1-axi: Don't print error on -EPROBE_DEFER bus: bt1-apb: Don't print error on -EPROBE_DEFER ARM: Fix refcount leak in axxia_boot_secondary ARM: dts: stm32: move SCMI related nodes in a dedicated file for stm32mp15 soc: imx: imx8m-blk-ctrl: fix display clock for LCDIF2 power domain ARM: dts: imx6qdl-colibri: Fix capacitive touch reset polarity ARM: dts: imx6qdl: correct PU regulator ramp delay firmware: arm_scmi: Fix incorrect error propagation in scmi_voltage_descriptors_get firmware: arm_scmi: Avoid using extended string-buffers sizes if not necessary firmware: arm_scmi: Fix SENSOR_AXIS_NAME_GET behaviour when unsupported ARM: dts: imx7: Move hsic_phy power domain to HSIC PHY node soc: bcm: brcmstb: pm: pm-arm: Fix refcount leak in brcmstb_pm_probe MAINTAINERS: Update BCM2711/BCM2835 maintainer ...
2022-06-26	Merge tag 'mm-hotfixes-stable-2022-06-26' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "Minor things, mainly - mailmap updates, MAINTAINERS updates, etc. Fixes for this merge window: - fix for a damon boot hang, from SeongJae - fix for a kfence warning splat, from Jason Donenfeld - fix for zero-pfn pinning, from Alex Williamson - fix for fallocate hole punch clearing, from Mike Kravetz Fixes for previous releases: - fix for a performance regression, from Marcelo - fix for a hwpoisining BUG from zhenwei pi" * tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add entry for Christian Marangi mm/memory-failure: disable unpoison once hw error happens hugetlbfs: zero partial pages during fallocate hole punch mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py mm: re-allow pinning of zero pfns mm/kfence: select random number before taking raw lock MAINTAINERS: add maillist information for LoongArch MAINTAINERS: update MM tree references MAINTAINERS: update Abel Vesa's email MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewer MAINTAINERS: add Miaohe Lin as a memory-failure reviewer mailmap: add alias for jarkko@profian.com mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal mm: lru_cache_disable: use synchronize_rcu_expedited mm/page_isolation.c: fix one kernel-doc comment
2022-06-26	Merge tag 'perf-tools-fixes-for-v5.19-2022-06-26' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Enable ignore_missing_thread in 'perf stat', enabling counting with '--pid' when threads disappear during counting session setup - Adjust output data offset for backward compatibility in 'perf inject' - Fix missing free in copy_kcore_dir() in 'perf inject' - Fix caching files with a wrong build ID - Sync drm, cpufeatures, vhost and svn headers with the kernel * tag 'perf-tools-fixes-for-v5.19-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: tools headers UAPI: Synch KVM's svm.h header with the kernel tools include UAPI: Sync linux/vhost.h with the kernel sources perf stat: Enable ignore_missing_thread perf inject: Adjust output data offset for backward compatibility perf trace beauty: Fix generation of errno id->str table on ALT Linux perf build-id: Fix caching files with a wrong build ID tools headers cpufeatures: Sync with the kernel sources tools headers UAPI: Sync drm/i915_drm.h with the kernel sources perf inject: Fix missing free in copy_kcore_dir()
2022-06-26	Merge tag 'for-5.19-rc3-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - zoned relocation fixes: - fix critical section end for extent writeback, this could lead to out of order write - prevent writing to previous data relocation block group if space gets low - reflink fixes: - fix race between reflinking and ordered extent completion - proper error handling when block reserve migration fails - add missing inode iversion/mtime/ctime updates on each iteration when replacing extents - fix deadlock when running fsync/fiemap/commit at the same time - fix false-positive KCSAN report regarding pid tracking for read locks and data race - minor documentation update and link to new site * tag 'for-5.19-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Documentation: update btrfs list of features and link to readthedocs.io btrfs: fix deadlock with fsync+fiemap+transaction commit btrfs: don't set lock_owner when locking extent buffer for reading btrfs: zoned: fix critical section of relocation inode writeback btrfs: zoned: prevent allocation from previous data relocation BG btrfs: do not BUG_ON() on failure to migrate space when replacing extents btrfs: add missing inode updates on each iteration when replacing extents btrfs: fix race between reflinking and ordered extent completion
2022-06-26	Merge tag 'dma-mapping-5.19-2022-06-26' of ↵	Linus Torvalds
	git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: - pass the correct size to dma_set_encrypted() when freeing memory (Dexuan Cui) * tag 'dma-mapping-5.19-2022-06-26' of git://git.infradead.org/users/hch/dma-mapping: dma-direct: use the correct size for dma_set_encrypted()
2022-06-26	Merge tag 'for-5.19/fbdev-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes from Helge Deller: "Two bug fixes for the pxa3xx and intelfb drivers: - pxa3xx-gcu: Fix integer overflow in pxa3xx_gcu_write - intelfb: Initialize value of stolen size The other changes are small cleanups, simplifications and documentation updates to the cirrusfb, skeletonfb, omapfb, intelfb, au1100fb and simplefb drivers" * tag 'for-5.19/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: video: fbdev: omap: Remove duplicate 'the' in comment video: fbdev: omapfb: Align '*' in comment video: fbdev: simplefb: Check before clk_put() not needed video: fbdev: au1100fb: Drop unnecessary NULL ptr check video: fbdev: pxa3xx-gcu: Fix integer overflow in pxa3xx_gcu_write video: fbdev: skeletonfb: Convert to generic power management video: fbdev: cirrusfb: Remove useless reference to PCI power management video: fbdev: intelfb: Initialize value of stolen size video: fbdev: intelfb: Use aperture size from pci_resource_len video: fbdev: skeletonfb: Fix syntax errors in comments
2022-06-26	Merge tag 'for-5.19/parisc-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc architecture fixes from Helge Deller: - enable ARCH_HAS_STRICT_MODULE_RWX to prevent a boot crash on c8000 machines - flush all mappings of a shared anonymous page on PA8800/8900 machines via flushing the whole data cache. This may slow down such machines but makes sure that the cache is consistent - Fix duplicate definition build error regarding fb_is_primary_device() * tag 'for-5.19/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Enable ARCH_HAS_STRICT_MODULE_RWX parisc: Fix flush_anon_page on PA8800/PA8900 parisc: align '*' in comment in math-emu code parisc/stifb: Fix fb_is_primary_device() only available with CONFIG_FB_STI
2022-06-26	Merge tag 'xtensa-20220626' of https://github.com/jcmvbkbc/linux-xtensa	Linus Torvalds
	Pull xtensa fixes from Max Filippov: - fix OF reference leaks in xtensa arch code - replace '.bss' with '.section .bss' to fix entry.S build with old assembler * tag 'xtensa-20220626' of https://github.com/jcmvbkbc/linux-xtensa: xtensa: change '.bss' to '.section .bss' xtensa: xtfpga: Fix refcount leak bug in setup xtensa: Fix refcount leak bug in time.c