summaryrefslogtreecommitdiff
path: root/drivers/iommu
AgeCommit message (Collapse)Author
2024-02-16iommu: Merge iommu_fault_event and iopf_faultLu Baolu
The iommu_fault_event and iopf_fault data structures store the same information about an iopf fault. They are also used in the same way. Merge these two data structures into a single one to make the code more concise and easier to maintain. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu: Remove iommu_[un]register_device_fault_handler()Lu Baolu
The individual iommu driver reports the iommu page faults by calling iommu_report_device_fault(), where a pre-registered device fault handler is called to route the fault to another fault handler installed on the corresponding iommu domain. The pre-registered device fault handler is static and won't be dynamic as the fault handler is eventually per iommu domain. Replace calling device fault handler with iommu_queue_iopf(). After this replacement, the registering and unregistering fault handler interfaces are not needed anywhere. Remove the interfaces and the related data structures to avoid dead code. Convert cookie parameter of iommu_queue_iopf() into a device pointer that is really passed. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu: Merge iopf_device_param into iommu_fault_paramLu Baolu
The struct dev_iommu contains two pointers, fault_param and iopf_param. The fault_param pointer points to a data structure that is used to store pending faults that are awaiting responses. The iopf_param pointer points to a data structure that is used to store partial faults that are part of a Page Request Group. The fault_param and iopf_param pointers are essentially duplicate. This causes memory waste. Merge the iopf_device_param pointer into the iommu_fault_param pointer to consolidate the code and save memory. The consolidated pointer would be allocated on demand when the device driver enables the iopf on device, and would be freed after iopf is disabled. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu: Cleanup iopf data structure definitionsLu Baolu
struct iommu_fault_page_request and struct iommu_page_response are not part of uAPI anymore. Convert them to data structures for kAPI. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu/arm-smmu-v3: Remove unrecoverable faults reportingLu Baolu
No device driver registers fault handler to handle the reported unrecoveraable faults. Remove it to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu/mtk_iommu: Use devm_kcalloc() instead of devm_kzalloc()Erick Archer
This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows [1]. Here the multiplication is obviously safe because MTK_PROTECT_PA_ALIGN is defined as a literal value of 256 or 128. For the "mtk_iommu.c" file: 256 For the "mtk_iommu_v1.c" file: 128 However, using devm_kcalloc() is more appropriate [2] and improves readability. This patch has no effect on runtime behavior. Link: https://github.com/KSPP/linux/issues/162 [1] Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2] Signed-off-by: Erick Archer <erick.archer@gmx.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20240211182250.12656-1-erick.archer@gmx.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu/amd: Mark interrupt as managedMario Limonciello
On many systems that have an AMD IOMMU the following sequence of warnings is observed during bootup. ``` pci 0000:00:00.2 can't derive routing for PCI INT A pci 0000:00:00.2: PCI INT A: not connected ``` This series of events happens because of the IOMMU initialization sequence order and the lack of _PRT entries for the IOMMU. During initialization the IOMMU driver first enables the PCI device using pci_enable_device(). This will call acpi_pci_irq_enable() which will check if the interrupt is declared in a PCI routing table (_PRT) entry. According to the PCI spec [1] these routing entries are only required under PCI root bridges: The _PRT object is required under all PCI root bridges The IOMMU is directly connected to the root complex, so there is no parent bridge to look for a _PRT entry. The first warning is emitted since no entry could be found in the hierarchy. The second warning is then emitted because the interrupt hasn't yet been configured to any value. The pin was configured in pci_read_irq() but the byte in PCI_INTERRUPT_LINE return 0xff which means "Unknown". After that sequence of events pci_enable_msi() is called and this will allocate an interrupt. That is both of these warnings are totally harmless because the IOMMU uses MSI for interrupts. To avoid even trying to probe for a _PRT entry mark the IOMMU as IRQ managed. This avoids both warnings. Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html?highlight=_prt#prt-pci-routing-table [1] Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Fixes: cffe0a2b5a34 ("x86, irq: Keep balance of IOAPIC pin reference count") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240122233400.1802-1-mario.limonciello@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-15irqchip: Convert all platform MSI users to the new APIThomas Gleixner
Switch all the users of the platform MSI domain over to invoke the new interfaces which branch to the original platform MSI functions when the irqdomain associated to the caller device does not yet provide MSI parent functionality. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240127161753.114685-7-apatel@ventanamicro.com
2024-02-13Revert "iommu/arm-smmu: Convert to domain_alloc_paging()"Dmitry Baryshkov
This reverts commit 9b3febc3a3da ("iommu/arm-smmu: Convert to domain_alloc_paging()"). It breaks Qualcomm MSM8996 platform. Calling arm_smmu_write_context_bank() from new codepath results in the platform being reset because of the unclocked hardware access. Fixes: 9b3febc3a3da ("iommu/arm-smmu: Convert to domain_alloc_paging()") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20240213-iommu-revert-domain-alloc-v1-1-325ff55dece4@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issueVasant Hegde
With v1 page table, the AMD IOMMU spec states that the hardware must use the domain ID to tag its internal translation caches. I/O devices with different v1 page tables must be given different domain IDs. I/O devices that share the same v1 page table __may__ be given the same domain ID. This domain ID management policy is currently implemented by the AMD IOMMU driver. In this case, only the domain ID is needed when issuing the INVALIDATE_IOMMU_PAGES command to invalidate the IOMMU translation cache (TLB). With v2 page table, the hardware uses domain ID and PASID as parameters to tag and issue the INVALIDATE_IOMMU_PAGES command. Since the GCR3 table is setup per-device, and there is no guarantee for PASID to be unique across multiple devices. The same PASID for different devices could have different v2 page tables. In such case, if multiple devices share the same domain ID, IOMMU translation cache for these devices would be polluted due to TLB aliasing. Hence, avoid the TLB aliasing issue with v2 page table by allocating unique domain ID for each device even when multiple devices are sharing the same v1 page table. Please note that this fix would result in multiple INVALIDATE_IOMMU_PAGES commands (one per domain id) when unmapping a translation. Domain ID can be shared until device starts using PASID. We will enhance this code later where we will allocate per device domain ID only when its needed. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-18-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Remove unused GCR3 table parameters from struct protection_domainSuravee Suthikulpanit
Since they are moved to struct iommu_dev_data, and the driver has been ported to use them. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-17-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Rearrange device flush codeVasant Hegde
Consolidate all flush related code in one place so that its easy to maintain. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-16-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Remove unused flush pasid functionsVasant Hegde
We have removed iommu_v2 module and converted v2 page table to use common flush functions. Also we have moved GCR3 table to per device. PASID related functions are not used. Hence remove these unused functions. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-15-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Refactor GCR3 table helper functionsSuravee Suthikulpanit
To use the new per-device struct gcr3_tbl_info. Use GFP_KERNEL flag instead of GFP_ATOMIC for GCR3 table allocation. Also modify set_dte_entry() to use new per device GCR3 table. Also in free_gcr3_table() path replace BUG_ON with WARN_ON_ONCE(). Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-14-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Refactor protection_domain helper functionsSuravee Suthikulpanit
To removes the code to setup GCR3 table, and only handle domain create / destroy, since GCR3 is no longer part of a domain. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-13-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Refactor attaching / detaching device functionsSuravee Suthikulpanit
If domain is configured with V2 page table then setup default GCR3 with domain GCR3 pointer. So that all devices in the domain uses same page table for translation. Also return page table setup status from do_attach() function. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-12-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Refactor helper function for setting / clearing GCR3Suravee Suthikulpanit
Refactor GCR3 helper functions in preparation to use per device GCR3 table. * Add new function update_gcr3 to update per device GCR3 table * Remove per domain default GCR3 setup during v2 page table allocation. Subsequent patch will add support to setup default gcr3 while attaching device to domain. * Remove amd_iommu_domain_update() from V2 page table path as device detach path will take care of updating the domain. * Consolidate GCR3 table related code in one place so that its easy to maintain. * Rename functions to reflect its usage. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-11-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu: Introduce iommu_group_mutex_assert()Vasant Hegde
Add function to check iommu group mutex lock. So that device drivers can rely on group mutex lock instead of adding another driver level lock before modifying driver specific device data structure. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-10-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Rearrange GCR3 table setup codeVasant Hegde
Consolidate GCR3 table related code in one place so that its easy to maintain. Note that this patch doesn't move __set_gcr3/__clear_gcr3. We are moving GCR3 table from per domain to per device. Following series will rework these functions. During that time I will move these functions as well. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-9-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Add support for device based TLB invalidationVasant Hegde
Add support to invalidate TLB/IOTLB for the given device. These functions will be used in subsequent patches where we will introduce per device GCR3 table and SVA support. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-8-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Use protection_domain.flags to check page table modeVasant Hegde
Page table mode (v1, v2 or pt) is per domain property. Recently we have enhanced protection_domain.pd_mode to track per domain page table mode. Use that variable to check the page table mode instead of global 'amd_iommu_pgtable' in {map/unmap}_pages path. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Introduce per-device GCR3 tableSuravee Suthikulpanit
AMD IOMMU GCR3 table is indexed by PASID. Each entry stores guest CR3 register value, which is an address to the root of guest IO page table. The GCR3 table can be programmed per-device. However, Linux AMD IOMMU driver currently managing the table on a per-domain basis. PASID is a device feature. When SVA is enabled it will bind PASID to device, not domain. Hence it makes sense to have per device GCR3 table. Introduce struct iommu_dev_data.gcr3_tbl_info to keep track of GCR3 table configuration. This will eventually replaces gcr3 related variables in protection_domain structure. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-6-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Introduce struct protection_domain.pd_modeSuravee Suthikulpanit
This enum variable is used to track the type of page table used by the protection domain. It will replace the protection_domain.flags in subsequent series. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-5-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Introduce get_amd_iommu_from_dev()Suravee Suthikulpanit
Introduce get_amd_iommu_from_dev() and get_amd_iommu_from_dev_data(). And replace rlookup_amd_iommu() with the new helper function where applicable to avoid unnecessary loop to look up struct amd_iommu from struct device. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-4-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Enable Guest Translation before registering devicesVasant Hegde
IOMMU Guest Translation (GT) feature needs to be enabled before invalidating guest translations (CMD_INV_IOMMU_PAGES with GN=1). Currently GT feature is enabled after setting up interrupt handler. So far it was fine as we were not invalidating guest page table before this point. Upcoming series will introduce per device GCR3 table and it will invalidate guest pages after configuring. Hence move GT feature enablement to early_enable_iommu(). Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Pass struct iommu_dev_data to set_dte_entry()Vasant Hegde
Pass iommu_dev_data structure instead of passing indivisual variables. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Remove EXPORT_SYMBOL for perf counter related functionsVasant Hegde
.. as IOMMU perf counters are always built as part of kernel. No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Remove redundant error check in amd_iommu_probe_device()Vasant Hegde
iommu_init_device() is not returning -ENOTSUPP since commit 61289cbaf6c8 ("iommu/amd: Remove old alias handling code"). No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-6-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Remove duplicate function declarations from amd_iommu.hVasant Hegde
Perf counter related functions are defined in amd-iommu.h as well. Hence remove duplicate declarations. No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-5-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/ipmmu-vmsa: Minor cleanupsRobin Murphy
Remove the of_match_ptr() which was supposed to have gone long ago, but managed to got lost in a fix-squashing mishap. On a similar theme, we may as well also modernise the PM ops to get rid of the clunky #ifdefs, and modernise the resource mapping to keep the checkpatch brigade happy. Link: https://lore.kernel.org/linux-iommu/Yxni3d6CdI3FZ5D+@8bytes.org/ Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://lore.kernel.org/r/791877b0d310dc2ab7dc616d2786ab24252b9b8e.1707151207.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/iova: use named kmem_cache for iova magazinesPasha Tatashin
The magazine buffers can take gigabytes of kmem memory, dominating all other allocations. For observability purpose create named slab cache so the iova magazine memory overhead can be clearly observed. With this change: > slabtop -o | head Active / Total Objects (% used) : 869731 / 952904 (91.3%) Active / Total Slabs (% used) : 103411 / 103974 (99.5%) Active / Total Caches (% used) : 135 / 211 (64.0%) Active / Total Size (% used) : 395389.68K / 411430.20K (96.1%) Minimum / Average / Maximum Object : 0.02K / 0.43K / 8.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 244412 244239 99% 1.00K 61103 4 244412K iommu_iova_magazine 91636 88343 96% 0.03K 739 124 2956K kmalloc-32 75744 74844 98% 0.12K 2367 32 9468K kernfs_node_cache On this machine it is now clear that magazine use 242M of kmem memory. Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> [ rm: adjust to rework of iova_cache_{get,put} ] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/dc5c51aaba50906a92b9ba1a5137ed462484a7be.1707144953.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/iova: Reorganise some codeRobin Murphy
The iova_cache_{get,put}() calls really represent top-level lifecycle management for the whole IOVA library, so it's long been rather confusing to have them buried right in the middle of the allocator implementation details. Move them to a more expected position at the end of the file, where it will then also be easier to expand them. With this, we can also move the rcache hotplug handler (plus another stray function) into the rcache portion of the file. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/d4753562f4faa0e6b3aeebcbf88fdb60cc22d715.1707144953.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/iova: Tidy up iova_cache_get() failureRobin Murphy
Failure handling in iova_cache_get() is a little messy, and we'd like to add some more to it, so let's tidy up a bit first. By leaving the hotplug handler until last we can take advantage of kmem_cache_destroy() being NULL-safe to have a single cleanup label. We can also improve the error reporting, noting that kmem_cache_create() already screams if it fails, so that one is redundant. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/ae4a3bda2d6a9b738221553c838d30473bd624e7.1707144953.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Remove unused APERTURE_* macrosVasant Hegde
These macros are not used after commit 518d9b450387 ("iommu/amd: Remove special mapping code for dma_ops path"). No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-4-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Remove unused IOVA_* macroVasant Hegde
These macros are not used after commit ac6d704679d343 ("iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops()"). No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09iommu/amd: Remove unused PPR_* macrosVasant Hegde
Commit 5a0b11a180a ("iommu/amd: Remove iommu_v2 module") missed to remove PPR_* macros. Remove these macros as its not used anymore. No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-07iommu/amd: Fix failure return from snp_lookup_rmpentry()Ashish Kalra
Commit f366a8dac1b8: ("iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown") leads to the following Smatch static checker warning: drivers/iommu/amd/init.c:3820 iommu_page_make_shared() error: uninitialized symbol 'assigned'. Fix it. [ bp: Address the other error cases too. ] Fixes: f366a8dac1b8 ("iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown") Closes: https://lore.kernel.org/linux-iommu/1be69f6a-e7e1-45f9-9a74-b2550344f3fd@moroto.mountain Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: Joerg Roedel <jroedel@suse.com> Link: https://lore.kernel.org/lkml/20240126041126.1927228-20-michael.roth@amd.com
2024-02-06iommu/msm-iommu: don't limit the driver too muchDmitry Baryshkov
In preparation of dropping most of ARCH_QCOM subtypes, stop limiting the driver just to those machines. Allow it to be built for any 32-bit Qualcomm platform (ARCH_QCOM). Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20231216162700.863456-2-dmitry.baryshkov@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-02-06iommufd/iova_bitmap: Consider page offset for the pages to be pinnedJoao Martins
For small bitmaps that aren't PAGE_SIZE aligned *and* that are less than 512 pages in bitmap length, use an extra page to be able to cover the entire range e.g. [1M..3G] which would be iterated more efficiently in a single iteration, rather than two. Fixes: b058ea3ab5af ("vfio/iova_bitmap: refactor iova_bitmap_set() to better handle page boundaries") Link: https://lore.kernel.org/r/20240202133415.23819-10-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-02-06iommufd/selftest: Hugepage mock domain supportJoao Martins
Add support to mock iommu hugepages of 1M (for a 2K mock io page size). To avoid breaking test suite defaults, the way this is done is by explicitly creating a iommu mock device which has hugepage support (i.e. through MOCK_FLAGS_DEVICE_HUGE_IOVA). The same scheme is maintained of mock base page index tracking in the XArray, except that an extra bit is added to mark it as a hugepage. One subpage containing the dirty bit, means that the whole hugepage is dirty (similar to AMD IOMMU non-standard page sizes). For clearing, same thing applies, and it must clear all dirty subpages. This is in preparation for dirty tracking to mark mock hugepages as dirty to exercise all the iova-bitmap fixes. Link: https://lore.kernel.org/r/20240202133415.23819-8-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-02-06iommufd/selftest: Refactor mock_domain_read_and_clear_dirty()Joao Martins
Move the clearing of the dirty bit of the mock domain into mock_domain_test_and_clear_dirty() helper, simplifying the caller function. Additionally, rework the mock_domain_read_and_clear_dirty() loop to iterate over a potentially variable IO page size. No functional change intended with the loop refactor. This is in preparation for dirty tracking support for IOMMU hugepage mock domains. Link: https://lore.kernel.org/r/20240202133415.23819-7-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-02-06iommufd/iova_bitmap: Handle recording beyond the mapped pagesJoao Martins
IOVA bitmap is a zero-copy scheme of recording dirty bits that iterate the different bitmap user pages at chunks of a maximum of PAGE_SIZE/sizeof(struct page*) pages. When the iterations are split up into 64G, the end of the range may be broken up in a way that's aligned with a non base page PTE size. This leads to only part of the huge page being recorded in the bitmap. Note that in pratice this is only a problem for IOMMU dirty tracking i.e. when the backing PTEs are in IOMMU hugepages and the bitmap is in base page granularity. So far this not something that affects VF dirty trackers (which reports and records at the same granularity). To fix that, if there is a remainder of bits left to set in which the current IOVA bitmap doesn't cover, make a copy of the bitmap structure and iterate-and-set the rest of the bits remaining. Finally, when advancing the iterator, skip all the bits that were set ahead. Link: https://lore.kernel.org/r/20240202133415.23819-5-joao.m.martins@oracle.com Reported-by: Avihai Horon <avihaih@nvidia.com> Fixes: f35f22cc760e ("iommu/vt-d: Access/Dirty bit support for SS domains") Fixes: 421a511a293f ("iommu/amd: Access/Dirty bit support in IOPTEs") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-02-06iommufd/iova_bitmap: Switch iova_bitmap::bitmap to an u8 arrayJoao Martins
iova_bitmap_mapped_length() don't deal correctly with the small bitmaps (< 2M bitmaps) when the starting address isn't u64 aligned, leading to skipping a tiny part of the IOVA range. This is materialized as not marking data dirty that should otherwise have been. Fix that by using a u8 * in the internal state of IOVA bitmap. Most of the data structures use the type of the bitmap to adjust its indexes, thus changing the type of the bitmap decreases the granularity of the bitmap indexes. Fixes: b058ea3ab5af ("vfio/iova_bitmap: refactor iova_bitmap_set() to better handle page boundaries") Link: https://lore.kernel.org/r/20240202133415.23819-3-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-02-06iommufd/iova_bitmap: Bounds check mapped::pages accessJoao Martins
Dirty IOMMU hugepages reported on a base page page-size granularity can lead to an attempt to set dirty pages in the bitmap beyond the limits that are pinned. Bounds check the page index of the array we are trying to access is within the limits before we kmap() and return otherwise. While it is also a defensive check, this is also in preparation to defer setting bits (outside the mapped range) to the next iteration(s) when the pages become available. Fixes: b058ea3ab5af ("vfio/iova_bitmap: refactor iova_bitmap_set() to better handle page boundaries") Link: https://lore.kernel.org/r/20240202133415.23819-2-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-02-01iommu: Allow ops->default_domain to work when !CONFIG_IOMMU_DMAJason Gunthorpe
The ops->default_domain flow used a 0 req_type to select the default domain and this was enforced by iommu_group_alloc_default_domain(). When !CONFIG_IOMMU_DMA started forcing the old ARM32 drivers into IDENTITY it also overroad the 0 req_type of the ops->default_domain drivers to IDENTITY which ends up causing failures during device probe. Make iommu_group_alloc_default_domain() accept a req_type that matches the ops->default_domain and have iommu_group_alloc_default_domain() generate a req_type that matches the default_domain. This way the req_type always describes what kind of domain should be attached and ops->default_domain overrides all other mechanisms to choose the default domain. Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops") Fixes: 0f6a90436a57 ("iommu: Do not use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled") Reported-by: Ovidiu Panait <ovidiu.panait@windriver.com> Closes: https://lore.kernel.org/linux-iommu/20240123165829.630276-1-ovidiu.panait@windriver.com/ Reported-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Closes: https://lore.kernel.org/linux-iommu/170618452753.3805.4425669653666211728.stgit@ltcd48-lp2.aus.stglab.ibm.com/ Tested-by: Ovidiu Panait <ovidiu.panait@windriver.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v1-755bd21c4a64+525b8-iommu_def_dom_fix_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-01-29iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdownAshish Kalra
Add a new IOMMU API interface amd_iommu_snp_disable() to transition IOMMU pages to Hypervisor state from Reclaim state after SNP_SHUTDOWN_EX command. Invoke this API from the CCP driver after SNP_SHUTDOWN_EX command. Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240126041126.1927228-20-michael.roth@amd.com
2024-01-29iommu/amd: Don't rely on external callers to enable IOMMU SNP supportAshish Kalra
Currently, the expectation is that the kernel will call amd_iommu_snp_enable() to perform various checks and set the amd_iommu_snp_en flag that the IOMMU uses to adjust its setup routines to account for additional requirements on hosts where SNP is enabled. This is somewhat fragile as it relies on this call being done prior to IOMMU setup. It is more robust to just do this automatically as part of IOMMU initialization, so rework the code accordingly. There is still a need to export information about whether or not the IOMMU is configured in a manner compatible with SNP, so relocate the existing amd_iommu_snp_en flag so it can be used to convey that information in place of the return code that was previously provided by calls to amd_iommu_snp_enable(). While here, also adjust the kernel messages related to IOMMU SNP enablement for consistency/grammar/clarity. Suggested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Co-developed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20240126041126.1927228-4-michael.roth@amd.com
2024-01-18Merge tag 'for-linus-iommufd' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "This brings the first of three planned user IO page table invalidation operations: - IOMMU_HWPT_INVALIDATE allows invalidating the IOTLB integrated into the iommu itself. The Intel implementation will also generate an ATC invalidation to flush the device IOTLB as it unambiguously knows the device, but other HW will not. It goes along with the prior PR to implement userspace IO page tables (aka nested translation for VMs) to allow Intel to have full functionality for simple cases. An Intel implementation of the operation is provided. Also fix a small bug in the selftest mock iommu driver probe" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd/selftest: Check the bus type during probe iommu/vt-d: Add iotlb flush for nested domain iommufd: Add data structure for Intel VT-d stage-1 cache invalidation iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op iommufd/selftest: Add mock_domain_cache_invalidate_user support iommu: Add iommu_copy_struct_from_user_array helper iommufd: Add IOMMU_HWPT_INVALIDATE iommu: Add cache_invalidate_user op
2024-01-18Merge tag 'iommu-updates-v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "Core changes: - Fix race conditions in device probe path - Retire IOMMU bus_ops - Support for passing custom allocators to page table drivers - Clean up Kconfig around IOMMU_SVA - Support for sharing SVA domains with all devices bound to a mm - Firmware data parsing cleanup - Tracing improvements for iommu-dma code - Some smaller fixes and cleanups ARM-SMMU drivers: - Device-tree binding updates: - Add additional compatible strings for Qualcomm SoCs - Document Adreno clocks for Qualcomm's SM8350 SoC - SMMUv2: - Implement support for the ->domain_alloc_paging() callback - Ensure Secure context is restored following suspend of Qualcomm SMMU implementation - SMMUv3: - Disable stalling mode for the "quiet" context descriptor - Minor refactoring and driver cleanups Intel VT-d driver: - Cleanup and refactoring AMD IOMMU driver: - Improve IO TLB invalidation logic - Small cleanups and improvements Rockchip IOMMU driver: - DT binding update to add Rockchip RK3588 Apple DART driver: - Apple M1 USB4/Thunderbolt DART support - Cleanups Virtio IOMMU driver: - Add support for iotlb_sync_map - Enable deferred IO TLB flushes" * tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits) iommu: Don't reserve 0-length IOVA region iommu/vt-d: Move inline helpers to header files iommu/vt-d: Remove unused vcmd interfaces iommu/vt-d: Remove unused parameter of intel_pasid_setup_pass_through() iommu/vt-d: Refactor device_to_iommu() to retrieve iommu directly iommu/sva: Fix memory leak in iommu_sva_bind_device() dt-bindings: iommu: rockchip: Add Rockchip RK3588 iommu/dma: Trace bounce buffer usage when mapping buffers iommu/arm-smmu: Convert to domain_alloc_paging() iommu/arm-smmu: Pass arm_smmu_domain to internal functions iommu/arm-smmu: Implement IOMMU_DOMAIN_BLOCKED iommu/arm-smmu: Convert to a global static identity domain iommu/arm-smmu: Reorganize arm_smmu_domain_add_master() iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() iommu/arm-smmu-v3: Add a type for the STE iommu/arm-smmu-v3: disable stall for quiet_cd iommu/qcom: restore IOMMU state if needed iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible iommu/arm-smmu-qcom: Add missing GMU entry to match table ...
2024-01-11iommufd/selftest: Check the bus type during probeJason Gunthorpe
This relied on the probe function only being invoked by the bus type mock was registered on. The removal of the bus ops broke this assumption and the probe could be called on non-mock bus types like PCI. Check the bus type directly in probe. Fixes: 17de3f5fdd35 ("iommu: Retire bus ops") Link: https://lore.kernel.org/r/0-v1-82d59f7eab8c+40c-iommufd_mock_bus_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>