summaryrefslogtreecommitdiff
path: root/include/linux/iommu.h
AgeCommit message (Collapse)Author
2022-12-14Merge tag 'for-linus-iommufd' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd implementation from Jason Gunthorpe: "iommufd is the user API to control the IOMMU subsystem as it relates to managing IO page tables that point at user space memory. It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO container) which is the VFIO specific interface for a similar idea. We see a broad need for extended features, some being highly IOMMU device specific: - Binding iommu_domain's to PASID/SSID - Userspace IO page tables, for ARM, x86 and S390 - Kernel bypassed invalidation of user page tables - Re-use of the KVM page table in the IOMMU - Dirty page tracking in the IOMMU - Runtime Increase/Decrease of IOPTE size - PRI support with faults resolved in userspace Many of these HW features exist to support VM use cases - for instance the combination of PASID, PRI and Userspace IO Page Tables allows an implementation of DMA Shared Virtual Addressing (vSVA) within a guest. Dirty tracking enables VM live migration with SRIOV devices and PASID support allow creating "scalable IOV" devices, among other things. As these features are fundamental to a VM platform they need to be uniformly exposed to all the driver families that do DMA into VMs, which is currently VFIO and VDPA" For more background, see the extended explanations in Jason's pull request: https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/ * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits) iommufd: Change the order of MSI setup iommufd: Improve a few unclear bits of code iommufd: Fix comment typos vfio: Move vfio group specific code into group.c vfio: Refactor dma APIs for emulated devices vfio: Wrap vfio group module init/clean code into helpers vfio: Refactor vfio_device open and close vfio: Make vfio_device_open() truly device specific vfio: Swap order of vfio_device_container_register() and open_device() vfio: Set device->group in helper function vfio: Create wrappers for group register/unregister vfio: Move the sanity check of the group to vfio_create_group() vfio: Simplify vfio_create_group() iommufd: Allow iommufd to supply /dev/vfio/vfio vfio: Make vfio_container optionally compiled vfio: Move container related MODULE_ALIAS statements into container.c vfio-iommufd: Support iommufd for emulated VFIO devices vfio-iommufd: Support iommufd for physical VFIO devices vfio-iommufd: Allow iommufd to be used in place of a container fd vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent() ...
2022-12-07iommu/tegra: Add tegra_dev_iommu_get_stream_id() helperThierry Reding
Access to the internals of struct iommu_fwspec by non-IOMMU drivers is discouraged. Many drivers for Tegra SoCs, however, need access to their IOMMU stream IDs so that they can be programmed into various hardware registers. Formalize this access into a common helper to make it easier to audit and maintain. Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20221206165945.3551774-3-thierry.reding@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-12-07iommu: Add note about struct iommu_fwspec usageThierry Reding
This structure is to be considered private to the IOMMU API. Except for very few exceptions, IOMMU consumer drivers should treat this as opaque data. Acked-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20221206165945.3551774-2-thierry.reding@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-11-29iommu: Add device-centric DMA ownership interfacesLu Baolu
These complement the group interfaces used by VFIO and are for use by iommufd. The main difference is that multiple devices in the same group can all share the ownership by passing the same ownership pointer. Move the common code into shared functions. Link: https://lore.kernel.org/r/2-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-29iommu: Add IOMMU_CAP_ENFORCE_CACHE_COHERENCYJason Gunthorpe
This queries if a domain linked to a device should expect to support enforce_cache_coherency() so iommufd can negotiate the rules for when a domain should be shared or not. For iommufd a device that declares IOMMU_CAP_ENFORCE_CACHE_COHERENCY will not be attached to a domain that does not support it. Link: https://lore.kernel.org/r/1-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-03Merge tag 'for-joerg' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd into core iommu: Define EINVAL as device/domain incompatibility This series is to replace the previous EMEDIUMTYPE patch in a VFIO series: https://lore.kernel.org/kvm/Yxnt9uQTmbqul5lf@8bytes.org/ The purpose is to regulate all existing ->attach_dev callback functions to use EINVAL exclusively for an incompatibility error between a device and a domain. This allows VFIO and IOMMUFD to detect such a soft error, and then try a different domain with the same device. Among all the patches, the first two are preparatory changes. And then one patch to update kdocs and another three patches for the enforcement effort. Link: https://lore.kernel.org/r/cover.1666042872.git.nicolinc@nvidia.com
2022-11-03iommu: Prepare IOMMU domain for IOPFLu Baolu
This adds some mechanisms around the iommu_domain so that the I/O page fault handling framework could route a page fault to the domain and call the fault handler from it. Add pointers to the page fault handler and its private data in struct iommu_domain. The fault handler will be called with the private data as a parameter once a page fault is routed to the domain. Any kernel component which owns an iommu domain could install handler and its private parameter so that the page fault could be further routed and handled. This also prepares the SVA implementation to be the first consumer of the per-domain page fault handling model. The I/O page fault handler for SVA is copied to the SVA file with mmget_not_zero() added before mmap_read_lock(). Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Remove SVA related callbacks from iommu opsLu Baolu
These ops'es have been deprecated. There's no need for them anymore. Remove them to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu/sva: Refactoring iommu_sva_bind/unbind_device()Lu Baolu
The existing iommu SVA interfaces are implemented by calling the SVA specific iommu ops provided by the IOMMU drivers. There's no need for any SVA specific ops in iommu_ops vector anymore as we can achieve this through the generic attach/detach_dev_pasid domain ops. This refactors the IOMMU SVA interfaces implementation by using the iommu_attach/detach_device_pasid interfaces and align them with the concept of the SVA iommu domain. Put the new SVA code in the SVA related file in order to make it self-contained. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Add IOMMU SVA domain supportLu Baolu
The SVA iommu_domain represents a hardware pagetable that the IOMMU hardware could use for SVA translation. This adds some infrastructures to support SVA domain in the iommu core. It includes: - Extend the iommu_domain to support a new IOMMU_DOMAIN_SVA domain type. The IOMMU drivers that support allocation of the SVA domain should provide its own SVA domain specific iommu_domain_ops. - Add a helper to allocate an SVA domain. The iommu_domain_free() is still used to free an SVA domain. The report_iommu_fault() should be replaced by the new iommu_report_device_fault(). Leave the existing fault handler with the existing users and the newly added SVA members excludes it. Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Add attach/detach_dev_pasid iommu interfacesLu Baolu
Attaching an IOMMU domain to a PASID of a device is a generic operation for modern IOMMU drivers which support PASID-granular DMA address translation. Currently visible usage scenarios include (but not limited): - SVA (Shared Virtual Address) - kernel DMA with PASID - hardware-assist mediated device This adds the set_dev_pasid domain ops for setting the domain onto a PASID of a device and remove_dev_pasid iommu ops for removing any setup on a PASID of device. This also adds interfaces for device drivers to attach/detach/retrieve a domain for a PASID of a device. If multiple devices share a single group, it's fine as long the fabric always routes every TLP marked with a PASID to the host bridge and only the host bridge. For example, ACS achieves this universally and has been checked when pci_enable_pasid() is called. As we can't reliably tell the source apart in a group, all the devices in a group have to be considered as the same source, and mapped to the same PASID table. The DMA ownership is about the whole device (more precisely, iommu group), including the RID and PASIDs. When the ownership is converted, the pasid array must be empty. This also adds necessary checks in the DMA ownership interfaces. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Remove SVM_FLAG_SUPERVISOR_MODE supportLu Baolu
The current kernel DMA with PASID support is based on the SVA with a flag SVM_FLAG_SUPERVISOR_MODE. The IOMMU driver binds the kernel memory address space to a PASID of the device. The device driver programs the device with kernel virtual address (KVA) for DMA access. There have been security and functional issues with this approach: - The lack of IOTLB synchronization upon kernel page table updates. (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.) - Other than slight more protection, using kernel virtual address (KVA) has little advantage over physical address. There are also no use cases yet where DMA engines need kernel virtual addresses for in-kernel DMA. This removes SVM_FLAG_SUPERVISOR_MODE support from the IOMMU interface. The device drivers are suggested to handle kernel DMA with PASID through the kernel DMA APIs. The drvdata parameter in iommu_sva_bind_device() and all callbacks is not needed anymore. Cleanup them as well. Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/ Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Add max_pasids field in struct dev_iommuLu Baolu
Use this field to save the number of PASIDs that a device is able to consume. It is a generic attribute of a device and lifting it into the per-device dev_iommu struct could help to avoid the boilerplate code in various IOMMU drivers. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Add max_pasids field in struct iommu_deviceLu Baolu
Use this field to keep the number of supported PASIDs that an IOMMU hardware is able to support. This is a generic attribute of an IOMMU and lifting it into the per-IOMMU device structure makes it possible to allocate a PASID for device without calls into the IOMMU drivers. Any iommu driver that supports PASID related features should set this field before enabling them on the devices. In the Intel IOMMU driver, intel_iommu_sm is moved to CONFIG_INTEL_IOMMU enclave so that the pasid_supported() helper could be used in dmar.c without compilation errors. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-01iommu: Add return value rules to attach_dev op and APIsNicolin Chen
Cases like VFIO wish to attach a device to an existing domain that was not allocated specifically from the device. This raises a condition where the IOMMU driver can fail the domain attach because the domain and device are incompatible with each other. This is a soft failure that can be resolved by using a different domain. Provide a dedicated errno EINVAL from the IOMMU driver during attach that the reason why the attach failed is because of domain incompatibility. VFIO can use this to know that the attach is a soft failure and it should continue searching. Otherwise, the attach will be a hard failure and VFIO will return the code to userspace. Update kdocs to add rules of return value to the attach_dev op and APIs. Link: https://lore.kernel.org/r/bd56d93c18621104a0fa1b0de31e9b760b81b769.1666042872.git.nicolinc@nvidia.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-21iommu: Add gfp parameter to iommu_alloc_resv_regionLu Baolu
Add gfp parameter to iommu_alloc_resv_region() for the callers to specify the memory allocation behavior. Thus iommu_alloc_resv_region() could also be available in critical contexts. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20220927053109.4053662-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07iommu/dma: Move public interfaces to linux/iommu.hRobin Murphy
The iommu-dma layer is now mostly encapsulated by iommu_dma_ops, with only a couple more public interfaces left pertaining to MSI integration. Since these depend on the main IOMMU API header anyway, move their declarations there, taking the opportunity to update the half-baked comments to proper kerneldoc along the way. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/9cd99738f52094e6bed44bfee03fa4f288d20695.1660668998.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07iommu: Clean up bus_set_iommu()Robin Murphy
Clean up the remaining trivial bus_set_iommu() callsites along with the implementation. Now drivers only have to know and care about iommu_device instances, phew! Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> # s390 Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> # s390 Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ea383d5f4d74ffe200ab61248e5de6e95846180a.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07iommu: Retire iommu_capable()Robin Murphy
With all callers now converted to the device-specific version, retire the old bus-based interface, and give drivers the chance to indicate accurate per-instance capabilities. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/d8bd8777d06929ad8f49df7fc80e1b9af32a41b5.1660574547.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07iommu: Remove comment of dev_has_feat in struct docYuan Can
dev_has_feat has been removed from iommu_ops in commit 309c56e84602 ("iommu: remove the unused dev_has_feat method"), remove its description in the struct doc. Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20220815013339.2552-1-yuancan@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15iommu: remove the put_resv_regions methodChristoph Hellwig
All drivers that implement get_resv_regions just use generic_put_resv_regions to implement the put side. Remove the indirections and document the allocations constraints. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220708080616.238833-4-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15iommu: remove iommu_dev_feature_enabledChristoph Hellwig
Remove the unused iommu_dev_feature_enabled function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220708080616.238833-3-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15iommu: remove the unused dev_has_feat methodChristoph Hellwig
This method is never actually called. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220708080616.238833-2-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-06ACPI/IORT: Add support to retrieve IORT RMR reserved regionsShameer Kolothum
Parse through the IORT RMR nodes and populate the reserve region list corresponding to a given IOMMU and device(optional). Also, go through the ID mappings of the RMR node and retrieve all the SIDs associated with it. Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Tested-by: Steven Price <steven.price@arm.com> Tested-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Tested-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220615101044.1972-5-shameerali.kolothum.thodi@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-06iommu: Introduce a callback to struct iommu_resv_regionShameer Kolothum
A callback is introduced to struct iommu_resv_region to free memory allocations associated with the reserved region. This will be useful when we introduce support for IORT RMR based reserved regions. Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Steven Price <steven.price@arm.com> Tested-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Tested-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220615101044.1972-2-shameerali.kolothum.thodi@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-20Merge branches 'apple/dart', 'arm/mediatek', 'arm/msm', 'arm/smmu', ↵Joerg Roedel
'ppc/pamu', 'x86/vt-d', 'x86/amd' and 'vfio-notifier-fix' into next
2022-04-28iommu: Redefine IOMMU_CAP_CACHE_COHERENCY as the cap flag for IOMMU_CACHEJason Gunthorpe
While the comment was correct that this flag was intended to convey the block no-snoop support in the IOMMU, it has become widely implemented and used to mean the IOMMU supports IOMMU_CACHE as a map flag. Only the Intel driver was different. Now that the Intel driver is using enforce_cache_coherency() update the comment to make it clear that IOMMU_CAP_CACHE_COHERENCY is only about IOMMU_CACHE. Fix the Intel driver to return true since IOMMU_CACHE always works. The two places that test this flag, usnic and vdpa, are both assigning userspace pages to a driver controlled iommu_domain and require IOMMU_CACHE behavior as they offer no way for userspace to synchronize caches. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v3-2cf356649677+a32-intel_no_snoop_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28iommu: Introduce the domain op enforce_cache_coherency()Jason Gunthorpe
This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU. Currently only Intel and AMD IOMMUs are known to support this feature. They both implement it as an IOPTE bit, that when set, will cause PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though the no-snoop bit was clear. The new API is triggered by calling enforce_cache_coherency() before mapping any IOVA to the domain which globally switches on no-snoop blocking. This allows other implementations that might block no-snoop globally and outside the IOPTE - AMD also documents such a HW capability. Leave AMD out of sync with Intel and have it block no-snoop even for in-kernel users. This can be trivially resolved in a follow up patch. Only VFIO needs to call this API because it does not have detailed control over the device to avoid requesting no-snoop behavior at the device level. Other places using domains with real kernel drivers should simply avoid asking their devices to set the no-snoop bit. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v3-2cf356649677+a32-intel_no_snoop_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28iommu: Remove iommu group changes notifierLu Baolu
The iommu group changes notifer is not referenced in the tree. Remove it to avoid dead code. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220418005000.897664-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28iommu: Add DMA ownership management interfacesLu Baolu
Multiple devices may be placed in the same IOMMU group because they cannot be isolated from each other. These devices must either be entirely under kernel control or userspace control, never a mixture. This adds dma ownership management in iommu core and exposes several interfaces for the device drivers and the device userspace assignment framework (i.e. VFIO), so that any conflict between user and kernel controlled dma could be detected at the beginning. The device driver oriented interfaces are, int iommu_device_use_default_domain(struct device *dev); void iommu_device_unuse_default_domain(struct device *dev); By calling iommu_device_use_default_domain(), the device driver tells the iommu layer that the device dma is handled through the kernel DMA APIs. The iommu layer will manage the IOVA and use the default domain for DMA address translation. The device user-space assignment framework oriented interfaces are, int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner); void iommu_group_release_dma_owner(struct iommu_group *group); bool iommu_group_dma_owner_claimed(struct iommu_group *group); The device userspace assignment must be disallowed if the DMA owner claiming interface returns failure. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220418005000.897664-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28iommu: Add capability for pre-boot DMA protectionRobin Murphy
VT-d's dmar_platform_optin() actually represents a combination of properties fairly well standardised by Microsoft as "Pre-boot DMA Protection" and "Kernel DMA Protection"[1]. As such, we can provide interested consumers with an abstracted capability rather than driver-specific interfaces that won't scale. We name it for the former aspect since that's what external callers are most likely to be interested in; the latter is for the IOMMU layer to handle itself. [1] https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/oem-kernel-dma-protection Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/d6218dff2702472da80db6aec2c9589010684551.1650878781.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28iommu: Introduce device_iommu_capable()Robin Murphy
iommu_capable() only really works for systems where all IOMMU instances are completely homogeneous, and all devices are IOMMU-mapped. Implement the new variant which will be able to give a more accurate answer for whichever device the caller is actually interested in, and even more so once all the external users have been converted and we can reliably pass the device pointer through the internal driver interface too. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/8407eb9586677995b7a9fd70d0fd82d85929a9bb.1650878781.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu: Split struct iommu_opsLu Baolu
Move the domain specific operations out of struct iommu_ops into a new structure that only has domain specific operations. This solves the problem of needing to know if the method vector for a given operation needs to be retrieved from the device or the domain. Logically the domain ops are the ones that make sense for external subsystems and endpoint drivers to use, while device ops, with the sole exception of domain_alloc, are IOMMU API internals. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu: Remove unused argument in is_attach_deferredLu Baolu
The is_attach_deferred iommu_ops callback is a device op. The domain argument is unnecessary and never used. Remove it to make code clean. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu: Use right way to retrieve iommu_opsLu Baolu
The common iommu_ops is hooked to both device and domain. When a helper has both device and domain pointer, the way to get the iommu_ops looks messy in iommu core. This sorts out the way to get iommu_ops. The device related helpers go through device pointer, while the domain related ones go through domain pointer. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu: Remove apply_resv_regionLu Baolu
The apply_resv_region callback in iommu_ops was introduced to reserve an IOVA range in the given DMA domain when the IOMMU driver manages the IOVA by itself. As all drivers converted to use dma-iommu in the core, there's no driver using this anymore. Remove it to avoid dead code. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu: Remove aux-domain related interfaces and iommu_opsLu Baolu
The aux-domain related interfaces and iommu_ops are not referenced anywhere in the tree. We've also reached a consensus to redesign it based the new iommufd framework. Remove them to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu: Remove guest pasid related interfaces and definitionsLu Baolu
The guest pasid related uapi interfaces and definitions are not referenced anywhere in the tree. We've also reached a consensus to replace them with a new iommufd design. Remove them to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-20iommu/vt-d: Use put_pages_listMatthew Wilcox (Oracle)
page->freelist is for the use of slab. We already have the ability to free a list of pages in the core mm, but it requires the use of a list_head and for the pages to be chained together through page->lru. Switch the Intel IOMMU and IOVA code over to using free_pages_list(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> [rm: split from original patch, cosmetic tweaks, fix fq entries] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2115b560d9a0ce7cd4b948bd51a2b7bde8fdfd59.1639753638.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-03Merge tag 'iommu-updates-v5.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - New DART IOMMU driver for Apple Silicon M1 chips - Optimizations for iommu_[map/unmap] performance - Selective TLB flush support for the AMD IOMMU driver to make it more efficient on emulated IOMMUs - Rework IOVA setup and default domain type setting to move more code out of IOMMU drivers and to support runtime switching between certain types of default domains - VT-d Updates from Lu Baolu: - Update the virtual command related registers - Enable Intel IOMMU scalable mode by default - Preset A/D bits for user space DMA usage - Allow devices to have more than 32 outstanding PRs - Various cleanups - ARM SMMU Updates from Will Deacon: SMMUv3: - Minor optimisation to avoid zeroing struct members on CMD submission - Increased use of batched commands to reduce submission latency - Refactoring in preparation for ECMDQ support SMMUv2: - Fix races when probing devices with identical StreamIDs - Optimise walk cache flushing for Qualcomm implementations - Allow deep sleep states for some Qualcomm SoCs with shared clocks - Various smaller optimizations, cleanups, and fixes * tag 'iommu-updates-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (85 commits) iommu/io-pgtable: Abstract iommu_iotlb_gather access iommu/arm-smmu: Fix missing unlock on error in arm_smmu_device_group() iommu/vt-d: Add present bit check in pasid entry setup helpers iommu/vt-d: Use pasid_pte_is_present() helper function iommu/vt-d: Drop the kernel doc annotation iommu/vt-d: Allow devices to have more than 32 outstanding PRs iommu/vt-d: Preset A/D bits for user space DMA usage iommu/vt-d: Enable Intel IOMMU scalable mode by default iommu/vt-d: Refactor Kconfig a bit iommu/vt-d: Remove unnecessary oom message iommu/vt-d: Update the virtual command related registers iommu: Allow enabling non-strict mode dynamically iommu: Merge strictness and domain type configs iommu: Only log strictness for DMA domains iommu: Expose DMA domain strictness via sysfs iommu: Express DMA strictness via the domain type iommu/vt-d: Prepare for multiple DMA domain types iommu/arm-smmu: Prepare for multiple DMA domain types iommu/amd: Prepare for multiple DMA domain types iommu: Introduce explicit type for non-strict DMA domains ...
2021-08-20Merge branches 'apple/dart', 'arm/smmu', 'iommu/fixes', 'x86/amd', ↵Joerg Roedel
'x86/vt-d' and 'core' into next
2021-08-20iommu/io-pgtable: Abstract iommu_iotlb_gather accessRobin Murphy
Previously io-pgtable merely passed the iommu_iotlb_gather pointer through to helpers, but now it has grown its own direct dereference. This turns out to break the build for !IOMMU_API configs where the structure only has a dummy definition. It will probably also crash drivers who don't use the gather mechanism and simply pass in NULL. Wrap this dereference in a suitable helper which can both be stubbed out for !IOMMU_API and encapsulate a NULL check otherwise. Fixes: 7a7c5badf858 ("iommu: Indicate queued flushes via gather data") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/83672ee76f6405c82845a55c148fa836f56fbbc1.1629465282.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18iommu: Express DMA strictness via the domain typeRobin Murphy
Eliminate the iommu_get_dma_strict() indirection and pipe the information through the domain type from the beginning. Besides the flow simplification this also has several nice side-effects: - Automatically implies strict mode for untrusted devices by virtue of their IOMMU_DOMAIN_DMA override. - Ensures that we only end up using flush queues for drivers which are aware of them and can actually benefit. - Allows us to handle flush queue init failure by falling back to strict mode instead of leaving it to possibly blow up later. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/47083d69155577f1367877b1594921948c366eb3.1628682049.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18iommu: Introduce explicit type for non-strict DMA domainsRobin Murphy
Promote the difference between strict and non-strict DMA domains from an internal detail to a distinct domain feature and type, to pave the road for exposing it through the sysfs default domain interface. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/08cd2afaf6b63c58ad49acec3517c9b32c2bb946.1628682049.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18iommu: Indicate queued flushes via gather dataRobin Murphy
Since iommu_iotlb_gather exists to help drivers optimise flushing for a given unmap request, it is also the logical place to indicate whether the unmap is strict or not, and thus help them further optimise for whether to expect a sync or a flush_all subsequently. As part of that, it also seems fair to make the flush queue code take responsibility for enforcing the really subtle ordering requirement it brings, so that we don't need to worry about forgetting that if new drivers want to add flush queue support, and can consolidate the existing versions. While we're adding to the kerneldoc, also fill in some info for @freelist which was overlooked previously. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/bf5f8e2ad84e48c712ccbf80fa8c610594c7595f.1628682049.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18iommu: Pull IOVA cookie management into the coreRobin Murphy
Now that everyone has converged on iommu-dma for IOMMU_DOMAIN_DMA support, we can abandon the notion of drivers being responsible for the cookie type, and consolidate all the management into the core code. CC: Yong Wu <yong.wu@mediatek.com> CC: Chunyan Zhang <chunyan.zhang@unisoc.com> CC: Maxime Ripard <mripard@kernel.org> Tested-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/46a2c0e7419c7d1d931762dc7b6a69fa082d199a.1628682048.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-09iommu: return full error code from iommu_map_sg[_atomic]()Logan Gunthorpe
Convert to ssize_t return code so the return code from __iommu_map() can be returned all the way down through dma_iommu_map_sg(). Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-08-02Merge remote-tracking branch 'korg/core' into x86/amdJoerg Roedel
2021-08-02iommu: Factor iommu_iotlb_gather_is_disjoint() outNadav Amit
Refactor iommu_iotlb_gather_add_page() and factor out the logic that detects whether IOTLB gather range and a new range are disjoint. To be used by the next patch that implements different gathering logic for AMD. Note that updating gather->pgsize unconditionally does not affect correctness as the function had (and has) an invariant, in which gather->pgsize always represents the flushing granularity of its range. Arguably, “size" should never be zero, but lets assume for the matter of discussion that it might. If "size" equals to "gather->pgsize", then the assignment in question has no impact. Otherwise, if "size" is non-zero, then iommu_iotlb_sync() would initialize the size and range (see iommu_iotlb_gather_init()), and the invariant is kept. Otherwise, "size" is zero, and "gather" already holds a range, so gather->pgsize is non-zero and (gather->pgsize && gather->pgsize != size) is true. Therefore, again, iommu_iotlb_sync() would be called and initialize the size. Cc: Joerg Roedel <joro@8bytes.org> Cc: Jiajun Cao <caojiajun@vmware.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Nadav Amit <namit@vmware.com> Link: https://lore.kernel.org/r/20210723093209.714328-5-namit@vmware.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-02iommu: Improve iommu_iotlb_gather helpersRobin Murphy
The Mediatek driver is not the only one which might want a basic address-based gathering behaviour, so although it's arguably simple enough to open-code, let's factor it out for the sake of cleanliness. Let's also take this opportunity to document the intent of these helpers for clarity. Cc: Joerg Roedel <joro@8bytes.org> Cc: Will Deacon <will@kernel.org> Cc: Jiajun Cao <caojiajun@vmware.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Nadav Amit <namit@vmware.com> Link: https://lore.kernel.org/r/20210723093209.714328-4-namit@vmware.com Signed-off-by: Joerg Roedel <jroedel@suse.de>