summaryrefslogtreecommitdiff
path: root/include/linux/iommu.h
diff options
context:
space:
mode:
authorRobin Murphy <robin.murphy@arm.com>2023-04-13 14:40:25 +0100
committerJoerg Roedel <jroedel@suse.de>2023-07-14 16:14:17 +0200
commit791c2b17fb4023f21c3cbf5f268af01d9b8cb7cc (patch)
tree83977771637623fe2179388281538dac95a4a536 /include/linux/iommu.h
parentf188056352bcdcd4bec31c22baa00ba6568f1416 (diff)
iommu: Optimise PCI SAC address trick
Per the reasoning in commit 4bf7fda4dce2 ("iommu/dma: Add config for PCI SAC address trick") and its subsequent revert, this mechanism no longer serves its original purpose, but now only works around broken hardware/drivers in a way that is unfortunately too impactful to remove. This does not, however, prevent us from solving the performance impact which that workaround has on large-scale systems that don't need it. Once the 32-bit IOVA space fills up and a workload starts allocating and freeing on both sides of the boundary, the opportunistic SAC allocation can then end up spending significant time hunting down scattered fragments of free 32-bit space, or just reestablishing max32_alloc_size. This can easily be exacerbated by a change in allocation pattern, such as by changing the network MTU, which can increase pressure on the 32-bit space by leaving a large quantity of cached IOVAs which are now the wrong size to be recycled, but also won't be freed since the non-opportunistic allocations can still be satisfied from the whole 64-bit space without triggering the reclaim path. However, in the context of a workaround where smaller DMA addresses aren't simply a preference but a necessity, if we get to that point at all then in fact it's already the endgame. The nature of the allocator is currently such that the first IOVA we give to a device after the 32-bit space runs out will be the highest possible address for that device, ever. If that works, then great, we know we can optimise for speed by always allocating from the full range. And if it doesn't, then the worst has already happened and any brokenness is now showing, so there's little point in continuing to try to hide it. To that end, implement a flag to refine the SAC business into a per-device policy that can automatically get itself out of the way if and when it stops being useful. CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Jakub Kicinski <kuba@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/b8502b115b915d2a3fabde367e099e39106686c8.1681392791.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
Diffstat (limited to 'include/linux/iommu.h')
-rw-r--r--include/linux/iommu.h2
1 files changed, 2 insertions, 0 deletions
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d31642596675..b1dcb1b9b170 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -409,6 +409,7 @@ struct iommu_fault_param {
* @priv: IOMMU Driver private data
* @max_pasids: number of PASIDs this device can consume
* @attach_deferred: the dma domain attachment is deferred
+ * @pci_32bit_workaround: Limit DMA allocations to 32-bit IOVAs
*
* TODO: migrate other per device data pointers under iommu_dev_data, e.g.
* struct iommu_group *iommu_group;
@@ -422,6 +423,7 @@ struct dev_iommu {
void *priv;
u32 max_pasids;
u32 attach_deferred:1;
+ u32 pci_32bit_workaround:1;
};
int iommu_device_register(struct iommu_device *iommu,