summaryrefslogtreecommitdiff
path: root/drivers/iommu
AgeCommit message (Collapse)Author
2018-03-29iommu/amd: Split irq_lookup_table out of the amd_iommu_devtable_lockSebastian Andrzej Siewior
The function get_irq_table() reads/writes irq_lookup_table while holding the amd_iommu_devtable_lock. It also modifies amd_iommu_dev_table[].data[2]. set_dte_entry() is using amd_iommu_dev_table[].data[0|1] (under the domain->lock) so it should be okay. The access to the iommu is serialized with its own (iommu's) lock. So split out get_irq_table() out of amd_iommu_devtable_lock's lock. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-29iommu/amd: Split domain id out of amd_iommu_devtable_lockSebastian Andrzej Siewior
domain_id_alloc() and domain_id_free() is used for id management. Those two function share a bitmap (amd_iommu_pd_alloc_bitmap) and set/clear bits based on id allocation. There is no need to share this with amd_iommu_devtable_lock, it can use its own lock for this operation. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-29iommu/amd: Turn dev_data_list into a lock less listSebastian Andrzej Siewior
alloc_dev_data() adds new items to dev_data_list and search_dev_data() is searching for items in this list. Both protect the access to the list with a spinlock. There is no need to navigate forth and back within the list and there is also no deleting of a specific item. This qualifies the list to become a lock less list and as part of this, the spinlock can be removed. With this change the ordering of those items within the list is changed: before the change new items were added to the end of the list, now they are added to the front. I don't think it matters but wanted to mention it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-29iommu/amd: Take into account that alloc_dev_data() may return NULLSebastian Andrzej Siewior
find_dev_data() does not check whether the return value alloc_dev_data() is NULL. This was okay once because the pointer was returned once as-is. Since commit df3f7a6e8e85 ("iommu/amd: Use is_attach_deferred call-back") the pointer may be used within find_dev_data() so a NULL check is required. Cc: Baoquan He <bhe@redhat.com> Fixes: df3f7a6e8e85 ("iommu/amd: Use is_attach_deferred call-back") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-29iommu/vt-d:Remove unused variableShaokun Zhang
Unused after commit <42e8c186b595> ("iommu/vt-d: Simplify io/tlb flushing in intel_iommu_unmap"), cleanup it. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-27iommu/arm-smmu-v3: Support 52-bit virtual addressRobin Murphy
Stage 1 input addresses are effectively 64-bit in SMMUv3 anyway, so really all that's involved is letting io-pgtable know the appropriate upper bound for T0SZ. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27iommu/arm-smmu-v3: Support 52-bit physical addressRobin Murphy
Implement SMMUv3.1 support for 52-bit physical addresses. Since a 52-bit OAS implies 64KB translation granule support, permitting level 1 block entries there is simple, and the rest is just extending address fields. Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27iommu/io-pgtable-arm: Support 52-bit physical addressRobin Murphy
Bring io-pgtable-arm in line with the ARMv8.2-LPA feature allowing 52-bit physical addresses when using the 64KB translation granule. This will be supported by SMMUv3.1. Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27iommu/arm-smmu-v3: Clean up queue definitionsRobin Murphy
As with registers and tables, use GENMASK and the bitfield accessors consistently for queue fields, to save some lines and ease maintenance a little. This now leaves everything in a nice state where all named field definitions expect to be used with bitfield accessors (although since single-bit fields can still be used directly we leave some of those uses as-is to avoid unnecessary churn), while the few remaining *_MASK definitions apply exclusively to in-place values. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27iommu/arm-smmu-v3: Clean up table definitionsRobin Murphy
As with registers, use GENMASK and the bitfield accessors consistently for table fields, to save some lines and ease maintenance a little. This also catches a subtle off-by-one wherein bit 5 of CD.T0SZ was missing. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27iommu/arm-smmu-v3: Clean up register definitionsRobin Murphy
The FIELD_{GET,PREP} accessors provided by linux/bitfield.h allow us to define multi-bit register fields solely in terms of their bit positions via GENMASK(), without needing explicit *_SHIFT and *_MASK definitions. As well as the immediate reduction in lines of code, this avoids the awkwardness of values sometimes being pre-shifted and sometimes not, which means we can factor out some common values like memory attributes. Furthermore, it also makes it trivial to verify the definitions against the architecture spec, on which note let's also fix up a few field names to properly match the current release (IHI0070B). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27iommu/arm-smmu-v3: Clean up address maskingRobin Murphy
Before trying to add the SMMUv3.1 support for 52-bit addresses, make things bearable by cleaning up the various address mask definitions to use GENMASK_ULL() consistently. The fact that doing so reveals (and fixes) a latent off-by-one in Q_BASE_ADDR_MASK only goes to show what a jolly good idea it is... Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27iommu/arm-smmu-v3: limit reporting of MSI allocation failuresNate Watterson
Currently, the arm-smmu-v3 driver expects to allocate MSIs for all SMMUs with FEAT_MSI set. This results in unwarranted "failed to allocate MSIs" warnings being printed on systems where FW was either deliberately configured to force the use of SMMU wired interrupts -or- is altogether incapable of describing SMMU MSI topology (ACPI IORT prior to rev.C). Remedy this by checking msi_domain before attempting to allocate SMMU MSIs. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27iommu/arm-smmu-v3: Warn about missing IRQsRobin Murphy
It is annoyingly non-obvious when DMA transactions silently go missing due to undetected SMMU faults. Help skip the first few debugging steps in those situations by making it clear when we have neither wired IRQs nor MSIs with which to raise error conditions. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-21iommu/mediatek: Fix protect memory settingYong Wu
In MediaTek's IOMMU design, When a iommu translation fault occurs (HW can NOT translate the destination address to a valid physical address), the IOMMU HW output the dirty data into a special memory to avoid corrupting the main memory, this is called "protect memory". the register(0x114) for protect memory is a little different between mt8173 and mt2712. In the mt8173, bit[30:6] in the register represents [31:7] of the physical address. In the 4GB mode, the register bit[31] should be 1. While in the mt2712, the bits don't shift. bit[31:7] in the register represents [31:7] in the physical address, and bit[1:0] in the register represents bit[33:32] of the physical address if it has. Fixes: e6dec9230862 ("iommu/mediatek: Add mt2712 IOMMU support") Reported-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-20iommu/vt-d: Use real PASID for flush in caching modeLu Baolu
If caching mode is supported, the hardware will cache none-present or erroneous translation entries. Hence, software should explicitly invalidate the PASID cache after a PASID table entry becomes present. We should issue such invalidation with the PASID value that we have changed. PASID 0 is not reserved for this case. Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Sankaran Rajesh <rajesh.sankaran@intel.com> Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-20iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up ↵Christoph Hellwig
intel_{alloc,free}_coherent() Use the dma_direct_*() helpers and clean up the code flow. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-9-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20iommu/amd_iommu: Use CONFIG_DMA_DIRECT_OPS=y and dma_direct_{alloc,free}()Christoph Hellwig
This cleans up the code a lot by removing duplicate logic. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-8-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20x86/dma: Use DMA-direct (CONFIG_DMA_DIRECT_OPS=y)Christoph Hellwig
The generic DMA-direct (CONFIG_DMA_DIRECT_OPS=y) implementation is now functionally equivalent to the x86 nommu dma_map implementation, so switch over to using it. That includes switching from using x86_dma_supported in various IOMMU drivers to use dma_direct_supported instead, which provides the same functionality. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-4-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-15iommu/omap: Increase group ref in .device_group()Jeffy Chen
Increase group refcounting in omap_iommu_device_group(). Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-15iommu/amd: Use dev_err to send events to the system logGary R Hook
Remove printk and use a more preferable error logging function. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-15iommu/vt-d: Fix a potential memory leakLu Baolu
A memory block was allocated in intel_svm_bind_mm() but never freed in a failure path. This patch fixes this by free it to avoid memory leakage. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Fixes: 2f26e0a9c9860 ('iommu/vt-d: Add basic SVM PASID support') Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-15iommu/rockchip: Perform a reset on shutdownMarc Zyngier
Trying to do a kexec whilst the iommus are still on is proving to be a challenging exercise. It is terribly unsafe, as we're reusing the memory allocated for the page tables, leading to a likely crash. Let's implement a shutdown method that will at least try to stop DMA from going crazy behind our back. Note that we need to be extra cautious when doing so, as the IOMMU may not be clocked if controlled by a another master, as typical on Rockchip system. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-03-15iommu/amd: Add support for fast IOTLB flushingSuravee Suthikulpanit
Since AMD IOMMU driver currently flushes all TLB entries when page size is more than one, use the same interface for both iommu_ops.flush_iotlb_all() and iommu_ops.iotlb_sync(). Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-22treewide/trivial: Remove ';;$' typo noiseIngo Molnar
On lkml suggestions were made to split up such trivial typo fixes into per subsystem patches: --- a/arch/x86/boot/compressed/eboot.c +++ b/arch/x86/boot/compressed/eboot.c @@ -439,7 +439,7 @@ setup_uga32(void **uga_handle, unsigned long size, u32 *width, u32 *height) struct efi_uga_draw_protocol *uga = NULL, *first_uga; efi_guid_t uga_proto = EFI_UGA_PROTOCOL_GUID; unsigned long nr_ugas; - u32 *handles = (u32 *)uga_handle;; + u32 *handles = (u32 *)uga_handle; efi_status_t status = EFI_INVALID_PARAMETER; int i; This patch is the result of the following script: $ sed -i 's/;;$/;/g' $(git grep -E ';;$' | grep "\.[ch]:" | grep -vwE 'for|ia64' | cut -d: -f1 | sort | uniq) ... followed by manual review to make sure it's all good. Splitting this up is just crazy talk, let's get over with this and just do it. Reported-by: Pavel Machek <pavel@ucw.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-17x86/apic: Rename variables and functions related to x86_io_apic_opsBaoquan He
The names of x86_io_apic_ops and its two member variables are misleading: The ->read() member is to read IO_APIC reg, while ->disable() which is called by native_disable_io_apic()/irq_remapping_disable_io_apic() is actually used to restore boot IRQ mode, not to disable the IO-APIC. So rename x86_io_apic_ops to 'x86_apic_ops' since it doesn't only handle the IO-APIC, but also the local APIC. Also rename its member variables and the related callbacks. Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: douly.fnst@cn.fujitsu.com Cc: joro@8bytes.org Cc: prarit@redhat.com Cc: uobergfe@redhat.com Link: http://lkml.kernel.org/r/20180214054656.3780-6-bhe@redhat.com [ Rewrote the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15iommu/amd: Avoid locking get_irq_table() from atomic contextScott Wood
get_irq_table() previously acquired amd_iommu_devtable_lock which is not a raw lock, and thus cannot be acquired from atomic context on PREEMPT_RT. Many calls to modify_irte*() come from atomic context due to the IRQ desc->lock, as does amd_iommu_update_ga() due to the preemption disabling in vcpu_load/put(). The only difference between calling get_irq_table() and reading from irq_lookup_table[] directly, other than the lock acquisition and amd_iommu_rlookup_table[] check, is if the table entry is unpopulated, which should never happen when looking up a devid that came from an irq_2_irte struct, as get_irq_table() would have already been called on that devid during irq_remapping_alloc(). The lock acquisition is not needed in these cases because entries in irq_lookup_table[] never change once non-NULL -- nor would the amd_iommu_devtable_lock usage in get_irq_table() provide meaningful protection if they did, since it's released before using the looked up table in the get_irq_table() caller. Rename the old get_irq_table() to alloc_irq_table(), and create a new lockless get_irq_table() to be used in non-allocating contexts that WARNs if it doesn't find what it's looking for. Signed-off-by: Scott Wood <swood@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-14iommu/dma: Add HW MSI(GICv3 ITS) address regions reservationShameer Kolothum
Modified iommu_dma_get_resv_regions() to include GICv3 ITS region on ACPI based ARM platfiorms which may require HW MSI reservations. Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13iommu/io-pgtable: Use size_t return type for all foo_unmapVivek Gautam
Unmap returns a size_t all throughout the IOMMU framework. Make io-pgtable match this convention. Moreover, there isn't a need to have a signed int return type as we return 0 in case of failures. Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13iommu: Do not return error code for APIs with size_t return typeSuravee Suthikulpanit
Currently, iommu_unmap, iommu_unmap_fast and iommu_map_sg return size_t. However, some of the return values are error codes (< 0), which can be misinterpreted as large size. Therefore, returning size 0 instead to signify failure to map/unmap. Cc: Joerg Roedel <joro@8bytes.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13iommu/vt-d: Add __init for dmar_register_bus_notifier()Dmitry Safonov
It's called only from intel_iommu_init(), which is init function. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13iommu/mediatek: Move attach_device after iommu-group is ready for M4Uv1Yong Wu
In the commit 05f80300dc8b ("iommu: Finish making iommu_group support mandatory"), the iommu framework has supposed all the iommu drivers have their owner iommu-group, it get rid of the FIXME workarounds while the group is NULL. But the flow of Mediatek M4U gen1 looks a bit trick that it will hang at this case: ========================================== Unable to handle kernel NULL pointer dereference at virtual address 00000030 pgd = c0004000 [00000030] *pgd=00000000 PC is at mutex_lock+0x28/0x54 LR is at iommu_attach_device+0xa4/0xd4 pc : [<c07632e8>] lr : [<c04736fc>] psr: 60000013 sp : df0edbb8 ip : df0edbc8 fp : df0edbc4 r10: c114da14 r9 : df2a3e40 r8 : 00000003 r7 : df27a210 r6 : df2a90c4 r5 : 00000030 r4 : 00000000 r3 : df0f8000 r2 : fffff000 r1 : df29c610 r0 : 00000030 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none xxx (mutex_lock) from [<c04736fc>] (iommu_attach_device+0xa4/0xd4) (iommu_attach_device) from [<c011b9dc>] (__arm_iommu_attach_device+0x28/0x90) (__arm_iommu_attach_device) from [<c011ba60>] (arm_iommu_attach_device+0x1c/0x30) (arm_iommu_attach_device) from [<c04759ac>] (mtk_iommu_add_device+0xfc/0x214) (mtk_iommu_add_device) from [<c0472aa4>] (add_iommu_group+0x3c/0x68) (add_iommu_group) from [<c047d044>] (bus_for_each_dev+0x78/0xac) (bus_for_each_dev) from [<c04734a4>] (bus_set_iommu+0xb0/0xec) (bus_set_iommu) from [<c0476310>] (mtk_iommu_probe+0x328/0x368) (mtk_iommu_probe) from [<c048189c>] (platform_drv_probe+0x5c/0xc0) (platform_drv_probe) from [<c047f510>] (driver_probe_device+0x2f4/0x4d8) (driver_probe_device) from [<c047f800>] (__driver_attach+0x10c/0x128) (__driver_attach) from [<c047d044>] (bus_for_each_dev+0x78/0xac) (bus_for_each_dev) from [<c047ec78>] (driver_attach+0x2c/0x30) (driver_attach) from [<c047e640>] (bus_add_driver+0x1e0/0x278) (bus_add_driver) from [<c048052c>] (driver_register+0x88/0x108) (driver_register) from [<c04817ec>] (__platform_driver_register+0x50/0x58) (__platform_driver_register) from [<c0b31380>] (m4u_init+0x24/0x28) (m4u_init) from [<c0101c38>] (do_one_initcall+0xf0/0x17c) ========================= The root cause is that the device's iommu-group is NULL while arm_iommu_attach_device is called. This patch prepare a new iommu-group for the iommu consumer devices to fix this issue. CC: Robin Murphy <robin.murphy@arm.com> CC: Honghui Zhang <honghui.zhang@mediatek.com> Fixes: 05f80300dc8b ("iommu: Finish making iommu_group support mandatory") Reported-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13iommu/exynos: Use generic group callbackRobin Murphy
Since iommu_group_get_for_dev() already tries iommu_group_get() and will not call ops->device_group if the group is already non-NULL, the check in get_device_iommu_group() is always redundant and it reduces to a duplicate of the generic version; let's just use that one instead. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13iommu/amd: Don't use dev_data in irte_ga_set_affinity()Scott Wood
search_dev_data() acquires a non-raw lock, which can't be done from atomic context on PREEMPT_RT. There is no need to look at dev_data because guest_mode should never be set if use_vapic is not set. Signed-off-by: Scott Wood <swood@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13iommu/amd: Use raw locks on atomic context pathsScott Wood
Several functions in this driver are called from atomic context, and thus raw locks must be used in order to be safe on PREEMPT_RT. This includes paths that must wait for command completion, which is a potential PREEMPT_RT latency concern but not easily avoidable. Signed-off-by: Scott Wood <swood@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-08Merge tag 'iommu-updates-v4.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This time there are not a lot of changes coming from the IOMMU side. That is partly because I returned from my parental leave late in the development process and probably partly because everyone was busy with Spectre and Meltdown mitigation work and didn't find the time for IOMMU work. So here are the few changes that queued up for this merge window: - 5-level page-table support for the Intel IOMMU. - error reporting improvements for the AMD IOMMU driver - additional DT bindings for ipmmu-vmsa (Renesas) - small fixes and cleanups" * tag 'iommu-updates-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Clean up of_iommu_init_fn iommu/ipmmu-vmsa: Remove redundant of_iommu_init_fn hook iommu/msm: Claim bus ops on probe iommu/vt-d: Enable 5-level paging mode in the PASID entry iommu/vt-d: Add a check for 5-level paging support iommu/vt-d: Add a check for 1GB page support iommu/vt-d: Enable upto 57 bits of domain address width iommu/vt-d: Use domain instead of cache fetching iommu/exynos: Don't unconditionally steal bus ops iommu/omap: Fix debugfs_create_*() usage iommu/vt-d: clean up pr_irq if request_threaded_irq fails iommu: Check the result of iommu_group_get() for NULL iommu/ipmmu-vmsa: Add r8a779(70|95) DT bindings iommu/ipmmu-vmsa: Add r8a7796 DT binding iommu/amd: Set the device table entry PPR bit for IOMMU V2 devices iommu/amd - Record more information about unknown events
2018-02-06Merge tag 'pci-v4.16-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - skip AER driver error recovery callbacks for correctable errors reported via ACPI APEI, as we already do for errors reported via the native path (Tyler Baicar) - fix DPC shared interrupt handling (Alex Williamson) - print full DPC interrupt number (Keith Busch) - enable DPC only if AER is available (Keith Busch) - simplify DPC code (Bjorn Helgaas) - calculate ASPM L1 substate parameter instead of hardcoding it (Bjorn Helgaas) - enable Latency Tolerance Reporting for ASPM L1 substates (Bjorn Helgaas) - move ASPM internal interfaces out of public header (Bjorn Helgaas) - allow hot-removal of VGA devices (Mika Westerberg) - speed up unplug and shutdown by assuming Thunderbolt controllers don't support Command Completed events (Lukas Wunner) - add AtomicOps support for GPU and Infiniband drivers (Felix Kuehling, Jay Cornwall) - expose "ari_enabled" in sysfs to help NIC naming (Stuart Hayes) - clean up PCI DMA interface usage (Christoph Hellwig) - remove PCI pool API (replaced with DMA pool) (Romain Perier) - deprecate pci_get_bus_and_slot(), which assumed PCI domain 0 (Sinan Kaya) - move DT PCI code from drivers/of/ to drivers/pci/ (Rob Herring) - add PCI-specific wrappers for dev_info(), etc (Frederick Lawler) - remove warnings on sysfs mmap failure (Bjorn Helgaas) - quiet ROM validation messages (Alex Deucher) - remove redundant memory alloc failure messages (Markus Elfring) - fill in types for compile-time VGA and other I/O port resources (Bjorn Helgaas) - make "pci=pcie_scan_all" work for Root Ports as well as Downstream Ports to help AmigaOne X1000 (Bjorn Helgaas) - add SPDX tags to all PCI files (Bjorn Helgaas) - quirk Marvell 9128 DMA aliases (Alex Williamson) - quirk broken INTx disable on Ceton InfiniTV4 (Bjorn Helgaas) - fix CONFIG_PCI=n build by adding dummy pci_irqd_intx_xlate() (Niklas Cassel) - use DMA API to get MSI address for DesignWare IP (Niklas Cassel) - fix endpoint-mode DMA mask configuration (Kishon Vijay Abraham I) - fix ARTPEC-6 incorrect IS_ERR() usage (Wei Yongjun) - add support for ARTPEC-7 SoC (Niklas Cassel) - add endpoint-mode support for ARTPEC (Niklas Cassel) - add Cadence PCIe host and endpoint controller driver (Cyrille Pitchen) - handle multiple INTx status bits being set in dra7xx (Vignesh R) - translate dra7xx hwirq range to fix INTD handling (Vignesh R) - remove deprecated Exynos PHY initialization code (Jaehoon Chung) - fix MSI erratum workaround for HiSilicon Hip06/Hip07 (Dongdong Liu) - fix NULL pointer dereference in iProc BCMA driver (Ray Jui) - fix Keystone interrupt-controller-node lookup (Johan Hovold) - constify qcom driver structures (Julia Lawall) - rework Tegra config space mapping to increase space available for endpoints (Vidya Sagar) - simplify Tegra driver by using bus->sysdata (Manikanta Maddireddy) - remove PCI_REASSIGN_ALL_BUS usage on Tegra (Manikanta Maddireddy) - add support for Global Fabric Manager Server (GFMS) event to Microsemi Switchtec switch driver (Logan Gunthorpe) - add IDs for Switchtec PSX 24xG3 and PSX 48xG3 (Kelvin Cao) * tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits) PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe endpoint controller PCI: endpoint: Fix EPF device name to support multi-function devices PCI: endpoint: Add the function number as argument to EPC ops PCI: cadence: Add host driver for Cadence PCIe controller dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe host controller PCI: Add vendor ID for Cadence PCI: Add generic function to probe PCI host controllers PCI: generic: fix missing call of pci_free_resource_list() PCI: OF: Add generic function to parse and allocate PCI resources PCI: Regroup all PCI related entries into drivers/pci/Makefile PCI/DPC: Reformat DPC register definitions PCI/DPC: Add and use DPC Status register field definitions PCI/DPC: Squash dpc_rp_pio_get_info() into dpc_process_rp_pio_error() PCI/DPC: Remove unnecessary RP PIO register structs PCI/DPC: Push dpc->rp_pio_status assignment into dpc_rp_pio_get_info() PCI/DPC: Squash dpc_rp_pio_print_error() into dpc_rp_pio_get_info() PCI/DPC: Make RP PIO log size check more generic PCI/DPC: Rename local "status" to "dpc_status" PCI/DPC: Squash dpc_rp_pio_print_tlp_header() into dpc_rp_pio_print_error() ...
2018-02-01Merge tag 'armsoc-drivers' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Arnd Bergmann: "A number of new drivers get added this time, along with many low-priority bugfixes. The most interesting changes by subsystem are: bus drivers: - Updates to the Broadcom bus interface driver to support newer SoC types - The TI OMAP sysc driver now supports updated DT bindings memory controllers: - A new driver for Tegra186 gets added - A new driver for the ti-emif sram, to allow relocating suspend/resume handlers there SoC specific: - A new driver for Qualcomm QMI, the interface to the modem on MSM SoCs - A new driver for power domains on the actions S700 SoC - A driver for the Xilinx Zynq VCU logicoreIP reset controllers: - A new driver for Amlogic Meson-AGX - various bug fixes tee subsystem: - A new user interface got added to enable asynchronous communication with the TEE supplicant. - A new method of using user space memory for communication with the TEE is added" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (84 commits) of: platform: fix OF node refcount leak soc: fsl: guts: Add a NULL check for devm_kasprintf() bus: ti-sysc: Fix smartreflex sysc mask psci: add CPU_IDLE dependency soc: xilinx: Fix Kconfig alignment soc: xilinx: xlnx_vcu: Use bitwise & rather than logical && on clkoutdiv soc: xilinx: xlnx_vcu: Depends on HAS_IOMEM for xlnx_vcu soc: bcm: brcmstb: Be multi-platform compatible soc: brcmstb: biuctrl: exit without warning on non brcmstb platforms Revert "soc: brcmstb: Only register SoC device on STB platforms" bus: omap: add MODULE_LICENSE tags soc: brcmstb: Only register SoC device on STB platforms tee: shm: Potential NULL dereference calling tee_shm_register() soc: xilinx: xlnx_vcu: Add Xilinx ZYNQMP VCU logicoreIP init driver dt-bindings: soc: xilinx: Add DT bindings to xlnx_vcu driver soc: xilinx: Create folder structure for soc specific drivers of: platform: populate /firmware/ node from of_platform_default_populate_init() soc: samsung: Add SPDX license identifiers soc: qcom: smp2p: Use common error handling code in qcom_smp2p_probe() tee: shm: don't put_page on null shm->pages ...
2018-01-31mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacksDavid Rientjes
Commit 4d4bbd8526a8 ("mm, oom_reaper: skip mm structs with mmu notifiers") prevented the oom reaper from unmapping private anonymous memory with the oom reaper when the oom victim mm had mmu notifiers registered. The rationale is that doing mmu_notifier_invalidate_range_{start,end}() around the unmap_page_range(), which is needed, can block and the oom killer will stall forever waiting for the victim to exit, which may not be possible without reaping. That concern is real, but only true for mmu notifiers that have blockable invalidate_range_{start,end}() callbacks. This patch adds a "flags" field to mmu notifier ops that can set a bit to indicate that these callbacks do not block. The implementation is steered toward an expensive slowpath, such as after the oom reaper has grabbed mm->mmap_sem of a still alive oom victim. [rientjes@google.com: mmu_notifier_invalidate_range_end() can also call the invalidate_range() must not block, fix comment] Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1801091339570.240101@chino.kir.corp.google.com [akpm@linux-foundation.org: make mm_has_blockable_invalidate_notifiers() return bool, use rwsem_is_locked()] Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1712141329500.74052@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Dimitri Sivanich <sivanich@hpe.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Joerg Roedel <joro@8bytes.org> Cc: Doug Ledford <dledford@redhat.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-17Merge branches 'arm/renesas', 'arm/omap', 'arm/exynos', 'x86/amd', ↵Joerg Roedel
'x86/vt-d' and 'core' into next
2018-01-17iommu: Clean up of_iommu_init_fnRobin Murphy
Now that no more drivers rely on arbitrary early initialisation via an of_iommu_init_fn hook, let's clean up the redundant remnants. The IOMMU_OF_DECLARE() macro needs to remain for now, as the probe-deferral mechanism has no other nice way to detect built-in drivers before they have registered themselves, such that it can make the right decision. Reviewed-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-01-17iommu/ipmmu-vmsa: Remove redundant of_iommu_init_fn hookRobin Murphy
Having of_iommu_init() call ipmmu_init() via ipmmu_vmsa_iommu_of_setup() does nothing that the subsys_initcall wouldn't do slightly later anyway, since probe-deferral of masters means it is no longer critical to register the driver super-early. Clean it up. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-01-17iommu/msm: Claim bus ops on probeRobin Murphy
Since the MSM IOMMU driver now probes via DT exclusively rather than platform data, dependent masters should be deferred until the IOMMU itself is ready. Thus we can do away with the early initialisation hook to unconditionally claim the bus ops, and instead do that only once an IOMMU is actually probed. Furthermore, this should also make the driver safe for multiplatform kernels on non-MSM SoCs. Reviewed-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-01-17iommu/vt-d: Enable 5-level paging mode in the PASID entrySohil Mehta
If the CPU has support for 5-level paging enabled and the IOMMU also supports 5-level paging then enable the 5-level paging mode for first- level translations - used when SVM is enabled. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-01-17iommu/vt-d: Add a check for 5-level paging supportSohil Mehta
Add a check to verify IOMMU 5-level paging support. If the CPU supports supports 5-level paging but the IOMMU does not support it then disable SVM by not allocating PASID tables. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-01-17iommu/vt-d: Add a check for 1GB page supportSohil Mehta
Add a check to verify IOMMU 1GB page support. If the CPU supports 1GB pages but the IOMMU does not support it then disable SVM by not allocating PASID tables. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-01-17iommu/vt-d: Enable upto 57 bits of domain address widthSohil Mehta
Update the IOMMU default domain address width to 57 bits. This would enable the IOMMU to do upto 5-levels of paging for second level translations - IOVA translation requests without PASID. Even though the maximum supported address width is being increased to 57, __iommu_calculate_agaw() would set the actual supported address width to the maximum support available in IOMMU hardware. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-01-17iommu/vt-d: Use domain instead of cache fetchingPeter Xu
after commit a1ddcbe93010 ("iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi", 2015-08-12), we have domain pointer as parameter to iommu_flush_iotlb_psi(), so no need to fetch it from cache again. More importantly, a NULL reference pointer bug is reported on RHEL7 (and it can be reproduced on some old upstream kernels too, e.g., v4.13) by unplugging an 40g nic from a VM (hard to test unplug on real host, but it should be the same): https://bugzilla.redhat.com/show_bug.cgi?id=1531367 [ 24.391863] pciehp 0000:00:03.0:pcie004: Slot(0): Attention button pressed [ 24.393442] pciehp 0000:00:03.0:pcie004: Slot(0): Powering off due to button press [ 29.721068] i40evf 0000:01:00.0: Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err OK [ 29.783557] iommu: Removing device 0000:01:00.0 from group 3 [ 29.784662] BUG: unable to handle kernel NULL pointer dereference at 0000000000000304 [ 29.785817] IP: iommu_flush_iotlb_psi+0xcf/0x120 [ 29.786486] PGD 0 [ 29.786487] P4D 0 [ 29.786812] [ 29.787390] Oops: 0000 [#1] SMP [ 29.787876] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_ng [ 29.795371] CPU: 0 PID: 156 Comm: kworker/0:2 Not tainted 4.13.0 #14 [ 29.796366] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-1.el7 04/01/2014 [ 29.797593] Workqueue: pciehp-0 pciehp_power_thread [ 29.798328] task: ffff94f5745b4a00 task.stack: ffffb326805ac000 [ 29.799178] RIP: 0010:iommu_flush_iotlb_psi+0xcf/0x120 [ 29.799919] RSP: 0018:ffffb326805afbd0 EFLAGS: 00010086 [ 29.800666] RAX: ffff94f5bc56e800 RBX: 0000000000000000 RCX: 0000000200000025 [ 29.801667] RDX: ffff94f5bc56e000 RSI: 0000000000000082 RDI: 0000000000000000 [ 29.802755] RBP: ffffb326805afbf8 R08: 0000000000000000 R09: ffff94f5bc86bbf0 [ 29.803772] R10: ffffb326805afba8 R11: 00000000000ffdc4 R12: ffff94f5bc86a400 [ 29.804789] R13: 0000000000000000 R14: 00000000ffdc4000 R15: 0000000000000000 [ 29.805792] FS: 0000000000000000(0000) GS:ffff94f5bfc00000(0000) knlGS:0000000000000000 [ 29.806923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.807736] CR2: 0000000000000304 CR3: 000000003499d000 CR4: 00000000000006f0 [ 29.808747] Call Trace: [ 29.809156] flush_unmaps_timeout+0x126/0x1c0 [ 29.809800] domain_exit+0xd6/0x100 [ 29.810322] device_notifier+0x6b/0x70 [ 29.810902] notifier_call_chain+0x4a/0x70 [ 29.812822] __blocking_notifier_call_chain+0x47/0x60 [ 29.814499] blocking_notifier_call_chain+0x16/0x20 [ 29.816137] device_del+0x233/0x320 [ 29.817588] pci_remove_bus_device+0x6f/0x110 [ 29.819133] pci_stop_and_remove_bus_device+0x1a/0x20 [ 29.820817] pciehp_unconfigure_device+0x7a/0x1d0 [ 29.822434] pciehp_disable_slot+0x52/0xe0 [ 29.823931] pciehp_power_thread+0x8a/0xa0 [ 29.825411] process_one_work+0x18c/0x3a0 [ 29.826875] worker_thread+0x4e/0x3b0 [ 29.828263] kthread+0x109/0x140 [ 29.829564] ? process_one_work+0x3a0/0x3a0 [ 29.831081] ? kthread_park+0x60/0x60 [ 29.832464] ret_from_fork+0x25/0x30 [ 29.833794] Code: 85 ed 74 0b 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 8b 54 24 60 44 89 f8 0f b6 c4 48 8b 04 c2 48 85 c0 74 49 45 0f b6 ff 4a 8b 3c f8 <80> bf [ 29.838514] RIP: iommu_flush_iotlb_psi+0xcf/0x120 RSP: ffffb326805afbd0 [ 29.840362] CR2: 0000000000000304 [ 29.841716] ---[ end trace b10ec0d6900868d3 ]--- This patch fixes that problem if applied to v4.13 kernel. The bug does not exist on latest upstream kernel since it's fixed as a side effect of commit 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing", 2017-08-15). But IMHO it's still good to have this patch upstream. CC: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Fixes: a1ddcbe93010 ("iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi") Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-01-17iommu/exynos: Don't unconditionally steal bus opsRobin Murphy
Removing the early device registration hook overlooked the fact that it only ran conditionally on a compatible device being present in the DT. With exynos_iommu_init() now running as an unconditional initcall, problems arise on non-Exynos systems when other IOMMU drivers find themselves unable to install their ops on the platform bus, or at worst the Exynos ops get called with someone else's domain and all hell breaks loose. The global ops/cache setup could probably all now be triggered from the first IOMMU probe, as with dma_dev assigment, but for the time being the simplest fix is to resurrect the logic from commit a7b67cd5d9af ("iommu/exynos: Play nice in multi-platform builds") to explicitly check the DT for the presence of an Exynos IOMMU before trying anything. Fixes: 928055a01b3f ("iommu/exynos: Remove custom platform device registration code") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-01-17iommu/omap: Fix debugfs_create_*() usageGeert Uytterhoeven
When exposing data access through debugfs, the correct debugfs_create_*() functions must be used, depending on data type. Remove all casts from data pointers passed to debugfs_create_*() functions, as such casts prevent the compiler from flagging bugs. omap_iommu.nr_tlb_entries is "int", hence casting to "u8 *" exposes only a part of it. Fix this by using debugfs_create_u32() instead. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Joerg Roedel <jroedel@suse.de>