summaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms/pseries/iommu.c
AgeCommit message (Collapse)Author
2019-08-30powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guestsThiago Jung Bauermann
Secure guest memory is inacessible to devices so regular DMA isn't possible. In that case set devices' dma_map_ops to NULL so that the generic DMA code path will use SWIOTLB to bounce buffers for DMA. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820021326.6884-14-bauerman@linux.ibm.com
2019-08-30Merge branch 'topic/ppc-kvm' into nextMichael Ellerman
Merge our ppc-kvm topic branch to bring in the Ultravisor support patches.
2019-08-30powerpc/pseries/iommu: Switch to xchg_no_killAlexey Kardashevskiy
This is the last implementation of iommu_table_ops::exchange() which we are about to remove. This implements xchg_no_kill() for pseries. Since it is paravirtual platform, the hypervisor does TCE invalidations and we do not have to deal with it here, hence no tce_kill() hook. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190829085252.72370-5-aik@ozlabs.ru
2019-08-19powerpc/powernv/ioda2: Create bigger default window with 64k IOMMU pagesAlexey Kardashevskiy
At the moment we create a small window only for 32bit devices, the window maps 0..2GB of the PCI space only. For other devices we either use a sketchy bypass or hardware bypass but the former can only work if the amount of RAM is no bigger than the device's DMA mask and the latter requires devices to support at least 59bit DMA. This extends the default DMA window to the maximum size possible to allow a wider DMA mask than just 32bit. The default window size is now limited by the the iommu_table::it_map allocation bitmap which is a contiguous array, 1 bit per an IOMMU page. This increases the default IOMMU page size from hard coded 4K to the system page size to allow wider DMA masks. This increases the level number to not exceed the max order allocation limit per TCE level. By the same time, this keeps minimal levels number as 2 in order to save memory. As the extended window now overlaps the 32bit MMIO region, this adds an area reservation to iommu_init_table(). After this change the default window size is 0x80000000000==1<<43 so devices limited to DMA mask smaller than the amount of system RAM can still use more than just 2GB of memory for DMA. This is an optimization and not a bug fix for DMA API usage. With the on-demand allocation of indirect TCE table levels enabled and 2 levels, the first TCE level size is just 1<<ceil((log2(0x7ffffffffff+1)-16)/2)=16384 TCEs or 2 system pages. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190718051139.74787-5-aik@ozlabs.ru
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 59 temple place suite 330 boston ma 02111 1307 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1334 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-20powerpc/pseries/iommu: Fix set but not used valuesQian Cai
The commit b7d6bf4fdd47 ("powerpc/pseries/pci: Remove obsolete SW invalidate") left 2 variables unused. arch/powerpc/platforms/pseries/iommu.c:108:17: warning: variable 'tces' set but not used __be64 *tcep, *tces; ^~~~ arch/powerpc/platforms/pseries/iommu.c:132:17: warning: variable 'tces' set but not used __be64 *tcep, *tces; ^~~~ Also, the commit 68c0449ea16d ("powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculation") set "ranges" in ddw_memory_hotplug_max() but never use it. arch/powerpc/platforms/pseries/iommu.c: In function 'ddw_memory_hotplug_max': arch/powerpc/platforms/pseries/iommu.c:948:7: warning: variable 'ranges' set but not used int ranges, n_mem_addr_cells, n_mem_size_cells, len; ^~~~~~ Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-18powerpc/dma: remove set_dma_offsetChristoph Hellwig
There is no good reason for this helper, just opencode it. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-18powerpc/pseries: use the generic iommu bypass codeChristoph Hellwig
Use the generic iommu bypass code instead of overriding set_dma_mask. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-18powerpc/pseries: unwind dma_get_required_mask_pSeriesLP a bitChristoph Hellwig
Call dma_get_required_mask_pSeriesLP directly instead of dma_iommu_ops to simply the code a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv/pseries: Rework device adding to IOMMU groupsAlexey Kardashevskiy
The powernv platform registers IOMMU groups and adds devices to them from the pci_controller_ops::setup_bridge() hook except one case when virtual functions (SRIOV VFs) are added from a bus notifier. The pseries platform registers IOMMU groups from the pci_controller_ops::dma_bus_setup() hook and adds devices from the pci_controller_ops::dma_dev_setup() hook. The very same bus notifier used for powernv does not add devices for pseries though as __of_scan_bus() adds devices first, then it does the bus/dev DMA setup. Both platforms use iommu_add_device() which takes a device and expects it to have a valid IOMMU table struct with an iommu_table_group pointer which in turn points the iommu_group struct (which represents an IOMMU group). Although the helper seems easy to use, it relies on some pre-existing device configuration and associated data structures which it does not really need. This simplifies iommu_add_device() to take the table_group pointer directly. Pseries already has a table_group pointer handy and the bus notified is not used anyway. For powernv, this copies the existing bus notifier, makes it work for powernv only which means an easy way of getting to the table_group pointer. This was tested on VFs but should also support physical PCI hotplug. Since iommu_add_device() receives the table_group pointer directly, pseries does not do TCE cache invalidation (the hypervisor does) nor allow multiple groups per a VFIO container (in other words sharing an IOMMU table between partitionable endpoints), this removes iommu_table_group_link from pseries. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/pseries: Remove IOMMU API support for non-LPAR systemsAlexey Kardashevskiy
The pci_dma_bus_setup_pSeries and pci_dma_dev_setup_pSeries hooks are registered for the pseries platform which does not have FW_FEATURE_LPAR; these would be pre-powernv platforms which we never supported PCI pass through for anyway so remove it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculationAlexey Kardashevskiy
We might have memory@ nodes with "linux,usable-memory" set to zero (for example, to replicate powernv's behaviour for GPU coherent memory) which means that the memory needs an extra initialization but since it can be used afterwards, the pseries platform will try mapping it for DMA so the DMA window needs to cover those memory regions too; if the window cannot cover new memory regions, the memory onlining fails. This walks through the memory nodes to find the highest RAM address to let a huge DMA window cover that too in case this memory gets onlined later. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-02Merge tag 'powerpc-4.16-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Enable support for memory protection keys aka "pkeys" on Power7/8/9 when using the hash table MMU. - Extend our interrupt soft masking to support masking PMU interrupts as well as "normal" interrupts, and then use that to implement local_t for a ~4x speedup vs the current atomics-based implementation. - A new driver "ocxl" for "Open Coherent Accelerator Processor Interface (OpenCAPI)" devices. - Support for new device tree properties on PowerVM to describe hotpluggable memory and devices. - Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE to the 64-bit VDSO. - Freescale updates from Scott: fixes for CPM GPIO and an FSL PCI erratum workaround, plus a minor cleanup patch. As well as quite a lot of other changes all over the place, and small fixes and cleanups as always. Thanks to: Alan Modra, Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anshuman Khandual, Anton Blanchard, Arnd Bergmann, Balbir Singh, Benjamin Herrenschmidt, Bhaktipriya Shridhar, Bryant G. Ly, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Cyril Bur, David Gibson, Desnes A. Nunes do Rosario, Dmitry Torokhov, Frederic Barrat, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo A. R. Silva, Gustavo Romero, Ivan Mikhaylov, Joakim Tjernlund, Joe Perches, Josh Poimboeuf, Juan J. Alvarez, Julia Cartwright, Kamalesh Babulal, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Malaterre, Michael Bringmann, Michael Hanselmann, Michael Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Philippe Bergheaud, Ram Pai, Russell Currey, Santosh Sivaraj, Scott Wood, Seth Forshee, Simon Guo, Stewart Smith, Sukadev Bhattiprolu, Thiago Jung Bauermann, Vaibhav Jain, Vasyl Gomonovych" * tag 'powerpc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (199 commits) powerpc/mm/radix: Fix build error when RADIX_MMU=n macintosh/ams-input: Use true and false for boolean values macintosh: change some data types from int to bool powerpc/watchdog: Print the NIP in soft_nmi_interrupt() powerpc/watchdog: regs can't be null in soft_nmi_interrupt() powerpc/watchdog: Tweak watchdog printks powerpc/cell: Remove axonram driver rtc-opal: Fix handling of firmware error codes, prevent busy loops powerpc/mpc52xx_gpt: make use of raw_spinlock variants macintosh/adb: Properly mark continued kernel messages powerpc/pseries: Fix cpu hotplug crash with memoryless nodes powerpc/numa: Ensure nodes initialized for hotplug powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes powerpc/kernel: Block interrupts when updating TIDR powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn powerpc/mm/nohash: do not flush the entire mm when range is a single page powerpc/pseries: Add Initialization of VF Bars powerpc/pseries/pci: Associate PEs to VFs in configure SR-IOV powerpc/eeh: Add EEH notify resume sysfs powerpc/eeh: Add EEH operations to notify resume ...
2018-01-10powerpc: rename dma_direct_ to dma_nommu_Christoph Hellwig
We want to use the dma_direct_ namespace for a generic implementation, so rename powerpc to the second best choice: dma_nommu_. Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-12-04powerpc: Use pr_warn instead of pr_warningJoe Perches
At some point, pr_warning will be removed so all logging messages use a consistent <prefix>_warn style. Update arch/powerpc/ Miscellanea: o Coalesce formats o Realign arguments o Use %s, __func__ instead of embedded function names o Remove unnecessary line continuations Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Geoff Levand <geoff@infradead.org> [mpe: Rebase due to some %pOF changes.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-22powerpc/pseries: Cleanup error handling in iommu_pseries_alloc_group()Markus Elfring
Although kfree(NULL) is legal, it's a bit lazy to rely on that to implement the error handling. So do it the normal Linux way using labels for each failure path. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> [mpe: Squash a few patches and rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23powerpc: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Anatolij Gustschin <agust@denx.de> Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linuxppc-dev@lists.ozlabs.org Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28powerpc/pseries: Enable VFIOAlexey Kardashevskiy
This enables VFIO on pseries host in order to allow VFIO in nested guest under PR KVM or DPDK in a HV guest. This adds support of the VFIO_SPAPR_TCE_IOMMU type. This adds exchange() callback to allow TCE updates by the SPAPR TCE IOMMU driver in VFIO. This initializes DMA32 window parameters in iommu_table_group as as this does not implement VFIO_SPAPR_TCE_v2_IOMMU and VFIO_SPAPR_TCE_IOMMU just reuses the existing DMA32 window. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-30powerpc/vfio_spapr_tce: Add reference counting to iommu_tableAlexey Kardashevskiy
So far iommu_table obejcts were only used in virtual mode and had a single owner. We are going to change this by implementing in-kernel acceleration of DMA mapping requests. The proposed acceleration will handle requests in real mode and KVM will keep references to tables. This adds a kref to iommu_table and defines new helpers to update it. This replaces iommu_free_table() with iommu_tce_table_put() and makes iommu_free_table() static. iommu_tce_table_get() is not used in this patch but it will be in the following patch. Since this touches prototypes, this also removes @node_name parameter as it has never been really useful on powernv and carrying it for the pseries platform code to iommu_free_table() seems to be quite useless as well. This should cause no behavioral change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-27scripts/spelling.txt: add "partiton" pattern and fix typo instancesMasahiro Yamada
Fix typos and add the following to the scripts/spelling.txt: partiton||partition Link: http://lkml.kernel.org/r/1481573103-11329-7-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-04dma-mapping: use unsigned long for dma_attrsKrzysztof Kozlowski
The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs *attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs *attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-17powerpc/pseries/pci: Remove obsolete SW invalidateBenjamin Herrenschmidt
That was used by some old IBM internal bringup tools and is no longer relevant. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-06powerpc/pseries: Fix PCI config address for DDWGavin Shan
In commit 8445a87f7092 "powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism", the PE address was replaced with the PCI config address in order to remove dependency on EEH. According to PAPR spec, firmware (pHyp or QEMU) should accept "xxBBSSxx" format PCI config address, not "xxxxBBSS" provided by the patch. Note that "BB" is PCI bus number and "SS" is the combination of slot and function number. This fixes the PCI address passed to DDW RTAS calls. Fixes: 8445a87f7092 ("powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism") Cc: stable@vger.kernel.org # v3.4+ Reported-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12powerpc/iommu: Remove the dependency on EEH struct in DDW mechanismGuilherme G. Piccoli
Commit 39baadbf36ce ("powerpc/eeh: Remove eeh information from pci_dn") changed the pci_dn struct by removing its EEH-related members. As part of this clean-up, DDW mechanism was modified to read the device configuration address from eeh_dev struct. As a consequence, now if we disable EEH mechanism on kernel command-line for example, the DDW mechanism will fail, generating a kernel oops by dereferencing a NULL pointer (which turns to be the eeh_dev pointer). This patch just changes the configuration address calculation on DDW functions to a manual calculation based on pci_dn members instead of using eeh_dev-based address. No functional changes were made. This was tested on pSeries, both in PHyp and qemu guest. Fixes: 39baadbf36ce ("powerpc/eeh: Remove eeh information from pci_dn") Cc: stable@vger.kernel.org # v3.4+ Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-15powerpc/pseries: Remove use of CONFIG_PCIMichael Ellerman
Now that we always have CONFIG_PCI=y for pseries, we can stop guarding code with CONFIG_PCI ifdefs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-13powerpc/iommu: Cleanup setting of DMA base/offsetBenjamin Herrenschmidt
Now that the table and the offset can co-exist, we no longer need to flip/flop, we can just establish both once at boot time. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_groupAlexey Kardashevskiy
So far one TCE table could only be used by one IOMMU group. However IODA2 hardware allows programming the same TCE table address to multiple PE allowing sharing tables. This replaces a single pointer to a group in a iommu_table struct with a linked list of groups which provides the way of invalidating TCE cache for every PE when an actual TCE table is updated. This adds pnv_pci_link_table_and_group() and pnv_pci_unlink_table_and_group() helpers to manage the list. However without VFIO, it is still going to be a single IOMMU group per iommu_table. This changes iommu_add_device() to add a device to a first group from the group list of a table as it is only called from the platform init code or PCI bus notifier and at these moments there is only one group per table. This does not change TCE invalidation code to loop through all attached groups in order to simplify this patch and because it is not really needed in most cases. IODA2 is fixed in a later patch. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11powerpc/spapr: vfio: Replace iommu_table with iommu_table_groupAlexey Kardashevskiy
Modern IBM POWERPC systems support multiple (currently two) TCE tables per IOMMU group (a.k.a. PE). This adds a iommu_table_group container for TCE tables. Right now just one table is supported. This defines iommu_table_group struct which stores pointers to iommu_group and iommu_table(s). This replaces iommu_table with iommu_table_group where iommu_table was used to identify a group: - iommu_register_group(); - iommudata of generic iommu_group; This removes @data from iommu_table as it_table_group provides same access to pnv_ioda_pe. For IODA, instead of embedding iommu_table, the new iommu_table_group keeps pointers to those. The iommu_table structs are allocated dynamically. For P5IOC2, both iommu_table_group and iommu_table are embedded into PE struct. As there is no EEH and SRIOV support for P5IOC2, iommu_free_table() should not be called on iommu_table struct pointers so we can keep it embedded in pnv_phb::p5ioc2. For pSeries, this replaces multiple calls of kzalloc_node() with a new iommu_pseries_alloc_group() helper and stores the table group struct pointer into the pci_dn struct. For release, a iommu_table_free_group() helper is added. This moves iommu_table struct allocation from SR-IOV code to the generic DMA initialization code in pnv_pci_ioda_setup_dma_pe and pnv_pci_ioda2_setup_dma_pe as this is where DMA is actually initialized. This change is here because those lines had to be changed anyway. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_tableAlexey Kardashevskiy
This adds a iommu_table_ops struct and puts pointer to it into the iommu_table struct. This moves tce_build/tce_free/tce_get/tce_flush callbacks from ppc_md to the new struct where they really belong to. This adds the requirement for @it_ops to be initialized before calling iommu_init_table() to make sure that we do not leave any IOMMU table with iommu_table_ops uninitialized. This is not a parameter of iommu_init_table() though as there will be cases when iommu_init_table() will not be called on TCE tables, for example - VFIO. This does s/tce_build/set/, s/tce_free/clear/ and removes "tce_" redundant prefixes. This removes tce_xxx_rm handlers from ppc_md but does not add them to iommu_table_ops as this will be done later if we decide to support TCE hypercalls in real mode. This removes _vm callbacks as only virtual mode is supported by now so this also removes @rm parameter. For pSeries, this always uses tce_buildmulti_pSeriesLP/ tce_buildmulti_pSeriesLP. This changes multi callback to fall back to tce_build_pSeriesLP/tce_free_pSeriesLP if FW_FEATURE_MULTITCE is not present. The reason for this is we still have to support "multitce=off" boot parameter in disable_multitce() and we do not want to walk through all IOMMU tables in the system and replace "multi" callbacks with single ones. For powernv, this defines _ops per PHB type which are P5IOC2/IODA1/IODA2. This makes the callbacks for them public. Later patches will extend callbacks for IODA1/2. No change in behaviour is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11powerpc/iommu: Put IOMMU group explicitlyAlexey Kardashevskiy
So far an iommu_table lifetime was the same as PE. Dynamic DMA windows will change this and iommu_free_table() will not always require the group to be released. This moves iommu_group_put() out of iommu_free_table(). This adds a iommu_pseries_free_table() helper which does iommu_group_put() and iommu_free_table(). Later it will be changed to receive a table_group and we will have to change less lines then. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_groupAlexey Kardashevskiy
The set_iommu_table_base_and_group() name suggests that the function sets table base and add a device to an IOMMU group. The actual purpose for table base setting is to put some reference into a device so later iommu_add_device() can get the IOMMU group reference and the device to the group. At the moment a group cannot be explicitly passed to iommu_add_device() as we want it to work from the bus notifier, we can fix it later and remove confusing calls of set_iommu_table_base(). This replaces set_iommu_table_base_and_group() with a couple of set_iommu_table_base() + iommu_add_device() which makes reading the code easier. This adds few comments why set_iommu_table_base() and iommu_add_device() are called where they are called. For IODA1/2, this essentially removes iommu_add_device() call from the pnv_pci_ioda_dma_dev_setup() as it will always fail at this particular place: - for physical PE, the device is already attached by iommu_add_device() in pnv_pci_ioda_setup_dma_pe(); - for virtual PE, the sysfs entries are not ready to create all symlinks so actual adding is happening in tce_iommu_bus_notifier. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11powerpc/pseries: Move controller ops from ppc_md to controller_opsDaniel Axtens
This moves the pSeries platform to use the pci_controller_ops structure, rather than ppc_md for PCI controller operations. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-04powerpc/iommu: Remove IOMMU device references via bus notifierNishanth Aravamudan
After d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier"), the refcnt on the kobject backing the IOMMU group for a PCI device is elevated by each call to pci_dma_dev_setup_pSeriesLP() (via set_iommu_table_base_and_group). When we go to dlpar a multi-function PCI device out: iommu_reconfig_notifier -> iommu_free_table -> iommu_group_put BUG_ON(tbl->it_group) We trip this BUG_ON, because there are still references on the table, so it is not freed. Fix this by moving the powernv bus notifier to common code and calling it for both powernv and pseries. Fixes: d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier") Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-11Merge tag 'powerpc-3.19-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: "Some nice cleanups like removing bootmem, and removal of __get_cpu_var(). There is one patch to mm/gup.c. This is the generic GUP implementation, but is only used by us and arm(64). We have an ack from Steve Capper, and although we didn't get an ack from Andrew he told us to take the patch through the powerpc tree. There's one cxl patch. This is in drivers/misc, but Greg said he was happy for us to manage fixes for it. There is an infrastructure patch to support an IPMI driver for OPAL. There is also an RTC driver for OPAL. We weren't able to get any response from the RTC maintainer, Alessandro Zummo, so in the end we just merged the driver. The usual batch of Freescale updates from Scott" * tag 'powerpc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (101 commits) powerpc/powernv: Return to cpu offline loop when finished in KVM guest powerpc/book3s: Fix partial invalidation of TLBs in MCE code. powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault powerpc/xmon: Cleanup the breakpoint flags powerpc/xmon: Enable HW instruction breakpoint on POWER8 powerpc/mm/thp: Use tlbiel if possible powerpc/mm/thp: Remove code duplication powerpc/mm/hugetlb: Sanity check gigantic hugepage count powerpc/oprofile: Disable pagefaults during user stack read powerpc/mm: Check for matching hpte without taking hpte lock powerpc: Drop useless warning in eeh_init() powerpc/powernv: Cleanup unused MCE definitions/declarations. powerpc/eeh: Dump PHB diag-data early powerpc/eeh: Recover EEH error on ownership change for BCM5719 powerpc/eeh: Set EEH_PE_RESET on PE reset powerpc/eeh: Refactor eeh_reset_pe() powerpc: Remove more traces of bootmem powerpc/pseries: Initialise nvram_pstore_info's buf_lock cxl: Name interrupts in /proc/interrupt cxl: Return error to PSL if IRQ demultiplexing fails & print clearer warning ...
2014-11-24of/reconfig: Always use the same structure for notifiersGrant Likely
The OF_RECONFIG notifier callback uses a different structure depending on whether it is a node change or a property change. This is silly, and not very safe. Rework the code to use the same data structure regardless of the type of notifier. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Cc: <linuxppc-dev@lists.ozlabs.org>
2014-11-10powerpc/pseries: delete unneeded test before of_node_putJulia Lawall
Of_node_put supports NULL as its argument, so the initial test is not necessary. Suggested by Uwe Kleine-König. The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e; @@ -if (e) of_node_put(e); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-03powerpc: Replace __get_cpu_var usesChristoph Lameter
This still has not been merged and now powerpc is the only arch that does not have this change. Sorry about missing linuxppc-dev before. V2->V2 - Fix up to work against 3.18-rc1 __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: Christoph Lameter <cl@linux.com> [mpe: Fix build errors caused by set/or_softirq_pending(), and rework assignment in __set_breakpoint() to use memcpy().] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15powerpc/pseries: Use dump_stack instead of show_stackAnton Blanchard
We can use the simpler dump_stack() instead of show_stack(current, __get_SP()) Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03powerpc/iommu/ddw: Fix endiannessAlexey Kardashevskiy
rtas_call() accepts and returns values in CPU endianness. The ddw_query_response and ddw_create_response structs members are defined and treated as BE but as they are passed to rtas_call() as (u32 *) and they get byteswapped automatically, the data is CPU-endian. This fixes ddw_query_response and ddw_create_response definitions and use. of_read_number() is designed to work with device tree cells - it assumes the input is big-endian and returns data in CPU-endian. However due to the ddw_create_response struct fix, create.addr_hi/lo are already CPU-endian so do not byteswap them. ddw_avail is a pointer to the "ibm,ddw-applicable" property which contains 3 cells which are big-endian as it is a device tree. rtas_call() accepts a RTAS token in CPU-endian. This makes use of of_property_read_u32_array to byte swap and avoid the need for a number of be32_to_cpu calls. Cc: stable@vger.kernel.org # v3.13+ Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> [aik: folded Anton's patch with of_property_read_u32_array] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-08-13powerpc/pseries: Avoid deadlock on removing ddwGavin Shan
Function remove_ddw() could be called in of_reconfig_notifier and we potentially remove the dynamic DMA window property, which invokes of_reconfig_notifier again. Eventually, it leads to the deadlock as following backtrace shows. The patch fixes the above issue by deferring releasing the dynamic DMA window property while releasing the device node. ============================================= [ INFO: possible recursive locking detected ] 3.16.0+ #428 Tainted: G W --------------------------------------------- drmgr/2273 is trying to acquire lock: ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \ .__blocking_notifier_call_chain+0x40/0x78 but task is already holding lock: ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \ .__blocking_notifier_call_chain+0x40/0x78 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock((of_reconfig_chain).rwsem); lock((of_reconfig_chain).rwsem); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by drmgr/2273: #0: (sb_writers#4){.+.+.+}, at: [<c0000000001cbe70>] \ .vfs_write+0xb0/0x1f8 #1: ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \ .__blocking_notifier_call_chain+0x40/0x78 stack backtrace: CPU: 17 PID: 2273 Comm: drmgr Tainted: G W 3.16.0+ #428 Call Trace: [c0000000137e7000] [c000000000013d9c] .show_stack+0x88/0x148 (unreliable) [c0000000137e70b0] [c00000000083cd34] .dump_stack+0x7c/0x9c [c0000000137e7130] [c0000000000b8afc] .__lock_acquire+0x128c/0x1c68 [c0000000137e7280] [c0000000000b9a4c] .lock_acquire+0xe8/0x104 [c0000000137e7350] [c00000000083588c] .down_read+0x4c/0x90 [c0000000137e73e0] [c000000000091890] .__blocking_notifier_call_chain+0x40/0x78 [c0000000137e7490] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48 [c0000000137e7520] [c000000000682a28] .of_reconfig_notify+0x34/0x5c [c0000000137e75b0] [c000000000682a9c] .of_property_notify+0x4c/0x54 [c0000000137e7650] [c000000000682bf0] .of_remove_property+0x30/0xd4 [c0000000137e76f0] [c000000000052a44] .remove_ddw+0x144/0x168 [c0000000137e7790] [c000000000053204] .iommu_reconfig_notifier+0x30/0xe0 [c0000000137e7820] [c00000000009137c] .notifier_call_chain+0x6c/0xb4 [c0000000137e78c0] [c0000000000918ac] .__blocking_notifier_call_chain+0x5c/0x78 [c0000000137e7970] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48 [c0000000137e7a00] [c000000000682a28] .of_reconfig_notify+0x34/0x5c [c0000000137e7a90] [c000000000682e14] .of_detach_node+0x44/0x1fc [c0000000137e7b40] [c0000000000518e4] .ofdt_write+0x3ac/0x688 [c0000000137e7c20] [c000000000238430] .proc_reg_write+0xb8/0xd4 [c0000000137e7cd0] [c0000000001cbeac] .vfs_write+0xec/0x1f8 [c0000000137e7d70] [c0000000001cc3b0] .SyS_write+0x58/0xa0 [c0000000137e7e30] [c00000000000a064] syscall_exit+0x0/0x98 Cc: stable@vger.kernel.org Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-15Revert "pseries/iommu: Remove DDW on kexec"Nishanth Aravamudan
After reverting 25ebc45b93452d0bc60271f178237123c4b26808 ("powerpc/pseries/iommu: remove default window before attempting DDW manipulation"), we no longer remove the base window in enable_ddw. Therefore, we no longer need to reset the DMA window state in find_existing_ddw_windows(). We can instead go back to what was done before, which simply reuses the previous configuration, if any. Further, this removes the final caller of the reset-pe-dma-windows call, so remove those functions. This fixes an EEH on kdump with the ipr driver. The EEH occurs, because the initcall removes the DDW configuration (64-bit DMA window), but doesn't ensure the ops are via the IOMMU -- a DMA operation occurs during probe (still investigating this) and we EEH. This reverts commit 14b6f00f8a4fdec5ccd45a0710284de301a61628. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-15Revert "powerpc/pseries/iommu: remove default window before attempting DDW ↵Nishanth Aravamudan
manipulation" Ben rightfully pointed out that there is a race in the "newer" DDW code. Presuming we are running on recent enough firmware that supports the "reset" DDW manipulation call, we currently always remove the base 32-bit DMA window in order to maximize the resources for Phyp when creating the 64-bit window. However, this can be problematic for the case where multiple functions are in the same PE (partitionable endpoint), where some funtions might be 32-bit DMA only. All of a sudden, the only functional DMA window for such functions is gone. We will have serious errors in such situations. The best solution is simply to revert the extension to the DDW code where we ever remove the base DMA window. This reverts commit 25ebc45b93452d0bc60271f178237123c4b26808. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30powerpc/iommu: Add it_page_shift field to determine iommu page sizeAlistair Popple
This patch adds a it_page_shift field to struct iommu_table and initiliases it to 4K for all platforms. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30powerpc/iommu: Update constant names to reflect their hardcoded page sizeAlistair Popple
The powerpc iommu uses a hardcoded page size of 4K. This patch changes the name of the IOMMU_PAGE_* macros to reflect the hardcoded values. A future patch will use the existing names to support dynamic page sizes. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05PPC: POWERNV: move iommu_add_device earlierAlexey Kardashevskiy
The current implementation of IOMMU on sPAPR does not use iommu_ops and therefore does not call IOMMU API's bus_set_iommu() which 1) sets iommu_ops for a bus 2) registers a bus notifier Instead, PCI devices are added to IOMMU groups from subsys_initcall_sync(tce_iommu_init) which does basically the same thing without using iommu_ops callbacks. However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158) implements iommu_ops and when tce_iommu_init is called, every PCI device is already added to some group so there is a conflict. This patch does 2 things: 1. removes the loop in which PCI devices were added to groups and adds explicit iommu_add_device() calls to add devices as soon as they get the iommu_table pointer assigned to them. 2. moves a bus notifier to powernv code in order to avoid conflict with the notifier from Freescale driver. iommu_add_device() and iommu_del_device() are public now. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-30powerpc/pseries: Fix endian issues in pseries iommu codeAnton Blanchard
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-27pseries: Move plpar_wrapper.h to powerpc common include/asm location.Deepthi Dharwar
As a part of pseries_idle backend driver cleanup to make the code common to both pseries and powernv platforms, it is necessary to move the backend-driver code to drivers/cpuidle. As a pre-requisite for that, it is essential to move plpar_wrapper.h to include/asm. Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14powerpc: of_parse_dma_window should take a __be32 *dma_windowAnton Blanchard
We pass dma_window to of_parse_dma_window as a void * and then run through hoops to cast it back to a u32 array. In the process we lose endian annotation. Simplify it by just passing a __be32 * down. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/vfio: Enable on pSeries platformAlexey Kardashevskiy
The enables VFIO on the pSeries platform, enabling user space programs to access PCI devices directly. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-18powerpc/pseries: close DDW race between functions of adapterNishanth Aravamudan
Given a PCI device with multiple functions in a DDW capable slot, the following situation can be encountered: When the first function sets a 64-bit DMA mask, enable_ddw() will be called and we can fail to properly configure DDW (the most common reason being the new DMA window's size is not large enough to map all of an LPAR's memory). With the recent changes to DDW, we remove the base window in order to determine if the new window is of sufficient size to cover an LPAR's memory. We correctly replace the base window if we find that not to be the case. However, once we go through and re-configured 32-bit DMA via the IOMMU, the next function of the adapter will go through the same process. And since DDW is a characteristic of the slot itself, we are most likely going to fail again. But to determine we are going to fail the second slot, we again remove the base window -- but that is now in-use by the first function/driver, which might be issuing I/O already. To close this window, keep a list of all the failed struct device_nodes that have failed to configure DDW. If the current device_node is in that list, just fail out immediately and fall back to 32-bit DMA without doing any DDW manipulation. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>