summaryrefslogtreecommitdiff
path: root/arch/arm/mm/dma-mapping.c
AgeCommit message (Collapse)Author
2017-05-04Merge branches 'arm/exynos', 'arm/omap', 'arm/rockchip', 'arm/mediatek', ↵Joerg Roedel
'arm/smmu', 'arm/core', 'x86/vt-d', 'x86/amd' and 'core' into next
2017-05-02xen/arm,arm64: fix xen_dma_ops after 815dd18 "Consolidate get_dma_ops..."Stefano Stabellini
The following commit: commit 815dd18788fe0d41899f51b91d0560279cf16b0d Author: Bart Van Assche <bart.vanassche@sandisk.com> Date: Fri Jan 20 13:04:04 2017 -0800 treewide: Consolidate get_dma_ops() implementations rearranges get_dma_ops in a way that xen_dma_ops are not returned when running on Xen anymore, dev->dma_ops is returned instead (see arch/arm/include/asm/dma-mapping.h:get_arch_dma_ops and include/linux/dma-mapping.h:get_dma_ops). Fix the problem by storing dev->dma_ops in dev_archdata, and setting dev->dma_ops to xen_dma_ops. This way, xen_dma_ops is returned naturally by get_dma_ops. The Xen code can retrieve the original dev->dma_ops from dev_archdata when needed. It also allows us to remove __generic_dma_ops from common headers. Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Tested-by: Julien Grall <julien.grall@arm.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> [4.11+] CC: linux@armlinux.org.uk CC: catalin.marinas@arm.com CC: will.deacon@arm.com CC: boris.ostrovsky@oracle.com CC: jgross@suse.com CC: Julien Grall <julien.grall@arm.com>
2017-04-29arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()Laurent Pinchart
The arch_setup_dma_ops() function is in charge of setting dma_ops with a call to set_dma_ops(). set_dma_ops() is also called from - highbank and mvebu bus notifiers - dmabounce (to be replaced with swiotlb) - arm_iommu_attach_device (arm_iommu_attach_device is itself called from IOMMU and bus master device drivers) To allow the arch_setup_dma_ops() call to be moved from device add time to device probe time we must ensure that dma_ops already setup by any of the above callers will not be overriden. Aftering replacing dmabounce with swiotlb, converting IOMMU drivers to of_xlate and taking care of highbank and mvebu, the workaround should be removed. Signed-off-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Tested-by: Ralph Sennhauser <ralph.sennhauser@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-03-29ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memoryRussell King
dma_get_sgtable() tries to create a scatterlist table containing valid struct page pointers for the coherent memory allocation passed in to it. However, memory can be declared via dma_declare_coherent_memory(), or via other reservation schemes which means that coherent memory is not guaranteed to be backed by struct pages. In such cases, the resulting scatterlist table contains pointers to invalid pages, which causes kernel oops later. This patch adds detection of such memory, and refuses to create a scatterlist table for such memory. Reported-by: Shuah Khan <shuahkhan@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-02-28Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: - nommu updates from Afzal Mohammed cleaning up the vectors support - allow DMA memory "mapping" for nommu Benjamin Gaignard - fixing a correctness issue with R_ARM_PREL31 relocations in the module linker - add strlen() prototype for the decompressor - support for DEBUG_VIRTUAL from Florian Fainelli - adjusting memory bounds after memory reservations have been registered - unipher cache handling updates from Masahiro Yamada - initrd and Thumb Kconfig cleanups * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (23 commits) ARM: mm: round the initrd reservation to page boundaries ARM: mm: clean up initrd initialisation ARM: mm: move initrd init code out of arm_memblock_init() ARM: 8655/1: improve NOMMU definition of pgprot_*() ARM: 8654/1: decompressor: add strlen prototype ARM: 8652/1: cache-uniphier: clean up active way setup code ARM: 8651/1: cache-uniphier: include <linux/errno.h> instead of <linux/types.h> ARM: 8650/1: module: handle negative R_ARM_PREL31 addends correctly ARM: 8649/2: nommu: remove Hivecs configuration is asm ARM: 8648/2: nommu: display vectors base ARM: 8647/2: nommu: dynamic exception base address setting ARM: 8646/1: mmu: decouple VECTORS_BASE from Kconfig ARM: 8644/1: Reduce "CPU: shutdown" message to debug level ARM: 8641/1: treewide: Replace uses of virt_to_phys with __pa_symbol ARM: 8640/1: Add support for CONFIG_DEBUG_VIRTUAL ARM: 8639/1: Define KERNEL_START and KERNEL_END ARM: 8638/1: mtd: lart: Rename partition defines to be prefixed with PART_ ARM: 8637/1: Adjust memory boundaries after reservations ARM: 8636/1: Cleanup sanity_check_meminfo ARM: add CPU_THUMB_CAPABLE to indicate possible Thumb support ...
2017-02-25Merge tag 'for-next-dma_ops' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma DMA mapping updates from Doug Ledford: "Drop IB DMA mapping code and use core DMA code instead. Bart Van Assche noted that the ib DMA mapping code was significantly similar enough to the core DMA mapping code that with a few changes it was possible to remove the IB DMA mapping code entirely and switch the RDMA stack to use the core DMA mapping code. This resulted in a nice set of cleanups, but touched the entire tree and has been kept separate for that reason." * tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits) IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it IB/core: Remove ib_device.dma_device nvme-rdma: Switch from dma_device to dev.parent RDS: net: Switch from dma_device to dev.parent IB/srpt: Modify a debug statement IB/srp: Switch from dma_device to dev.parent IB/iser: Switch from dma_device to dev.parent IB/IPoIB: Switch from dma_device to dev.parent IB/rxe: Switch from dma_device to dev.parent IB/vmw_pvrdma: Switch from dma_device to dev.parent IB/usnic: Switch from dma_device to dev.parent IB/qib: Switch from dma_device to dev.parent IB/qedr: Switch from dma_device to dev.parent IB/ocrdma: Switch from dma_device to dev.parent IB/nes: Remove a superfluous assignment statement IB/mthca: Switch from dma_device to dev.parent IB/mlx5: Switch from dma_device to dev.parent IB/mlx4: Switch from dma_device to dev.parent IB/i40iw: Remove a superfluous assignment statement IB/hns: Switch from dma_device to dev.parent ...
2017-02-24mm: wire up GFP flag passing in dma_alloc_from_contiguousLucas Stach
The callers of the DMA alloc functions already provide the proper context GFP flags. Make sure to pass them through to the CMA allocator, to make the CMA compaction context aware. Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.de Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alexander Graf <agraf@suse.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-24treewide: Constify most dma_map_ops structuresBart Van Assche
Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. This patch has been generated as follows: git grep -l 'struct dma_map_ops' | xargs -d\\n sed -i \ -e 's/struct dma_map_ops/const struct dma_map_ops/g' \ -e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \ -e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \ -e 's/const const struct dma_map_ops /const struct dma_map_ops /g'; sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops intel_dma_ops'); sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc); sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \ -e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \ -e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \ drivers/pci/host/*.c sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Russell King <linux@armlinux.org.uk> Cc: x86@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-19arm/dma-mapping: Implement DMA_ATTR_PRIVILEGEDSricharan R
The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that are only accessible to privileged DMA engines. Adding it to the arm dma-mapping.c so that the ARM32 DMA IOMMU mapper can make use of it. Signed-off-by: Sricharan R <sricharan@codeaurora.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-01-10ARM: 8633/1: nommu: allow mmap when !CONFIG_MMUBenjamin Gaignard
commit ab6494f0c96f ("nommu: Add noMMU support to the DMA API") have add CONFIG_MMU compilation flag but that prohibit to use dma_mmap_wc() when the platform doesn't have MMU. This patch call vm_iomap_memory() in noMMU case to test if addresses are correct and set vma->vm_flags rather than all return an error. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-11-15ARM: 8628/1: dma-mapping: preallocate DMA-debug hash tables in core_initcallMarek Szyprowski
fs_initcall is definitely too late to initialize DMA-debug hash tables, because some drivers might get probed and use DMA mapping framework already in core_initcall. Late initialization of DMA-debug results in false warning about accessing memory, that was not allocated, like this one: ------------[ cut here ]------------ WARNING: CPU: 5 PID: 1 at lib/dma-debug.c:1104 check_unmap+0xa1c/0xe50 exynos-sysmmu 10a60000.sysmmu: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x000000006ebd0000] [size=16384 bytes] Modules linked in: CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc5-00028-g39dde3d-dirty #44 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c0119dd4>] (unwind_backtrace) from [<c01122bc>] (show_stack+0x20/0x24) [<c01122bc>] (show_stack) from [<c062714c>] (dump_stack+0x84/0xa0) [<c062714c>] (dump_stack) from [<c0132560>] (__warn+0x14c/0x180) [<c0132560>] (__warn) from [<c01325dc>] (warn_slowpath_fmt+0x48/0x50) [<c01325dc>] (warn_slowpath_fmt) from [<c06814f8>] (check_unmap+0xa1c/0xe50) [<c06814f8>] (check_unmap) from [<c06819c4>] (debug_dma_unmap_page+0x98/0xc8) [<c06819c4>] (debug_dma_unmap_page) from [<c076c3e8>] (exynos_iommu_domain_free+0x158/0x380) [<c076c3e8>] (exynos_iommu_domain_free) from [<c0764a30>] (iommu_domain_free+0x34/0x60) [<c0764a30>] (iommu_domain_free) from [<c011f168>] (release_iommu_mapping+0x30/0xb8) [<c011f168>] (release_iommu_mapping) from [<c011f23c>] (arm_iommu_release_mapping+0x4c/0x50) [<c011f23c>] (arm_iommu_release_mapping) from [<c0b061ac>] (s5p_mfc_probe+0x640/0x80c) [<c0b061ac>] (s5p_mfc_probe) from [<c07e6750>] (platform_drv_probe+0x70/0x148) [<c07e6750>] (platform_drv_probe) from [<c07e25c0>] (driver_probe_device+0x12c/0x6b0) [<c07e25c0>] (driver_probe_device) from [<c07e2c6c>] (__driver_attach+0x128/0x17c) [<c07e2c6c>] (__driver_attach) from [<c07df74c>] (bus_for_each_dev+0x88/0xc8) [<c07df74c>] (bus_for_each_dev) from [<c07e1b6c>] (driver_attach+0x34/0x58) [<c07e1b6c>] (driver_attach) from [<c07e1350>] (bus_add_driver+0x18c/0x32c) [<c07e1350>] (bus_add_driver) from [<c07e4198>] (driver_register+0x98/0x148) [<c07e4198>] (driver_register) from [<c07e5cb0>] (__platform_driver_register+0x58/0x74) [<c07e5cb0>] (__platform_driver_register) from [<c174cb30>] (s5p_mfc_driver_init+0x1c/0x20) [<c174cb30>] (s5p_mfc_driver_init) from [<c0102690>] (do_one_initcall+0x64/0x258) [<c0102690>] (do_one_initcall) from [<c17014c0>] (kernel_init_freeable+0x3d0/0x4d0) [<c17014c0>] (kernel_init_freeable) from [<c116eeb4>] (kernel_init+0x18/0x134) [<c116eeb4>] (kernel_init) from [<c010bbd8>] (ret_from_fork+0x14/0x3c) ---[ end trace dc54c54bd3581296 ]--- This patch moves initialization of DMA-debug to core_initcall. This is safe from the initialization perspective. dma_debug_do_init() internally calls debugfs functions and debugfs also gets initialised at core_initcall(), and that is earlier than arch code in the link order, so it will get initialized just before the DMA-debug. Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-10-06Merge tag 'dmaengine-4.9-rc1' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds
Pull dmaengine updates from Vinod Koul: "This is bit large pile of code which bring in some nice additions: - Error reporting: we have added a new mechanism for users of dmaenegine to register a callback_result which tells them the result of the dma transaction. Right now only one user (ntb) is using it. - As we discussed on KS mailing list and pointed out NO_IRQ has no place in kernel, this also remove NO_IRQ from dmaengine subsystem (both arm and ppc users) - Support for IOMMU slave transfers and its implementation for arm. - To get better build coverage, enable COMPILE_TEST for bunch of driver, and fix the warning and sparse complaints on these. - Apart from above, usual updates spread across drivers" * tag 'dmaengine-4.9-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (169 commits) async_pq_val: fix DMA memory leak dmaengine: virt-dma: move function declarations dmaengine: omap-dma: Enable burst and data pack for SG DT: dmaengine: rcar-dmac: document R8A7743/5 support dmaengine: fsldma: Unmap region obtained by of_iomap dmaengine: jz4780: fix resource leaks on error exit return dma-debug: fix ia64 build, use PHYS_PFN dmaengine: coh901318: fix integer overflow when shifting more than 32 places dmaengine: edma: avoid uninitialized variable use dma-mapping: fix m32r build warning dma-mapping: fix ia64 build, use PHYS_PFN dmaengine: ti-dma-crossbar: enable COMPILE_TEST dmaengine: omap-dma: enable COMPILE_TEST dmaengine: edma: enable COMPILE_TEST dmaengine: ti-dma-crossbar: Fix of_device_id data parameter usage dmaengine: ti-dma-crossbar: Correct type for of_find_property() third parameter dmaengine/ARM: omap-dma: Fix the DMAengine compile test on non OMAP configs dmaengine: edma: Rename set_bits and remove unused clear_bits helper dmaengine: edma: Use correct type for of_find_property() third parameter dmaengine: edma: Fix of_device_id data parameter usage (legacy vs TPCC) ...
2016-09-26arm: dma-mapping: add {map,unmap}_resource for iommu opsNiklas Söderlund
Add methods to map/unmap device resources addresses for dma_map_ops that are IOMMU aware. This is needed to map a device MMIO register from a physical address. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-08-12ARM: 8587/1: dma-mapping: Use %zu for printing a size_t variableFabio Estevam
According to Documentation/printk-formats.txt when printing a size_t variable we should use %zu or %zx format specifiers. As we are printing a memory size value, we should better use %zu in this case. Reported-by: Frank Mori Hess <fmh6jj@gmail.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-04dma-mapping: use unsigned long for dma_attrsKrzysztof Kozlowski
The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs *attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs *attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-14ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is usedGregory CLEMENT
When doing dma allocation with IOMMU the __iommu_alloc_atomic() was used even when the system was coherent. However, this function allocates from a non-cacheable pool, which is fine when the device is not cache coherent but won't work as expected in the device is cache coherent. Indeed, the CPU and device must access the memory using the same cacheability attributes. Moreover when the devices are coherent, the mmap call must not change the pg_prot flags in the vma struct. The arm_coherent_iommu_mmap_attrs has been updated in the same way that it was done for the arm_dma_mmap in commit 55af8a91640d ("ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap"). Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherentGregory CLEMENT
When a L2 cache controller is used in a system that provides hardware coherency, the entire outer cache operations are useless, and can be skipped. Moreover, on some systems, it is harmful as it causes deadlocks between the Marvell coherency mechanism, the Marvell PCIe controller and the Cortex-A9. In the current kernel implementation, the outer cache flush range operation is triggered by the dma_alloc function. This operation can be take place during runtime and in some circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x SoCs. This patch extends the __dma_clear_buffer() function to receive a boolean argument related to the coherency of the system. The same things is done for the calling functions. Reported-by: Nadav Haklai <nadavh@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-20Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Changes included in this pull request: - revert pxa2xx-flash back to using ioremap_cached() and switch memremap() to use arch_memremap_wb() - remove pci=firmware command line argument handling - remove unnecessary arm_dma_set_mask() implementation, the generic implementation will do for ARM - removal of the ARM kallsyms "hack" to work around mode switching veneers and vectors located below PAGE_OFFSET - tidy up build system output a little - add L2 cache power management DT bindings - remove duplicated local_irq_disable() in reboot paths - handle AMBA primecell devices better at registration time with PM domains (needed for Samsung SoCs) - ARM specific preparation to support Keystone II kexec" * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8567/1: cache-uniphier: activate ways for secondary CPUs ARM: 8570/2: Documentation: devicetree: Add PL310 PM bindings ARM: 8569/1: pl2x0: Add OF control of cache power management ARM: 8568/1: reboot: remove duplicated local_irq_disable() ARM: 8566/1: drivers: amba: properly handle devices with power domains ARM: provide arm_has_idmap_alias() helper ARM: kexec: remove 512MB restriction on kexec crashdump ARM: provide improved virt_to_idmap() functionality ARM: kexec: fix crashkernel= handling ARM: 8557/1: specify install, zinstall, and uinstall as PHONY targets ARM: 8562/1: suppress "include/generated/mach-types.h is up to date." ARM: 8553/1: kallsyms: remove --page-offset command line option ARM: 8552/1: kallsyms: remove special lower address limit for CONFIG_ARM ARM: 8555/1: kallsyms: ignore ARM mode switching veneers ARM: 8548/1: dma-mapping: remove arm_dma_set_mask() ARM: 8554/1: kernel: pci: remove pci=firmware command line parameter handling ARM: memremap: implement arch_memremap_wb() memremap: add arch specific hook for MEMREMAP_WB mappings mtd: pxa2xx-flash: switch back from memremap to ioremap_cached ARM: reintroduce ioremap_cached() for creating cached I/O mappings
2016-05-09iommu: of: enforce const-ness of struct iommu_opsRobin Murphy
As a set of driver-provided callbacks and static data, there is no compelling reason for struct iommu_ops to be mutable in core code, so enforce const-ness throughout. Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-15ARM: 8551/2: DMA: Fix kzalloc flags in __dma_allocAlexandre Courbot
Commit 19e6e5e5392b ("ARM: 8547/1: dma-mapping: store buffer information") allocates a structure meant for internal buffer management with the GFP flags of the buffer itself. This can trigger the following safeguard in the slab/slub allocator: if (unlikely(flags & GFP_SLAB_BUG_MASK)) { pr_emerg("gfp: %un", flags & GFP_SLAB_BUG_MASK); BUG(); } Fix this by filtering the flags that make the slab allocator unhappy. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-04-07ARM: 8548/1: dma-mapping: remove arm_dma_set_mask()Alexandre Courbot
arm_dma_set_mask() implements exactly the same behavior as the fallback that dma_set_mask() takes if the set_dma_mask op is not set. Remove it and use that fallback instead like what is already done for dma_get_mask(). Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-03-04ARM: 8546/1: dma-mapping: refactor to fix coherent+cma+gfp=0Rabin Vincent
Given a device which uses arm_coherent_dma_ops and on which dev_get_cma_area(dev) returns non-NULL, the following usage of the DMA API with gfp=0 results in memory corruption and a memory leak. p = dma_alloc_coherent(dev, sz, &dma, 0); if (p) dma_free_coherent(dev, sz, p, dma); The memory leak is because the alloc allocates using __alloc_simple_buffer() but the free attempts dma_release_from_contiguous() which does not do free anything since the page is not in the CMA area. The memory corruption is because the free calls __dma_remap() on a page which is backed by only first level page tables. The apply_to_page_range() + __dma_update_pte() loop ends up interpreting the section mapping as an addresses to a second level page table and writing the new PTE to memory which is not used by page tables. We don't have access to the GFP flags used for allocation in the free function. Fix this by adding allocator backends and using this information in the free function so that we always use the correct release routine. Fixes: 21caf3a7 ("ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA") Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-03-04ARM: 8547/1: dma-mapping: store buffer informationRabin Vincent
Keep a list of allocated DMA buffers so that we can store metadata in alloc() which we later need in free(). Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-11ARM: 8507/1: dma-mapping: Use DMA_ATTR_ALLOC_SINGLE_PAGES hint to optimize allocDoug Anderson
If we know that TLB efficiency will not be an issue when memory is accessed then it's not terribly important to allocate big chunks of memory. The whole point of allocating the big chunks was that it would make TLB usage efficient. As Marek Szyprowski indicated: Please note that mapping memory with larger pages significantly improves performance, especially when IOMMU has a little TLB cache. This can be easily observed when multimedia devices do processing of RGB data with 90/270 degree rotation Image rotation is distinctly an operation that needs to bounce around through memory, so it makes sense that TLB efficiency is important there. Video decoding, on the other hand, is a fairly sequential operation. During video decoding it's not expected that we'll be jumping all over memory. Decoding video is also pretty heavy and the TLB misses aren't a huge deal. Presumably most HW video acceleration users of dma-mapping will not care about huge pages and will set DMA_ATTR_ALLOC_SINGLE_PAGES. Allocating big chunks of memory is quite expensive, especially if we're doing it repeadly and memory is full. In one (out of tree) usage model it is common that arm_iommu_alloc_attrs() is called 16 times in a row, each one trying to allocate 4 MB of memory. This is called whenever the system encounters a new video, which could easily happen while the memory system is stressed out. In fact, on certain social media websites that auto-play video and have infinite scrolling, it's quite common to see not just one of these 16x4MB allocations but 2 or 3 right after another. Asking the system even to do a small amount of extra work to give us big chunks in this case is just not a good use of time. Allocating big chunks of memory is also expensive indirectly. Even if we ask the system not to do ANY extra work to allocate _our_ memory, we're still potentially eating up all big chunks in the system. Presumably there are other users in the system that aren't quite as flexible and that actually need these big chunks. By eating all the big chunks we're causing extra work for the rest of the system. We also may start making other memory allocations fail. While the system may be robust to such failures (as is the case with dwc2 USB trying to allocate buffers for Ethernet data and with WiFi trying to allocate buffers for WiFi data), it is yet another big performance hit. Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-11ARM: 8505/1: dma-mapping: Optimize allocationDoug Anderson
The __iommu_alloc_buffer() is expected to be called to allocate pretty sizeable buffers. Upon simple tests of video I saw it trying to allocate 4,194,304 bytes. The function tries to allocate large chunks in order to optimize IOMMU TLB usage. The current function is very, very slow. One problem is the way it keeps trying and trying to allocate big chunks. Imagine a very fragmented memory that has 4M free but no contiguous pages at all. Further imagine allocating 4M (1024 pages). We'll do the following memory allocations: - For page 1: - Try to allocate order 10 (no retry) - Try to allocate order 9 (no retry) - ... - Try to allocate order 0 (with retry, but not needed) - For page 2: - Try to allocate order 9 (no retry) - Try to allocate order 8 (no retry) - ... - Try to allocate order 0 (with retry, but not needed) - ... - ... Total number of calls to alloc() calls for this case is: sum(int(math.log(i, 2)) + 1 for i in range(1, 1025)) => 9228 The above is obviously worse case, but given how slow alloc can be we really want to try to avoid even somewhat bad cases. I timed the old code with a device under memory pressure and it wasn't hard to see it take more than 120 seconds to allocate 4 megs of memory! (NOTE: testing was done on kernel 3.14, so possibly mainline would behave differently). A second problem is that allocating big chunks under memory pressure when we don't need them is just not a great idea anyway unless we really need them. We can make due pretty well with smaller chunks so it's probably wise to leave bigger chunks for other users once memory pressure is on. Let's adjust the allocation like this: 1. If a big chunk fails, stop trying to hard and bump down to lower order allocations. 2. Don't try useless orders. The whole point of big chunks is to optimize the TLB and it can really only make use of 2M, 1M, 64K and 4K sizes. We'll still tend to eat up a bunch of big chunks, but that might be the right answer for some users. A future patch could possibly add a new DMA_ATTR that would let the caller decide that TLB optimization isn't important and that we should use smaller chunks. Presumably this would be a sane strategy for some callers. Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-01-22tree wide: use kvfree() than conditional kfree()/vfree()Tetsuo Handa
There are many locations that do if (memory_was_allocated_by_vmalloc) vfree(ptr); else kfree(ptr); but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory using is_vmalloc_addr(). Unless callers have special reasons, we can replace this branch with kvfree(). Please check and reply if you found problems. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Jan Kara <jack@suse.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Acked-by: David Rientjes <rientjes@google.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Boris Petkov <bp@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-15Revert "scatterlist: use sg_phys()"Dan Williams
commit db0fa0cb0157 "scatterlist: use sg_phys()" did replacements of the form: phys_addr_t phys = page_to_phys(sg_page(s)); phys_addr_t phys = sg_phys(s) & PAGE_MASK; However, this breaks platforms where sizeof(phys_addr_t) > sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a combined helper in 4.5. Cc: <stable@vger.kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: db0fa0cb0157 ("scatterlist: use sg_phys()") Suggested-by: Joerg Roedel <joro@8bytes.org> Reported-by: Vitaly Lavrov <vel21ripn@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-11-06mm, page_alloc: distinguish between being unable to sleep, unwilling to ↵Mel Gorman
sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-03ARM: 8427/1: dma-mapping: add support for offset parameter in dma_mmap()Marek Szyprowski
IOMMU-based dma_mmap() implementation lacked proper support for offset parameter used in mmap call (it always assumed that mapping starts from offset zero). This patch adds support for offset parameter to IOMMU-based implementation. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: stable@vger.kernel.org # v3.6+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03ARM: 8426/1: dma-mapping: add missing range check in dma_mmap()Marek Szyprowski
dma_mmap() function in IOMMU-based dma-mapping implementation lacked a check for valid range of mmap parameters (offset and buffer size), what might have caused access beyond the allocated buffer. This patch fixes this issue. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: stable@vger.kernel.org # v3.6+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-19Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Three fixes and a resulting cleanup for -rc2: - Andre Przywara reported that he was seeing a warning with the new cast inside DMA_ERROR_CODE's definition, and fixed the incorrect use. - Doug Anderson noticed that kgdb causes a "scheduling while atomic" bug. - OMAP5 folk noticed that their Thumb-2 compiled X servers crashed when enabling support to cover ARMv6 CPUs due to a kernel bug leaking some conditional context into the signal handler" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8425/1: kgdb: Don't try to stop the machine when setting breakpoints ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition ARM: get rid of needless #if in signal handling code ARM: fix Thumb2 signal handling when ARMv6 is enabled
2015-09-16ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definitionAndre Przywara
Commit 96231b2686b5: ("ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE") changed the definition of DMA_ERROR_CODE to use dma_addr_t, which makes the compiler barf on assigning this to an "int" variable on ARM with LPAE enabled: ************* In file included from /src/linux/include/linux/dma-mapping.h:86:0, from /src/linux/arch/arm/mm/dma-mapping.c:21: /src/linux/arch/arm/mm/dma-mapping.c: In function '__iommu_create_mapping': /src/linux/arch/arm/include/asm/dma-mapping.h:16:24: warning: overflow in implicit constant conversion [-Woverflow] #define DMA_ERROR_CODE (~(dma_addr_t)0x0) ^ /src/linux/arch/arm/mm/dma-mapping.c:1252:15: note: in expansion of macro DMA_ERROR_CODE' int i, ret = DMA_ERROR_CODE; ^ ************* Remove the actually unneeded initialization of "ret" in __iommu_create_mapping() and move the variable declaration inside the for-loop to make the scope of this variable more clear. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-10dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}Christoph Hellwig
Since 2009 we have a nice asm-generic header implementing lots of DMA API functions for architectures using struct dma_map_ops, but unfortunately it's still missing a lot of APIs that all architectures still have to duplicate. This series consolidates the remaining functions, although we still need arch opt outs for two of them as a few architectures have very non-standard implementations. This patch (of 5): The coherent DMA allocator works the same over all architectures supporting dma_map operations. This patch consolidates them and converges the minor differences: - the debug_dma helpers are now called from all architectures, including those that were previously missing them - dma_alloc_from_coherent and dma_release_from_coherent are now always called from the generic alloc/free routines instead of the ops dma-mapping-common.h always includes dma-coherent.h to get the defintions for them, or the stubs if the architecture doesn't support this feature - checks for ->alloc / ->free presence are removed. There is only one magic instead of dma_map_ops without them (mic_dma_ops) and that one is x86 only anyway. Besides that only x86 needs special treatment to replace a default devices if none is passed and tweak the gfp_flags. An optional arch hook is provided for that. [linux@roeck-us.net: fix build] [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-03Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM development updates from Russell King: "Included in this update: - moving PSCI code from ARM64/ARM to drivers/ - removal of some architecture internals from global kernel view - addition of software based "privileged no access" support using the old domains register to turn off the ability for kernel loads/stores to access userspace. Only the proper accessors will be usable. - addition of early fixup support for early console - re-addition (and reimplementation) of OMAP special interconnect barrier - removal of finish_arch_switch() - only expose cpuX/online in sysfs if hotpluggable - a number of code cleanups" * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (41 commits) ARM: software-based priviledged-no-access support ARM: entry: provide uaccess assembly macro hooks ARM: entry: get rid of multiple macro definitions ARM: 8421/1: smp: Collapse arch_cpu_idle_dead() into cpu_die() ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore() ARM: mm: improve do_ldrd_abort macro ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit() ARM: entry: efficiency cleanups ARM: entry: get rid of asm_trace_hardirqs_on_cond ARM: uaccess: simplify user access assembly ARM: domains: remove DOMAIN_TABLE ARM: domains: keep vectors in separate domain ARM: domains: get rid of manager mode for user domain ARM: domains: move initial domain setting value to asm/domains.h ARM: domains: provide domain_mask() ARM: domains: switch to keeping domain value in register ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE ARM: 8417/1: refactor bitops functions with BIT_MASK() and BIT_WORD() ARM: 8416/1: Feroceon: use of_iomap() to map register base ARM: 8415/1: early fixmap support for earlycon ...
2015-09-03Merge branches 'cleanup', 'fixes', 'misc', 'omap-barrier' and 'uaccess' into ↵Russell King
for-linus
2015-09-02Merge branch 'for-4.3/sg' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull SG updates from Jens Axboe: "This contains a set of scatter-gather related changes/fixes for 4.3: - Add support for limited chaining of sg tables even for architectures that do not set ARCH_HAS_SG_CHAIN. From Christoph. - Add sg chain support to target_rd. From Christoph. - Fixup open coded sg->page_link in crypto/omap-sham. From Christoph. - Fixup open coded crypto ->page_link manipulation. From Dan. - Also from Dan, automated fixup of manual sg_unmark_end() manipulations. - Also from Dan, automated fixup of open coded sg_phys() implementations. - From Robert Jarzmik, addition of an sg table splitting helper that drivers can use" * 'for-4.3/sg' of git://git.kernel.dk/linux-block: lib: scatterlist: add sg splitting function scatterlist: use sg_phys() crypto/omap-sham: remove an open coded access to ->page_link scatterlist: remove open coded sg_unmark_end instances crypto: replace scatterwalk_sg_chain with sg_chain target/rd: always chain S/G list scatterlist: allow limited chaining without ARCH_HAS_SG_CHAIN
2015-08-17scatterlist: use sg_phys()Dan Williams
Coccinelle cleanup to replace open coded sg to physical address translations. This is in preparation for introducing scatterlists that reference __pfn_t. // sg_phys.cocci: convert usage page_to_phys(sg_page(sg)) to sg_phys(sg) // usage: make coccicheck COCCI=sg_phys.cocci MODE=patch virtual patch @@ struct scatterlist *sg; @@ - page_to_phys(sg_page(sg)) + sg->offset + sg_phys(sg) @@ struct scatterlist *sg; @@ - page_to_phys(sg_page(sg)) + sg_phys(sg) & PAGE_MASK Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-04ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMALorenzo Nava
This patch allows the use of CMA for DMA coherent memory allocation. At the moment if the input parameter "is_coherent" is set to true the allocation is not made using the CMA, which I think is not the desired behaviour. The patch covers the allocation and free of memory for coherent DMA. Signed-off-by: Lorenzo Nava <lorenx4@gmail.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-08-01ARM: reduce visibility of dmac_* functionsRussell King
The dmac_* functions are private to the ARM DMA API implementation, and should not be used by drivers. In order to discourage their use, remove their prototypes and macros from asm/*.h. We have to leave dmac_flush_range() behind as Exynos and MSM IOMMU code use these; once these sites are fixed, this can be moved also. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-07-17ARM: 8404/1: dma-mapping: fix off-by-one error in bitmap size checkMarek Szyprowski
nr_bitmaps member of mapping structure stores the number of already allocated bitmaps and it is interpreted as loop iterator (it starts from 0 not from 1), so a comparison against number of possible bitmap extensions should include this fact. This patch fixes this by changing the extension failure condition. This issue has been introduced by commit 4d852ef8c2544ce21ae41414099a7504c61164a0 ("arm: dma-mapping: Add support to extend DMA IOMMU mappings"). Reported-by: Hyungwon Hwang <human.hwang@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Hyungwon Hwang <human.hwang@samsung.com> Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-06-12Merge branches 'arnd-fixes', 'clk', 'misc', 'v7' and 'fixes' into for-nextRussell King
2015-06-06ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmapMike Looijmans
When dma-coherent transfers are enabled, the mmap call must not change the pg_prot flags in the vma struct. Split the arm_dma_mmap into a common and specific parts, and add a "arm_coherent_dma_mmap" implementation that does not alter the page protection flags. Tested on a topic-miami board (Zynq) using the ACP port to transfer data between FPGA and CPU using the Dyplo framework. Without this patch, byte-wise access to mmapped coherent DMA memory was about 20x slower because of the memory being marked as non-cacheable, and transfer speeds would not exceed 240MB/s. After this patch, the mapped memory is cacheable and the transfer speed is again 600MB/s (limited by the FPGA) when the data is in the L2 cache, while data integrity is being maintained. The patch has no effect on non-coherent DMA. Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-05-03ARM: 8347/1: dma-mapping: fix off-by-one check in arm_setup_iommu_dma_opsMarek Szyprowski
Patch 22b3c181c6c324a46f71aae806d8ddbe61d25761 ("arm: dma-mapping: limit IOMMU mapping size") added a check for IO address space size. However this patch broke IOMMU initialization for typical platforms initialized from device tree, which get the default IO address space size of 4GiB. This value doesn't fit into size_t and fails a check introduced by that commit resulting in failed dma-mapping/iommu initialization. This patch fixes this issue by adding proper support for full 4GiB address space size. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Included in this update are both some long term fixes and some new features. Fixes: - An integer overflow in the calculation of ELF_ET_DYN_BASE. - Avoiding OOMs for high-order IOMMU allocations - SMP requires the data cache to be enabled for synchronisation primitives to work, so prevent the CPU_DCACHE_DISABLE option being visible on SMP builds. - A bug going back 10+ years in the noMMU ARM94* CPU support code, where it corrupts registers. Found by folk getting Linux running on their cameras. - Versatile Express needs an errata workaround enabled for CPU hot-unplug to work. Features: - Clean up module linker by handling out of range relocations separately from relocation cases we don't handle. - Fix a long term bug in the pci_mmap_page_range() code, which we hope won't impact userspace (we hope there's no users of the existing broken interface.) - Don't map DMA coherent allocations when we don't have a MMU. - Drop experimental status for SMP_ON_UP. - Warn when DT doesn't specify ePAPR mandatory cache properties. - Add documentation concerning how we find the start of physical memory for AUTO_ZRELADDR kernels, detailing why we have chosen the mask and the implications of changing it. - Updates from Ard Biesheuvel to address some issues with large kernels (such as allyesconfig) failing to link. - Allow hibernation to work on modern (ARMv7) CPUs - this appears to have never worked in the past on these CPUs. - Enable IRQ_SHOW_LEVEL, which changes the /proc/interrupts output format (hopefully without userspace breaking... let's hope that if it causes someone a problem, they tell us.) - Fix tegra-ahb DT offsets. - Rework ARM errata 643719 code (and ARMv7 flush_cache_louis()/ flush_dcache_all()) code to be more efficient, and enable this errata workaround by default for ARMv7+SMP CPUs. This complements the Versatile Express fix above. - Rework ARMv7 context code for errata 430973, so that only Cortex A8 CPUs are impacted by the branch target buffer flush when this errata is enabled. Also update the help text to indicate that all r1p* A8 CPUs are impacted. - Switch ARM to the generic show_mem() implementation, it conveys all the information which we were already reporting. - Prevent slow timer sources being used for udelay() - timers running at less than 1MHz are not useful for this, and can cause udelay() to return immediately, without any wait. Using such a slow timer is silly. - VDSO support for 32-bit ARM, mainly for gettimeofday() using the ARM architected timer. - Perf support for Scorpion performance monitoring units" vdso semantic conflict fixed up as per linux-next. * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (52 commits) ARM: update errata 430973 documentation to cover Cortex A8 r1p* ARM: ensure delay timer has sufficient accuracy for delays ARM: switch to use the generic show_mem() implementation ARM: proc-v7: avoid errata 430973 workaround for non-Cortex A8 CPUs ARM: enable ARM errata 643719 workaround by default ARM: cache-v7: optimise test for Cortex A9 r0pX devices ARM: cache-v7: optimise branches in v7_flush_cache_louis ARM: cache-v7: consolidate initialisation of cache level index ARM: cache-v7: shift CLIDR to extract appropriate field before masking ARM: cache-v7: use movw/movt instructions ARM: allow 16-bit instructions in ALT_UP() ARM: proc-arm94*.S: fix setup function ARM: vexpress: fix CPU hotplug with CT9x4 tile. ARM: 8276/1: Make CPU_DCACHE_DISABLE depend on !SMP ARM: 8335/1: Documentation: DT bindings: Tegra AHB: document the legacy base address ARM: 8334/1: amba: tegra-ahb: detect and correct bogus base address ARM: 8333/1: amba: tegra-ahb: fix register offsets in the macros ARM: 8339/1: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations ...
2015-04-14Merge branches 'misc', 'vdso' and 'fixes' into for-nextRussell King
Conflicts: arch/arm/mm/proc-macros.S
2015-04-13Merge tag 'pci-v4.1-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "Enumeration - Read capability list as dwords, not bytes (Sean O. Stalley) Resource management - Don't check for PNP overlaps with unassigned PCI BARs (Bjorn Helgaas) - Mark invalid BARs as unassigned (Bjorn Helgaas) - Show driver, BAR#, and resource on pci_ioremap_bar() failure (Bjorn Helgaas) - Fail pci_ioremap_bar() on unassigned resources (Bjorn Helgaas) - Assign resources before drivers claim devices (Yijing Wang) - Claim bus resources before pci_bus_add_devices() (Yijing Wang) Power management - Optimize device state transition delays (Aaron Lu) - Don't clear ASPM bits when the FADT declares it's unsupported (Matthew Garrett) Virtualization - Add ACS quirks for Intel 1G NICs (Alex Williamson) IOMMU - Add ptr to OF node arg to of_iommu_configure() (Murali Karicheri) - Move of_dma_configure() to device.c to help re-use (Murali Karicheri) - Fix size when dma-range is not used (Murali Karicheri) - Add helper functions pci_get[put]_host_bridge_device() (Murali Karicheri) - Add of_pci_dma_configure() to update DMA configuration (Murali Karicheri) - Update DMA configuration from DT (Murali Karicheri) - dma-mapping: limit IOMMU mapping size (Murali Karicheri) - Calculate device DMA masks based on DT dma-range size (Murali Karicheri) ARM Versatile host bridge driver - Check for devm_ioremap_resource() failures (Jisheng Zhang) Broadcom iProc host bridge driver - Add Broadcom iProc PCIe driver (Ray Jui) Marvell MVEBU host bridge driver - Add suspend/resume support (Thomas Petazzoni) Renesas R-Car host bridge driver - Fix position of MSI enable bit (Nobuhiro Iwamatsu) - Write zeroes to reserved PCIEPARL bits (Nobuhiro Iwamatsu) - Change PCIEPARL and PCIEPARH to PCIEPALR and PCIEPAUR (Nobuhiro Iwamatsu) - Verify that mem_res is 64K-aligned (Nobuhiro Iwamatsu) Samsung Exynos host bridge driver - Fix INTx enablement statement termination error (Jaehoon Chung) Miscellaneous - Make a shareable UUID for PCI firmware ACPI _DSM (Aaron Lu) - Clarify policy for vendor IDs in pci.txt (Michael S. Tsirkin)" * tag 'pci-v4.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (36 commits) PCI: Read capability list as dwords, not bytes PCI: layerscape: Simplify platform_get_resource_byname() failure checking PCI: keystone: Don't dereference possible NULL pointer PCI: versatile: Check for devm_ioremap_resource() failures PCI: Don't clear ASPM bits when the FADT declares it's unsupported PCI: Clarify policy for vendor IDs in pci.txt PCI/ACPI: Optimize device state transition delays PCI: Export pci_find_host_bridge() for use inside PCI core PCI: Make a shareable UUID for PCI firmware ACPI _DSM PCI: Fix typo in Thunderbolt kernel message PCI: exynos: Fix INTx enablement statement termination error PCI: iproc: Add Broadcom iProc PCIe support PCI: iproc: Add DT docs for Broadcom iProc PCIe driver PCI: Export symbols required for loadable host driver modules PCI: Add ACS quirks for Intel 1G NICs PCI: mvebu: Add suspend/resume support PCI: Cleanup control flow sparc/PCI: Claim bus resources before pci_bus_add_devices() PCI: Assign resources before drivers claim devices (pci_scan_root_bus()) PCI: Fail pci_ioremap_bar() on unassigned resources ...
2015-04-02ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocationsTomasz Figa
IOMMU should be able to use single pages as well as bigger blocks, so if higher order allocations fail, we should not affect state of the system, with events such as OOM killer, but rather fall back to order 0 allocations. This patch changes the behavior of ARM IOMMU DMA allocator to use __GFP_NORETRY, which bypasses OOM invocation, for orders higher than zero and, only if that fails, fall back to normal order 0 allocation which might invoke OOM killer. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Doug Anderson <dianders@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-03-18ARM: 8289/1: dma-mapping: use to_dma_iommu_mapping instead of accessing archdataWill Deacon
When using the IOMMU-backed DMA ops for a device, we store a pointer to the dma_iommu_mapping structure (used to keep track of the address space) in the archdata.mapping field of the struct device. Rather than access this field directly, use the to_dma_iommu_mapping helper in dma-mapping, so that we don't really care where the mapping information is held. Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-03-12arm: dma-mapping: limit IOMMU mapping sizeMurali Karicheri
arm_iommu_create_mapping() has size parameter of size_t and arm_setup_iommu_dma_ops() can take a value higher than that when this is called from the OF code. So limit the size to SIZE_MAX. Tested-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> (AMD Seattle) Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> CC: Joerg Roedel <joro@8bytes.org> CC: Grant Likely <grant.likely@linaro.org> CC: Rob Herring <robh+dt@kernel.org> CC: Russell King <linux@arm.linux.org.uk> CC: Arnd Bergmann <arnd@arndb.de>
2015-03-10ARM: dma-api: fix off-by-one error in __dma_supported()Russell King
When validating the mask against the amount of memory we have available (so that we can trap 32-bit DMA addresses with >32-bits memory), we had not taken account of the fact that max_pfn is the maximum PFN number plus one that would be in the system. There are several references in the code which bear this out: mm/page_owner.c: for (; pfn < max_pfn; pfn++) { } arch/x86/kernel/setup.c: high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>