summaryrefslogtreecommitdiff
path: root/arch/powerpc/include
AgeCommit message (Collapse)Author
2018-12-04powerpc/8xx: reintroduce 16K pages with HW assistanceChristophe Leroy
Using this HW assistance implies some constraints on the page table structure: - Regardless of the main page size used (4k or 16k), the level 1 table (PGD) contains 1024 entries and each PGD entry covers a 4Mbytes area which is managed by a level 2 table (PTE) containing also 1024 entries each describing a 4k page. - 16k pages require 4 identifical entries in the L2 table - 512k pages PTE have to be spread every 128 bytes in the L2 table - 8M pages PTE are at the address pointed by the L1 entry and each 8M page require 2 identical entries in the PGD. In order to use hardware assistance with 16K pages, this patch does the following modifications: - Make PGD size independent of the main page size - In 16k pages mode, redefine pte_t as a struct with 4 elements, and populate those 4 elements in __set_pte_at() and pte_update() - Adapt the size of the hugepage tables. - Define a PTE_FRAGMENT_NB so that a 16k page contains 4 page tables. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/8xx: Enable 512k hugepage support with HW assistanceChristophe Leroy
For using 512k pages with hardware assistance, the PTEs have to be spread every 128 bytes in the L2 table. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: fix a warning when a cache is common to PGD and hugepagesChristophe Leroy
While implementing TLB miss HW assistance on the 8xx, the following warning was encountered: [ 423.732965] WARNING: CPU: 0 PID: 345 at mm/slub.c:2412 ___slab_alloc.constprop.30+0x26c/0x46c [ 423.733033] CPU: 0 PID: 345 Comm: mmap Not tainted 4.18.0-rc8-00664-g2dfff9121c55 #671 [ 423.733075] NIP: c0108f90 LR: c0109ad0 CTR: 00000004 [ 423.733121] REGS: c455bba0 TRAP: 0700 Not tainted (4.18.0-rc8-00664-g2dfff9121c55) [ 423.733147] MSR: 00021032 <ME,IR,DR,RI> CR: 24224848 XER: 20000000 [ 423.733319] [ 423.733319] GPR00: c0109ad0 c455bc50 c4521910 c60053c0 007080c0 c0011b34 c7fa41e0 c455be30 [ 423.733319] GPR08: 00000001 c00103a0 c7fa41e0 c49afcc4 24282842 10018840 c079b37c 00000040 [ 423.733319] GPR16: 73f00000 00210d00 00000000 00000001 c455a000 00000100 00000200 c455a000 [ 423.733319] GPR24: c60053c0 c0011b34 007080c0 c455a000 c455a000 c7fa41e0 00000000 00009032 [ 423.734190] NIP [c0108f90] ___slab_alloc.constprop.30+0x26c/0x46c [ 423.734257] LR [c0109ad0] kmem_cache_alloc+0x210/0x23c [ 423.734283] Call Trace: [ 423.734326] [c455bc50] [00000100] 0x100 (unreliable) [ 423.734430] [c455bcc0] [c0109ad0] kmem_cache_alloc+0x210/0x23c [ 423.734543] [c455bcf0] [c0011b34] huge_pte_alloc+0xc0/0x1dc [ 423.734633] [c455bd20] [c01044dc] hugetlb_fault+0x408/0x48c [ 423.734720] [c455bdb0] [c0104b20] follow_hugetlb_page+0x14c/0x44c [ 423.734826] [c455be10] [c00e8e54] __get_user_pages+0x1c4/0x3dc [ 423.734919] [c455be80] [c00e9924] __mm_populate+0xac/0x140 [ 423.735020] [c455bec0] [c00db14c] vm_mmap_pgoff+0xb4/0xb8 [ 423.735127] [c455bf00] [c00f27c0] ksys_mmap_pgoff+0xcc/0x1fc [ 423.735222] [c455bf40] [c000e0f8] ret_from_syscall+0x0/0x38 [ 423.735271] Instruction dump: [ 423.735321] 7cbf482e 38fd0008 7fa6eb78 7fc4f378 4bfff5dd 7fe3fb78 4bfffe24 81370010 [ 423.735536] 71280004 41a2ff88 4840c571 4bffff80 <0fe00000> 4bfffeb8 81340010 712a0004 [ 423.735757] ---[ end trace e9b222919a470790 ]--- This warning occurs when calling kmem_cache_zalloc() on a cache having a constructor. In this case it happens because PGD cache and 512k hugepte cache are the same size (4k). While a cache with constructor is created for the PGD, hugepages create cache without constructor and uses kmem_cache_zalloc(). As both expect a cache with the same size, the hugepages reuse the cache created for PGD, hence the conflict. In order to avoid this conflict, this patch: - modifies pgtable_cache_add() so that a zeroising constructor is added for any cache size. - replaces calls to kmem_cache_zalloc() by kmem_cache_alloc() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: replace hugetlb_cache by PGT_CACHE(PTE_T_ORDER)Christophe Leroy
Instead of opencoding cache handling for the special case of hugepage tables having a single pte_t element, this patch makes use of the common pgtable_cache helpers Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: enable the use of page table cache of order 0Christophe Leroy
hugepages uses a cache of order 0. Lets allow page tables of order 0 in the common part in order to avoid open coding in hugetlb Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: Extend pte_fragment functionality to PPC32Christophe Leroy
In order to allow the 8xx to handle pte_fragments, this patch extends the use of pte_fragments to PPC32 platforms. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: add helpers to get/set mm.context->pte_fragChristophe Leroy
In order to handle pte_fragment functions with single fragment without adding pte_frag in all mm_context_t, this patch creates two helpers which do nothing on platforms using a single fragment. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: Move pgtable_t into platform headersChristophe Leroy
This patch move pgtable_t into platform headers. It gets rid of the CONFIG_PPC_64K_PAGES case for PPC64 as nohash/64 doesn't support CONFIG_PPC_64K_PAGES. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: move platform specific mmu-xxx.h in platform directoriesChristophe Leroy
The purpose of this patch is to move platform specific mmu-xxx.h files in platform directories like pte-xxx.h files. In the meantime this patch creates common nohash and nohash/32 + nohash/64 mmu.h files for future common parts. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: Move pte_fragment_alloc() to a common locationChristophe Leroy
In preparation of next patch which generalises the use of pte_fragment_alloc() for all, this patch moves the related functions in a place that is common to all subarches. The 8xx will need that for supporting 16k pages, as in that mode page tables still have a size of 4k. Since pte_fragment with only once fragment is not different from what is done in the general case, we can easily migrate all subarchs to pte fragments. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/8xx: Remove PTE_ATOMIC_UPDATESChristophe Leroy
commit 1bc54c03117b9 ("powerpc: rework 4xx PTE access and TLB miss") introduced non atomic PTE updates and started the work of removing PTE updates in TLB miss handlers, but kept PTE_ATOMIC_UPDATES for the 8xx with the following comment: /* Until my rework is finished, 8xx still needs atomic PTE updates */ commit fe11dc3f9628e ("powerpc/8xx: Update TLB asm so it behaves as linux mm expects") removed all PTE updates done in TLB miss handlers Therefore, atomic PTE updates are not needed anymore for the 8xx Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/book3s32: Remove CONFIG_BOOKE dependent codeChristophe Leroy
BOOK3S/32 cannot be BOOKE, so remove useless code Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: remove unused function prototypeBreno Leitao
Commit f384796c40dc ("powerpc/mm: Add support for handling > 512TB address in SLB miss") removed function slb_miss_bad_addr(struct pt_regs *regs), but kept its declaration in the prototype file. This patch simply removes the function definition. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-26powerpc: change CONFIG_PPC_STD_MMU_32 to CONFIG_PPC_BOOK3S_32Christophe Leroy
Today we have: config PPC_BOOK3S_32 bool "512x/52xx/6xx/7xx/74xx/82xx/83xx/86xx" [depends on PPC32 within a choice] config PPC_BOOK3S def_bool y depends on PPC_BOOK3S_32 || PPC_BOOK3S_64 config PPC_STD_MMU def_bool y depends on PPC_BOOK3S config PPC_STD_MMU_32 def_bool y depends on PPC_STD_MMU && PPC32 PPC_STD_MMU_32 is therefore redundant with PPC_BOOK3S_32. In order to make the code clearer, lets use preferably PPC_BOOK3S_32. This will allow to remove CONFIG_PPC_STD_MMU_32 in a later patch. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-26powerpc/32: Remove #ifdef CONFIG_PPC_STD_MMU_32 in asm/book3s/32/pgtable.hChristophe Leroy
asm/book3s/32/pgtable.h is only included when CONFIG_PPC_BOOK3S_32 is set. Whenever CONFIG_PPC_BOOK3S_32 is set, CONFIG_PPC_STD_MMU_32 is set as well. This patch removes useless CONFIG_PPC_STD_MMU_32 #ifdefs Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-26powerpc: change CONFIG_6xx to CONFIG_PPC_BOOK3S_32Christophe Leroy
Today we have: config PPC_BOOK3S_32 bool "512x/52xx/6xx/7xx/74xx/82xx/83xx/86xx" [depends on PPC32 within a choice] config PPC_BOOK3S def_bool y depends on PPC_BOOK3S_32 || PPC_BOOK3S_64 config 6xx def_bool y depends on PPC32 && PPC_BOOK3S 6xx is therefore redundant with PPC_BOOK3S_32. In order to make the code clearer, lets use preferably PPC_BOOK3S_32. This will allow to remove CONFIG_6xx in a later patch. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-25powerpc: Typo s/use use/use/Geert Uytterhoeven
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-25powerpc: mark 64-bit PD_HUGE constant as unsigned longDaniel Axtens
When compiled for 64-bit, the PD_HUGE constant is a 64-bit integer. Mark it as an unsigned long. This squashes over a thousand sparse warnings on my minimal T4240RDB (e6500, ppc64be) config, of the following 2 forms: arch/powerpc/include/asm/hugetlb.h:52:49: warning: constant 0x8000000000000000 is so big it is unsigned long arch/powerpc/include/asm/nohash/pgtable.h:269:49: warning: constant 0x8000000000000000 is so big it is unsigned long Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-25powerpc/mm: remove const type qualifier from function ‘pud_pfn’Mathieu Malaterre
Type qualifier on return type is ignored. Remove warning in W=1: arch/powerpc/include/asm/book3s/64/pgtable.h:1268:25: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-25powerpc/mm: Remove extern from function definitionBreno Leitao
Function huge_ptep_set_access_flags() has the 'extern' keyword in the function definition and also in the function declaration. This causes a warning in 'sparse' since the 'extern' storage class should not be used in the function definition. arch/powerpc/mm/pgtable.c:232:12: warning: function 'huge_ptep_set_access_flags' with external linkage has definition This patch removes the keyword from the definition part. It also removes the extern keyword from the declaration part, since checkpatch --strict complains about it. Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-15powerpc/64: Fix kernel stack 16-byte alignmentNicholas Piggin
Commit 4c2de74cc869 ("powerpc/64: Interrupts save PPR on stack rather than thread_struct") changed sizeof(struct pt_regs) % 16 from 0 to 8, which causes the interrupt frame allocation on kernel entry to put the kernel stack out of alignment. Quadword (16-byte) alignment for the stack is required by both the 64-bit v1 ABI (v1.9 § 3.2.2) and the 64-bit v2 ABI (v1.1 § 2.2.2.1). Add a pad field to fix alignment, and add a BUILD_BUG_ON to catch this in future. Fixes: 4c2de74cc869 ("powerpc/64: Interrupts save PPR on stack rather than thread_struct") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-12powerpc/io: Fix the IO workarounds code to work with RadixMichael Ellerman
Back in 2006 Ben added some workarounds for a misbehaviour in the Spider IO bridge used on early Cell machines, see commit 014da7ff47b5 ("[POWERPC] Cell "Spider" MMIO workarounds"). Later these were made to be generic, ie. not tied specifically to Spider. The code stashes a token in the high bits (59-48) of virtual addresses used for IO (eg. returned from ioremap()). This works fine when using the Hash MMU, but when we're using the Radix MMU the bits used for the token overlap with some of the bits of the virtual address. This is because the maximum virtual address is larger with Radix, up to c00fffffffffffff, and in fact we use that high part of the address range for ioremap(), see RADIX_KERN_IO_START. As it happens the bits that are used overlap with the bits that differentiate an IO address vs a linear map address. If the resulting address lies outside the linear mapping we will crash (see below), if not we just corrupt memory. virtio-pci 0000:00:00.0: Using 64-bit direct DMA at offset 800000000000000 Unable to handle kernel paging request for data at address 0xc000000080000014 ... CFAR: c000000000626b98 DAR: c000000080000014 DSISR: 42000000 IRQMASK: 0 GPR00: c0000000006c54fc c00000003e523378 c0000000016de600 0000000000000000 GPR04: c00c000080000014 0000000000000007 0fffffff000affff 0000000000000030 ^^^^ ... NIP [c000000000626c5c] .iowrite8+0xec/0x100 LR [c0000000006c992c] .vp_reset+0x2c/0x90 Call Trace: .pci_bus_read_config_dword+0xc4/0x120 (unreliable) .register_virtio_device+0x13c/0x1c0 .virtio_pci_probe+0x148/0x1f0 .local_pci_probe+0x68/0x140 .pci_device_probe+0x164/0x220 .really_probe+0x274/0x3b0 .driver_probe_device+0x80/0x170 .__driver_attach+0x14c/0x150 .bus_for_each_dev+0xb8/0x130 .driver_attach+0x34/0x50 .bus_add_driver+0x178/0x2f0 .driver_register+0x90/0x1a0 .__pci_register_driver+0x6c/0x90 .virtio_pci_driver_init+0x2c/0x40 .do_one_initcall+0x64/0x280 .kernel_init_freeable+0x36c/0x474 .kernel_init+0x24/0x160 .ret_from_kernel_thread+0x58/0x7c This hasn't been a problem because CONFIG_PPC_IO_WORKAROUNDS which enables this code is usually not enabled. It is only enabled when it's selected by PPC_CELL_NATIVE which is only selected by PPC_IBM_CELL_BLADE and that in turn depends on BIG_ENDIAN. So in order to hit the bug you need to build a big endian kernel, with IBM Cell Blade support enabled, as well as Radix MMU support, and then boot that on Power9 using Radix MMU. Still we can fix the bug, so let's do that. We simply use fewer bits for the token, taking the union of the restrictions on the address from both Hash and Radix, we end up with 8 bits we can use for the token. The only user of the token is iowa_mem_find_bus() which only supports 8 token values, so 8 bits is plenty for that. Fixes: 566ca99af026 ("powerpc/mm/radix: Add dummy radix_enabled()") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-06powerpc/mm/64s: Use PPC_SLBFEE macroMichael Ellerman
Old toolchains don't know about slbfee and break the build, eg: {standard input}:37: Error: Unrecognized opcode: `slbfee.' Fix it by using the macro version. We need to add an underscore version that takes raw register numbers from the inline asm, rather than our Rx macros. Fixes: e15a4fea4dee ("powerpc/64s/hash: Add some SLB debugging tests") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-02Merge tag 'powerpc-4.20-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some things that I missed due to travel, or that came in late. Two fixes also going to stable: - A revert of a buggy change to the 8xx TLB miss handlers. - Our flushing of SPE (Signal Processing Engine) registers on fork was broken. Other changes: - A change to the KVM decrementer emulation to use proper APIs. - Some cleanups to the way we do code patching in the 8xx code. - Expose the maximum possible memory for the system in /proc/powerpc/lparcfg. - Merge some updates from Scott: "a couple device tree updates, and a fix for a missing prototype warning" A few other minor fixes and a handful of fixes for our selftests. Thanks to: Aravinda Prasad, Breno Leitao, Camelia Groza, Christophe Leroy, Felipe Rechia, Joel Stanley, Naveen N. Rao, Paul Mackerras, Scott Wood, Tyrel Datwyler" * tag 'powerpc-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (21 commits) selftests/powerpc: Fix compilation issue due to asm label selftests/powerpc/cache_shape: Fix out-of-tree build selftests/powerpc/switch_endian: Fix out-of-tree build selftests/powerpc/pmu: Link ebb tests with -no-pie selftests/powerpc/signal: Fix out-of-tree build selftests/powerpc/ptrace: Fix out-of-tree build powerpc/xmon: Relax frame size for clang selftests: powerpc: Fix warning for security subdir selftests/powerpc: Relax L1d miss targets for rfi_flush test powerpc/process: Fix flush_all_to_thread for SPE powerpc/pseries: add missing cpumask.h include file selftests/powerpc: Fix ptrace tm failure KVM: PPC: Use exported tb_to_ns() function in decrementer emulation powerpc/pseries: Export maximum memory value powerpc/8xx: Use patch_site for perf counters setup powerpc/8xx: Use patch_site for memory setup patching powerpc/code-patching: Add a helper to get the address of a patch_site Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP" powerpc/8xx: add missing header in 8xx_mmu.c powerpc/8xx: Add DT node for using the SEC engine of the MPC885 ...
2018-10-31treewide: remove current_text_addrNick Desaulniers
Prefer _THIS_IP_ defined in linux/kernel.h. Most definitions of current_text_addr were the same as _THIS_IP_, but a few archs had inline assembly instead. This patch removes the final call site of current_text_addr, making all of the definitions dead code. [akpm@linux-foundation.org: fix arch/csky/include/asm/processor.h] Link: http://lkml.kernel.org/r/20180911182413.180715-1-ndesaulniers@google.com Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-29Merge tag 'tty-4.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big tty and serial pull request for 4.20-rc1 Lots of little things here, including a merge from the SPI tree in order to keep things simpler for everyone to sync around for one platform. Major stuff is: - tty buffer clearing after use - atmel_serial fixes and additions - xilinx uart driver updates and of course, lots of tiny fixes and additions to individual serial drivers. All of these have been in linux-next with no reported issues for a while" * tag 'tty-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (66 commits) of: base: Change logic in of_alias_get_alias_list() of: base: Fix english spelling in of_alias_get_alias_list() serial: sh-sci: do not warn if DMA transfers are not supported serial: uartps: Do not allow use aliases >= MAX_UART_INSTANCES tty: check name length in tty_find_polling_driver() serial: sh-sci: Add r8a77990 support tty: wipe buffer if not echoing data tty: wipe buffer. serial: fsl_lpuart: Remove the alias node dependence TTY: sn_console: Replace spin_is_locked() with spin_trylock() Revert "serial:serial_core: Allow use of CTS for PPS line discipline" serial: 8250_uniphier: add auto-flow-control support serial: 8250_uniphier: flatten probe function serial: 8250_uniphier: remove unused "fifo-size" property dt-bindings: serial: sh-sci: Document r8a7744 bindings serial: uartps: Fix missing unlock on error in cdns_get_id() tty/serial: atmel: add ISO7816 support tty/serial_core: add ISO7816 infrastructure serial:serial_core: Allow use of CTS for PPS line discipline serial: docs: Fix filename for serial reference implementation ...
2018-10-28Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds
Pull XArray conversion from Matthew Wilcox: "The XArray provides an improved interface to the radix tree data structure, providing locking as part of the API, specifying GFP flags at allocation time, eliminating preloading, less re-walking the tree, more efficient iterations and not exposing RCU-protected pointers to its users. This patch set 1. Introduces the XArray implementation 2. Converts the pagecache to use it 3. Converts memremap to use it The page cache is the most complex and important user of the radix tree, so converting it was most important. Converting the memremap code removes the only other user of the multiorder code, which allows us to remove the radix tree code that supported it. I have 40+ followup patches to convert many other users of the radix tree over to the XArray, but I'd like to get this part in first. The other conversions haven't been in linux-next and aren't suitable for applying yet, but you can see them in the xarray-conv branch if you're interested" * 'xarray' of git://git.infradead.org/users/willy/linux-dax: (90 commits) radix tree: Remove multiorder support radix tree test: Convert multiorder tests to XArray radix tree tests: Convert item_delete_rcu to XArray radix tree tests: Convert item_kill_tree to XArray radix tree tests: Move item_insert_order radix tree test suite: Remove multiorder benchmarking radix tree test suite: Remove __item_insert memremap: Convert to XArray xarray: Add range store functionality xarray: Move multiorder_check to in-kernel tests xarray: Move multiorder_shrink to kernel tests xarray: Move multiorder account test in-kernel radix tree test suite: Convert iteration test to XArray radix tree test suite: Convert tag_tagged_items to XArray radix tree: Remove radix_tree_clear_tags radix tree: Remove radix_tree_maybe_preload_order radix tree: Remove split/join code radix tree: Remove radix_tree_update_node_t page cache: Finish XArray conversion dax: Convert page fault handlers to XArray ...
2018-10-26Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge updates from Andrew Morton: - a few misc things - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (132 commits) hugetlbfs: dirty pages as they are added to pagecache mm: export add_swap_extent() mm: split SWP_FILE into SWP_ACTIVATED and SWP_FS tools/testing/selftests/vm/map_fixed_noreplace.c: add test for MAP_FIXED_NOREPLACE mm: thp: relocate flush_cache_range() in migrate_misplaced_transhuge_page() mm: thp: fix mmu_notifier in migrate_misplaced_transhuge_page() mm: thp: fix MADV_DONTNEED vs migrate_misplaced_transhuge_page race condition mm/kasan/quarantine.c: make quarantine_lock a raw_spinlock_t mm/gup: cache dev_pagemap while pinning pages Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved" mm: return zero_resv_unavail optimization mm: zero remaining unavailable struct pages tools/testing/selftests/vm/gup_benchmark.c: add MAP_HUGETLB option tools/testing/selftests/vm/gup_benchmark.c: add MAP_SHARED option tools/testing/selftests/vm/gup_benchmark.c: allow user specified file tools/testing/selftests/vm/gup_benchmark.c: fix 'write' flag usage mm/gup_benchmark.c: add additional pinning methods mm/gup_benchmark.c: time put_page() mm: don't raise MEMCG_OOM event due to failed high-order allocation mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock ...
2018-10-26hugetlb: introduce generic version of huge_ptep_getAlexandre Ghiti
ia64, mips, parisc, powerpc, sh, sparc, x86 architectures use the same version of huge_ptep_get, so move this generic implementation into asm-generic/hugetlb.h. [arnd@arndb.de: fix ARM 3level page tables] Link: http://lkml.kernel.org/r/20181005161722.904274-1-arnd@arndb.de Link: http://lkml.kernel.org/r/20180920060358.16606-12-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26hugetlb: introduce generic version of huge_ptep_set_access_flags()Alexandre Ghiti
arm, ia64, sh, x86 architectures use the same version of huge_ptep_set_access_flags, so move this generic implementation into asm-generic/hugetlb.h. Link: http://lkml.kernel.org/r/20180920060358.16606-11-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26hugetlb: introduce generic version of huge_ptep_set_wrprotect()Alexandre Ghiti
arm, ia64, mips, powerpc, sh, x86 architectures use the same version of huge_ptep_set_wrprotect, so move this generic implementation into asm-generic/hugetlb.h. Link: http://lkml.kernel.org/r/20180920060358.16606-10-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26hugetlb: introduce generic version of prepare_hugepage_rangeAlexandre Ghiti
arm, arm64, powerpc, sparc, x86 architectures use the same version of prepare_hugepage_range, so move this generic implementation into asm-generic/hugetlb.h. Link: http://lkml.kernel.org/r/20180920060358.16606-9-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26hugetlb: introduce generic version of huge_pte_wrprotectAlexandre Ghiti
arm, arm64, ia64, mips, parisc, powerpc, sh, sparc, x86 architectures use the same version of huge_pte_wrprotect, so move this generic implementation into asm-generic/hugetlb.h. Link: http://lkml.kernel.org/r/20180920060358.16606-8-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26hugetlb: introduce generic version of huge_pte_none()Alexandre Ghiti
arm, arm64, ia64, mips, parisc, powerpc, sh, sparc, x86 architectures use the same version of huge_pte_none, so move this generic implementation into asm-generic/hugetlb.h. Link: http://lkml.kernel.org/r/20180920060358.16606-7-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26hugetlb: introduce generic version of huge_ptep_clear_flushAlexandre Ghiti
arm, x86 architectures use the same version of huge_ptep_clear_flush, so move this generic implementation into asm-generic/hugetlb.h. Link: http://lkml.kernel.org/r/20180920060358.16606-6-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26hugetlb: introduce generic version of huge_ptep_get_and_clear()Alexandre Ghiti
arm, ia64, sh, x86 architectures use the same version of huge_ptep_get_and_clear, so move this generic implementation into asm-generic/hugetlb.h. Link: http://lkml.kernel.org/r/20180920060358.16606-5-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26hugetlb: introduce generic version of set_huge_pte_at()Alexandre Ghiti
arm, ia64, mips, powerpc, sh, x86 architectures use the same version of set_huge_pte_at, so move this generic implementation into asm-generic/hugetlb.h. Link: http://lkml.kernel.org/r/20180920060358.16606-4-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26hugetlb: introduce generic version of hugetlb_free_pgd_rangeAlexandre Ghiti
arm, arm64, mips, parisc, sh, x86 architectures use the same version of hugetlb_free_pgd_range, so move this generic implementation into asm-generic/hugetlb.h. Link: http://lkml.kernel.org/r/20180920060358.16606-3-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Ingo Molnar <mingo@kernel.org> [x86] Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <jhogan@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26Merge tag 'powerpc-4.20-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - A large series to rewrite our SLB miss handling, replacing a lot of fairly complicated asm with much fewer lines of C. - Following on from that, we now maintain a cache of SLB entries for each process and preload them on context switch. Leading to a 27% speedup for our context switch benchmark on Power9. - Improvements to our handling of SLB multi-hit errors. We now print more debug information when they occur, and try to continue running by flushing the SLB and reloading, rather than treating them as fatal. - Enable THP migration on 64-bit Book3S machines (eg. Power7/8/9). - Add support for physical memory up to 2PB in the linear mapping on 64-bit Book3S. We only support up to 512TB as regular system memory, otherwise the percpu allocator runs out of vmalloc space. - Add stack protector support for 32 and 64-bit, with a per-task canary. - Add support for PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP. - Support recognising "big cores" on Power9, where two SMT4 cores are presented to us as a single SMT8 core. - A large series to cleanup some of our ioremap handling and PTE flags. - Add a driver for the PAPR SCM (storage class memory) interface, allowing guests to operate on SCM devices (acked by Dan). - Changes to our ftrace code to handle very large kernels, where we need to use a trampoline to get to ftrace_caller(). And many other smaller enhancements and cleanups. Thanks to: Alan Modra, Alistair Popple, Aneesh Kumar K.V, Anton Blanchard, Aravinda Prasad, Bartlomiej Zolnierkiewicz, Benjamin Herrenschmidt, Breno Leitao, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Dan Carpenter, Daniel Axtens, Finn Thain, Gautham R. Shenoy, Gustavo Romero, Haren Myneni, Hari Bathini, Jia Hongtao, Joel Stanley, John Allen, Laurent Dufour, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Hairgrove, Masahiro Yamada, Michael Bringmann, Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras, Petr Vorel, Rashmica Gupta, Reza Arbab, Rob Herring, Sam Bobroff, Samuel Mendoza-Jonas, Scott Wood, Stan Johnson, Stephen Rothwell, Stewart Smith, Suraj Jitindar Singh, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, YueHaibing, zhong jiang" * tag 'powerpc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (221 commits) Revert "selftests/powerpc: Fix out-of-tree build errors" powerpc/msi: Fix compile error on mpc83xx powerpc: Fix stack protector crashes on CPU hotplug powerpc/traps: restore recoverability of machine_check interrupts powerpc/64/module: REL32 relocation range check powerpc/64s/radix: Fix radix__flush_tlb_collapsed_pmd double flushing pmd selftests/powerpc: Add a test of wild bctr powerpc/mm: Fix page table dump to work on Radix powerpc/mm/radix: Display if mappings are exec or not powerpc/mm/radix: Simplify split mapping logic powerpc/mm/radix: Remove the retry in the split mapping logic powerpc/mm/radix: Fix small page at boundary when splitting powerpc/mm/radix: Fix overuse of small pages in splitting logic powerpc/mm/radix: Fix off-by-one in split mapping logic powerpc/ftrace: Handle large kernel configs powerpc/mm: Fix WARN_ON with THP NUMA migration selftests/powerpc: Fix out-of-tree build errors powerpc/time: no steal_time when CONFIG_PPC_SPLPAR is not selected powerpc/time: Only set CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC64 powerpc/time: isolate scaled cputime accounting in dedicated functions. ...
2018-10-26powerpc/pseries: add missing cpumask.h include fileTyrel Datwyler
Build error is encountered when inlcuding <asm/rtas.h> if no explicit or implicit include of cpumask.h exists in the including file. In file included from arch/powerpc/platforms/pseries/hotplug-pci.c:3:0: ./arch/powerpc/include/asm/rtas.h:360:34: error: unknown type name 'cpumask_var_t' extern int rtas_online_cpus_mask(cpumask_var_t cpus); ^ ./arch/powerpc/include/asm/rtas.h:361:35: error: unknown type name 'cpumask_var_t' extern int rtas_offline_cpus_mask(cpumask_var_t cpus); Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26powerpc/8xx: Use patch_site for perf counters setupChristophe Leroy
The 8xx TLB miss routines are patched when (de)activating perf counters. This patch uses the new patch_site functionality in order to get a better code readability and avoid a label mess when dumping the code with 'objdump -d' Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26powerpc/8xx: Use patch_site for memory setup patchingChristophe Leroy
The 8xx TLB miss routines are patched at startup at several places. This patch uses the new patch_site functionality in order to get a better code readability and avoid a label mess when dumping the code with 'objdump -d' Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26powerpc/code-patching: Add a helper to get the address of a patch_siteChristophe Leroy
This patch adds a helper to get the address of a patch_site. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Call it "patch site" addr] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP"Christophe Leroy
This reverts commit 4f94b2c7462d9720b2afa7e8e8d4c19446bb31ce. That commit was buggy, as it used rlwinm instead of rlwimi. Instead of fixing that bug, we revert the previous commit in order to reduce the dependency between L1 entries and L2 entries Fixes: 4f94b2c7462d9 ("powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-25Merge tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Radim Krčmář: "ARM: - Improved guest IPA space support (32 to 52 bits) - RAS event delivery for 32bit - PMU fixes - Guest entry hardening - Various cleanups - Port of dirty_log_test selftest PPC: - Nested HV KVM support for radix guests on POWER9. The performance is much better than with PR KVM. Migration and arbitrary level of nesting is supported. - Disable nested HV-KVM on early POWER9 chips that need a particular hardware bug workaround - One VM per core mode to prevent potential data leaks - PCI pass-through optimization - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base s390: - Initial version of AP crypto virtualization via vfio-mdev - Improvement for vfio-ap - Set the host program identifier - Optimize page table locking x86: - Enable nested virtualization by default - Implement Hyper-V IPI hypercalls - Improve #PF and #DB handling - Allow guests to use Enlightened VMCS - Add migration selftests for VMCS and Enlightened VMCS - Allow coalesced PIO accesses - Add an option to perform nested VMCS host state consistency check through hardware - Automatic tuning of lapic_timer_advance_ns - Many fixes, minor improvements, and cleanups" * tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned Revert "kvm: x86: optimize dr6 restore" KVM: PPC: Optimize clearing TCEs for sparse tables x86/kvm/nVMX: tweak shadow fields selftests/kvm: add missing executables to .gitignore KVM: arm64: Safety check PSTATE when entering guest and handle IL KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips arm/arm64: KVM: Enable 32 bits kvm vcpu events support arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension() KVM: arm64: Fix caching of host MDCR_EL2 value KVM: VMX: enable nested virtualization by default KVM/x86: Use 32bit xor to clear registers in svm.c kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD kvm: vmx: Defer setting of DR6 until #DB delivery kvm: x86: Defer setting of CR2 until #PF delivery kvm: x86: Add payload operands to kvm_multiple_exception kvm: x86: Add exception payload fields to kvm_vcpu_events kvm: x86: Add has_payload and payload to kvm_queued_exception KVM: Documentation: Fix omission in struct kvm_vcpu_events KVM: selftests: add Enlightened VMCS test ...
2018-10-25Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timekeeping updates from Thomas Gleixner: "The timers and timekeeping departement provides: - Another large y2038 update with further preparations for providing the y2038 safe timespecs closer to the syscalls. - An overhaul of the SHCMT clocksource driver - SPDX license identifier updates - Small cleanups and fixes all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) tick/sched : Remove redundant cpu_online() check clocksource/drivers/dw_apb: Add reset control clocksource: Remove obsolete CLOCKSOURCE_OF_DECLARE clocksource/drivers: Unify the names to timer-* format clocksource/drivers/sh_cmt: Add R-Car gen3 support dt-bindings: timer: renesas: cmt: document R-Car gen3 support clocksource/drivers/sh_cmt: Properly line-wrap sh_cmt_of_table[] initializer clocksource/drivers/sh_cmt: Fix clocksource width for 32-bit machines clocksource/drivers/sh_cmt: Fixup for 64-bit machines clocksource/drivers/sh_tmu: Convert to SPDX identifiers clocksource/drivers/sh_mtu2: Convert to SPDX identifiers clocksource/drivers/sh_cmt: Convert to SPDX identifiers clocksource/drivers/renesas-ostm: Convert to SPDX identifiers clocksource: Convert to using %pOFn instead of device_node.name tick/broadcast: Remove redundant check RISC-V: Request newstat syscalls y2038: signal: Change rt_sigtimedwait to use __kernel_timespec y2038: socket: Change recvmmsg to use __kernel_timespec y2038: sched: Change sched_rr_get_interval to use __kernel_timespec y2038: utimes: Rework #ifdef guards for compat syscalls ...
2018-10-25Merge tag 'pci-v4.20-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - Fix ASPM link_state teardown on removal (Lukas Wunner) - Fix misleading _OSC ASPM message (Sinan Kaya) - Make _OSC optional for PCI (Sinan Kaya) - Don't initialize ASPM link state when ACPI_FADT_NO_ASPM is set (Patrick Talbert) - Remove x86 and arm64 node-local allocation for host bridge structures (Punit Agrawal) - Pay attention to device-specific _PXM node values (Jonathan Cameron) - Support new Immediate Readiness bit (Felipe Balbi) - Differentiate between pciehp surprise and safe removal (Lukas Wunner) - Remove unnecessary pciehp includes (Lukas Wunner) - Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner) - Tolerate PCIe Slot Presence Detect being hardwired to zero to workaround broken hardware, e.g., the Wilocity switch/wireless device (Lukas Wunner) - Unify pciehp controller & slot structs (Lukas Wunner) - Constify hotplug_slot_ops (Lukas Wunner) - Drop hotplug_slot_info (Lukas Wunner) - Embed hotplug_slot struct into users instead of allocating it separately (Lukas Wunner) - Initialize PCIe port service drivers directly instead of relying on initcall ordering (Keith Busch) - Restore PCI config state after a slot reset (Keith Busch) - Save/restore DPC config state along with other PCI config state (Keith Busch) - Reference count devices during AER handling to avoid race issue with concurrent hot removal (Keith Busch) - If an Upstream Port reports ERR_FATAL, don't try to read the Port's config space because it is probably unreachable (Keith Busch) - During error handling, use slot-specific reset instead of secondary bus reset to avoid link up/down issues on hotplug ports (Keith Busch) - Restore previous AER/DPC handling that does not remove and re-enumerate devices on ERR_FATAL (Keith Busch) - Notify all drivers that may be affected by error recovery resets (Keith Busch) - Always generate error recovery uevents, even if a driver doesn't have error callbacks (Keith Busch) - Make PCIe link active reporting detection generic (Keith Busch) - Support D3cold in PCIe hierarchies during system sleep and runtime, including hotplug and Thunderbolt ports (Mika Westerberg) - Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots are empty or occupied (Jon Derrick) - Remove duplicated include from pci/pcie/err.c and unused variable from cpqphp (YueHaibing) - Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza Pawandeep) - Uninline PCI bus accessors for better ftracing (Keith Busch) - Remove unused AER Root Port .error_resume method (Keith Busch) - Use kfifo in AER instead of a local version (Keith Busch) - Use threaded IRQ in AER bottom half (Keith Busch) - Use managed resources in AER core (Keith Busch) - Reuse pcie_port_find_device() for AER injection (Keith Busch) - Abstract AER interrupt handling to disconnect error injection (Keith Busch) - Refactor AER injection callbacks to simplify future improvments (Keith Busch) - Remove unused Netronome NFP32xx Device IDs (Jakub Kicinski) - Use bitmap_zalloc() for dma_alias_mask (Andy Shevchenko) - Add switch fall-through annotations (Gustavo A. R. Silva) - Remove unused Switchtec quirk variable (Joshua Abraham) - Fix pci.c kernel-doc warning (Randy Dunlap) - Remove trivial PCI wrappers for DMA APIs (Christoph Hellwig) - Add Intel GPU device IDs to spurious interrupt quirk (Bin Meng) - Run Switchtec DMA aliasing quirk only on NTB endpoints to avoid useless dmesg errors (Logan Gunthorpe) - Update Switchtec NTB documentation (Wesley Yung) - Remove redundant "default n" from Kconfig (Bartlomiej Zolnierkiewicz) - Avoid panic when drivers enable MSI/MSI-X twice (Tonghao Zhang) - Add PCI support for peer-to-peer DMA (Logan Gunthorpe) - Add sysfs group for PCI peer-to-peer memory statistics (Logan Gunthorpe) - Add PCI peer-to-peer DMA scatterlist mapping interface (Logan Gunthorpe) - Add PCI configfs/sysfs helpers for use by peer-to-peer users (Logan Gunthorpe) - Add PCI peer-to-peer DMA driver writer's documentation (Logan Gunthorpe) - Add block layer flag to indicate driver support for PCI peer-to-peer DMA (Logan Gunthorpe) - Map Infiniband scatterlists for peer-to-peer DMA if they contain P2P memory (Logan Gunthorpe) - Register nvme-pci CMB buffer as PCI peer-to-peer memory (Logan Gunthorpe) - Add nvme-pci support for PCI peer-to-peer memory in requests (Logan Gunthorpe) - Use PCI peer-to-peer memory in nvme (Stephen Bates, Steve Wise, Christoph Hellwig, Logan Gunthorpe) - Cache VF config space size to optimize enumeration of many VFs (KarimAllah Ahmed) - Remove unnecessary <linux/pci-ats.h> include (Bjorn Helgaas) - Fix VMD AERSID quirk Device ID matching (Jon Derrick) - Fix Cadence PHY handling during probe (Alan Douglas) - Signal Cadence Endpoint interrupts via AXI region 0 instead of last region (Alan Douglas) - Write Cadence Endpoint MSI interrupts with 32 bits of data (Alan Douglas) - Remove redundant controller tests for "device_type == pci" (Rob Herring) - Document R-Car E3 (R8A77990) bindings (Tho Vu) - Add device tree support for R-Car r8a7744 (Biju Das) - Drop unused mvebu PCIe capability code (Thomas Petazzoni) - Add shared PCI bridge emulation code (Thomas Petazzoni) - Convert mvebu to use shared PCI bridge emulation (Thomas Petazzoni) - Add aardvark Root Port emulation (Thomas Petazzoni) - Support 100MHz/200MHz refclocks for i.MX6 (Lucas Stach) - Add initial power management for i.MX7 (Leonard Crestez) - Add PME_Turn_Off support for i.MX7 (Leonard Crestez) - Fix qcom runtime power management error handling (Bjorn Andersson) - Update TI dra7xx unaligned access errata workaround for host mode as well as endpoint mode (Vignesh R) - Fix kirin section mismatch warning (Nathan Chancellor) - Remove iproc PAXC slot check to allow VF support (Jitendra Bhivare) - Quirk Keystone K2G to limit MRRS to 256 (Kishon Vijay Abraham I) - Update Keystone to use MRRS quirk for host bridge instead of open coding (Kishon Vijay Abraham I) - Refactor Keystone link establishment (Kishon Vijay Abraham I) - Simplify and speed up Keystone link training (Kishon Vijay Abraham I) - Remove unused Keystone host_init argument (Kishon Vijay Abraham I) - Merge Keystone driver files into one (Kishon Vijay Abraham I) - Remove redundant Keystone platform_set_drvdata() (Kishon Vijay Abraham I) - Rename Keystone functions for uniformity (Kishon Vijay Abraham I) - Add Keystone device control module DT binding (Kishon Vijay Abraham I) - Use SYSCON API to get Keystone control module device IDs (Kishon Vijay Abraham I) - Clean up Keystone PHY handling (Kishon Vijay Abraham I) - Use runtime PM APIs to enable Keystone clock (Kishon Vijay Abraham I) - Clean up Keystone config space access checks (Kishon Vijay Abraham I) - Get Keystone outbound window count from DT (Kishon Vijay Abraham I) - Clean up Keystone outbound window configuration (Kishon Vijay Abraham I) - Clean up Keystone DBI setup (Kishon Vijay Abraham I) - Clean up Keystone ks_pcie_link_up() (Kishon Vijay Abraham I) - Fix Keystone IRQ status checking (Kishon Vijay Abraham I) - Add debug messages for all Keystone errors (Kishon Vijay Abraham I) - Clean up Keystone includes and macros (Kishon Vijay Abraham I) - Fix Mediatek unchecked return value from devm_pci_remap_iospace() (Gustavo A. R. Silva) - Fix Mediatek endpoint/port matching logic (Honghui Zhang) - Change Mediatek Root Port Class Code to PCI_CLASS_BRIDGE_PCI (Honghui Zhang) - Remove redundant Mediatek PM domain check (Honghui Zhang) - Convert Mediatek to pci_host_probe() (Honghui Zhang) - Fix Mediatek MSI enablement (Honghui Zhang) - Add Mediatek system PM support for MT2712 and MT7622 (Honghui Zhang) - Add Mediatek loadable module support (Honghui Zhang) - Detach VMD resources after stopping root bus to prevent orphan resources (Jon Derrick) - Convert pcitest build process to that used by other tools (iio, perf, etc) (Gustavo Pimentel) * tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits) PCI/AER: Refactor error injection fallbacks PCI/AER: Abstract AER interrupt handling PCI/AER: Reuse existing pcie_port_find_device() interface PCI/AER: Use managed resource allocations PCI: pcie: Remove redundant 'default n' from Kconfig PCI: aardvark: Implement emulated root PCI bridge config space PCI: mvebu: Convert to PCI emulated bridge config space PCI: mvebu: Drop unused PCI express capability code PCI: Introduce PCI bridge emulated config space common logic PCI: vmd: Detach resources after stopping root bus nvmet: Optionally use PCI P2P memory nvmet: Introduce helper functions to allocate and free request SGLs nvme-pci: Add support for P2P memory in requests nvme-pci: Use PCI p2pmem subsystem to manage the CMB IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]() block: Add PCI P2P flag for request queue PCI/P2PDMA: Add P2P DMA driver writer's documentation docs-rst: Add a new directory for PCI documentation PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset ...
2018-10-24Merge branch 'siginfo-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull siginfo updates from Eric Biederman: "I have been slowly sorting out siginfo and this is the culmination of that work. The primary result is in several ways the signal infrastructure has been made less error prone. The code has been updated so that manually specifying SEND_SIG_FORCED is never necessary. The conversion to the new siginfo sending functions is now complete, which makes it difficult to send a signal without filling in the proper siginfo fields. At the tail end of the patchset comes the optimization of decreasing the size of struct siginfo in the kernel from 128 bytes to about 48 bytes on 64bit. The fundamental observation that enables this is by definition none of the known ways to use struct siginfo uses the extra bytes. This comes at the cost of a small user space observable difference. For the rare case of siginfo being injected into the kernel only what can be copied into kernel_siginfo is delivered to the destination, the rest of the bytes are set to 0. For cases where the signal and the si_code are known this is safe, because we know those bytes are not used. For cases where the signal and si_code combination is unknown the bits that won't fit into struct kernel_siginfo are tested to verify they are zero, and the send fails if they are not. I made an extensive search through userspace code and I could not find anything that would break because of the above change. If it turns out I did break something it will take just the revert of a single change to restore kernel_siginfo to the same size as userspace siginfo. Testing did reveal dependencies on preferring the signo passed to sigqueueinfo over si->signo, so bit the bullet and added the complexity necessary to handle that case. Testing also revealed bad things can happen if a negative signal number is passed into the system calls. Something no sane application will do but something a malicious program or a fuzzer might do. So I have fixed the code that performs the bounds checks to ensure negative signal numbers are handled" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits) signal: Guard against negative signal numbers in copy_siginfo_from_user32 signal: Guard against negative signal numbers in copy_siginfo_from_user signal: In sigqueueinfo prefer sig not si_signo signal: Use a smaller struct siginfo in the kernel signal: Distinguish between kernel_siginfo and siginfo signal: Introduce copy_siginfo_from_user and use it's return value signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE signal: Fail sigqueueinfo if si_signo != sig signal/sparc: Move EMT_TAGOVF into the generic siginfo.h signal/unicore32: Use force_sig_fault where appropriate signal/unicore32: Generate siginfo in ucs32_notify_die signal/unicore32: Use send_sig_fault where appropriate signal/arc: Use force_sig_fault where appropriate signal/arc: Push siginfo generation into unhandled_exception signal/ia64: Use force_sig_fault where appropriate signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn signal/ia64: Use the generic force_sigsegv in setup_frame signal/arm/kvm: Use send_sig_mceerr signal/arm: Use send_sig_fault where appropriate signal/arm: Use force_sig_fault where appropriate ...
2018-10-21powerpc/msi: Fix compile error on mpc83xxChristophe Leroy
mpic_get_primary_version() is not defined when not using MPIC. The compile error log like: arch/powerpc/sysdev/built-in.o: In function `fsl_of_msi_probe': fsl_msi.c:(.text+0x150c): undefined reference to `fsl_mpic_primary_get_version' Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Reported-by: Radu Rendec <radu.rendec@gmail.com> Fixes: 807d38b73b6 ("powerpc/mpic: Add get_version API both for internal and external use") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20KVM: PPC: Optimize clearing TCEs for sparse tablesAlexey Kardashevskiy
The powernv platform maintains 2 TCE tables for VFIO - a hardware TCE table and a table with userspace addresses. These tables are radix trees, we allocate indirect levels when they are written to. Since the memory allocation is problematic in real mode, we have 2 accessors to the entries: - for virtual mode: it allocates the memory and it is always expected to return non-NULL; - fr real mode: it does not allocate and can return NULL. Also, DMA windows can span to up to 55 bits of the address space and since we never have this much RAM, such windows are sparse. However currently the SPAPR TCE IOMMU driver walks through all TCEs to unpin DMA memory. Since we maintain a userspace addresses table for VFIO which is a mirror of the hardware table, we can use it to know which parts of the DMA window have not been mapped and skip these so does this patch. The bare metal systems do not have this problem as they use a bypass mode of a PHB which maps RAM directly. This helps a lot with sparse DMA windows, reducing the shutdown time from about 3 minutes per 1 billion TCEs to a few seconds for 32GB sparse guest. Just skipping the last level seems to be good enough. As non-allocating accessor is used now in virtual mode as well, rename it from IOMMU_TABLE_USERSPACE_ENTRY_RM (real mode) to _RO (read only). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>