summaryrefslogtreecommitdiff
path: root/arch/powerpc/include/asm
AgeCommit message (Collapse)Author
2017-04-11powerpc: Create asm/debugfs.h and move powerpc_debugfs_root thereMichael Ellerman
powerpc_debugfs_root is the dentry representing the root of the "powerpc" directory tree in debugfs. Currently it sits in asm/debug.h, a long with some other things that have "debug" in the name, but are otherwise unrelated. Pull it out into a separate header, which also includes linux/debugfs.h, and convert all the users to include debugfs.h instead of debug.h. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-04powerpc/powernv: Introduce address translation services for Nvlink2Alistair Popple
Nvlink2 supports address translation services (ATS) allowing devices to request address translations from an mmu known as the nest MMU which is setup to walk the CPU page tables. To access this functionality certain firmware calls are required to setup and manage hardware context tables in the nvlink processing unit (NPU). The NPU also manages forwarding of TLB invalidates (known as address translation shootdowns/ATSDs) to attached devices. This patch exports several methods to allow device drivers to register a process id (PASID/PID) in the hardware tables and to receive notification of when a device should stop issuing address translation requests (ATRs). It also adds a fault handler to allow device drivers to demand fault pages in. Signed-off-by: Alistair Popple <alistair@popple.id.au> [mpe: Fix up comment formatting, use flush_tlb_mm()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-03powerpc/book3s: Print task info if we take a machine check in user modeMichael Ellerman
For an MCE (Machine Check Exception) that hits while in user mode MSR(PR=1), print the task info to the console MCE error log. This may help to identify an application that triggered the MCE. After this patch the MCE console looks like: Severe Machine check interrupt [Recovered] NIP: [0000000010039778] PID: 762 Comm: ebizzy Initiator: CPU Error type: SLB [Multihit] Effective address: 0000000010039778 Severe Machine check interrupt [Not recovered] NIP: [0000000010039778] PID: 763 Comm: ebizzy Initiator: CPU Error type: UE [Page table walk ifetch] Effective address: 0000000010039778 ebizzy[763]: unhandled signal 7 at 0000000010039778 nip 0000000010039778 lr 0000000010001b44 code 30004 Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-01powerpc/mm: Enable mappings above 128TBAneesh Kumar K.V
Not all user space application is ready to handle wide addresses. It's known that at least some JIT compilers use higher bits in pointers to encode their information. It collides with valid pointers with 512TB addresses and leads to crashes. To mitigate this, we are not going to allocate virtual address space above 128TB by default. But userspace can ask for allocation from full address space by specifying hint address (with or without MAP_FIXED) above 128TB. If hint address set above 128TB, but MAP_FIXED is not specified, we try to look for unmapped area by specified address. If it's already occupied, we look for unmapped area in *full* address space, rather than from 128TB window. This approach helps to easily make application's memory allocator aware about large address space without manually tracking allocated virtual address space. This is going to be a per mmap decision. ie, we can have some mmaps with larger addresses and other that do not. A sample memory layout looks like: 10000000-10010000 r-xp 00000000 fc:00 9057045 /home/max_addr_512TB 10010000-10020000 r--p 00000000 fc:00 9057045 /home/max_addr_512TB 10020000-10030000 rw-p 00010000 fc:00 9057045 /home/max_addr_512TB 10029630000-10029660000 rw-p 00000000 00:00 0 [heap] 7fff834a0000-7fff834b0000 rw-p 00000000 00:00 0 7fff834b0000-7fff83670000 r-xp 00000000 fc:00 9177190 /lib/powerpc64le-linux-gnu/libc-2.23.so 7fff83670000-7fff83680000 r--p 001b0000 fc:00 9177190 /lib/powerpc64le-linux-gnu/libc-2.23.so 7fff83680000-7fff83690000 rw-p 001c0000 fc:00 9177190 /lib/powerpc64le-linux-gnu/libc-2.23.so 7fff83690000-7fff836a0000 rw-p 00000000 00:00 0 7fff836a0000-7fff836c0000 r-xp 00000000 00:00 0 [vdso] 7fff836c0000-7fff83700000 r-xp 00000000 fc:00 9177193 /lib/powerpc64le-linux-gnu/ld-2.23.so 7fff83700000-7fff83710000 r--p 00030000 fc:00 9177193 /lib/powerpc64le-linux-gnu/ld-2.23.so 7fff83710000-7fff83720000 rw-p 00040000 fc:00 9177193 /lib/powerpc64le-linux-gnu/ld-2.23.so 7fffdccf0000-7fffdcd20000 rw-p 00000000 00:00 0 [stack] 1000000000000-1000000010000 rw-p 00000000 00:00 0 1ffff83710000-1ffff83720000 rw-p 00000000 00:00 0 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-01powerpc/pseries: Skip using reserved virtual address rangeAneesh Kumar K.V
Now that we use all the available virtual address range, we need to make sure we don't generate VSID such that it overlaps with the reserved vsid range. Reserved vsid range include the virtual address range used by the adjunct partition and also the VRMA virtual segment. We find the context value that can result in generating such a VSID and reserve it early in boot. We don't look at the adjunct range, because for now we disable the adjunct usage in a Linux LPAR via CAS interface. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Rewrite hash__reserve_context_id(), move the rest into pseries] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-01powerpc/mm/hash: Store addr_limit in PACAAneesh Kumar K.V
We optmize the slice page size array copy to paca by copying only the range based on addr_limit. This will require us to not look at page size array beyond addr_limit in PACA on slb fault. To enable that copy task size to paca which will be used during slb fault. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Rename from task_size to addr_limit, consolidate #ifdefs] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-01powerpc/mm: Add addr_limit to mm_context and use it to derive max slice indexAneesh Kumar K.V
In the followup patch, we will increase the slice array size to handle 512TB range, but will limit the max addr to 128TB. Avoid doing unnecessary computation and avoid doing slice mask related operation above address limit. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/hash: Increase VA range to 128TBAneesh Kumar K.V
We update the hash linux page table layout such that we can support 512TB. But we limit the TASK_SIZE to 128TB. We can switch to 128TB by default without conditional because that is the max virtual address supported by other architectures. We will later add a mechanism to on-demand increase the application's effective address range to 512TB. Having the page table layout changed to accommodate 512TB makes testing large memory configuration easier with less code changes to kernel Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/hash: Convert mask to unsigned longAneesh Kumar K.V
This doesn't have any functional change. But helps in avoiding mistakes in case the shift bit changes Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/hash: Support 68 bit VAAneesh Kumar K.V
Inorder to support large effective address range (512TB), we want to increase the virtual address bits to 68. But we do have platforms like p4 and p5 that can only do 65 bit VA. We support those platforms by limiting context bits on them to 16. The protovsid -> vsid conversion is verified to work with both 65 and 68 bit va values. I also documented the restrictions in a table format as part of code comments. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/hash: Check for non-kernel address in get_kernel_vsid()Michael Ellerman
get_kernel_vsid() has a very stern comment saying that it's only valid for kernel addresses, but there's nothing in the code to enforce that. Rather than hoping our callers are well behaved, add a check and return a VSID of 0 (invalid). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/hash: Use context ids 1-4 for the kernelAneesh Kumar K.V
Currently we use the top 4 context ids (0x7fffc-0x7ffff) for the kernel. Kernel VSIDs are built using these top context values and effective the segement ID. In subsequent patches we want to increase the max effective address to 512TB. We will achieve that by increasing the effective segment IDs there by increasing virtual address range. We will be switching to a 68bit virtual address in the following patch. But platforms like Power4 and Power5 only support a 65 bit virtual address. We will handle that by limiting the context bits to 16 instead of 19 on those platforms. That means the max context id will have a different value on different platforms. So that we don't have to deal with the kernel context ids changing between different platforms, move the kernel context ids down to use context ids 1-4. We can't use segment 0 of context-id 0, because that maps to VSID 0, which we want to keep as invalid, so we avoid context-id 0 entirely. Similarly we can't use the last segment of the maximum context, so we avoid it too. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Switch from 0-3 to 1-4 so VSID=0 remains invalid] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm: Split radix vs hash mm context initialisationMichael Ellerman
Complete the split of the radix vs hash mm context initialisation. This is mostly code movement, with the exception that we now limit the context allocation to PRTB_ENTRIES - 1 on radix. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/hash: Abstract context id allocation for KVMMichael Ellerman
KVM wants to be able to allocate an MMU context id, which it does currently by calling __init_new_context(). We're about to rework that code, so provide a wrapper for KVM so it can not worry about the details. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/slice: Move slice_mask struct definition to slice.cAneesh Kumar K.V
This structure definition need not be in a header since this is used only by slice.c file. So move it to slice.c. This also allow us to use SLICE_NUM_HIGH instead of 64. I also switch the low_slices type to u64 from u16. This doesn't have an impact on size of struct due to padding added with u16 type. This helps in using bitmap printing function for printing slice mask. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm: Move copy_mm_to_paca to paca.cAneesh Kumar K.V
We also update the function arg to struct mm_struct. Move this so that function finds the definition of struct mm_struct. No functional change in this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/slice: Convert slice_mask high slice to a bitmapAneesh Kumar K.V
In followup patch we want to increase the va range which will result in us requiring high_slices to have more than 64 bits. To enable this convert high_slices to bitmap. We keep the number bits same in this patch and later change that to higher value Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Fold in fix to use bitmap_empty()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm: Move hash specific pte bits to be top bits of RPNAneesh Kumar K.V
We don't support the full 57 bits of physical address and hence can overload the top bits of RPN as hash specific pte bits. Add a BUILD_BUG_ON() to enforce the relationship between H_PAGE_F_SECOND and H_PAGE_F_GIX. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Move the BUILD_BUG_ON() into hash_utils_64.c and comment it] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm: Lower the max real address to 53 bitsAneesh Kumar K.V
Max value supported by hardware is 51 bits address. Radix page table define a slot of 57 bits for future expansion. We restrict the value supported in linux kernel 53 bits, so that we can use the bits between 57-53 for storing hash linux page table bits. This is done in the next patch. This will free up the software page table bits to be used for features that are needed for both hash and radix. The current hash linux page table format doesn't have any free software bits. Moving hash linux page table specific bits to top of RPN field free up the software bits for other purpose. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm: Define all PTE bits based on radix definitions.Aneesh Kumar K.V
Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm: Define _PAGE_SOFT_DIRTY unconditionallyAneesh Kumar K.V
Conditional PTE bit definition is confusing and results in coding error. Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/radix: rename _PAGE_LARGE to R_PAGE_LARGEAneesh Kumar K.V
This bit is only used by radix and it is nice to follow the naming style of having bit name start with H_/R_ depending on which translation mode they are used. No functional change in this patch. Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm: Cleanup bits definition between hash and radix.Aneesh Kumar K.V
Define everything based on bits present in pgtable.h. This will help in easily identifying overlapping bits between hash/radix. No functional change with this patch. Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm/nohash: MM_SLICE is only used by book3s 64Aneesh Kumar K.V
BOOKE code is dead code as per the Kconfig details. So make it simpler by enabling MM_SLICE only for book3s_64. The changes w.r.t nohash is just removing deadcode. W.r.t ppc64, 4k without hugetlb will now enable MM_SLICE. But that is good, because we reduce one extra variant which probably is not getting tested much. Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-21powerpc/64s: Move POWER machine check defines into mce_power.cNicholas Piggin
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-21powerpc: Fix missing CRCs, add more asm-prototypes.h declarationsBen Hutchings
Add declarations for: - __mfdcr, __mtdcr (if CONFIG_PPC_DCR_NATIVE=y; through <asm/dcr.h>) - switch_mmu_context (if CONFIG_PPC_BOOK3S_64=n; through <asm/mmu_context.h>) Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-20powerpc/ftrace: Add prototype for prepare_ftrace_return()Tobin C. Harding
Sparse emits a warning: symbol 'prepare_ftrace_return' was not declared. Should it be static? prepare_ftrace_return() is called from assembler and should not be static. Add a prototype for it to asm-prototypes.h and include that in ftrace.c. Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-20powerpc/pseries: Move struct hcall_stats to hvCall_inst.cTobin C. Harding
struct hcall_stats is only used in hvCall_inst.c, so move it there. Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-20powerpc: Move THREAD_SHIFT config to KconfigHamish Martin
Shift the logic for defining THREAD_SHIFT logic to Kconfig in order to allow override by users. Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-16powerpc: Wire up statx() syscallChandan Rajendra
Test runs on a ppc64 BE guest succeeded. linux/samples/statx/test-statx program was executed on the following file types, 1. Regular file 2. Directory 3. device file 4. symlink 5. Named pipe The test run also included invoking test-statx with the runtime options provided in the main() function of test-statx.c Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-13Merge tag 'powerpc-4.11-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull some more powerpc fixes from Michael Ellerman: "The main item is the addition of the Power9 Machine Check handler. This was delayed to make sure some details were correct, and is as minimal as possible. The rest is small fixes, two for the Power9 PMU, two dealing with obscure toolchain problems, two for the PowerNV IOMMU code (used by VFIO), and one to fix a crash on 32-bit machines with macio devices due to missing dma_ops. Thanks to: Alexey Kardashevskiy, Cyril Bur, Larry Finger, Madhavan Srinivasan, Nicholas Piggin" * tag 'powerpc-4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: POWER9 machine check handler powerpc/64s: allow machine check handler to set severity and initiator powerpc/64s: fix handling of non-synchronous machine checks powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops powerpc/powernv/ioda2: Update iommu table base on ownership change powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested selftests/powerpc: Replace stxvx and lxvx with stxvd2x/lxvd2x powerpc/perf: Handle sdar_mode for marked event in power9 powerpc/perf: Fix perf_get_data_addr() for power9 DD1 powerpc/boot: Fix zImage TOC alignment
2017-03-10Merge branch 'prep-for-5level'Linus Torvalds
Merge 5-level page table prep from Kirill Shutemov: "Here's relatively low-risk part of 5-level paging patchset. Merging it now will make x86 5-level paging enabling in v4.12 easier. The first patch is actually x86-specific: detect 5-level paging support. It boils down to single define. The rest of patchset converts Linux MMU abstraction from 4- to 5-level paging. Enabling of new abstraction in most cases requires adding single line of code in arch-specific code. The rest is taken care by asm-generic/. Changes to mm/ code are mostly mechanical: add support for new page table level -- p4d_t -- where we deal with pud_t now. v2: - fix build on microblaze (Michal); - comment for __ARCH_HAS_5LEVEL_HACK in kasan_populate_zero_shadow(); - acks from Michal" * emailed patches from Kirill A Shutemov <kirill.shutemov@linux.intel.com>: mm: introduce __p4d_alloc() mm: convert generic code to 5-level paging asm-generic: introduce <asm-generic/pgtable-nop4d.h> arch, mm: convert all architectures to use 5level-fixup.h asm-generic: introduce __ARCH_USE_5LEVEL_HACK asm-generic: introduce 5level-fixup.h x86/cpufeature: Add 5-level paging detection
2017-03-10powerpc/64s: POWER9 machine check handlerNicholas Piggin
Add POWER9 machine check handler. There are several new types of errors added, so logging messages for those are also added. This doesn't attempt to reuse any of the P7/8 defines or functions, because that becomes too complex. The better option in future is to use a table driven approach. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-10powerpc/64s: allow machine check handler to set severity and initiatorNicholas Piggin
Currently severity and initiator are always set to MCE_SEV_ERROR_SYNC and MCE_INITIATOR_CPU in the core mce code. Allow them to be set by the machine specific mce handlers. No functional change for existing handlers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-09power/mm: update pte_write and pte_wrprotect to handle savedwriteAneesh Kumar K.V
We use pte_write() to check whethwer the pte entry is writable. This is mostly used to later mark the pte read only if it is writable. The other use of pte_write() is to check whether the pte_entry is writable so that hardware page table entry can be marked accordingly. This is used in kvm where we look at qemu page table entry and update hardware hash page table for the guest with correct write enable bit. With the above, for the first usage we should also check the savedwrite bit so that we can correctly clear the savedwite bit. For the later, we add a new variant __pte_write(). With this we can revert write_protect_page part of 595cd8f256d2 ("mm/ksm: handle protnone saved writes when making page write protect"). But I left it as it is as an example code for savedwrite check. Fixes: c137a2757b886 ("powerpc/mm/autonuma: switch ppc64 to its own implementation of saved write") Link: http://lkml.kernel.org/r/1488203787-17849-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Rik van Riel <riel@surriel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09powerpc/mm: handle protnone ptes on forkAneesh Kumar K.V
We need to mark pages of parent process read only on fork. Numa fault pte needs a protnone ptes variant with saved write flag set. On fork we need to make sure we remove the saved write bit. Instead of adding the protnone check in the caller update ptep_set_wrprotect variants to clear savedwrite bit. Without this we see random segfaults in application on fork. Fixes: c137a2757b886 ("powerpc/mm/autonuma: switch ppc64 to its own implementation of saved write") Link: http://lkml.kernel.org/r/1488203787-17849-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Rik van Riel <riel@surriel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09arch, mm: convert all architectures to use 5level-fixup.hKirill A. Shutemov
If an architecture uses 4level-fixup.h we don't need to do anything as it includes 5level-fixup.h. If an architecture uses pgtable-nop*d.h, define __ARCH_USE_5LEVEL_HACK before inclusion of the header. It makes asm-generic code to use 5level-fixup.h. If an architecture has 4-level paging or folds levels on its own, include 5level-fixup.h directly. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-07Merge tag 'powerpc-4.11-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Five fairly small fixes for things that went in this cycle. A fairly large patch to rework the CAS logic on Power9, necessitated by a late change to the firmware API, and we can't boot without it. Three fixes going to stable, allowing more instructions to be emulated on LE, fixing a boot crash on 32-bit Freescale BookE machines, and the OPAL XICS workaround. And a patch from me to sort the selects under CONFIG PPC. Annoying churn, but worth it in the long run, and best for it to go in now to avoid conflicts. Thanks to: Alexey Kardashevskiy, Anton Blanchard, Balbir Singh, Gautham R. Shenoy, Laurentiu Tudor, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Sachin Sant, Shile Zhang, Suraj Jitindar Singh" * tag 'powerpc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Sort the selects under CONFIG_PPC powerpc/64: Fix L1D cache shape vector reporting L1I values powerpc/64: Avoid panic during boot due to divide by zero in init_cache_info() powerpc: Update to new option-vector-5 format for CAS powerpc: Parse the command line before calling CAS powerpc/xics: Work around limitations of OPAL XICS priority handling powerpc/64: Fix checksum folding in csum_add() powerpc/powernv: Fix opal tracepoints with JUMP_LABEL=n powerpc/booke: Fix boot crash due to null hugepd powerpc: Fix compiling a BE kernel with a powerpc64le toolchain selftest/powerpc: Fix false failures for skipped tests powerpc/powernv: Fix bug due to labeling ambiguity in power_enter_stop powerpc/64: Invalidate process table caching after setting process table powerpc: emulate_step() tests for load/store instructions powerpc: Emulation support for load/store instructions on LE
2017-03-06powerpc/64: Fix L1D cache shape vector reporting L1I valuesMichael Ellerman
It seems we didn't pay quite enough attention when testing the new cache shape vectors, which means we didn't notice the bug where the vector for the L1D was using the L1I values. Fix it, resulting in eg: L1I cache size: 0x8000 32768B 32K L1I line size: 0x80 8-way associative L1D cache size: 0x10000 65536B 64K L1D line size: 0x80 8-way associative Fixes: 98a5f361b862 ("powerpc: Add new cache geometry aux vectors") Cut-and-paste-bug-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Badly-reviewed-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-06powerpc: Update to new option-vector-5 format for CASSuraj Jitindar Singh
On POWER9 the ibm,client-architecture-support (CAS) negotiation process has been updated to change how the host to guest negotiation is done for the new hash/radix mmu as well as the nest mmu, process tables and guest translation shootdown (GTSE). This is documented in the unreleased PAPR ACR "CAS option vector additions for P9". The host tells the guest which options it supports in ibm,arch-vec-5-platform-support. The guest then chooses a subset of these to request in the CAS call and these are agreed to in the ibm,architecture-vec-5 property of the chosen node. Thus we read ibm,arch-vec-5-platform-support and make our selection before calling CAS. We then parse the ibm,architecture-vec-5 property of the chosen node to check whether we should run as hash or radix. ibm,arch-vec-5-platform-support format: index value pairs: <index, val> ... <index, val> index: Option vector 5 byte number val: Some representation of supported values Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Don't print about unknown options, be consistent with OV5_FEAT] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-04Merge tag 'kvm-4.11-2' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull more KVM updates from Radim Krčmář: "Second batch of KVM changes for the 4.11 merge window: PPC: - correct assumption about ASDR on POWER9 - fix MMIO emulation on POWER9 x86: - add a simple test for ioperm - cleanup TSS (going through KVM tree as the whole undertaking was caused by VMX's use of TSS) - fix nVMX interrupt delivery - fix some performance counters in the guest ... and two cleanup patches" * tag 'kvm-4.11-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: nVMX: Fix pending events injection x86/kvm/vmx: remove unused variable in segment_base() selftests/x86: Add a basic selftest for ioperm x86/asm: Tidy up TSS limit code kvm: convert kvm.users_count from atomic_t to refcount_t KVM: x86: never specify a sample period for virtualized in_tx_cp counters KVM: PPC: Book3S HV: Don't use ASDR for real-mode HPT faults on POWER9 KVM: PPC: Book3S HV: Fix software walk of guest process page tables
2017-03-04powerpc/64: Fix checksum folding in csum_add()Shile Zhang
Paul's patch to fix checksum folding, commit b492f7e4e07a ("powerpc/64: Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold") missed a case in csum_add(). Fix it. Signed-off-by: Shile Zhang <shile.zhang@nokia.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-03powerpc/booke: Fix boot crash due to null hugepdLaurentiu Tudor
On 32-bit book-e machines, hugepd_ok() no longer takes into account null hugepd values, causing this crash at boot: Unable to handle kernel paging request for data at address 0x80000000 ... NIP [c0018378] follow_huge_addr+0x38/0xf0 LR [c001836c] follow_huge_addr+0x2c/0xf0 Call Trace: follow_huge_addr+0x2c/0xf0 (unreliable) follow_page_mask+0x40/0x3e0 __get_user_pages+0xc8/0x450 get_user_pages_remote+0x8c/0x250 copy_strings+0x110/0x390 copy_strings_kernel+0x2c/0x50 do_execveat_common+0x478/0x630 do_execve+0x2c/0x40 try_to_run_init_process+0x18/0x60 kernel_init+0xbc/0x110 ret_from_kernel_thread+0x5c/0x64 This impacts all nxp (ex-freescale) 32-bit booke platforms. This was caused by the change of hugepd_t.pd from signed to unsigned, and the update to the nohash version of hugepd_ok(). Previously hugepd_ok() could exclude all non-huge and NULL pgds using > 0, whereas now we need to explicitly check that the value is not zero and also that PD_HUGE is *clear*. This isn't protected by the pgd_none() check in __find_linux_pte_or_hugepte() because on 32-bit we use pgtable-nopud.h, which causes the pgd_none() check to be always false. Fixes: 20717e1ff526 ("powerpc/mm: Fix little-endian 4K hugetlb") Cc: stable@vger.kernel.org # v4.7+ Reported-by: Madalin-Cristian Bucur <madalin.bucur@nxp.com> Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> [mpe: Flesh out change log details.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-03powerpc/powernv: Fix bug due to labeling ambiguity in power_enter_stopGautham R. Shenoy
Commit 09206b600c76 ("powernv: Pass PSSCR value and mask to power9_idle_stop") added additional code in power_enter_stop() to distinguish between stop requests whose PSSCR had ESL=EC=1 from those which did not. When ESL=EC=1, we do a forward-jump to a location labelled by "1", which had the code to handle the ESL=EC=1 case. Unfortunately just a couple of instructions before this label, is the macro IDLE_STATE_ENTER_SEQ() which also has a label "1" in its expansion. As a result, the current code can result in directly executing stop instruction for deep stop requests with PSSCR ESL=EC=1, without saving the hypervisor state. Fix this BUG by labeling the location that handles ESL=EC=1 case with a more descriptive label ".Lhandle_esl_ec_set" (local label suggestion a la .Lxx from Anton Blanchard). While at it, rename the label "2" labelling the location of the code handling entry into deep stop states with ".Lhandle_deep_stop". For a good measure, change the label in IDLE_STATE_ENTER_SEQ() macro to an not-so commonly used value in order to avoid similar mishaps in the future. Fixes: 09206b600c76 ("powernv: Pass PSSCR value and mask to power9_idle_stop") Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-03powerpc: emulate_step() tests for load/store instructionsRavi Bangoria
Add new selftest that test emulate_step for Normal, Floating Point, Vector and Vector Scalar - load/store instructions. Test should run at boot time if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set. Sample log: emulate_step_test: ld : PASS emulate_step_test: lwz : PASS emulate_step_test: lwzx : PASS emulate_step_test: std : PASS emulate_step_test: ldarx / stdcx. : PASS emulate_step_test: lfsx : PASS emulate_step_test: stfsx : PASS emulate_step_test: lfdx : PASS emulate_step_test: stfdx : PASS emulate_step_test: lvx : PASS emulate_step_test: stvx : PASS emulate_step_test: lxvd2x : PASS emulate_step_test: stxvd2x : PASS Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> [mpe: Drop start/complete lines, make it all __init] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-01Merge tag 'powerpc-4.11-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull more powerpc updates from Michael Ellerman: "Highlights include: - an update of the disassembly code used by xmon to the latest versions in binutils. We've received permission from all the authors of the relevant binutils changes to relicense their changes to the relevant files from GPLv3 to GPLv2, for inclusion in Linux. Thanks to Peter Bergner for doing the leg work to get permission from everyone. - addition of the "architected" Power9 CPU table entry, allowing us to boot in Power9 architected mode under a hypervisor. - updates to the Power9 PMU code. - implementation of clear_bit_unlock_is_negative_byte() to optimise unlock_page(). - Freescale updates from Scott: "Highlights include 8xx breakpoints and perf, t1042rdb display support, and board updates." Thanks to: Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas Miller, Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan, Michael Roth, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Peter Bergner, Paul E. McKenney, Rashmica Gupta, Russell Currey, Sahil Mehta, Stewart Smith" * tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits) powerpc: Remove leftover cputime_to_nsecs call causing build error powerpc/mm/hash: Always clear UPRT and Host Radix bits when setting up CPU powerpc/optprobes: Fix TOC handling in optprobes trampoline powerpc/pseries: Advertise Hot Plug Event support to firmware cxl: fix nested locking hang during EEH hotplug powerpc/xmon: Dump memory in CPU endian format powerpc/pseries: Revert 'Auto-online hotplugged memory' powerpc/powernv: Make PCI non-optional powerpc/64: Implement clear_bit_unlock_is_negative_byte() powerpc/powernv: Remove unused variable in pnv_pci_sriov_disable() powerpc/kernel: Remove error message in pcibios_setup_phb_resources() powerpc/mm: Fix typo in set_pte_at() pci/hotplug/pnv-php: Disable MSI and PCI device properly pci/hotplug/pnv-php: Disable surprise hotplug capability on conflicts pci/hotplug/pnv-php: Remove WARN_ON() in pnv_php_put_slot() powerpc: Add POWER9 architected mode to cputable powerpc/perf: use is_kernel_addr macro in perf_get_misc_flags() powerpc/perf: Avoid FAB_*_MATCH checks for power9 powerpc/perf: Add restrictions to PMC5 in power9 DD1 powerpc/perf: Use Instruction Counter value ...
2017-03-01KVM: PPC: Book3S HV: Fix software walk of guest process page tablesPaul Mackerras
This fixes some bugs in the code that walks the guest's page tables. These bugs cause MMIO emulation to fail whenever the guest is in virtial mode (MMU on), leading to the guest hanging if it tried to access a virtio device. The first bug was that when reading the guest's process table, we were using the whole of arch->process_table, not just the field that contains the process table base address. The second bug was that the mask used when reading the process table entry to get the radix tree base address, RPDB_MASK, had the wrong value. Fixes: 9e04ba69beec ("KVM: PPC: Book3S HV: Add basic infrastructure for radix guests") Fixes: e99833448c5f ("powerpc/mm/radix: Add partition table format & callback") Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-02-27scripts/spelling.txt: add "partiton" pattern and fix typo instancesMasahiro Yamada
Fix typos and add the following to the scripts/spelling.txt: partiton||partition Link: http://lkml.kernel.org/r/1481573103-11329-7-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27kprobes: move kprobe declarations to asm-generic/kprobes.hLuis R. Rodriguez
Often all is needed is these small helpers, instead of compiler.h or a full kprobes.h. This is important for asm helpers, in fact even some asm/kprobes.h make use of these helpers... instead just keep a generic asm file with helpers useful for asm code with the least amount of clutter as possible. Likewise we need now to also address what to do about this file for both when architectures have CONFIG_HAVE_KPROBES, and when they do not. Then for when architectures have CONFIG_HAVE_KPROBES but have disabled CONFIG_KPROBES. Right now most asm/kprobes.h do not have guards against CONFIG_KPROBES, this means most architecture code cannot include asm/kprobes.h safely. Correct this and add guards for architectures missing them. Additionally provide architectures that not have kprobes support with the default asm-generic solution. This lets us force asm/kprobes.h on the header include/linux/kprobes.h always, but most importantly we can now safely include just asm/kprobes.h on architecture code without bringing the full kitchen sink of header files. Two architectures already provided a guard against CONFIG_KPROBES on its kprobes.h: sh, arch. The rest of the architectures needed gaurds added. We avoid including any not-needed headers on asm/kprobes.h unless kprobes have been enabled. In a subsequent atomic change we can try now to remove compiler.h from include/linux/kprobes.h. During this sweep I've also identified a few architectures defining a common macro needed for both kprobes and ftrace, that of the definition of the breakput instruction up. Some refer to this as BREAKPOINT_INSTRUCTION. This must be kept outside of the #ifdef CONFIG_KPROBES guard. [mcgrof@kernel.org: fix arm64 build] Link: http://lkml.kernel.org/r/CAB=NE6X1WMByuARS4mZ1g9+W=LuVBnMDnh_5zyN0CLADaVh=Jw@mail.gmail.com [sfr@canb.auug.org.au: fixup for kprobes declarations moving] Link: http://lkml.kernel.org/r/20170214165933.13ebd4f4@canb.auug.org.au Link: http://lkml.kernel.org/r/20170203233139.32682-1-mcgrof@kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-25Merge tag 'for-next-dma_ops' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma DMA mapping updates from Doug Ledford: "Drop IB DMA mapping code and use core DMA code instead. Bart Van Assche noted that the ib DMA mapping code was significantly similar enough to the core DMA mapping code that with a few changes it was possible to remove the IB DMA mapping code entirely and switch the RDMA stack to use the core DMA mapping code. This resulted in a nice set of cleanups, but touched the entire tree and has been kept separate for that reason." * tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits) IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it IB/core: Remove ib_device.dma_device nvme-rdma: Switch from dma_device to dev.parent RDS: net: Switch from dma_device to dev.parent IB/srpt: Modify a debug statement IB/srp: Switch from dma_device to dev.parent IB/iser: Switch from dma_device to dev.parent IB/IPoIB: Switch from dma_device to dev.parent IB/rxe: Switch from dma_device to dev.parent IB/vmw_pvrdma: Switch from dma_device to dev.parent IB/usnic: Switch from dma_device to dev.parent IB/qib: Switch from dma_device to dev.parent IB/qedr: Switch from dma_device to dev.parent IB/ocrdma: Switch from dma_device to dev.parent IB/nes: Remove a superfluous assignment statement IB/mthca: Switch from dma_device to dev.parent IB/mlx5: Switch from dma_device to dev.parent IB/mlx4: Switch from dma_device to dev.parent IB/i40iw: Remove a superfluous assignment statement IB/hns: Switch from dma_device to dev.parent ...