summaryrefslogtreecommitdiff
path: root/arch/powerpc/include/asm
AgeCommit message (Collapse)Author
2021-06-30powerpc/64: enable MSR[EE] in irq replay pt_regsNicholas Piggin
Similar to commit 2b48e96be2f9f ("powerpc/64: fix irq replay pt_regs->softe value"), enable MSR_EE in pt_regs->msr. This makes the regs look more normal. It also allows some extra debug checks to be added to interrupt handler entry. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210630074621.2109197-7-npiggin@gmail.com
2021-06-30powerpc/64s/interrupt: preserve regs->softe for NMI interruptsNicholas Piggin
If an NMI interrupt hits in an implicit soft-masked region, regs->softe is modified to reflect that. This may not be necessary for correctness at the moment, but it is less surprising and it's unhelpful when debugging or adding checks. Make sure this is changed back to how it was found before returning. Fixes: 4ec5feec1ad0 ("powerpc/64s: Make NMI record implicitly soft-masked code as irqs disabled") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210630074621.2109197-6-npiggin@gmail.com
2021-06-30powerpc/64s: add a table of implicit soft-masked addressesNicholas Piggin
Commit 9d1988ca87dd ("powerpc/64: treat low kernel text as irqs soft-masked") ends up catching too much code, including ret_from_fork, and parts of interrupt and syscall return that do not expect to be interrupts to be soft-masked. If an interrupt gets marked pending, and then the code proceeds out of the implicit soft-masked region it will fail to deal with the pending interrupt. Fix this by adding a new table of addresses which explicitly marks the regions of code that are soft masked. This table is only checked for interrupts that below __end_soft_masked, so most kernel interrupts will not have the overhead of the table search. Fixes: 9d1988ca87dd ("powerpc/64: treat low kernel text as irqs soft-masked") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210630074621.2109197-5-npiggin@gmail.com
2021-06-30powerpc/64e: remove implicit soft-masking and interrupt exit restart logicNicholas Piggin
The implicit soft-masking to speed up interrupt return was going to be used by 64e as well, but it has not been extensively tested on that platform and is not considered ready. It was intended to be disabled before merge. Disable it for now. Most of the restart code is common with 64s, so with more correctness and performance testing this could be re-enabled again by adding the extra soft-mask checks to interrupt handlers and flipping exit_must_hard_disable(). Fixes: 9d1988ca87dd ("powerpc/64: treat low kernel text as irqs soft-masked") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210630074621.2109197-4-npiggin@gmail.com
2021-06-29Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: "191 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab, slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap, mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization, pagealloc, and memory-failure)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits) mm,hwpoison: make get_hwpoison_page() call get_any_page() mm,hwpoison: send SIGBUS with error virutal address mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes mm/page_alloc: allow high-order pages to be stored on the per-cpu lists mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA docs: remove description of DISCONTIGMEM arch, mm: remove stale mentions of DISCONIGMEM mm: remove CONFIG_DISCONTIGMEM m68k: remove support for DISCONTIGMEM arc: remove support for DISCONTIGMEM arc: update comment about HIGHMEM implementation alpha: remove DISCONTIGMEM and NUMA mm/page_alloc: move free_the_page mm/page_alloc: fix counting of managed_pages mm/page_alloc: improve memmap_pages dbg msg mm: drop SECTION_SHIFT in code comments mm/page_alloc: introduce vm.percpu_pagelist_high_fraction mm/page_alloc: limit the number of pages on PCP lists when reclaim is active mm/page_alloc: scale the number of pages that are batch freed ...
2021-06-29Merge tag 'irq-core-2021-06-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates for the interrupt subsystem: Core changes: - Cleanup and simplification of common code to invoke the low level interrupt flow handlers when this invocation requires irqdomain resolution. Add the necessary core infrastructure. - Provide a proper interface for modular PMU drivers to set the interrupt affinity. - Add a request flag which allows to exclude interrupts from spurious interrupt detection. Useful especially for IPI handlers which always return IRQ_HANDLED which turns the spurious interrupt detection into a pointless waste of CPU cycles. Driver changes: - Bulk convert interrupt chip drivers to the new irqdomain low level flow handler invocation mechanism. - Add device tree bindings for the Renesas R-Car M3-W+ SoC - Enable modular build of the Qualcomm PDC driver - The usual small fixes and improvements" * tag 'irq-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) dt-bindings: interrupt-controller: arm,gic-v3: Describe GICv3 optional properties irqchip: gic-pm: Remove redundant error log of clock bulk irqchip/sun4i: Remove unnecessary oom message irqchip/irq-imx-gpcv2: Remove unnecessary oom message irqchip/imgpdc: Remove unnecessary oom message irqchip/gic-v3-its: Remove unnecessary oom message irqchip/gic-v2m: Remove unnecessary oom message irqchip/exynos-combiner: Remove unnecessary oom message irqchip: Bulk conversion to generic_handle_domain_irq() genirq: Move non-irqdomain handle_domain_irq() handling into ARM's handle_IRQ() genirq: Add generic_handle_domain_irq() helper irqchip/nvic: Convert from handle_IRQ() to handle_domain_irq() irqdesc: Fix __handle_domain_irq() comment genirq: Use irq_resolve_mapping() to implement __handle_domain_irq() and co irqdomain: Introduce irq_resolve_mapping() irqdomain: Protect the linear revmap with RCU irqdomain: Cache irq_data instead of a virq number in the revmap irqdomain: Use struct_size() helper when allocating irqdomain irqdomain: Make normal and nomap irqdomains exclusive powerpc: Move the use of irq_domain_add_nomap() behind a config option ...
2021-06-29mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMAMike Rapoport
After removal of DISCINTIGMEM the NEED_MULTIPLE_NODES and NUMA configuration options are equivalent. Drop CONFIG_NEED_MULTIPLE_NODES and use CONFIG_NUMA instead. Done with $ sed -i 's/CONFIG_NEED_MULTIPLE_NODES/CONFIG_NUMA/' \ $(git grep -wl CONFIG_NEED_MULTIPLE_NODES) $ sed -i 's/NEED_MULTIPLE_NODES/NUMA/' \ $(git grep -wl NEED_MULTIPLE_NODES) with manual tweaks afterwards. [rppt@linux.ibm.com: fix arm boot crash] Link: https://lkml.kernel.org/r/YMj9vHhHOiCVN4BF@linux.ibm.com Link: https://lkml.kernel.org/r/20210608091316.3622-9-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David Hildenbrand <david@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "This covers all architectures (except MIPS) so I don't expect any other feature pull requests this merge window. ARM: - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes PPC: - Support for the H_RPT_INVALIDATE hypercall - Conversion of Book3S entry/exit to C - Bug fixes S390: - new HW facilities for guests - make inline assembly more robust with KASAN and co x86: - Allow userspace to handle emulation errors (unknown instructions) - Lazy allocation of the rmap (host physical -> guest physical address) - Support for virtualizing TSC scaling on VMX machines - Optimizations to avoid shattering huge pages at the beginning of live migration - Support for initializing the PDPTRs without loading them from memory - Many TLB flushing cleanups - Refuse to load if two-stage paging is available but NX is not (this has been a requirement in practice for over a year) - A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from CR0/CR4/EFER, using the MMU mode everywhere once it is computed from the CPU registers - Use PM notifier to notify the guest about host suspend or hibernate - Support for passing arguments to Hyper-V hypercalls using XMM registers - Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap on AMD processors - Hide Hyper-V hypercalls that are not included in the guest CPUID - Fixes for live migration of virtual machines that use the Hyper-V "enlightened VMCS" optimization of nested virtualization - Bugfixes (not many) Generic: - Support for retrieving statistics without debugfs - Cleanups for the KVM selftests API" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (314 commits) KVM: x86: rename apic_access_page_done to apic_access_memslot_enabled kvm: x86: disable the narrow guest module parameter on unload selftests: kvm: Allows userspace to handle emulation errors. kvm: x86: Allow userspace to handle emulation errors KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on KVM: x86/mmu: Get CR4.SMEP from MMU, not vCPU, in shadow page fault KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic KVM: x86: Enhance comments for MMU roles and nested transition trickiness KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU KVM: x86/mmu: Use MMU's role to determine PTTYPE KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers KVM: x86/mmu: Add a helper to calculate root from role_regs KVM: x86/mmu: Add helper to update paging metadata KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0 KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper KVM: x86/mmu: Get nested MMU's root level from the MMU's role ...
2021-06-28Merge tag 'locking-core-2021-06-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Core locking & atomics: - Convert all architectures to ARCH_ATOMIC: move every architecture to ARCH_ATOMIC, then get rid of ARCH_ATOMIC and all the transitory facilities and #ifdefs. Much reduction in complexity from that series: 63 files changed, 756 insertions(+), 4094 deletions(-) - Self-test enhancements - Futexes: - Add the new FUTEX_LOCK_PI2 ABI, which is a variant that doesn't set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC). [ The temptation to repurpose FUTEX_LOCK_PI's implicit setting of FLAGS_CLOCKRT & invert the flag's meaning to avoid having to introduce a new variant was resisted successfully. ] - Enhance futex self-tests - Lockdep: - Fix dependency path printouts - Optimize trace saving - Broaden & fix wait-context checks - Misc cleanups and fixes. * tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) locking/lockdep: Correct the description error for check_redundant() futex: Provide FUTEX_LOCK_PI2 to support clock selection futex: Prepare futex_lock_pi() for runtime clock selection lockdep/selftest: Remove wait-type RCU_CALLBACK tests lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTING lockdep: Fix wait-type for empty stack locking/selftests: Add a selftest for check_irq_usage() lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage() locking/lockdep: Remove the unnecessary trace saving locking/lockdep: Fix the dep path printing for backwards BFS selftests: futex: Add futex compare requeue test selftests: futex: Add futex wait test seqlock: Remove trailing semicolon in macros locking/lockdep: Reduce LOCKDEP dependency list locking/lockdep,doc: Improve readability of the block matrix locking/atomics: atomic-instrumented: simplify ifdeffery locking/atomic: delete !ARCH_ATOMIC remnants locking/atomic: xtensa: move to ARCH_ATOMIC locking/atomic: sparc: move to ARCH_ATOMIC locking/atomic: sh: move to ARCH_ATOMIC ...
2021-06-25Merge tag 'kvmarm-5.14' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for v5.14. - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes
2021-06-26powerpc/ptrace: Refactor regs_set_return_{msr/ip}Christophe Leroy
regs_set_return_msr() and regs_set_return_ip() have a copy of the code of set_return_regs_changed(). Call the later instead. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/baf64a91557d3811c155616a6aa23ed7b3b21da4.1624619582.git.christophe.leroy@csgroup.eu
2021-06-26powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip}Christophe Leroy
regs_set_return_msr() and regs_set_return_ip() have a copy of the code of set_return_regs_changed(). Move up set_return_regs_changed() so it can be reused by regs_set_return_{msr/ip} Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/49f4fb051a3e1cb69f7305d5b6768aec14727c32.1624619582.git.christophe.leroy@csgroup.eu
2021-06-25powerpc: Fix is_kvm_guest() / kvm_para_available()Michael Ellerman
Commit a21d1becaa3f ("powerpc: Reintroduce is_kvm_guest() as a fast-path check") added is_kvm_guest() and changed kvm_para_available() to use it. is_kvm_guest() checks a static key, kvm_guest, and that static key is set in check_kvm_guest(). The problem is check_kvm_guest() is only called on pseries, and even then only in some configurations. That means is_kvm_guest() always returns false on all non-pseries and some pseries depending on configuration. That's a bug. For PR KVM guests this is noticable because they no longer do live patching of themselves, which can be detected by the omission of a message in dmesg such as: KVM: Live patching for a fast VM worked To fix it make check_kvm_guest() an initcall, to ensure it's always called at boot. It needs to be core so that it runs before kvm_guest_init() which is postcore. To be an initcall it needs to return int, where 0 means success, so update that. We still call it manually in pSeries_smp_probe(), because that runs before init calls are run. Fixes: a21d1becaa3f ("powerpc: Reintroduce is_kvm_guest() as a fast-path check") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210623130514.2543232-1-mpe@ellerman.id.au
2021-06-25powerpc/kprobes: Roll IS_RFI() macro into IS_RFID()Naveen N. Rao
In kprobes and xmon, we should exclude both 32-bit and 64-bit variants of mtmsr and rfi instructions from being stepped. Have IS_RFID() also detect a rfi instruction similar to IS_MTMSRD(). Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eee32e1b75dae85d471c89b4c0a123ad4b0aabf8.1621416666.git.naveen.n.rao@linux.vnet.ibm.com
2021-06-24KVM: stats: Separate generic stats from architecture specific onesJing Zhang
Generic KVM stats are those collected in architecture independent code or those supported by all architectures; put all generic statistics in a separate structure. This ensures that they are defined the same way in the statistics API which is being added, removing duplication among different architectures in the declaration of the descriptors. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25powerpc: Remove klimitChristophe Leroy
klimit is a global variable initialised at build time with the value of _end. This variable is never modified, so _end symbol can be used directly. Remove klimit. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9fa9ba6807c17f93f35a582c199c646c4a8bfd9c.1622800638.git.christophe.leroy@csgroup.eu
2021-06-25powerpc/64: use interrupt restart table to speed up return from interruptNicholas Piggin
Use the restart table facility to return from interrupt or system calls without disabling MSR[EE] or MSR[RI]. Interrupt return asm is put into the low soft-masked region, to prevent interrupts being processed here, although they are still taken as masked interrupts which causes SRRs to be clobbered, and a pending soft-masked interrupt to require replaying. The return code uses restart table regions to redirct to a fixup handler rather than continue with the exit, if such an interrupt happens. In this case the interrupt return is redirected to a fixup handler which reloads r1 for the interrupt stack and reloads registers and sets state up to replay the soft-masked interrupt and try the exit again. Some types of security exit fallback flushes and barriers are currently unable to cope with reentrant interrupts, e.g., because they store some state in the scratch SPR which would be clobbered even by masked interrupts. For now the interrupts-enabled exits are disabled when these flushes are used. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Guard unused exit_must_hard_disable() as reported by lkp] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-13-npiggin@gmail.com
2021-06-25powerpc/64: treat low kernel text as irqs soft-maskedNicholas Piggin
Treat code below __end_soft_masked as soft-masked for the purpose of alternate return. 64s already mostly does this for scv entry. This will be used to exit from interrupts without disabling MSR[EE]. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-12-npiggin@gmail.com
2021-06-25powerpc/64: allow alternate return locations for soft-masked interruptsNicholas Piggin
The exception table fixup adjusts a failed page fault's interrupt return location if it was taken at an address specified in the exception table, to a corresponding fixup handler address. Introduce a variation of that idea which adds a fixup table for NMIs and soft-masked asynchronous interrupts. This will be used to protect certain critical sections that are sensitive to being clobbered by interrupts coming in (due to using the same SPRs and/or irq soft-mask state). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-10-npiggin@gmail.com
2021-06-25powerpc/64: move interrupt return asm to interrupt_64.SNicholas Piggin
The next patch would like to move interrupt return assembly code to a low location before general text, so move it into its own file and include via head_64.S Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-7-npiggin@gmail.com
2021-06-25powerpc/64s: avoid reloading (H)SRR registers if they are still validNicholas Piggin
When an interrupt is taken, the SRR registers are set to return to where it left off. Unless they are modified in the meantime, or the return address or MSR are modified, there is no need to reload these registers when returning from interrupt. Introduce per-CPU flags that track the validity of SRR and HSRR registers. These are cleared when returning from interrupt, when using the registers for something else (e.g., OPAL calls), when adjusting the return address or MSR of a context, and when context switching (which changes the return address and MSR). This improves the performance of interrupt returns. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fold in fixup patch from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-5-npiggin@gmail.com
2021-06-25powerpc: remove interrupt exit helpers unused argumentNicholas Piggin
The msr argument is not used, remove it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-3-npiggin@gmail.com
2021-06-23Merge branch 'topic/ppc-kvm' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into HEAD - Support for the H_RPT_INVALIDATE hypercall - Conversion of Book3S entry/exit to C - Bug fixes
2021-06-23Merge branch 'topic/ppc-kvm' into nextMichael Ellerman
Pull in some more ppc KVM patches we are keeping in our topic branch. In particular this brings in the series to add H_RPT_INVALIDATE.
2021-06-22KVM: PPC: Book3S HV: Nested support in H_RPT_INVALIDATEBharata B Rao
Enable support for process-scoped invalidations from nested guests and partition-scoped invalidations for nested guests. Process-scoped invalidations for any level of nested guests are handled by implementing H_RPT_INVALIDATE handler in the nested guest exit path in L0. Partition-scoped invalidation requests are forwarded to the right nested guest, handled there and passed down to L0 for eventual handling. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> [aneesh: Nested guest partition-scoped invalidation changes] Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Squash in fixup patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210621085003.904767-5-bharata@linux.ibm.com
2021-06-21KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATEBharata B Rao
H_RPT_INVALIDATE does two types of TLB invalidations: 1. Process-scoped invalidations for guests when LPCR[GTSE]=0. This is currently not used in KVM as GTSE is not usually disabled in KVM. 2. Partition-scoped invalidations that an L1 hypervisor does on behalf of an L2 guest. This is currently handled by H_TLB_INVALIDATE hcall and this new replaces the old that. This commit enables process-scoped invalidations for L1 guests. Support for process-scoped and partition-scoped invalidations from/for nested guests will be added separately. Process scoped tlbie invalidations from L1 and nested guests need RS register for TLBIE instruction to contain both PID and LPID. This patch introduces primitives that execute tlbie instruction with both PID and LPID set in prepartion for H_RPT_INVALIDATE hcall. A description of H_RPT_INVALIDATE follows: int64   /* H_Success: Return code on successful completion */         /* H_Busy - repeat the call with the same */         /* H_Parameter, H_P2, H_P3, H_P4, H_P5 : Invalid parameters */ hcall(const uint64 H_RPT_INVALIDATE, /* Invalidate RPT translation lookaside information */       uint64 id,        /* PID/LPID to invalidate */       uint64 target,    /* Invalidation target */       uint64 type,      /* Type of lookaside information */       uint64 pg_sizes, /* Page sizes */       uint64 start,     /* Start of Effective Address (EA) range (inclusive) */       uint64 end)       /* End of EA range (exclusive) */ Invalidation targets (target) ----------------------------- Core MMU        0x01 /* All virtual processors in the partition */ Core local MMU  0x02 /* Current virtual processor */ Nest MMU        0x04 /* All nest/accelerator agents in use by the partition */ A combination of the above can be specified, except core and core local. Type of translation to invalidate (type) --------------------------------------- NESTED       0x0001  /* invalidate nested guest partition-scope */ TLB          0x0002  /* Invalidate TLB */ PWC          0x0004  /* Invalidate Page Walk Cache */ PRT          0x0008  /* Invalidate caching of Process Table Entries if NESTED is clear */ PAT          0x0008  /* Invalidate caching of Partition Table Entries if NESTED is set */ A combination of the above can be specified. Page size mask (pages) ---------------------- 4K              0x01 64K             0x02 2M              0x04 1G              0x08 All sizes       (-1UL) A combination of the above can be specified. All page sizes can be selected with -1. Semantics: Invalidate radix tree lookaside information            matching the parameters given. * Return H_P2, H_P3 or H_P4 if target, type, or pageSizes parameters are different from the defined values. * Return H_PARAMETER if NESTED is set and pid is not a valid nested LPID allocated to this partition * Return H_P5 if (start, end) doesn't form a valid range. Start and end should be a valid Quadrant address and  end > start. * Return H_NotSupported if the partition is not in running in radix translation mode. * May invalidate more translation information than requested. * If start = 0 and end = -1, set the range to cover all valid addresses. Else start and end should be aligned to 4kB (lower 11 bits clear). * If NESTED is clear, then invalidate process scoped lookaside information. Else pid specifies a nested LPID, and the invalidation is performed   on nested guest partition table and nested guest partition scope real addresses. * If pid = 0 and NESTED is clear, then valid addresses are quadrant 3 and quadrant 0 spaces, Else valid addresses are quadrant 0. * Pages which are fully covered by the range are to be invalidated.   Those which are partially covered are considered outside invalidation range, which allows a caller to optimally invalidate ranges that may   contain mixed page sizes. * Return H_SUCCESS on success. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210621085003.904767-4-bharata@linux.ibm.com
2021-06-21powerpc/book3s64/radix: Add H_RPT_INVALIDATE pgsize encodings to mmu_psize_defBharata B Rao
Add a field to mmu_psize_def to store the page size encodings of H_RPT_INVALIDATE hcall. Initialize this while scanning the radix AP encodings. This will be used when invalidating with required page size encoding in the hcall. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210621085003.904767-3-bharata@linux.ibm.com
2021-06-21KVM: PPC: Book3S HV: Fix comments of H_RPT_INVALIDATE argumentsAneesh Kumar K.V
The type values H_RPTI_TYPE_PRT and H_RPTI_TYPE_PAT indicate invalidating the caching of process and partition scoped entries respectively. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210621085003.904767-2-bharata@linux.ibm.com
2021-06-21powerpc/xics: Add a native ICS backend for microwattBenjamin Herrenschmidt
This is a simple native ICS backend that matches the layout of the Microwatt implementation of ICS. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> [mpe: Add empty ics_native_init() to unbreak non-microwatt builds] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> fixup-ics Link: https://lore.kernel.org/r/YMwW8cxrwB2W5EUN@thinks.paulus.ozlabs.org
2021-06-21powerpc/mm: implement set_memory_attr()Christophe Leroy
In addition to the set_memory_xx() functions which allows to change the memory attributes of not (yet) used memory regions, implement a set_memory_attr() function to: - set the final memory protection after init on currently used kernel regions. - enable/disable kernel memory regions in the scope of DEBUG_PAGEALLOC. Unlike the set_memory_xx() which can act in three step as the regions are unused, this function must modify 'on the fly' as the kernel is executing from them. At the moment only PPC32 will use it and changing page attributes on the fly is not an issue. Reported-by: kbuild test robot <lkp@intel.com> [ruscur: cast "data" to unsigned long instead of int] Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210609013431.9805-9-jniethe5@gmail.com
2021-06-21powerpc/modules: Make module_alloc() Strict Module RWX awareJordan Niethe
Make module_alloc() use PAGE_KERNEL protections instead of PAGE_KERNEL_EXEX if Strict Module RWX is enabled. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210609013431.9805-4-jniethe5@gmail.com
2021-06-21powerpc/lib/code-patching: Set up Strict RWX patching earlierJordan Niethe
setup_text_poke_area() is a late init call so it runs before mark_rodata_ro() and after the init calls. This lets all the init code patching simply write to their locations. In the future, kprobes is going to allocate its instruction pages RO which means they will need setup_text__poke_area() to have been already called for their code patching. However, init_kprobes() (which allocates and patches some instruction pages) is an early init call so it happens before setup_text__poke_area(). start_kernel() calls poking_init() before any of the init calls. On powerpc, poking_init() is currently a nop. setup_text_poke_area() relies on kernel virtual memory, cpu hotplug and per_cpu_areas being setup. setup_per_cpu_areas(), boot_cpu_hotplug_init() and mm_init() are called before poking_init(). Turn setup_text_poke_area() into poking_init(). Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Russell Currey <ruscur@russell.cc> [mpe: Fold in missing prototype for poking_init() from lkp] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210609013431.9805-3-jniethe5@gmail.com
2021-06-21powerpc/mm: Implement set_memory() routinesRussell Currey
The set_memory_{ro/rw/nx/x}() functions are required for STRICT_MODULE_RWX, and are generally useful primitives to have. This implementation is designed to be generic across powerpc's many MMUs. It's possible that this could be optimised to be faster for specific MMUs. This implementation does not handle cases where the caller is attempting to change the mapping of the page it is executing from, or if another CPU is concurrently using the page being altered. These cases likely shouldn't happen, but a more complex implementation with MMU-specific code could safely handle them. On hash, the linear mapping is not kept in the linux pagetable, so this will not change the protection if used on that range. Currently these functions are not used on the linear map so just WARN for now. apply_to_existing_page_range() does not work on huge pages so for now disallow changing the protection of huge pages. [jpn: - Allow set memory functions to be used without Strict RWX - Hash: Disallow certain regions - Have change_page_attr() take function pointers to manipulate ptes - Radix: Add ptesync after set_pte_at()] Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210609013431.9805-2-jniethe5@gmail.com
2021-06-21powerpc/pesries: Get STF barrier requirement from H_GET_CPU_CHARACTERISTICSNicholas Piggin
This allows the hypervisor / firmware to describe this workarounds to the guest. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210503130243.891868-4-npiggin@gmail.com
2021-06-21powerpc/security: Add a security feature for STF barrierNicholas Piggin
Rather than tying this mitigation to RFI L1D flush requirement, add a new bit for it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210503130243.891868-3-npiggin@gmail.com
2021-06-21powerpc/pseries: Get entry and uaccess flush required bits from ↵Nicholas Piggin
H_GET_CPU_CHARACTERISTICS This allows the hypervisor / firmware to describe these workarounds to the guest. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210503130243.891868-2-npiggin@gmail.com
2021-06-21KVM: PPC: Book3S HV: Fix TLB management on SMT8 POWER9 and POWER10 processorsSuraj Jitindar Singh
The POWER9 vCPU TLB management code assumes all threads in a core share a TLB, and that TLBIEL execued by one thread will invalidate TLBs for all threads. This is not the case for SMT8 capable POWER9 and POWER10 (big core) processors, where the TLB is split between groups of threads. This results in TLB multi-hits, random data corruption, etc. Fix this by introducing cpu_first_tlb_thread_sibling etc., to determine which siblings share TLBs, and use that in the guest TLB flushing code. [npiggin@gmail.com: add changelog and comment] Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210602040441.3984352-1-npiggin@gmail.com
2021-06-20powerpc/pseries/vas: Integrate API with open/close windowsHaren Myneni
This patch adds VAS window allocatioa/close with the corresponding hcalls. Also changes to integrate with the existing user space VAS API and provide register/unregister functions to NX pseries driver. The driver register function is used to create the user space interface (/dev/crypto/nx-gzip) and unregister to remove this entry. The user space process opens this device node and makes an ioctl to allocate VAS window. The close interface is used to deallocate window. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e8d956bace3f182c4d2e66e343ff37cb0391d1fd.camel@linux.ibm.com
2021-06-20powerpc/pseries/vas: Define VAS/NXGZIP hcalls and structsHaren Myneni
This patch adds hcalls and other definitions. Also define structs that are used in VAS implementation on PowerVM. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b4b8c594c27ee4aa6be9dc6dc4ee7331571cbbe8.camel@linux.ibm.com
2021-06-20powerpc/vas: Define and use common vas_window structHaren Myneni
Many elements in vas_struct are used on PowerNV and PowerVM platforms. vas_window is used for both TX and RX windows on PowerNV and for TX windows on PowerVM. So some elements are specific to these platforms. So this patch defines common vas_window and platform specific window structs (pnv_vas_window on PowerNV). Also adds the corresponding changes in PowerNV vas code. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1698c35c158dfe52c6d2166667823d3d4a463353.camel@linux.ibm.com
2021-06-20powerpc/vas: Move update_csb/dump_crb to common book3s platformHaren Myneni
If a coprocessor encounters an error translating an address, the VAS will cause an interrupt in the host. The kernel processes the fault by updating CSB. This functionality is same for both powerNV and pseries. So this patch moves these functions to common vas-api.c and the actual functionality is not changed. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bf8d5b0770fa1ef5cba88c96580caa08d999d3b5.camel@linux.ibm.com
2021-06-20powerpc/vas: Create take/drop pid and mm reference functionsHaren Myneni
Take pid and mm references when each window opens and drops during close. This functionality is needed for powerNV and pseries. So this patch defines the existing code as functions in common book3s platform vas-api.c Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2fa40df962250a737c804e58202924717b39e381.camel@linux.ibm.com
2021-06-20powerpc/vas: Add platform specific user window operationsHaren Myneni
PowerNV uses registers to open/close VAS windows, and getting the paste address. Whereas the hypervisor calls are used on PowerVM. This patch adds the platform specific user space window operations and register with the common VAS user space interface. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f85091f4ace67f951ac04d60394d67b21e2f5d3c.camel@linux.ibm.com
2021-06-20powerpc/powernv/vas: Rename register/unregister functionsHaren Myneni
powerNV and pseries drivers register / unregister to the corresponding platform specific VAS separately. Then these VAS functions call the common API with the specific window operations. So rename powerNV VAS API register/unregister functions. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9db00d58dbdcb7cfc07a1df95f3d2a9e3e5d746a.camel@linux.ibm.com
2021-06-20powerpc/vas: Move VAS API to book3s common platformHaren Myneni
The pseries platform will share vas and nx code and interfaces with the PowerNV platform, so create the arch/powerpc/platforms/book3s/ directory and move VAS API code there. Functionality is not changed. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e05c8db17b9eabe3545b902d034238e4c6c08180.camel@linux.ibm.com
2021-06-19Merge tag 'powerpc-5.13-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fix initrd corruption caused by our recent change to use relative jump labels. Fix a crash using perf record on systems without a hardware PMU backend. Rework our 64-bit signal handling slighty to make it more closely match the old behaviour, after the recent change to use unsafe user accessors. Thanks to Anastasia Kovaleva, Athira Rajeev, Christophe Leroy, Daniel Axtens, Greg Kurz, and Roman Bolshakov" * tag 'powerpc-5.13-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/perf: Fix crash in perf_instruction_pointer() when ppmu is not set powerpc: Fix initrd corruption with relative jump labels powerpc/signal64: Copy siginfo before changing regs->nip powerpc/mem: Add back missing header to fix 'no previous prototype' error
2021-06-17KVM: switch per-VM stats to u64Paolo Bonzini
Make them the same type as vCPU stats. There is no reason to limit the counters to unsigned long. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-17Merge branch 'topic/ppc-kvm' into nextMichael Ellerman
Merge some powerpc KVM patches from our topic branch. In particular this brings in Nick's big series rewriting parts of the guest entry/exit path in C. Conflicts: arch/powerpc/kernel/security.c arch/powerpc/kvm/book3s_hv_rmhandlers.S
2021-06-17powerpc/64: drop redundant defination of spin_until_condSudeep Holla
linux/processor.h has exactly same defination for spin_until_cond. Drop the redundant defination in asm/processor.h Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1fff2054e5dfc00329804dbd3f2a91667c9a8aff.1623438544.git.christophe.leroy@csgroup.eu
2021-06-17powerpc: Move update_power8_hid0() into its only userChristophe Leroy
update_power8_hid0() is used only by powernv platform subcore.c Move it there. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/37f41d74faa0c66f90b373e243e8b1ee37a1f6fa.1623219019.git.christophe.leroy@csgroup.eu