summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2018-02-01Merge branch 'x86/hyperv' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Topic branch for stable KVM clockource under Hyper-V. Thanks to Christoffer Dall for resolving the ARM conflict.
2018-01-29Merge branch 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "A rather small set of irq updates this time: - removal of the old and now obsolete irq domain debugging code - the new Goldfish PIC driver - the usual pile of small fixes and updates" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Kill CONFIG_IRQ_DOMAIN_DEBUG irq/work: Improve the flag definitions irqchip/gic-v3: Fix the driver probe() fail due to disabled GICC entry irqchip/irq-goldfish-pic: Add Goldfish PIC driver dt-bindings/goldfish-pic: Add device tree binding for Goldfish PIC driver irqchip/ompic: fix return value check in ompic_of_init() dt-bindings/bcm283x: Define polarity of per-cpu interrupts irqchip/irq-bcm2836: Add support for DT interrupt polarity dt-bindings/bcm2836-l1-intc: Add interrupt polarity support
2018-01-29Merge tag 'init_task-20180117' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull init_task initializer cleanups from David Howells: "It doesn't seem useful to have the init_task in a header file rather than in a normal source file. We could consolidate init_task handling instead and expand out various macros. Here's a series of patches that consolidate init_task handling: (1) Make THREAD_SIZE available to vmlinux.lds for cris, hexagon and openrisc. (2) Alter the INIT_TASK_DATA linker script macro to set init_thread_union and init_stack rather than defining these in C. Insert init_task and init_thread_into into the init_stack area in the linker script as appropriate to the configuration, with different section markers so that they end up correctly ordered. We can then get merge ia64's init_task.c into the main one. We then have a bunch of single-use INIT_*() macros that seem only to be macros because they used to be used per-arch. We can then expand these in place of the user and get rid of a few lines and a lot of backslashes. (3) Expand INIT_TASK() in place. (4) Expand in place various small INIT_*() macros that are defined conditionally. Expand them and surround them by #if[n]def/#endif in the .c file as it takes fewer lines. (5) Expand INIT_SIGNALS() and INIT_SIGHAND() in place. (6) Expand INIT_STRUCT_PID in place. These macros can then be discarded" * tag 'init_task-20180117' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: Expand INIT_STRUCT_PID and remove Expand the INIT_SIGNALS and INIT_SIGHAND macros and remove Expand various INIT_* macros and remove Expand INIT_TASK() in init/init_task.c and remove Construct init thread stack in the linker script rather than by union openrisc: Make THREAD_SIZE available to vmlinux.lds hexagon: Make THREAD_SIZE available to vmlinux.lds cris: Make THREAD_SIZE available to vmlinux.lds
2018-01-24irqdomain: Kill CONFIG_IRQ_DOMAIN_DEBUGMarc Zyngier
CONFIG_IRQ_DOMAIN_DEBUG is similar to CONFIG_GENERIC_IRQ_DEBUGFS, just with less information. Spring cleanup time. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Yang Shunyong <shunyong.yang@hxt-semitech.com> Link: https://lkml.kernel.org/r/20180117142647.23622-1-marc.zyngier@arm.com
2018-01-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "ARM: - fix incorrect huge page mappings on systems using the contiguous hint for hugetlbfs - support alternative GICv4 init sequence - correctly implement the ARM SMCC for HVC and SMC handling PPC: - add KVM IOCTL for reporting vulnerability and workaround status s390: - provide userspace interface for branch prediction changes in firmware x86: - use correct macros for bits" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: wire up bpb feature KVM: PPC: Book3S: Provide information about hardware/firmware CVE workarounds KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and X86_CR4_PAE_BIT in kvm_valid_sregs() arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls KVM: arm64: Fix GICv4 init when called from vgic_its_create KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
2018-01-19Merge tag 'powerpc-4.15-8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "More than we'd like after rc8, but nothing very alarming either, just tying up loose ends before the release: Since we changed powernv to use cpufreq_get() from show_cpuinfo(), we see warnings with PREEMPT enabled. But the preempt_disable() in show_cpuinfo() doesn't actually prevent CPU hotplug as it suggests, so remove it. Two updates to the recently merged RFI flush code. Wire up the generic sysfs file to report the status, and add a debugfs file to allow enabling/disabling it at runtime. Two updates to xmon, one to add the RFI flush related fields to the paca dump, and another to not use hashed pointers in the paca dump. And one minor fix to add a missing include of linux/types.h in asm/hvcall.h, not seen to break the build in upstream, but correct anyway. Thanks to: Benjamin Herrenschmidt, Michal Suchanek, Nicholas Piggin" * tag 'powerpc-4.15-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: include linux/types.h in asm/hvcall.h powerpc/64s: Allow control of RFI flush via debugfs powerpc/64s: Wire up cpu_show_meltdown() powerpc: Don't preempt_disable() in show_cpuinfo() powerpc/xmon: Don't print hashed pointers in paca dump powerpc/xmon: Add RFI flush related fields to paca dump
2018-01-19KVM: PPC: Book3S: Provide information about hardware/firmware CVE workaroundsPaul Mackerras
This adds a new ioctl, KVM_PPC_GET_CPU_CHAR, that gives userspace information about the underlying machine's level of vulnerability to the recently announced vulnerabilities CVE-2017-5715, CVE-2017-5753 and CVE-2017-5754, and whether the machine provides instructions to assist software to work around the vulnerabilities. The ioctl returns two u64 words describing characteristics of the CPU and required software behaviour respectively, plus two mask words which indicate which bits have been filled in by the kernel, for extensibility. The bit definitions are the same as for the new H_GET_CPU_CHARACTERISTICS hypercall. There is also a new capability, KVM_CAP_PPC_GET_CPU_CHAR, which indicates whether the new ioctl is available. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-17powerpc/pseries: include linux/types.h in asm/hvcall.hMichal Suchanek
Commit 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") uses u64 in asm/hvcall.h without including linux/types.h This breaks hvcall.h users that do not include the header themselves. Fixes: 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-17powerpc/64s: Allow control of RFI flush via debugfsMichael Ellerman
Expose the state of the RFI flush (enabled/disabled) via debugfs, and allow it to be enabled/disabled at runtime. eg: $ cat /sys/kernel/debug/powerpc/rfi_flush 1 $ echo 0 > /sys/kernel/debug/powerpc/rfi_flush $ cat /sys/kernel/debug/powerpc/rfi_flush 0 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2018-01-17powerpc/64s: Wire up cpu_show_meltdown()Michael Ellerman
The recent commit 87590ce6e373 ("sysfs/cpu: Add vulnerability folder") added a generic folder and set of files for reporting information on CPU vulnerabilities. One of those was for meltdown: /sys/devices/system/cpu/vulnerabilities/meltdown This commit wires up that file for 64-bit Book3S powerpc. For now we default to "Vulnerable" unless the RFI flush is enabled. That may not actually be true on all hardware, further patches will refine the reporting based on the CPU/platform etc. But for now we default to being pessimists. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-14Merge tag 'powerpc-4.15-7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "One fix for an oops at boot if we take a hotplug interrupt before we are ready to handle it. The bulk is patches to implement mitigation for Meltdown, see the change logs for more details. Thanks to: Nicholas Piggin, Michael Neuling, Oliver O'Halloran, Jon Masters, Jose Ricardo Ziviani, David Gibson" * tag 'powerpc-4.15-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/powernv: Check device-tree for RFI flush settings powerpc/pseries: Query hypervisor for RFI flush settings powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti powerpc/64s: Add support for RFI flush of L1-D cache powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL powerpc/64s: Simple RFI macro conversions powerpc/64: Add macros for annotating the destination of rfid/hrfid powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper powerpc/pseries: Make RAS IRQ explicitly dependent on DLPAR WQ
2018-01-11Merge tag 'kvm-ppc-fixes-4.15-3' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master PPC KVM fixes for 4.15 Four commits here, including two that were tagged but never merged. Three of them are for the HPT resizing code; two of those fix a user-triggerable use-after-free in the host, and one that fixes stale TLB entries in the guest. The remaining commit fixes a bug causing PR KVM guests under PowerVM to fail to start.
2018-01-11powerpc: Don't preempt_disable() in show_cpuinfo()Benjamin Herrenschmidt
This causes warnings from cpufreq mutex code. This is also rather unnecessary and ineffective. If we really want to prevent concurrent unplug, we could take the unplug read lock but I don't see this being critical. Fixes: cd77b5ce208c ("powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-11powerpc/xmon: Don't print hashed pointers in paca dumpMichael Ellerman
Remember when the biggest problem we had to worry about was hashed pointers, those were the days. These were missed in my earlier patch because they don't match "%p", but the macro is hiding a "%p", so these all end up being hashed, which is not what we want in xmon. Convert them to "%px". Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-11powerpc/xmon: Add RFI flush related fields to paca dumpMichael Ellerman
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/powernv: Check device-tree for RFI flush settingsOliver O'Halloran
New device-tree properties are available which tell the hypervisor settings related to the RFI flush. Use them to determine the appropriate flush instruction to use, and whether the flush is required. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/pseries: Query hypervisor for RFI flush settingsMichael Neuling
A new hypervisor call is available which tells the guest settings related to the RFI flush. Use it to query the appropriate flush instruction(s), and whether the flush is required. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/64s: Support disabling RFI flush with no_rfi_flush and noptiMichael Ellerman
Because there may be some performance overhead of the RFI flush, add kernel command line options to disable it. We add a sensibly named 'no_rfi_flush' option, but we also hijack the x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we see 'nopti' we can guess that the user is trying to avoid any overhead of Meltdown mitigations, and it means we don't have to educate every one about a different command line option. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/64s: Add support for RFI flush of L1-D cacheMichael Ellerman
On some CPUs we can prevent the Meltdown vulnerability by flushing the L1-D cache on exit from kernel to user mode, and from hypervisor to guest. This is known to be the case on at least Power7, Power8 and Power9. At this time we do not know the status of the vulnerability on other CPUs such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale CPUs. As more information comes to light we can enable this, or other mechanisms on those CPUs. The vulnerability occurs when the load of an architecturally inaccessible memory region (eg. userspace load of kernel memory) is speculatively executed to the point where its result can influence the address of a subsequent speculatively executed load. In order for that to happen, the first load must hit in the L1, because before the load is sent to the L2 the permission check is performed. Therefore if no kernel addresses hit in the L1 the vulnerability can not occur. We can ensure that is the case by flushing the L1 whenever we return to userspace. Similarly for hypervisor vs guest. In order to flush the L1-D cache on exit, we add a section of nops at each (h)rfi location that returns to a lower privileged context, and patch that with some sequence. Newer firmwares are able to advertise to us that there is a special nop instruction that flushes the L1-D. If we do not see that advertised, we fall back to doing a displacement flush in software. For guest kernels we support migration between some CPU versions, and different CPUs may use different flush instructions. So that we are prepared to migrate to a machine with a different flush instruction activated, we may have to patch more than one flush instruction at boot if the hypervisor tells us to. In the end this patch is mostly the work of Nicholas Piggin and Michael Ellerman. However a cast of thousands contributed to analysis of the issue, earlier versions of the patch, back ports testing etc. Many thanks to all of them. Tested-by: Jon Masters <jcm@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10KVM: PPC: Book3S HV: Always flush TLB in kvmppc_alloc_reset_hpt()David Gibson
The KVM_PPC_ALLOCATE_HTAB ioctl(), implemented by kvmppc_alloc_reset_hpt() is supposed to completely clear and reset a guest's Hashed Page Table (HPT) allocating or re-allocating it if necessary. In the case where an HPT of the right size already exists and it just zeroes it, it forces a TLB flush on all guest CPUs, to remove any stale TLB entries loaded from the old HPT. However, that situation can arise when the HPT is resizing as well - or even when switching from an RPT to HPT - so those cases need a TLB flush as well. So, move the TLB flush to trigger in all cases except for errors. Cc: stable@vger.kernel.org # v4.10+ Fixes: f98a8bf9ee20 ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT size") Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-10KVM: PPC: Book3S PR: Fix WIMG handling under pHypAlexey Kardashevskiy
Commit 96df226 ("KVM: PPC: Book3S PR: Preserve storage control bits") added code to preserve WIMG bits but it missed 2 special cases: - a magic page in kvmppc_mmu_book3s_64_xlate() and - guest real mode in kvmppc_handle_pagefault(). For these ptes, WIMG was 0 and pHyp failed on these causing a guest to stop in the very beginning at NIP=0x100 (due to bd9166ffe "KVM: PPC: Book3S PR: Exit KVM on failed mapping"). According to LoPAPR v1.1 14.5.4.1.2 H_ENTER: The hypervisor checks that the WIMG bits within the PTE are appropriate for the physical page number else H_Parameter return. (For System Memory pages WIMG=0010, or, 1110 if the SAO option is enabled, and for IO pages WIMG=01**.) This hence initializes WIMG to non-zero value HPTE_R_M (0x10), as expected by pHyp. [paulus@ozlabs.org - fix compile for 32-bit] Cc: stable@vger.kernel.org # v4.11+ Fixes: 96df226 "KVM: PPC: Book3S PR: Preserve storage control bits" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Ruediger Oertel <ro@suse.de> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-09Construct init thread stack in the linker script rather than by unionDavid Howells
Construct the init thread stack in the linker script rather than doing it by means of a union so that ia64's init_task.c can be got rid of. The following symbols are then made available from INIT_TASK_DATA() linker script macro: init_thread_union init_stack INIT_TASK_DATA() also expands the region to THREAD_SIZE to accommodate the size of the init stack. init_thread_union is given its own section so that it can be placed into the stack space in the right order. I'm assuming that the ia64 ordering is correct and that the task_struct is first and the thread_info second. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Will Deacon <will.deacon@arm.com> (arm64) Tested-by: Palmer Dabbelt <palmer@sifive.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2018-01-10powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNELNicholas Piggin
In the SLB miss handler we may be returning to user or kernel. We need to add a check early on and save the result in the cr4 register, and then we bifurcate the return path based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNELNicholas Piggin
Similar to the syscall return path, in fast_exception_return we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNELNicholas Piggin
In the syscall exit path we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/64s: Simple RFI macro conversionsNicholas Piggin
This commit does simple conversions of rfi/rfid to the new macros that include the expected destination context. By simple we mean cases where there is a single well known destination context, and it's simply a matter of substituting the instruction for the appropriate macro. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/64: Add macros for annotating the destination of rfid/hrfidNicholas Piggin
The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is used for switching from the kernel to userspace, and from the hypervisor to the guest kernel. However it can and is also used for other transitions, eg. from real mode kernel code to virtual mode kernel code, and it's not always clear from the code what the destination context is. To make it clearer when reading the code, add macros which encode the expected destination context. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10Merge branch 'topic/ppc-kvm' into fixesMichael Ellerman
Merge the topic branch with share with the kvm-ppc tree. In this case we need to share the definition of a new hypervisor call and associated flags.
2018-01-10powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapperMichael Neuling
A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-08powerpc/pseries: Make RAS IRQ explicitly dependent on DLPAR WQMichael Ellerman
The hotplug code uses its own workqueue to handle IRQ requests (pseries_hp_wq), however that workqueue is initialized after init_ras_IRQ(). That can lead to a kernel panic if any hotplug interrupts fire after init_ras_IRQ() but before pseries_hp_wq is initialised. eg: UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes) NET: Registered protocol family 1 Unpacking initramfs... (qemu) object_add memory-backend-ram,id=mem1,size=10G (qemu) device_add pc-dimm,id=dimm1,memdev=mem1 Unable to handle kernel paging request for data at address 0xf94d03007c421378 Faulting instruction address: 0xc00000000012d744 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2-ziviani+ #26 task: (ptrval) task.stack: (ptrval) NIP: c00000000012d744 LR: c00000000012d744 CTR: 0000000000000000 REGS: (ptrval) TRAP: 0380 Not tainted (4.15.0-rc2-ziviani+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28088042 XER: 20040000 CFAR: c00000000012d3c4 SOFTE: 0 ... NIP [c00000000012d744] __queue_work+0xd4/0x5c0 LR [c00000000012d744] __queue_work+0xd4/0x5c0 Call Trace: [c0000000fffefb90] [c00000000012d744] __queue_work+0xd4/0x5c0 (unreliable) [c0000000fffefc70] [c00000000012dce4] queue_work_on+0xb4/0xf0 This commit makes the RAS IRQ registration explicitly dependent on the creation of the pseries_hp_wq. Reported-by: Min Deng <mdeng@redhat.com> Reported-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Tested-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-06Merge tag 'powerpc-4.15-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: "Just one fix to correctly return SEGV_ACCERR when we take a SEGV on a mapped region. The bug was introduced in the refactoring of the page fault handler we did in the previous release. Thanks to John Sperbeck" * tag 'powerpc-4.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR
2018-01-02powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERRJohn Sperbeck
The recent refactoring of the powerpc page fault handler in commit c3350602e876 ("powerpc/mm: Make bad_area* helper functions") caused access to protected memory regions to indicate SEGV_MAPERR instead of the traditional SEGV_ACCERR in the si_code field of a user-space signal handler. This can confuse debug libraries that temporarily change the protection of memory regions, and expect to use SEGV_ACCERR as an indication to restore access to a region. This commit restores the previous behavior. The following program exhibits the issue: $ ./repro read || echo "FAILED" $ ./repro write || echo "FAILED" $ ./repro exec || echo "FAILED" #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <signal.h> #include <sys/mman.h> #include <assert.h> static void segv_handler(int n, siginfo_t *info, void *arg) { _exit(info->si_code == SEGV_ACCERR ? 0 : 1); } int main(int argc, char **argv) { void *p = NULL; struct sigaction act = { .sa_sigaction = segv_handler, .sa_flags = SA_SIGINFO, }; assert(argc == 2); p = mmap(NULL, getpagesize(), (strcmp(argv[1], "write") == 0) ? PROT_READ : 0, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); assert(p != MAP_FAILED); assert(sigaction(SIGSEGV, &act, NULL) == 0); if (strcmp(argv[1], "read") == 0) printf("%c", *(unsigned char *)p); else if (strcmp(argv[1], "write") == 0) *(unsigned char *)p = 0; else if (strcmp(argv[1], "exec") == 0) ((void (*)(void))p)(); return 1; /* failed to generate SEGV */ } Fixes: c3350602e876 ("powerpc/mm: Make bad_area* helper functions") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: John Sperbeck <jsperbeck@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Add commit references in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-28kernel/irq: Extend lockdep class for request mutexAndrew Lunn
The IRQ code already has support for lockdep class for the lock mutex in an interrupt descriptor. Extend this to add a second class for the request mutex in the descriptor. Not having a class is resulting in false positive splats in some code paths. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: linus.walleij@linaro.org Cc: grygorii.strashko@ti.com Cc: f.fainelli@gmail.com Link: https://lkml.kernel.org/r/1512234664-21555-1-git-send-email-andrew@lunn.ch
2017-12-23Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI preparatory patches from Thomas Gleixner: "Todays Advent calendar window contains twentyfour easy to digest patches. The original plan was to have twenty three matching the date, but a late fixup made that moot. - Move the cpu_entry_area mapping out of the fixmap into a separate address space. That's necessary because the fixmap becomes too big with NRCPUS=8192 and this caused already subtle and hard to diagnose failures. The top most patch is fresh from today and cures a brain slip of that tall grumpy german greybeard, who ignored the intricacies of 32bit wraparounds. - Limit the number of CPUs on 32bit to 64. That's insane big already, but at least it's small enough to prevent address space issues with the cpu_entry_area map, which have been observed and debugged with the fixmap code - A few TLB flush fixes in various places plus documentation which of the TLB functions should be used for what. - Rename the SYSENTER stack to CPU_ENTRY_AREA stack as it is used for more than sysenter now and keeping the name makes backtraces confusing. - Prevent LDT inheritance on exec() by moving it to arch_dup_mmap(), which is only invoked on fork(). - Make vysycall more robust. - A few fixes and cleanups of the debug_pagetables code. Check PAGE_PRESENT instead of checking the PTE for 0 and a cleanup of the C89 initialization of the address hint array which already was out of sync with the index enums. - Move the ESPFIX init to a different place to prepare for PTI. - Several code moves with no functional change to make PTI integration simpler and header files less convoluted. - Documentation fixes and clarifications" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit init: Invoke init_espfix_bsp() from mm_init() x86/cpu_entry_area: Move it out of the fixmap x86/cpu_entry_area: Move it to a separate unit x86/mm: Create asm/invpcid.h x86/mm: Put MMU to hardware ASID translation in one place x86/mm: Remove hard-coded ASID limit checks x86/mm: Move the CR3 construction functions to tlbflush.h x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what x86/mm: Remove superfluous barriers x86/mm: Use __flush_tlb_one() for kernel memory x86/microcode: Dont abuse the TLB-flush interface x86/uv: Use the right TLB-flush API x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation x86/mm/64: Improve the memory map documentation x86/ldt: Prevent LDT inheritance on exec x86/ldt: Rework locking arch, mm: Allow arch_dup_mmap() to fail x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode ...
2017-12-22Merge tag 'powerpc-4.15-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "This is all fairly boring, except that there's two KVM fixes that you'd normally get via Paul's kvm-ppc tree. He's away so I picked them up. I was waiting to see if he would apply them, which is why they have only been in my tree since today. But they were on the list for a while and have been tested on the relevant hardware. Of note is two fixes for KVM XIVE (Power9 interrupt controller). These would normally go via the KVM tree but Paul is away so I've picked them up. Other than that, two fixes for error handling in the IMC driver, and one for a potential oops in the BHRB code if the hardware records a branch address that has subsequently been unmapped, and finally a s/%p/%px/ in our oops code. Thanks to: Anju T Sudhakar, Cédric Le Goater, Laurent Vivier, Madhavan Srinivasan, Naveen N. Rao, Ravi Bangoria" * tag 'powerpc-4.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp() KVM: PPC: Book3S: fix XIVE migration of pending interrupts powerpc/kernel: Print actual address of regs when oopsing powerpc/perf: Fix kfree memory allocated for nest pmus powerpc/perf/imc: Fix nest-imc cpuhotplug callback failure powerpc/perf: Dereference BHRB entries safely
2017-12-22arch, mm: Allow arch_dup_mmap() to failThomas Gleixner
In order to sanitize the LDT initialization on x86 arch_dup_mmap() must be allowed to fail. Fix up all instances. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: dan.j.williams@intel.com Cc: hughd@google.com Cc: keescook@google.com Cc: kirill.shutemov@linux.intel.com Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-22KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp()Laurent Vivier
When we migrate a VM from a POWER8 host (XICS) to a POWER9 host (XICS-on-XIVE), we have an error: qemu-kvm: Unable to restore KVM interrupt controller state \ (0xff000000) for CPU 0: Invalid argument This is because kvmppc_xics_set_icp() checks the new state is internaly consistent, and especially: ... 1129 if (xisr == 0) { 1130 if (pending_pri != 0xff) 1131 return -EINVAL; ... On the other side, kvmppc_xive_get_icp() doesn't set neither the pending_pri value, nor the xisr value (set to 0) (and kvmppc_xive_set_icp() ignores the pending_pri value) As xisr is 0, pending_pri must be set to 0xff. Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-22KVM: PPC: Book3S: fix XIVE migration of pending interruptsCédric Le Goater
When restoring a pending interrupt, we are setting the Q bit to force a retrigger in xive_finish_unmask(). But we also need to force an EOI in this case to reach the same initial state : P=1, Q=0. This can be done by not setting 'old_p' for pending interrupts which will inform xive_finish_unmask() that an EOI needs to be sent. Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-19powerpc/kernel: Print actual address of regs when oopsingMichael Ellerman
When we oops or otherwise call show_regs() we print the address of the regs structure. Being able to see the address is fairly useful, firstly to verify that the regs pointer is not completely bogus, and secondly it allows you to dump the regs and surrounding memory with a debugger if you have one. In the normal case the regs will be located somewhere on the stack, so printing their location discloses no further information than printing the stack pointer does already. So switch to %px and print the actual address, not the hashed value. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-15bpf, ppc64: do not reload skb pointers in non-skb contextDaniel Borkmann
The assumption of unconditionally reloading skb pointers on BPF helper calls where bpf_helper_changes_pkt_data() holds true is wrong. There can be different contexts where the helper would enforce a reload such as in case of XDP. Here, we do have a struct xdp_buff instead of struct sk_buff as context, thus this will access garbage. JITs only ever need to deal with cached skb pointer reload when ld_abs/ind was seen, therefore guard the reload behind SEEN_SKB. Fixes: 156d0e290e96 ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-14KVM: introduce kvm_arch_vcpu_async_ioctlPaolo Bonzini
After the vcpu_load/vcpu_put pushdown, the handling of asynchronous VCPU ioctl is already much clearer in that it is obvious that they bypass vcpu_load and vcpu_put. However, it is still not perfect in that the different state of the VCPU mutex is still hidden in the caller. Separate those ioctls into a new function kvm_arch_vcpu_async_ioctl that returns -ENOIOCTLCMD for more "traditional" synchronous ioctls. Cc: James Hogan <jhogan@kernel.org> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Suggested-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctlChristoffer Dall
Move the calls to vcpu_load() and vcpu_put() in to the architecture specific implementations of kvm_arch_vcpu_ioctl() which dispatches further architecture-specific ioctls on to other functions. Some architectures support asynchronous vcpu ioctls which cannot call vcpu_load() or take the vcpu->mutex, because that would prevent concurrent execution with a running VCPU, which is the intended purpose of these ioctls, for example because they inject interrupts. We repeat the separate checks for these specifics in the architecture code for MIPS, S390 and PPC, and avoid taking the vcpu->mutex and calling vcpu_load for these ioctls. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_set_guest_debugChristoffer Dall
Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_set_guest_debug(). Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_translateChristoffer Dall
Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_translate(). Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_set_sregsChristoffer Dall
Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_set_sregs(). Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_get_sregsChristoffer Dall
Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_get_sregs(). Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_set_regsChristoffer Dall
Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_set_regs(). Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_get_regsChristoffer Dall
Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_get_regs(). Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_runChristoffer Dall
Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_run(). Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> # s390 parts Reviewed-by: Cornelia Huck <cohuck@redhat.com> [Rebased. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-13powerpc/perf: Fix kfree memory allocated for nest pmusAnju T Sudhakar
imc_common_cpuhp_mem_free() is the common function for all IMC (In-memory Collection counters) domains to unregister cpuhotplug callback and free memory. Since kfree of memory allocated for nest-imc (per_nest_pmu_arr) is in the common code, all domains (core/nest/thread) can do the kfree in the failure case. This could potentially create a call trace as shown below, where core(/thread/nest) imc pmu initialization fails and in the failure path imc_common_cpuhp_mem_free() free the memory(per_nest_pmu_arr), which is allocated by successfully registered nest units. The call trace is generated in a scenario where core-imc initialization is made to fail and a cpuhotplug is performed in a p9 system. During cpuhotplug ppc_nest_imc_cpu_offline() tries to access per_nest_pmu_arr, which is already freed by core-imc. NIP [c000000000cb6a94] mutex_lock+0x34/0x90 LR [c000000000cb6a88] mutex_lock+0x28/0x90 Call Trace: mutex_lock+0x28/0x90 (unreliable) perf_pmu_migrate_context+0x90/0x3a0 ppc_nest_imc_cpu_offline+0x190/0x1f0 cpuhp_invoke_callback+0x160/0x820 cpuhp_thread_fun+0x1bc/0x270 smpboot_thread_fn+0x250/0x290 kthread+0x1a8/0x1b0 ret_from_kernel_thread+0x5c/0x74 To address this scenario do the kfree(per_nest_pmu_arr) only in case of nest-imc initialization failure, and when there is no other nest units registered. Fixes: 73ce9aec65b1 ("powerpc/perf: Fix IMC_MAX_PMU macro") Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>