summaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms/powernv
AgeCommit message (Collapse)Author
2017-11-07Merge branch 'linus' into locking/core, to resolve conflictsIngo Molnar
Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-25locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns ↵Mark Rutland
to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-06powerpc/powernv: Increase memory block size to 1GB on radixAnton Blanchard
Memory hot unplug on PowerNV radix hosts is broken. Our memory block size is 256MB but since we map the linear region with very large pages, each pte we tear down maps 1GB. A hot unplug of one 256MB memory block results in 768MB of memory getting unintentionally unmapped. At this point we are likely to oops. Fix this by increasing our memory block size to 1GB on PowerNV radix hosts. Fixes: 4b5d62ca17a1 ("powerpc/mm: add radix__remove_section_mapping()") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-20powerpc/powernv: Clear LPCR[PECE1] via stop-api only for deep state offlineGautham R. Shenoy
Commit 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug") clears the PECE1 bit of the LPCR via stop-api during CPU-Hotplug to prevent wakeup due to a decrementer on an offlined CPU which is in a deep stop state. In the case where the stop-api support is found to be lacking, the commit 785a12afdb4a ("powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails") disables deep states that lose hypervisor context. Thus in this case, the offlined CPU will be put to some shallow idle state. However, we currently unconditionally clear the PECE1 in LPCR via stop-api during CPU-Hotplug even when deep states are disabled due to stop-api failure. Fix this by clearing PECE1 of LPCR via stop-api during CPU-Hotplug *only* when the offlined CPU will be put to a deep state that loses hypervisor context. Fixes: 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug") Reported-by: Pavithra Prakash <pavirampu@linux.vnet.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Pavithra Prakash <pavrampu@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-07Merge tag 'powerpc-4.14-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Nothing really major this release, despite quite a lot of activity. Just lots of things all over the place. Some things of note include: - Access via perf to a new type of PMU (IMC) on Power9, which can count both core events as well as nest unit events (Memory controller etc). - Optimisations to the radix MMU TLB flushing, mostly to avoid unnecessary Page Walk Cache (PWC) flushes when the structure of the tree is not changing. - Reworks/cleanups of do_page_fault() to modernise it and bring it closer to other architectures where possible. - Rework of our page table walking so that THP updates only need to send IPIs to CPUs where the affected mm has run, rather than all CPUs. - The size of our vmalloc area is increased to 56T on 64-bit hash MMU systems. This avoids problems with the percpu allocator on systems with very sparse NUMA layouts. - STRICT_KERNEL_RWX support on PPC32. - A new sched domain topology for Power9, to capture the fact that pairs of cores may share an L2 cache. - Power9 support for VAS, which is a new mechanism for accessing coprocessors, and initial support for using it with the NX compression accelerator. - Major work on the instruction emulation support, adding support for many new instructions, and reworking it so it can be used to implement the emulation needed to fixup alignment faults. - Support for guests under PowerVM to use the Power9 XIVE interrupt controller. And probably that many things again that are almost as interesting, but I had to keep the list short. Plus the usual fixes and cleanups as always. Thanks to: Alexey Kardashevskiy, Alistair Popple, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Arvind Yadav, Balbir Singh, Benjamin Herrenschmidt, Bhumika Goyal, Breno Leitao, Bryant G. Ly, Christophe Leroy, Cédric Le Goater, Dan Carpenter, Dou Liyang, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff Levand, Hannes Reinecke, Haren Myneni, Ivan Mikhaylov, John Allen, Julia Lawall, LABBE Corentin, Laurentiu Tudor, Madhavan Srinivasan, Markus Elfring, Masahiro Yamada, Matt Brown, Michael Neuling, Murilo Opsfelder Araujo, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Rashmica Gupta, Rob Herring, Rui Teng, Sam Bobroff, Santosh Sivaraj, Scott Wood, Shilpasri G Bhat, Sukadev Bhattiprolu, Suraj Jitindar Singh, Tobin C. Harding, Victor Aoqui" * tag 'powerpc-4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (321 commits) powerpc/xive: Fix section __init warning powerpc: Fix kernel crash in emulation of vector loads and stores powerpc/xive: improve debugging macros powerpc/xive: add XIVE Exploitation Mode to CAS powerpc/xive: introduce H_INT_ESB hcall powerpc/xive: add the HW IRQ number under xive_irq_data powerpc/xive: introduce xive_esb_write() powerpc/xive: rename xive_poke_esb() in xive_esb_read() powerpc/xive: guest exploitation of the XIVE interrupt controller powerpc/xive: introduce a common routine xive_queue_page_alloc() powerpc/sstep: Avoid used uninitialized error axonram: Return directly after a failed kzalloc() in axon_ram_probe() axonram: Improve a size determination in axon_ram_probe() axonram: Delete an error message for a failed memory allocation in axon_ram_probe() powerpc/powernv/npu: Move tlb flush before launching ATSD powerpc/macintosh: constify wf_sensor_ops structures powerpc/iommu: Use permission-specific DEVICE_ATTR variants powerpc/eeh: Delete an error out of memory message at init time powerpc/mm: Use seq_putc() in two functions macintosh: Convert to using %pOF instead of full_name ...
2017-09-01powerpc/powernv/npu: Move tlb flush before launching ATSDAlistair Popple
The nest MMU tlb flush needs to happen before the GPU translation shootdown is launched to avoid the GPU refilling its tlb with stale nmmu translations prior to the nmmu flush completing. Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and now are bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Alistair Popple <alistair@popple.id.au> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31powerpc/asm: Convert .llong directives to .8byteTobin C. Harding
.llong is an undocumented PPC specific directive. The generic equivalent is .quad, but even better (because it's self describing) is .8byte. Convert all .llong directives to .8byte. Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv/vas: Define copy/paste interfacesSukadev Bhattiprolu
Define interfaces (wrappers) to the 'copy' and 'paste' instructions (which are new in PowerISA 3.0). These are intended to be used to by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to the NX hardware engines. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv/vas: Define vas_tx_win_open()Sukadev Bhattiprolu
Define an interface to open a VAS send window. This interface is intended to be used the Nest Accelerator (NX) driver(s) to open a send window and use it to submit compression/encryption requests to a VAS receive window. The receive window, identified by the [vasid, cop] parameters, must already be open in VAS (i.e connected to an NX engine). Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv/vas: Define vas_win_close() interfaceSukadev Bhattiprolu
Define the vas_win_close() interface which should be used to close a send or receive windows. While the hardware configurations required to open send and receive windows differ, the configuration to close a window is the same for both. So we use a single interface to close the window. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv/vas: Define vas_rx_win_open() interfaceSukadev Bhattiprolu
Define the vas_rx_win_open() interface. This interface is intended to be used by the Nest Accelerator (NX) driver(s) to setup receive windows for one or more NX engines (which implement compression & encryption algorithms in the hardware). Follow-on patches will provide an interface to close the window and to open a send window that kernel subsystems can use to access the NX engines. The interface to open a receive window is expected to be invoked for each instance of VAS in the system. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv/vas: Define helpers to alloc/free windowsSukadev Bhattiprolu
Define helpers to allocate/free VAS window objects. These will be used in follow-on patches when opening/closing windows. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv/vas: Define helpers to init window contextSukadev Bhattiprolu
Define helpers to initialize window context registers of the VAS hardware. These will be used in follow-on patches when opening/closing VAS windows. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv/vas: Define helpers to access MMIO regionsSukadev Bhattiprolu
Define some helper functions to access the MMIO regions. We use these in follow-on patches to read/write VAS hardware registers. They are also used to later issue 'paste' instructions to submit requests to the NX hardware engines. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv/vas: Define vas_init() and vas_exit()Sukadev Bhattiprolu
Implement vas_init() and vas_exit() functions for a new VAS module. This VAS module is essentially a library for other device drivers and kernel users of the NX coprocessors like NX-842 and NX-GZIP. In the future this will be extended to add support for user space to access the NX coprocessors. VAS is currently only supported with 64K page size. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv/vas: Define macros, register fields and structuresSukadev Bhattiprolu
Define macros for the VAS hardware registers and bit-fields as well as couple of data structures needed by the VAS driver. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [mpe: Fixup include guard to use _ASM_POWERPC_VAS_H] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/eeh: Remove unnecessary config_addr from eeh_devAlexey Kardashevskiy
The eeh_dev struct hold a config space address of an associated node and the very same address is also stored in the pci_dn struct which is always present during the eeh_dev lifetime. This uses bus:devfn directly from pci_dn instead of cached and packed config_addr. Since config_addr is made from device's bus:dev.fn, there is no point in keeping it in the debugfs either so remove that too. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/eeh: Remove unnecessary pointer to phb from eeh_devAlexey Kardashevskiy
The eeh_dev struct already holds a pointer to pci_dn which it does not exist without and pci_dn itself holds the very same pointer so just use it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/eeh: Reduce to one the number of places where edev is allocatedAlexey Kardashevskiy
arch/powerpc/kernel/eeh_dev.c:57 is the only legit place where edev is allocated; other 2 places allocate it on stack and in the heap for a very short period of time to use eeh_pe_get() as takes edev. This changes eeh_pe_get() to receive required parameters explicitly. This removes unnecessary temporary allocation of edev. This uses the "pe_no" name instead of the "pe_config_addr" name as it actually is a PE number and not a config space address as it seemed. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv: Use kernel crash path for machine checksNicholas Piggin
There are quite a few machine check exceptions that can be caused by kernel bugs. To make debugging easier, use the kernel crash path in cases of synchronous machine checks that occur in kernel mode, if that would not result in the machine going straight to panic or crash dump. There is a downside here that die()ing the process in kernel mode can still leave the system unstable. panic_on_oops will always force the system to fail-stop, so systems where that behaviour is important will still do the right thing. As a test, when triggering an i-side 0111b error (ifetch from foreign address) in kernel mode process context on POWER9, the kernel currently dies quickly like this: Severe Machine check interrupt [Not recovered] NIP [ffff000000000000]: 0xffff000000000000 Initiator: CPU Error type: Real address [Instruction fetch (foreign)] [ 127.426651616,0] OPAL: Reboot requested due to Platform error. Effective[ 127.426693712,3] OPAL: Reboot requested due to Platform error. address: ffff000000000000 opal: Reboot type 1 not supported Kernel panic - not syncing: PowerNV Unrecovered Machine Check CPU: 56 PID: 4425 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #35 Call Trace: [ 128.017988928,4] IPMI: BUG: Dropping ESEL on the floor due to buggy/mising code in OPAL for this BMC Rebooting in 10 seconds.. Trying to free IRQ 496 from IRQ context! After this patch, the process is killed and the kernel continues with this message, which gives enough information to identify the offending branch (i.e., with CFAR): Severe Machine check interrupt [Not recovered] NIP [ffff000000000000]: 0xffff000000000000 Initiator: CPU Error type: Real address [Instruction fetch (foreign)] Effective address: ffff000000000000 Oops: Machine check, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 ... CPU: 22 PID: 4436 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #36 task: c000000932300000 task.stack: c000000932380000 NIP: ffff000000000000 LR: 00000000217706a4 CTR: ffff000000000000 REGS: c00000000fc8fd80 TRAP: 0200 Tainted: G M (4.12.0-rc1-13857-ga4700a261072-dirty) MSR: 90000000001c1003 <SF,HV,ME,RI,LE> CR: 24000484 XER: 20000000 CFAR: c000000000004c80 DAR: 0000000021770a90 DSISR: 0a000000 SOFTE: 1 GPR00: 0000000000001ebe 00007fffce4818b0 0000000021797f00 0000000000000000 GPR04: 00007fff8007ac24 0000000044000484 0000000000004000 00007fff801405e8 GPR08: 900000000280f033 0000000024000484 0000000000000000 0000000000000030 GPR12: 9000000000001003 00007fff801bc370 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28: 00007fff801b0000 0000000000000000 00000000217707a0 00007fffce481918 NIP [ffff000000000000] 0xffff000000000000 LR [00000000217706a4] 0x217706a4 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv: Flush console before platform error rebootNicholas Piggin
Unrecovered MCE and HMI errors are sent through a special restart OPAL call to log the platform error. The downside is that they don't go through normal Linux crash paths, so they don't give much information to the Linux console. Change this by providing a special crash function which does some of the console flushing from the panic() path before calling firmware to reboot. The downside of this is a little more code to execute before reaching the firmware reboot. However in practice, it's critical to get the Linux console messages output in order to debug a problem. So this is a desirable tradeoff. Note on the implementation: It is difficult to plumb a custom reboot handler into the panic path, because panic does a little bit too much work. For example, it will try to delay with the timebase, but that may be corrupted in some cases resulting in a hang without reaching the platform reboot. Another problem is that panic can invoke the crash dump code which is not what we want in the case of a hardware platform error. Long-term the best solution will be to rework the panic path so it can be suitable for this kind of panic, but for now we just duplicate a bit of the code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/powernv: powernv platform is not constrained by RMANicholas Piggin
Remove incorrect comment about real mode address restrictions on powernv (bare metal), and unnecessary clamping to ppc64_rma_size. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/powernv: Fix build error in opal-imc.c when NUMA=nLABBE Corentin
When building a random powerpc kernel I hit this build error: arch/powerpc/platforms/powernv/opal-imc.c:130:13: error : assignment discards « const » qualifier from pointer target type [-Werror=discarded-qualifiers] l_cpumask = cpumask_of_node(nid); ^ This happens because when CONFIG_NUMA=n cpumask_of_node() returns a const pointer. This patch simply adds const to l_cpumask to fix this issue. Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Flesh out change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-24powerpc/powernv: Enable removal of memory for in memory tracingRashmica Gupta
The hardware trace macro feature requires access to a chunk of real memory. This patch provides a debugfs interface to do this. By writing an integer containing the size of memory to be unplugged into /sys/kernel/debug/powerpc/memtrace/enable, the code will attempt to remove that much memory from the end of each NUMA node. This patch also adds additional debugsfs files for each node that allows the tracer to interact with the removed memory, as well as a trace file that allows userspace to read the generated trace. Note that this patch does not invoke the hardware trace macro, it only allows memory to be removed during runtime for the trace macro to utilise. Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> [mpe: Minor formatting etc fixups] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23powerpc: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Anatolij Gustschin <agust@denx.de> Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linuxppc-dev@lists.ozlabs.org Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23Merge branch 'fixes' into nextMichael Ellerman
There's a non-trivial dependency between some commits we want to put in next and the KVM prefetch work around that went into fixes. So merge fixes into next.
2017-08-17powerpc: Add const to bin_attribute structuresBhumika Goyal
Declare bin_attribute structures as const as they are only passed as an argument to the function sysfs_create_bin_file. This argument is of type const, so declare the structure as const. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-10powerpc/powernv: Add support to clear sensor groups dataShilpasri G Bhat
Adds support for clearing different sensor groups. OCC inband sensor groups like CSM, Profiler, Job Scheduler can be cleared using this driver. The min/max of all sensors belonging to these sensor groups will be cleared. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-10powerpc/powernv: Add support to set power-shifting-ratioShilpasri G Bhat
This patch adds support to set power-shifting-ratio which hints the firmware how to distribute/throttle power between different entities in a system (e.g CPU v/s GPU). This ratio is used by OCC for power capping algorithm. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-10powerpc/powernv: Add support for powercap frameworkShilpasri G Bhat
Adds a generic powercap framework to change the system powercap inband through OPAL-OCC command/response interface. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-08powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api failsGautham R. Shenoy
Currently, we use the opal call opal_slw_set_reg() to inform the Sleep-Winkle Engine (SLW) to restore the contents of some of the Hypervisor state on wakeup from deep idle states that lose full hypervisor context (characterized by the flag OPAL_PM_LOSE_FULL_CONTEXT). However, the current code has a bug in that if opal_slw_set_reg() fails, we don't disable the use of these deep states (winkle on POWER8, stop4 onwards on POWER9). This patch fixes this bug by ensuring that if programing the sleep-winkle engine to restore the hypervisor states in pnv_save_sprs_for_deep_states() fails, then we exclude such states by clearing the OPAL_PM_LOSE_FULL_CONTEXT flag from supported_cpuidle_states. As a result POWER8 will be prevented from using winkle for CPU-Hotplug, and POWER9 will put the offlined CPUs to the default stop state when available. Further, we ensure in the initialization of the cpuidle-powernv driver to only include those states whose flags are present in supported_cpuidle_states, thereby skipping OPAL_PM_LOSE_FULL_CONTEXT states when they have been disabled due to stop-api failure. Fixes: 1e1601b38e6 ("powerpc/powernv/idle: Restore SPRs for deep idle states via stop API.") Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-08powerpc/powernv: Use darn instruction for get_random_seed() on Power9Matt Brown
This adds powernv_get_random_darn() which utilises the darn instruction, introduced in ISA v3.0/POWER9. The darn instruction can potentially return an error, which is supported by the get_random_seed() API, in normal usage if we see an error we just return that to the caller. However when detecting whether darn is functional at boot we try up to 10 times, before deciding that darn doesn't work and failing the registration of get_random_seed(). That way an intermittent failure at boot doesn't deprive the system of randomness until the next reboot. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> [mpe: Move init into a function, tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-08powerpc/powernv: Enable PCI peer-to-peerFrederic Barrat
P9 has support for PCI peer-to-peer, enabling a device to write in the MMIO space of another device directly, without interrupting the CPU. This patch adds support for it on powernv, by adding a new API to be called by drivers. The pnv_pci_set_p2p(...) call configures an 'initiator', i.e the device which will issue the MMIO operation, and a 'target', i.e. the device on the receiving side. P9 really only supports MMIO stores for the time being but that's expected to change in the future, so the API allows to define both load and store operations. /* PCI p2p descriptor */ #define OPAL_PCI_P2P_ENABLE 0x1 #define OPAL_PCI_P2P_LOAD 0x2 #define OPAL_PCI_P2P_STORE 0x4 int pnv_pci_set_p2p(struct pci_dev *initiator, struct pci_dev *target, u64 desc) It uses a new OPAL call, as the configuration magic is done on the PHBs by skiboot. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> [mpe: Drop unrelated OPAL calls, s/uint64_t/u64/, minor formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-01powerpc/powernv: Clear PECE1 in LPCR via stop-api only on HotplugGautham R. Shenoy
Currently we use the stop-api provided by the firmware to program the SLW engine to restore the values of hypervisor resources that get lost on deeper idle states (such as winkle). Since the deep states were only used for CPU-Hotplug on POWER8 systems, we would program the LPCR to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously woken up by decrementer. On POWER9, some of the deep platform idle states such as stop4 can be used in cpuidle as well. In this case, we want the CPU in stop4 to be woken up by the decrementer when some timer on the CPU expires. In this patch, we program the stop-api for LPCR with PECE1 bit cleared only when we are offlining the CPU and set it back once the CPU is online. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-28powerpc/powernv/pci: Return failure for some uses of dma_set_mask()Alistair Popple
Commit 8e3f1b1d8255 ("powerpc/powernv/pci: Enable 64-bit devices to access >4GB DMA space") introduced the ability for PCI device drivers to request a DMA mask between 64 and 32 bits and actually get a mask greater than 32-bits. However currently if certain machine configuration dependent conditions are not meet the code silently falls back to a 32-bit mask. This makes it hard for device drivers to detect which mask they actually got. Instead we should return an error when the request could not be fulfilled which allows drivers to either fallback or implement other workarounds as documented in DMA-API-HOWTO.txt. Signed-off-by: Alistair Popple <alistair@popple.id.au> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-25powerpc/perf: Add nest IMC PMU supportAnju T Sudhakar
Add support to register Nest In-Memory Collection PMU counters. Patch adds a new device file called "imc-pmu.c" under powerpc/perf folder to contain all the device PMU functions. Device tree parser code added to parse the PMU events information and create sysfs event attributes for the PMU. Cpumask attribute added along with Cpu hotplug online/offline functions specific for nest PMU. A new state "CPUHP_AP_PERF_POWERPC_NEST_IMC_ONLINE" added for the cpu hotplug callbacks. Error handle path frees the memory and unregisters the CPU hotplug callbacks. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-25powerpc/powernv: Detect and create IMC deviceMadhavan Srinivasan
Code to create platform device for the In-Memory Collection (IMC) counters. Platform devices are created based on the IMC compatibility. New header file created to contain the data structures and macros needed for In-Memory Collection (IMC) counter pmu devices. The device tree for IMC counters starts at the node "imc-counters". This node contains all the IMC PMU nodes and event nodes for these IMC PMUs. Device probe() parses the device to locate three possible IMC device types (Nest/Core/Thread). Function then branch to parse each unit nodes to populate vital information such as device memory sizes, event nodes information, base address for reserve memory access (if any) and so on. Simple bare-minimum shutdown function added which only "stops" the engines. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Fix build with CONFIG_PERF_EVENTS=n] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-24powerpc/powernv: Add IMC OPAL APIsMadhavan Srinivasan
In-Memory Collection (IMC) counters are performance monitoring infrastructure. These counters need special sequence of SCOMs to init/start/stop which is handled by OPAL. And OPAL provides three APIs to init and control these IMC engines. OPAL API documentation: https://github.com/open-power/skiboot/blob/master/doc/opal-api/opal-imc-counters.rst Patch updates the kernel side powernv platform code to support the new OPAL APIs Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-24powerpc/powernv: use memdup_userGeliang Tang
Use memdup_user() helper instead of open-coding to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-24powerpc/powernv: Get cpu only after validity checkSantosh Sivaraj
Check for validity of cpu before calling get_hard_smp_processor_id(). Found with coverity. Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-17powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores()Michael Ellerman
In commit 1c0eaf0f56d6 ("powerpc/powernv: Tell OPAL about our MMU mode on POWER9"), we added additional flags to the OPAL call to configure CPUs at boot. These flags only work on Power9 firmwares, and worse can cause boot failures on Power8 machines, so we check for CPU_FTR_ARCH_300 (aka POWER9) before adding the extra flags. Unfortunately we forgot that opal_configure_cores() is called before the CPU feature checks are dynamically patched, meaning the check always returns true. We definitely need to do something to make the CPU feature checks less prone to bugs like this, but for now the minimal fix is to use early_cpu_has_feature(). Reported-and-tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Fixes: 1c0eaf0f56d6 ("powerpc/powernv: Tell OPAL about our MMU mode on POWER9") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-10powerpc/powernv: Tell OPAL about our MMU mode on POWER9Benjamin Herrenschmidt
That will allow OPAL to configure the CPU in an optimal way. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-07Merge tag 'powerpc-4.13-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights include: - Support for STRICT_KERNEL_RWX on 64-bit server CPUs. - Platform support for FSP2 (476fpe) board - Enable ZONE_DEVICE on 64-bit server CPUs. - Generic & powerpc spin loop primitives to optimise busy waiting - Convert VDSO update function to use new update_vsyscall() interface - Optimisations to hypercall/syscall/context-switch paths - Improvements to the CPU idle code on Power8 and Power9. As well as many other fixes and improvements. Thanks to: Akshay Adiga, Andrew Donnellan, Andrew Jeffery, Anshuman Khandual, Anton Blanchard, Balbir Singh, Benjamin Herrenschmidt, Christophe Leroy, Christophe Lombard, Colin Ian King, Dan Carpenter, Gautham R. Shenoy, Hari Bathini, Ian Munsie, Ivan Mikhaylov, Javier Martinez Canillas, Madhavan Srinivasan, Masahiro Yamada, Matt Brown, Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pavel Machek, Russell Currey, Santosh Sivaraj, Stephen Rothwell, Thiago Jung Bauermann, Yang Li" * tag 'powerpc-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (158 commits) powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs powerpc/mm/radix: Implement STRICT_RWX/mark_rodata_ro() for Radix powerpc/mm/hash: Implement mark_rodata_ro() for hash powerpc/vmlinux.lds: Align __init_begin to 16M powerpc/lib/code-patching: Use alternate map for patch_instruction() powerpc/xmon: Add patch_instruction() support for xmon powerpc/kprobes/optprobes: Use patch_instruction() powerpc/kprobes: Move kprobes over to patch_instruction() powerpc/mm/radix: Fix execute permissions for interrupt_vectors powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() powerpc/64s: Blacklist rtas entry/exit from kprobes powerpc/64s: Blacklist functions invoked on a trap powerpc/64s: Un-blacklist system_call() from kprobes powerpc/64s: Move system_call() symbol to just after setting MSR_EE powerpc/64s: Blacklist system_call() and system_call_common() from kprobes powerpc/64s: Convert .L__replay_interrupt_return to a local label powerpc64/elfv1: Only dereference function descriptor for non-text symbols cxl: Export library to support IBM XSL powerpc/dts: Use #include "..." to include local DT powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8 ...
2017-07-03Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP hotplug updates from Thomas Gleixner: "This update is primarily a cleanup of the CPU hotplug locking code. The hotplug locking mechanism is an open coded RWSEM, which allows recursive locking. The main problem with that is the recursive nature as it evades the full lockdep coverage and hides potential deadlocks. The rework replaces the open coded RWSEM with a percpu RWSEM and establishes full lockdep coverage that way. The bulk of the changes fix up recursive locking issues and address the now fully reported potential deadlocks all over the place. Some of these deadlocks have been observed in the RT tree, but on mainline the probability was low enough to hide them away." * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) cpu/hotplug: Constify attribute_group structures powerpc: Only obtain cpu_hotplug_lock if called by rtasd ARM/hw_breakpoint: Fix possible recursive locking for arch_hw_breakpoint_init cpu/hotplug: Remove unused check_for_tasks() function perf/core: Don't release cred_guard_mutex if not taken cpuhotplug: Link lock stacks for hotplug callbacks acpi/processor: Prevent cpu hotplug deadlock sched: Provide is_percpu_thread() helper cpu/hotplug: Convert hotplug locking to percpu rwsem s390: Prevent hotplug rwsem recursion arm: Prevent hotplug rwsem recursion arm64: Prevent cpu hotplug rwsem recursion kprobes: Cure hotplug lock ordering issues jump_label: Reorder hotplug lock and jump_label_lock perf/tracing/cpuhotplug: Fix locking order ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() PCI: Replace the racy recursion prevention PCI: Use cpu_hotplug_disable() instead of get_online_cpus() perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() x86/perf: Drop EXPORT of perf_check_microcode ...
2017-07-03Merge branch 'fixes' into nextMichael Ellerman
Merge our fixes branch, a few of them are tripping people up while working on top of next, and we also have a dependency between the CXL fixes and new CXL code we want to merge into next.
2017-06-28powerpc/smp: Convert NR_CPUS to nr_cpu_idsSantosh Sivaraj
nr_cpu_ids can be limited by nr_cpus boot parameter, whereas NR_CPUS is a compile time constant, which shouldn't be compared against during cpu kick. Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-28powerpc/smp: Do not BUG_ON if invalid CPU during kickSantosh Sivaraj
During secondary start, we do not need to BUG_ON if an invalid CPU number is passed. We already print an error if secondary cannot be started, so just return an error instead. Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-27powerpc/powernv/pci: Enable 64-bit devices to access >4GB DMA spaceRussell Currey
On PHB3/POWER8 systems, devices can select between two different sections of address space, TVE#0 and TVE#1. TVE#0 is intended for 32bit devices that aren't capable of addressing more than 4GB. Selecting TVE#1 instead, with the capability of addressing over 4GB, is performed by setting bit 59 of a PCI address. However, some devices aren't capable of addressing at least 59 bits, but still want more than 4GB of DMA space. In order to enable this, reconfigure TVE#0 to be suitable for 64-bit devices by allocating memory past the initial 4GB that is inaccessible by 64-bit DMAs. This bypass mode is only enabled if a device requests 4GB or more of DMA address space, if the system has PHB3 (POWER8 systems), and if the device does not share a PE with any devices from different vendors. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>