summaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel
AgeCommit message (Collapse)Author
2020-10-06powerpc/eeh: Rework EEH initialisationOliver O'Halloran
Drop the EEH register / unregister ops thing and have the platform pass the ops structure into eeh_init() directly. This takes one initcall out of the EEH setup path and it means we're only doing EEH setup on the platforms which actually support it. It's also less code and generally easier to follow. No functional changes. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200918093050.37344-1-oohall@gmail.com
2020-10-06powerpc/64: make restore_interrupts 64e onlyNicholas Piggin
This is not used by 64s. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200915114650.3980244-5-npiggin@gmail.com
2020-10-06powerpc/64e: remove 64s specific interrupt soft-mask codeNicholas Piggin
Since the assembly soft-masking code was moved to 64e specific, there are some 64s specific interrupt types still there. Remove them. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200915114650.3980244-4-npiggin@gmail.com
2020-10-06powerpc/64e: remove PACA_IRQ_EE_EDGENicholas Piggin
This is not used anywhere. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200915114650.3980244-3-npiggin@gmail.com
2020-10-06powerpc/64: fix irq replay pt_regs->softe valueNicholas Piggin
Replayed interrupts get an "artificial" struct pt_regs constructed to pass to interrupt handler functions. This did not get the softe field set correctly, it's as though the interrupt has hit while irqs are disabled. It should be IRQS_ENABLED. This is possibly harmless, asynchronous handlers should not be testing if irqs were disabled, but it might be possible for example some code is shared with synchronous or NMI handlers, and it makes more sense if debug output looks at this. Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200915114650.3980244-2-npiggin@gmail.com
2020-10-06powerpc/64: fix irq replay missing preemptNicholas Piggin
Prior to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C"), replayed interrupts returned by the regular interrupt exit code, which performs preemption in case an interrupt had set need_resched. This logic was missed by the conversion. Adding preempt_disable/enable around the interrupt replay and final irq enable will reschedule if needed. Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200915114650.3980244-1-npiggin@gmail.com
2020-10-06powerpc: untangle cputable mce includeNicholas Piggin
Having cputable.h include mce.h means it pulls in a bunch of low level headers (e.g., synch.h) which then can't use CPU_FTR_ definitions. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200916030234.4110379-1-npiggin@gmail.com
2020-10-03mm: remove compat_process_vm_{readv,writev}Christoph Hellwig
Now that import_iovec handles compat iovecs, the native syscalls can be used for the compat case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03fs: remove compat_sys_vmspliceChristoph Hellwig
Now that import_iovec handles compat iovecs, the native vmsplice syscall can be used for the compat case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03fs: remove the compat readv/writev syscallsChristoph Hellwig
Now that import_iovec handles compat iovecs, the native readv and writev syscalls can be used for the compat case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-25dma-mapping: add a new dma_alloc_pages APIChristoph Hellwig
This API is the equivalent of alloc_pages, except that the returned memory is guaranteed to be DMA addressable by the passed in device. The implementation will also be used to provide a more sensible replacement for DMA_ATTR_NON_CONSISTENT flag. Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages as its backend. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> (MIPS part)
2020-09-25Merge branch 'master' of ↵Christoph Hellwig
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into dma-mapping-for-next Pull in the latest 5.9 tree for the commit to revert the V4L2_FLAG_MEMORY_NON_CONSISTENT uapi addition.
2020-09-25kbuild: preprocess module linker scriptMasahiro Yamada
There was a request to preprocess the module linker script like we do for the vmlinux one. (https://lkml.org/lkml/2020/8/21/512) The difference between vmlinux.lds and module.lds is that the latter is needed for external module builds, thus must be cleaned up by 'make mrproper' instead of 'make clean'. Also, it must be created by 'make modules_prepare'. You cannot put it in arch/$(SRCARCH)/kernel/, which is cleaned up by 'make clean'. I moved arch/$(SRCARCH)/kernel/module.lds to arch/$(SRCARCH)/include/asm/module.lds.h, which is included from scripts/module.lds.S. scripts/module.lds is fine because 'make clean' keeps all the build artifacts under scripts/. You can add arch-specific sections in <asm/module.lds.h>. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Jessica Yu <jeyu@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Jessica Yu <jeyu@kernel.org>
2020-09-22fs: remove compat_sys_mountChristoph Hellwig
compat_sys_mount is identical to the regular sys_mount now, so remove it and use the native version everywhere. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-18powerpc/sysfs: Remove unused 'err' variable in sysfs_create_dscr_default()Cédric Le Goater
This fixes a compile error with W=1. arch/powerpc/kernel/sysfs.c: In function ‘sysfs_create_dscr_default’: arch/powerpc/kernel/sysfs.c:228:7: error: variable ‘err’ set but not used [-Werror=unused-but-set-variable] int err = 0; ^~~ cc1: all warnings being treated as errors Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200914211007.2285999-2-clg@kaod.org
2020-09-18powerpc/prom_init: Check display props exist before enabling btextMichael Ellerman
It's possible to enable CONFIG_PPC_EARLY_DEBUG_BOOTX for a pseries kernel (maybe it shouldn't be), which is then booted with qemu/slof. But if you do that the kernel crashes in draw_byte(), with a DAR pointing somewhere near INT_MAX. Adding some debug to prom_init we see that we're not able to read the "address" property from OF, so we're just using whatever junk value was on the stack. So check the properties can be read properly from OF, if not we bail out before initialising btext, which avoids the crash. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Link: https://lore.kernel.org/r/20200821103407.3362149-1-mpe@ellerman.id.au
2020-09-18powerpc/smp: Move ppc_md.cpu_die() to smp_ops.cpu_offline_self()Michael Ellerman
We have smp_ops->cpu_die() and ppc_md.cpu_die(). One of them offlines the current CPU and one offlines another CPU, can you guess which is which? Also one is in smp_ops and one is in ppc_md? So rename ppc_md.cpu_die(), to cpu_offline_self(), because that's what it does. And move it into smp_ops where it belongs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200819015634.1974478-3-mpe@ellerman.id.au
2020-09-18powerpc/smp: Fold cpu_die() into its only callerMichael Ellerman
Avoid the eternal confusion between cpu_die() and __cpu_die() by removing the former, folding it into its only caller. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200819015634.1974478-2-mpe@ellerman.id.au
2020-09-18powerpc: Move arch_cpu_idle_dead() into smp.cMichael Ellerman
arch_cpu_idle_dead() is in idle.c, which makes sense, but it's inside a CONFIG_HOTPLUG_CPU block. It would be more at home in smp.c, inside the existing CONFIG_HOTPLUG_CPU block. Note that CONFIG_HOTPLUG_CPU depends on CONFIG_SMP so even though smp.c is not built for SMP=n builds, that's fine. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200819015634.1974478-1-mpe@ellerman.id.au
2020-09-18powerpc/process: Fix uninitialised variable errorMichael Ellerman
Clang, and GCC with -Wmaybe-uninitialized, can't see that val is unused in get_fpexec_mode(): arch/powerpc/kernel/process.c:1940:7: error: variable 'val' is used uninitialized whenever 'if' condition is true if (cpu_has_feature(CPU_FTR_SPE)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ We know that CPU_FTR_SPE will only be true iff CONFIG_SPE is also true, but the compiler doesn't. Avoid it by initialising val to zero. Reported-by: kernel test robot <lkp@intel.com> Fixes: 532ed1900d37 ("powerpc/process: Remove useless #ifdef CONFIG_SPE") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20200917024509.3253837-1-mpe@ellerman.id.au
2020-09-16powerpc/smp: Create coregroup domainSrikar Dronamraju
Add percpu coregroup maps and masks to create coregroup domain. If a coregroup doesn't exist, the coregroup domain will be degenerated in favour of SMT/CACHE domain. Do note this patch is only creating stubs for cpu_to_coregroup_id. The actual cpu_to_coregroup_id implementation would be in a subsequent patch. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200810071834.92514-10-srikar@linux.vnet.ibm.com
2020-09-16powerpc/smp: Allocate cpumask only after searching thread groupSrikar Dronamraju
If allocated earlier and the search fails, then cpu_l1_cache_map cpumask is unnecessarily cleared. However cpu_l1_cache_map can be allocated / cleared after we search thread group. Please note CONFIG_CPUMASK_OFFSTACK is not set on Powerpc. Hence cpumask allocated by zalloc_cpumask_var_node is never freed. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200810071834.92514-9-srikar@linux.vnet.ibm.com
2020-09-16powerpc/numa: Detect support for coregroupSrikar Dronamraju
Add support for grouping cores based on the device-tree classification. - The last domain in the associativity domains always refers to the core. - If primary reference domain happens to be the penultimate domain in the associativity domains device-tree property, then there are no coregroups. However if its not a penultimate domain, then there are coregroups. There can be more than one coregroup. For now we would be interested in the last or the smallest coregroups, i.e one sub-group per DIE. Currently there are no firmwares that are exposing this grouping. Hence allow the basis for grouping to be abstract. Once the firmware starts using this grouping, code would be added to detect the type of grouping and adjust the sd domain flags accordingly. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200810071834.92514-8-srikar@linux.vnet.ibm.com
2020-09-16powerpc/smp: Optimize start_secondarySrikar Dronamraju
In start_secondary, even if shared_cache was already set, system does a redundant match for cpumask. This redundant check can be removed by checking if shared_cache is already set. While here, localize the sibling_mask variable to within the if condition. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200810071834.92514-7-srikar@linux.vnet.ibm.com
2020-09-16powerpc/smp: Dont assume l2-cache to be superset of siblingSrikar Dronamraju
Current code assumes that cpumask of cpus sharing a l2-cache mask will always be a superset of cpu_sibling_mask. Lets stop that assumption. cpu_l2_cache_mask is a superset of cpu_sibling_mask if and only if shared_caches is set. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200913171038.GB11808@linux.vnet.ibm.com
2020-09-16powerpc/smp: Move topology fixups into a new functionSrikar Dronamraju
Move topology fixup based on the platform attributes into its own function which is called just before set_sched_topology. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200810071834.92514-5-srikar@linux.vnet.ibm.com
2020-09-16powerpc/smp: Move powerpc_topology aboveSrikar Dronamraju
Just moving the powerpc_topology description above. This will help in using functions in this file and avoid declarations. No other functional changes Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200810071834.92514-4-srikar@linux.vnet.ibm.com
2020-09-16powerpc/smp: Merge Power9 topology with Power topologySrikar Dronamraju
A new sched_domain_topology_level was added just for Power9. However the same can be achieved by merging powerpc_topology with power9_topology and makes the code more simpler especially when adding a new sched domain. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200810071834.92514-3-srikar@linux.vnet.ibm.com
2020-09-16powerpc/smp: Fix a warning under !NEED_MULTIPLE_NODESSrikar Dronamraju
Fix a build warning in a non CONFIG_NEED_MULTIPLE_NODES "error: _numa_cpu_lookup_table_ undeclared" Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200810071834.92514-2-srikar@linux.vnet.ibm.com
2020-09-15powerpc/pci: unmap legacy INTx interrupts when a PHB is removedCédric Le Goater
When a passthrough IO adapter is removed from a pseries machine using hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the guest OS to clear all page table entries related to the adapter. If some are still present, the RTAS call which isolates the PCI slot returns error 9001 "valid outstanding translations" and the removal of the IO adapter fails. This is because when the PHBs are scanned, Linux maps automatically the INTx interrupts in the Linux interrupt number space but these are never removed. To solve this problem, we introduce a PPC platform specific pcibios_remove_bus() routine which clears all interrupt mappings when the bus is removed. This also clears the associated page table entries of the ESB pages when using XIVE. For this purpose, we record the logical interrupt numbers of the mapped interrupt under the PHB structure and let pcibios_remove_bus() do the clean up. Since some PCI adapters, like GPUs, use the "interrupt-map" property to describe interrupt mappings other than the legacy INTx interrupts, we can not restrict the size of the mapping array to PCI_NUM_INTX. The number of interrupt mappings is computed from the "interrupt-map" property and the mapping array is allocated accordingly. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200807101854.844619-1-clg@kaod.org
2020-09-15powerpc/powermac: Fix low_sleep_handler with KUAP and KUEPChristophe Leroy
low_sleep_handler() has an hardcoded restore of segment registers that doesn't take KUAP and KUEP into account. Use head_32's load_segment_registers() routine instead. Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection") Fixes: 31ed2b13c48d ("powerpc/32s: Implement Kernel Userspace Execution Prevention.") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/21b05f7298c1b18f73e6e5b4cd5005aafa24b6da.1599820109.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Remove useless #ifdef CONFIG_PPC_FPUChristophe Leroy
Add a stub for __giveup_fpu() when CONFIG_PPC_FPU is not selected, as done for CONFIG_SPE and CONFIG_ALTIVEC. This allows to remove some #ifdef CONFIG_PPC_FPU. Also change one to IS_ENABLED(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/69c8b7954ceeccc6b849e52e1fa41b3a0f10f6c1.1597643221.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Remove useless #ifdef CONFIG_SPEChristophe Leroy
cpu_has_feature(CPU_FTR_SPE) returns false when CONFIG_SPE is not set. There is no need to enclose the test in an #ifdef CONFIG_SPE. Remove it. CPU_FTR_SPE only exists on 32 bits. Define it as 0 on 64 bits. We have a couple of places like: #ifdef CONFIG_SPE if (cpu_has_feature(CPU_FTR_SPE)) { do_something_that_requires_CONFIG_SPE } else { return -EINVAL; } #else return -EINVAL; #endif Replace them by a cleaner version: if (cpu_has_feature(CPU_FTR_SPE)) { #ifdef CONFIG_SPE do_something_that_requires_CONFIG_SPE #endif } else { return -EINVAL; } When CONFIG_SPE is not set, this resolves to an unconditional return of -EINVAL Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/698df8387555765b70ea42e4a7fa48141c309c1f.1597643221.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Remove useless #ifdef CONFIG_ALTIVECChristophe Leroy
cpu_has_feature(CPU_FTR_ALTIVEC) returns false when CONFIG_ALTIVEC is not set. There is no need to enclose the test in an #ifdef CONFIG_ALTIVEC. Remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/03ba6b52344ca7c336df2bc6e3d31d736c804ae2.1597643221.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Remove useless #ifdef CONFIG_VSXChristophe Leroy
cpu_has_feature(CPU_FTR_VSX) returns false when CONFIG_VSX is not set. There is no need to enclose the test in an #ifdef CONFIG_VSX. Remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0eb61cf0dc66d781d47deb2228498cd61d03a754.1597643221.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Tag an #endif to help locate the matching #ifdef.Christophe Leroy
That #endif is more than 100 lines after the matching #ifdef, and there are several #ifdef/#else/#endif inbetween. Tag it as /* CONFIG_PPC_BOOK3S_64 */ to help locate the matching #ifdef. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3612a8f8aaca16de3fc414a7e66293319d6e213c.1597643147.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Replace #ifdef CONFIG_KALLSYMS by IS_ENABLED()Christophe Leroy
The #ifdef CONFIG_KALLSYMS encloses some printk which can compile in all cases. Replace by IS_ENABLED(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2d89732a9062b2cf2651728804e4b8f6c9b9358e.1597643164.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Replace an #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) ↵Christophe Leroy
by IS_ENABLED() The #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) encloses some printk which can be compiled in all cases. Replace by IS_ENABLED(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a1b6ef3d657c8f249193442f56868fc358ea5b6c.1597643160.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Replace an #ifdef CONFIG_PPC_BOOK3S_64 by IS_ENABLED()Christophe Leroy
This #ifdef CONFIG_PPC_BOOK3S_64 calls preload_new_slb_context() when radix is not enabled. radix_enabled() is always defined, and the prototype for preload_new_slb_context() is always present, so the #ifdef is unneeded. Replace it by IS_ENABLED(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d31506ca9bac9def68cf7424eded63fdc4fb6660.1597643167.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Replace an #ifdef CONFIG_PPC_47x by IS_ENABLED()Christophe Leroy
isync() is always defined, no need for an #ifdef. Replace it by IS_ENABLED(CONFIG_PPC_47x). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ac8da0e3baa91dda805e1e492fd65aecd90c1fb5.1597643156.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/32: Fix vmap stack - Properly set r1 before activating MMUChristophe Leroy
We need r1 to be properly set before activating MMU, otherwise any new exception taken while saving registers into the stack in exception prologs will use the user stack, which is wrong and will even lockup or crash when KUAP is selected. Do that by switching the meaning of r11 and r1 until we have saved r1 to the stack: copy r1 into r11 and setup the new stack pointer in r1. To avoid complicating and impacting all generic and specific prolog code (and more), copy back r1 into r11 once r11 is save onto the stack. We could get rid of copying r1 back and forth at the cost of rewriting everything to use r1 instead of r11 all the way when CONFIG_VMAP_STACK is set, but the effort is probably not worth it. Fixes: 028474876f47 ("powerpc/32: prepare for CONFIG_VMAP_STACK") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8f85e8752ac5af602db7237ef53d634f4f3d3892.1599486108.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/32: Fix vmap stack - Do not activate MMU before reading task structChristophe Leroy
We need r1 to be properly set before activating MMU, so reading task_struct->stack must be done with MMU off. This means we need an additional register to play with MSR bits while r11 now points to the stack. For that, move r10 back to CR (As is already done for hash MMU) and use r10. We still don't have r1 correct yet when we activate MMU. It is done in following patch. Fixes: 028474876f47 ("powerpc/32: prepare for CONFIG_VMAP_STACK") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a027d447022a006c9c4958ac734128e577a3c5c1.1599486108.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/tau: Disable TAU between measurementsFinn Thain
Enabling CONFIG_TAU_INT causes random crashes: Unrecoverable exception 1700 at c0009414 (msr=1000) Oops: Unrecoverable exception, sig: 6 [#1] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-pmac-00043-gd5f545e1a8593 #5 NIP: c0009414 LR: c0009414 CTR: c00116fc REGS: c0799eb8 TRAP: 1700 Not tainted (5.7.0-pmac-00043-gd5f545e1a8593) MSR: 00001000 <ME> CR: 22000228 XER: 00000100 GPR00: 00000000 c0799f70 c076e300 00800000 0291c0ac 00e00000 c076e300 00049032 GPR08: 00000001 c00116fc 00000000 dfbd3200 ffffffff 007f80a8 00000000 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c075ce04 GPR24: c075ce04 dfff8880 c07b0000 c075ce04 00080000 00000001 c079ef98 c079ef5c NIP [c0009414] arch_cpu_idle+0x24/0x6c LR [c0009414] arch_cpu_idle+0x24/0x6c Call Trace: [c0799f70] [00000001] 0x1 (unreliable) [c0799f80] [c0060990] do_idle+0xd8/0x17c [c0799fa0] [c0060ba4] cpu_startup_entry+0x20/0x28 [c0799fb0] [c072d220] start_kernel+0x434/0x44c [c0799ff0] [00003860] 0x3860 Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX 3d20c07b XXXXXXXX XXXXXXXX XXXXXXXX 7c0802a6 XXXXXXXX XXXXXXXX XXXXXXXX 4e800421 XXXXXXXX XXXXXXXX XXXXXXXX 7d2000a6 ---[ end trace 3a0c9b5cb216db6b ]--- Resolve this problem by disabling each THRMn comparator when handling the associated THRMn interrupt and by disabling the TAU entirely when updating THRMn thresholds. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5a0ba3dc5612c7aac596727331284a3676c08472.1599260540.git.fthain@telegraphics.com.au
2020-09-15powerpc/tau: Check processor type before enabling TAU interruptFinn Thain
According to Freescale's documentation, MPC74XX processors have an erratum that prevents the TAU interrupt from working, so don't try to use it when running on those processors. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c281611544768e758bd58fe812cf702a5bd2d042.1599260540.git.fthain@telegraphics.com.au
2020-09-15powerpc/tau: Remove duplicated set_thresholds() callFinn Thain
The commentary at the call site seems to disagree with the code. The conditional prevents calling set_thresholds() via the exception handler, which appears to crash. Perhaps that's because it immediately triggers another TAU exception. Anyway, calling set_thresholds() from TAUupdate() is redundant because tau_timeout() does so. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d7c7ee33232cf72a6a6bbb6ef05838b2e2b113c0.1599260540.git.fthain@telegraphics.com.au
2020-09-15powerpc/tau: Convert from timer to workqueueFinn Thain
Since commit 19dbdcb8039cf ("smp: Warn on function calls from softirq context") the Thermal Assist Unit driver causes a warning like the following when CONFIG_SMP is enabled. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/smp.c:428 smp_call_function_many_cond+0xf4/0x38c Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-pmac #3 NIP: c00b37a8 LR: c00b3abc CTR: c001218c REGS: c0799c60 TRAP: 0700 Not tainted (5.7.0-pmac) MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42000224 XER: 00000000 GPR00: c00b3abc c0799d18 c076e300 c079ef5c c0011fec 00000000 00000000 00000000 GPR08: 00000100 00000100 00008000 ffffffff 42000224 00000000 c079d040 c079d044 GPR16: 00000001 00000000 00000004 c0799da0 c079f054 c07a0000 c07a0000 00000000 GPR24: c0011fec 00000000 c079ef5c c079ef5c 00000000 00000000 00000000 00000000 NIP [c00b37a8] smp_call_function_many_cond+0xf4/0x38c LR [c00b3abc] on_each_cpu+0x38/0x68 Call Trace: [c0799d18] [ffffffff] 0xffffffff (unreliable) [c0799d68] [c00b3abc] on_each_cpu+0x38/0x68 [c0799d88] [c0096704] call_timer_fn.isra.26+0x20/0x7c [c0799d98] [c0096b40] run_timer_softirq+0x1d4/0x3fc [c0799df8] [c05b4368] __do_softirq+0x118/0x240 [c0799e58] [c0039c44] irq_exit+0xc4/0xcc [c0799e68] [c000ade8] timer_interrupt+0x1b0/0x230 [c0799ea8] [c0013520] ret_from_except+0x0/0x14 --- interrupt: 901 at arch_cpu_idle+0x24/0x6c LR = arch_cpu_idle+0x24/0x6c [c0799f70] [00000001] 0x1 (unreliable) [c0799f80] [c0060990] do_idle+0xd8/0x17c [c0799fa0] [c0060ba8] cpu_startup_entry+0x24/0x28 [c0799fb0] [c072d220] start_kernel+0x434/0x44c [c0799ff0] [00003860] 0x3860 Instruction dump: 8129f204 2f890000 40beff98 3d20c07a 8929eec4 2f890000 40beff88 0fe00000 81220000 552805de 550802ef 4182ff84 <0fe00000> 3860ffff 7f65db78 7f44d378 ---[ end trace 34a886e47819c2eb ]--- Don't call on_each_cpu() from a timer callback, call it from a worker thread instead. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bb61650bea4f4c91fb8e24b9a6f130a1438651a7.1599260540.git.fthain@telegraphics.com.au
2020-09-15powerpc/tau: Use appropriate temperature sample intervalFinn Thain
According to the MPC750 Users Manual, the SITV value in Thermal Management Register 3 is 13 bits long. The present code calculates the SITV value as 60 * 500 cycles. This would overflow to give 10 us on a 500 MHz CPU rather than the intended 60 us. (But according to the Microprocessor Datasheet, there is also a factor of 266 that has to be applied to this value on certain parts i.e. speed sort above 266 MHz.) Always use the maximum cycle count, as recommended by the Datasheet. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/896f542e5f0f1d6cf8218524c2b67d79f3d69b3c.1599260540.git.fthain@telegraphics.com.au
2020-09-15powerpc/mm/book3s: Split radix and hash MAX_PHYSMEM limitAneesh Kumar K.V
MAX_PHYSMEM #define is used along with sparsemem to determine the SECTION_SHIFT value. Powerpc also uses the same value to limit the max memory enabled on the system. With 4K PAGE_SIZE and hash translation mode, we want to limit the max memory enabled to 64TB due to page table size restrictions. However, with radix translation, we don't have these restrictions. Hence split the radix and hash MA_PHYSMEM limit and use different limit for each of them. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200608070904.387440-4-aneesh.kumar@linux.ibm.com
2020-09-15powerpc/64/mm: implement page mapping percpu first chunk allocatorAneesh Kumar K.V
Implement page mapping percpu first chunk allocator as a fallback to the embedding allocator. With 4K hash translation we limit our page table range to 64TB and commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range") moved all kernel mapping to that 64TB range. In-order to support sparse memory layout we need to increase our linear mapping space and reduce other mappings. With such a layout percpu embedded first chunk allocator will fail because of small vmalloc range. Add a fallback to page mapping percpu first chunk allocator for such failures. The below dmesg output can be observed in such case. percpu: max_distance=0x1ffffef00000 too large for vmalloc space 0x10000000000 PERCPU: auto allocator failed (-22), falling back to page size percpu: 40 4K pages/cpu s148816 r0 d15024 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200608070904.387440-2-aneesh.kumar@linux.ibm.com
2020-09-15powerpc/percpu: Update percpu bootmem allocatorAneesh Kumar K.V
This update the ppc64 version to be closer to x86/sparc. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200608070904.387440-1-aneesh.kumar@linux.ibm.com