linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-10-30	parisc: Use FRAME_SIZE and FRAME_ALIGN from assembly.h	Helge Deller
	Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: Allocate task struct with stack frame alignment	Helge Deller
	We will put the stack directly behind the task struct, so make sure that we allocate it with an alignment of 64 bytes. Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: Define FRAME_ALIGN and PRIV_USER/PRIV_KERNEL in assembly.h	Helge Deller
	Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: fix warning in flush_tlb_all	Sven Schnelle
	I've got the following splat after enabling preemption: [ 3.724721] BUG: using __this_cpu_add() in preemptible [00000000] code: swapper/0/1 [ 3.734630] caller is __this_cpu_preempt_check+0x38/0x50 [ 3.740635] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc4-64bit+ #324 [ 3.744605] Hardware name: 9000/785/C8000 [ 3.744605] Backtrace: [ 3.744605] [<00000000401d9d58>] show_stack+0x74/0xb0 [ 3.744605] [<0000000040c27bd4>] dump_stack_lvl+0x10c/0x188 [ 3.744605] [<0000000040c27c84>] dump_stack+0x34/0x48 [ 3.744605] [<0000000040c33438>] check_preemption_disabled+0x178/0x1b0 [ 3.744605] [<0000000040c334f8>] __this_cpu_preempt_check+0x38/0x50 [ 3.744605] [<00000000401d632c>] flush_tlb_all+0x58/0x2e0 [ 3.744605] [<00000000401075c0>] 0x401075c0 [ 3.744605] [<000000004010b8fc>] 0x4010b8fc [ 3.744605] [<00000000401080fc>] 0x401080fc [ 3.744605] [<00000000401d5224>] do_one_initcall+0x128/0x378 [ 3.744605] [<0000000040102de8>] 0x40102de8 [ 3.744605] [<0000000040c33864>] kernel_init+0x60/0x3a8 [ 3.744605] [<00000000401d1020>] ret_from_kernel_thread+0x20/0x28 [ 3.744605] Fix this by moving the __inc_irq_stat() into the locked section. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: disable preemption in send_IPI_allbutself()	Sven Schnelle
	Otherwise we might not stop all other CPUs. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: fix preempt_count() check in entry.S	Sven Schnelle
	preempt_count in struct thread_info is unsigned int, but the entry.S code used LDREG, which generates a 64 bit load when compiled for 64 bit. Fix this to use an ldw and also change the condition in the compare one line below to only compares 32 bits, although ldw zero extends, and that should work with a 64 bit compare. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: deduplicate code in flush_cache_mm() and flush_cache_range()	Sven Schnelle
	Parts of both functions are the same, so deduplicate them. No functional change. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: disable preemption during local tlb flush	Sven Schnelle
	flush_cache_mm() and flush_cache_range() fetch %sr3 via mfsp(). If it matches mm->context, they flush caches and the TLB. However, the TLB is cpu-local, so if the code gets preempted shortly after the mfsp(), and later resumed on another CPU, the wrong TLB is flushed. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: Add KFENCE support	Helge Deller
	Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: Switch to ARCH_STACKWALK implementation	Helge Deller
	It's shorter and kfence currently depends on this stack unwinding implementation. Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc: make parisc_acctyp() available outside of faults.c	Helge Deller
	When adding kfence support, we need to tell kfence_handle_page_fault() if the interrupted assembler statement is a read or write operation. Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	parisc/unwind: use copy_from_kernel_nofault()	Sven Schnelle
	I have no idea why get_user() is used there, but we're unwinding the kernel stack, so we should use copy_from_kernel_nofault(). Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30	Merge tag 'riscv-for-linus-5.15-rc8' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: "These are pretty late, but they do fix concrete issues. - ensure the trap vector's address is aligned. - avoid re-populating the KASAN shadow memory. - allow kasan to build without warnings, which have recently become errors" * tag 'riscv-for-linus-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fix asan-stack clang build riscv: Do not re-populate shadow memory with kasan_populate_early_shadow riscv: fix misalgned trap vector base address
2021-10-30	locking: Remove spin_lock_flags() etc	Arnd Bergmann
	parisc, ia64 and powerpc32 are the only remaining architectures that provide custom arch_{spin,read,write}_lock_flags() functions, which are meant to re-enable interrupts while waiting for a spinlock. However, none of these can actually run into this codepath, because it is only called on architectures without CONFIG_GENERIC_LOCKBREAK, or when CONFIG_DEBUG_LOCK_ALLOC is set without CONFIG_LOCKDEP, and none of those combinations are possible on the three architectures. Going back in the git history, it appears that arch/mn10300 may have been able to run into this code path, but there is a good chance that it never worked. On the architectures that still exist, it was already impossible to hit back in 2008 after the introduction of CONFIG_GENERIC_LOCKBREAK, and possibly earlier. As this is all dead code, just remove it and the helper functions built around it. For arch/ia64, the inline asm could be cleaned up, but it seems safer to leave it untouched. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Helge Deller <deller@gmx.de> # parisc Link: https://lore.kernel.org/r/20211022120058.1031690-1-arnd@kernel.org
2021-10-30	perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings	Stephane Eranian
	This patch fixes the encoding for INST_RETIRED.PREC_DIST as published by Intel (download.01.org/perfmon/) for Icelake. The official encoding is event code 0x00 umask 0x1, a change from Skylake where it was code 0xc0 umask 0x1. With this patch applied it is possible to run: $ perf record -a -e cpu/event=0x00,umask=0x1/pp ..... Whereas before this would fail. To avoid problems with tools which may use the old code, we maintain the old encoding for Icelake. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211014001214.2680534-1-eranian@google.com
2021-10-30	ARM: 9151/1: Thumb2: avoid __builtin_thread_pointer() on Clang	Ard Biesheuvel
	If available, we use the __builtin_thread_pointer() helper to get the value of the TLS register, to help the compiler understand that it doesn't need to reload it every time we access 'current'. Unfortunately, Clang fails to emit the MRC system register read directly when building for Thumb2, and instead, it issues a call to the __aeabi_read_tp helper, which the kernel does not provide, and so this result in link failures at build time. So create a special case for this, and emit the MRC directly using an asm() block, just like we do when the helper is not available to begin with. Link: https://github.com/ClangBuiltLinux/linux/issues/1485 Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-30	ARM: 9150/1: Fix PID_IN_CONTEXTIDR regression when THREAD_INFO_IN_TASK=y	Ard Biesheuvel
	The code that implements the rarely used PID_IN_CONTEXTIDR feature dereferences the 'task' field of struct thread_info directly, and this is no longer possible when THREAD_INFO_IN_TASK=y, as the 'task' field is omitted from the struct definition in that case. Instead, we should just cast the thread_info pointer to a task_struct pointer, given that the former is now the first member of the latter. So use a helper that abstracts this, and provide implementations for both cases. Reported by: Arnd Bergmann <arnd@arndb.de> Fixes: 18ed1c01a7dd ("ARM: smp: Enable THREAD_INFO_IN_TASK") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-30	Merge tag 'coresight-next-v5.16.v3' of ↵	Greg Kroah-Hartman
	gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux into char-misc-next Mathieu writes: Coresight changes for v5.16 - A new option to make coresight cpu-debug capabilities available as early as possible in the kernel boot process. - Make trace sessions more enduring by coping with scenarios where events are scheduled on CPUs that can't reach the selected sink. - A set of improvement to make the TMC-ETR driver more efficient. - Enhancements to the TRBE driver to correct several errata. - An enhancement to make the AXI burts size configurable for TMC devices that can't work with the default value. - A fix in the CTI module to use the correct device when calling pm_runtime_put() - The addition of the Kryo-5xx device to the list of support ETMs. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> * tag 'coresight-next-v5.16.v3' of gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux: (39 commits) arm64: errata: Enable TRBE workaround for write to out-of-range address arm64: errata: Enable workaround for TRBE overwrite in FILL mode coresight: trbe: Work around write to out of range coresight: trbe: Make sure we have enough space coresight: trbe: Add a helper to determine the minimum buffer size coresight: trbe: Workaround TRBE errata overwrite in FILL mode coresight: trbe: Add infrastructure for Errata handling coresight: trbe: Allow driver to choose a different alignment coresight: trbe: Decouple buffer base from the hardware base coresight: trbe: Add a helper to pad a given buffer area coresight: trbe: Add a helper to calculate the trace generated coresight: trbe: Defer the probe on offline CPUs coresight: trbe: Fix incorrect access of the sink specific data coresight: etm4x: Add ETM PID for Kryo-5XX coresight: trbe: Prohibit trace before disabling TRBE coresight: trbe: End the AUX handle on truncation coresight: trbe: Do not truncate buffer on IRQ coresight: trbe: Fix handling of spurious interrupts coresight: trbe: irq handler: Do not disable TRBE if no action is needed coresight: trbe: Unify the enabling sequence ...
2021-10-29	Merge tag 'powerpc-5.15-6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Three commits fixing some issues introduced with the recent IOMMU changes we merged. Thanks to Alexey Kardashevskiy" * tag 'powerpc-5.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries/iommu: Create huge DMA window if no MMIO32 is present powerpc/pseries/iommu: Check if the default window in use before removing it powerpc/pseries/iommu: Use correct vfree for it_map
2021-10-29	signal: Replace force_sigsegv(SIGSEGV) with force_fatal_sig(SIGSEGV)	Eric W. Biederman
	Now that force_fatal_sig exists it is unnecessary and a bit confusing to use force_sigsegv in cases where the simpler force_fatal_sig is wanted. So change every instance we can to make the code clearer. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Link: https://lkml.kernel.org/r/877de7jrev.fsf@disp2133 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-29	signal/x86: In emulate_vsyscall force a signal instead of calling do_exit	Eric W. Biederman
	Directly calling do_exit with a signal number has the problem that all of the side effects of the signal don't happen, such as killing all of the threads of a process instead of just the calling thread. So replace do_exit(SIGSYS) with force_fatal_sig(SIGSYS) which causes the signal handling to take it's normal path and work as expected. Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20211020174406.17889-17-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	signal/sparc32: In setup_rt_frame and setup_fram use force_fatal_sig	Eric W. Biederman
	Modify the 32bit version of setup_rt_frame and setup_frame to act similar to the 64bit version of setup_rt_frame and fail with a signal instead of calling do_exit. Replacing do_exit(SIGILL) with force_fatal_signal(SIGILL) ensures that the process will be terminated cleanly when the stack frame is invalid, instead of just killing off a single thread and leaving the process is a weird state. Cc: David Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Link: https://lkml.kernel.org/r/20211020174406.17889-16-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	signal/sparc32: Exit with a fatal signal when try_to_clear_window_buffer fails	Eric W. Biederman
	The function try_to_clear_window_buffer is only called from rtrap_32.c. After it is called the signal pending state is retested, and signals are handled if TIF_SIGPENDING is set. This allows try_to_clear_window_buffer to call force_fatal_signal and then rely on the signal being delivered to kill the process, without any danger of returning to userspace, or otherwise using possible corrupt state on failure. The functional difference between force_fatal_sig and do_exit is that do_exit will only terminate a single thread, and will never trigger a core-dump. A multi-threaded program for which a single thread terminates unexpectedly is hard to reason about. Calling force_fatal_sig does not give userspace a chance to catch the signal, but otherwise is an ordinary fatal signal exit, and it will trigger a coredump of the offending process if core dumps are enabled. Cc: David Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Link: https://lkml.kernel.org/r/20211020174406.17889-15-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	signal/s390: Use force_sigsegv in default_trap_handler	Eric W. Biederman
	Reading the history it is unclear why default_trap_handler calls do_exit. It is not even menthioned in the commit where the change happened. My best guess is that because it is unknown why the exception happened it was desired to guarantee the process never returned to userspace. Using do_exit(SIGSEGV) has the problem that it will only terminate one thread of a process, leaving the process in an undefined state. Use force_sigsegv(SIGSEGV) instead which effectively has the same behavior except that is uses the ordinary signal mechanism and terminates all threads of a process and is generally well defined. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-s390@vger.kernel.org Fixes: ca2ab03237ec ("[PATCH] s390: core changes") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lkml.kernel.org/r/20211020174406.17889-11-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	ocxl: Use pci core's DVSEC functionality	Ben Widawsky
	Reduce maintenance burden of DVSEC query implementation by using the centralized PCI core implementation. There are two obvious places to simply drop in the new core implementation. There remains find_dvsec_from_pos() which would benefit from using a core implementation. As that change is less trivial it is reserved for later. Cc: linuxppc-dev@lists.ozlabs.org Cc: Andrew Donnellan <ajd@linux.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> (v1) Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Link: https://lore.kernel.org/r/163379789065.692348.7117946955275586530.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-10-29	Merge branch 'linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a build-time warning in x86/sm4" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/sm4 - Fix invalid section entry size
2021-10-29	riscv: Fix asan-stack clang build	Alexandre Ghiti
	Nathan reported that because KASAN_SHADOW_OFFSET was not defined in Kconfig, it prevents asan-stack from getting disabled with clang even when CONFIG_KASAN_STACK is disabled: fix this by defining the corresponding config. Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Fixes: 8ad8b72721d0 ("riscv: Add KASAN support") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29	riscv: Do not re-populate shadow memory with kasan_populate_early_shadow	Alexandre Ghiti
	When calling this function, all the shadow memory is already populated with kasan_early_shadow_pte which has PAGE_KERNEL protection. kasan_populate_early_shadow write-protects the mapping of the range of addresses passed in argument in zero_pte_populate, which actually write-protects all the shadow memory mapping since kasan_early_shadow_pte is used for all the shadow memory at this point. And then when using memblock API to populate the shadow memory, the first write access to the kernel stack triggers a trap. This becomes visible with the next commit that contains a fix for asan-stack. We already manually populate all the shadow memory in kasan_early_init and we write-protect kasan_early_shadow_pte at the end of kasan_init which makes the calls to kasan_populate_early_shadow superfluous so we can remove them. Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Fixes: e178d670f251 ("riscv/kasan: add KASAN_VMALLOC support") Fixes: 8ad8b72721d0 ("riscv: Add KASAN support") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29	net: xtensa: use eth_hw_addr_set()	Jakub Kicinski
	Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it go through appropriate helpers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29	net: um: use eth_hw_addr_set()	Jakub Kicinski
	Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it go through appropriate helpers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29	powerpc/32e: Ignore ESR in instruction storage interrupt handler	Nicholas Piggin
	A e5500 machine running a 32-bit kernel sometimes hangs at boot, seemingly going into an infinite loop of instruction storage interrupts. The ESR (Exception Syndrome Register) has a value of 0x800000 (store) when this happens, which is likely set by a previous store. An instruction TLB miss interrupt would then leave ESR unchanged, and if no PTE exists it calls directly to the instruction storage interrupt handler without changing ESR. access_error() does not cause a segfault due to a store to a read-only vma because is_exec is true. Most subsequent fault handling does not check for a write fault on a read-only vma, and might do strange things like create a writeable PTE or call page_mkwrite on a read only vma or file. It's not clear what happens here to cause the infinite faulting in this case, a fault handler failure or low level PTE or TLB handling. In any case this can be fixed by having the instruction storage interrupt zero regs->dsisr rather than storing the ESR value to it. Fixes: a01a3f2ddbcd ("powerpc: remove arguments from fault handler functions") Cc: stable@vger.kernel.org # v5.12+ Reported-by: Jacques de Laval <jacques.delaval@protonmail.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Jacques de Laval <jacques.delaval@protonmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211028133043.4159501-1-npiggin@gmail.com
2021-10-29	Merge branch 'for-next/fixes' into for-next/core	Will Deacon
	Merge for-next/fixes to resolve conflicts in arm64_hugetlb_cma_reserve(). * for-next/fixes: acpi/arm64: fix next_platform_timer() section mismatch error arm64/hugetlb: fix CMA gigantic page order for non-4K PAGE_SIZE
2021-10-29	Merge branch 'for-next/vdso' into for-next/core	Will Deacon
	* for-next/vdso: arm64: vdso32: require CROSS_COMPILE_COMPAT for gcc+bfd arm64: vdso32: suppress error message for 'make mrproper' arm64: vdso32: drop test for -march=armv8-a arm64: vdso32: drop the test for dmb ishld
2021-10-29	Merge branch 'for-next/trbe-errata' into for-next/core	Will Deacon
	* for-next/trbe-errata: arm64: errata: Add detection for TRBE write to out-of-range arm64: errata: Add workaround for TSB flush failures arm64: errata: Add detection for TRBE overwrite in FILL mode arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
2021-10-29	Merge branch 'for-next/sve' into for-next/core	Will Deacon
	* for-next/sve: arm64/sve: Fix warnings when SVE is disabled arm64/sve: Add stub for sve_max_virtualisable_vl() arm64/sve: Track vector lengths for tasks in an array arm64/sve: Explicitly load vector length when restoring SVE state arm64/sve: Put system wide vector length information into structs arm64/sve: Use accessor functions for vector lengths in thread_struct arm64/sve: Rename find_supported_vector_length() arm64/sve: Make access to FFR optional arm64/sve: Make sve_state_size() static arm64/sve: Remove sve_load_from_fpsimd_state() arm64/fp: Reindent fpsimd_save()
2021-10-29	Merge branch 'for-next/pfn-valid' into for-next/core	Will Deacon
	* for-next/pfn-valid: arm64/mm: drop HAVE_ARCH_PFN_VALID dma-mapping: remove bogus test for pfn_valid from dma_map_resource
2021-10-29	Merge branch 'for-next/mte' into for-next/core	Will Deacon
	* for-next/mte: kasan: Extend KASAN mode kernel parameter arm64: mte: Add asymmetric mode support arm64: mte: CPU feature detection for Asymm MTE arm64: mte: Bitfield definitions for Asymm MTE kasan: Remove duplicate of kasan_flag_async arm64: kasan: mte: move GCR_EL1 switch to task switch when KASAN disabled
2021-10-29	Merge branch 'for-next/mm' into for-next/core	Will Deacon
	* for-next/mm: arm64: mm: update max_pfn after memory hotplug arm64/mm: Add pud_sect_supported() arm64: mm: Drop pointless call to set_max_mapnr()
2021-10-29	Merge branch 'for-next/misc' into for-next/core	Will Deacon
	* for-next/misc: arm64: Select POSIX_CPU_TIMERS_TASK_WORK arm64: Document boot requirements for FEAT_SME_FA64 arm64: ftrace: use function_nocfi for _mcount as well arm64: asm: setup.h: export common variables arm64/traps: Avoid unnecessary kernel/user pointer conversion
2021-10-29	Merge branch 'for-next/kexec' into for-next/core	Will Deacon
	* for-next/kexec: arm64: trans_pgd: remove trans_pgd_map_page() arm64: kexec: remove cpu-reset.h arm64: kexec: remove the pre-kexec PoC maintenance arm64: kexec: keep MMU enabled during kexec relocation arm64: kexec: install a copy of the linear-map arm64: kexec: use ld script for relocation function arm64: kexec: relocate in EL1 mode arm64: kexec: configure EL2 vectors for kexec arm64: kexec: pass kimage as the only argument to relocation function arm64: kexec: Use dcache ops macros instead of open-coding arm64: kexec: skip relocation code for inplace kexec arm64: kexec: flush image and lists during kexec load time arm64: hibernate: abstract ttrb0 setup function arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors arm64: kernel: add helper for booted at EL2 and not VHE
2021-10-29	Merge branch 'for-next/extable' into for-next/core	Will Deacon
	* for-next/extable: arm64: vmlinux.lds.S: remove `.fixup` section arm64: extable: add load_unaligned_zeropad() handler arm64: extable: add a dedicated uaccess handler arm64: extable: add `type` and `data` fields arm64: extable: use `ex` for `exception_table_entry` arm64: extable: make fixup_exception() return bool arm64: extable: consolidate definitions arm64: gpr-num: support W registers arm64: factor out GPR numbering helpers arm64: kvm: use kvm_exception_table_entry arm64: lib: __arch_copy_to_user(): fold fixups into body arm64: lib: __arch_copy_from_user(): fold fixups into body arm64: lib: __arch_clear_user(): fold fixups into body
2021-10-29	Merge tag 'irqchip-5.16' into irq/core	Borislav Petkov
	Merge irqchip updates for Linux 5.16 from Marc Zyngier: - A large cross-arch rework to move irq_enter()/irq_exit() into the arch code, and removing it from the generic irq code. Thanks to Mark Rutland for the huge effort! - A few irqchip drivers are made modular (broadcom, meson), because that's apparently a thing... - A new driver for the Microchip External Interrupt Controller - The irq_cpu_offline()/irq_cpu_online() API is now deprecated and can only be selected on the Cavium Octeon platform. Once this platform is removed, the API will be removed at the same time. - A sprinkle of devm_* helper, as people seem to love that. - The usual spattering of small fixes and minor improvements. * tag 'irqchip-5.16': (912 commits) h8300: Fix linux/irqchip.h include mess dt-bindings: irqchip: renesas-irqc: Document r8a774e1 bindings MIPS: irq: Avoid an unused-variable error genirq: Hide irq_cpu_{on,off}line() behind a deprecated option irqchip/mips-gic: Get rid of the reliance on irq_cpu_online() MIPS: loongson64: Drop call to irq_cpu_offline() irq: remove handle_domain_{irq,nmi}() irq: remove CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY irq: riscv: perform irqentry in entry code irq: openrisc: perform irqentry in entry code irq: csky: perform irqentry in entry code irq: arm64: perform irqentry in entry code irq: arm: perform irqentry in entry code irq: add a (temporary) CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY irq: nds32: avoid CONFIG_HANDLE_DOMAIN_IRQ irq: arc: avoid CONFIG_HANDLE_DOMAIN_IRQ irq: add generic_handle_arch_irq() irq: unexport handle_irq_desc() irq: simplify handle_domain_{irq,nmi}() irq: mips: simplify do_domain_IRQ() ... Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211029083332.3680101-1-maz@kernel.org
2021-10-29	x86/apic: Reduce cache line misses in __x2apic_send_IPI_mask()	Eric Dumazet
	Using per-cpu storage for @x86_cpu_to_logical_apicid is not optimal. Broadcast IPI will need at least one cache line per cpu to access this field. __x2apic_send_IPI_mask() is using standard bitmask operators. By converting x86_cpu_to_logical_apicid to an array, we divide by 16x number of needed cache lines, because we find 16 values per cache line. CPU prefetcher can kick nicely. Also move @cluster_masks to READ_MOSTLY section to avoid false sharing. Tested on a dual socket host with 256 cpus, cost for a full broadcast is now 11 usec instead of 33 usec. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211007143556.574911-1-eric.dumazet@gmail.com
2021-10-29	powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload	Vasant Hegde
	Commit 587164cd, introduced new opal message type (OPAL_MSG_PRD2) and added opal notifier. But I missed to unregister the notifier during module unload path. This results in below call trace if you try to unload and load opal_prd module. Also add new notifier_block for OPAL_MSG_PRD2 message. Sample calltrace (modprobe -r opal_prd; modprobe opal_prd) BUG: Unable to handle kernel data access on read at 0xc0080000192200e0 Faulting instruction address: 0xc00000000018d1cc Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV CPU: 66 PID: 7446 Comm: modprobe Kdump: loaded Tainted: G E 5.14.0prd #759 NIP: c00000000018d1cc LR: c00000000018d2a8 CTR: c0000000000cde10 REGS: c0000003c4c0f0a0 TRAP: 0300 Tainted: G E (5.14.0prd) MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 24224824 XER: 20040000 CFAR: c00000000018d2a4 DAR: c0080000192200e0 DSISR: 40000000 IRQMASK: 1 ... NIP notifier_chain_register+0x2c/0xc0 LR atomic_notifier_chain_register+0x48/0x80 Call Trace: 0xc000000002090610 (unreliable) atomic_notifier_chain_register+0x58/0x80 opal_message_notifier_register+0x7c/0x1e0 opal_prd_probe+0x84/0x150 [opal_prd] platform_probe+0x78/0x130 really_probe+0x110/0x5d0 __driver_probe_device+0x17c/0x230 driver_probe_device+0x60/0x130 __driver_attach+0xfc/0x220 bus_for_each_dev+0xa8/0x130 driver_attach+0x34/0x50 bus_add_driver+0x1b0/0x300 driver_register+0x98/0x1a0 __platform_driver_register+0x38/0x50 opal_prd_driver_init+0x34/0x50 [opal_prd] do_one_initcall+0x60/0x2d0 do_init_module+0x7c/0x320 load_module+0x3394/0x3650 __do_sys_finit_module+0xd4/0x160 system_call_exception+0x140/0x290 system_call_common+0xf4/0x258 Fixes: 587164cd593c ("powerpc/powernv: Add new opal message type") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211028165716.41300-1-hegdevasant@linux.vnet.ibm.com
2021-10-29	powerpc: Don't provide __kernel_map_pages() without ↵	Christophe Leroy
	ARCH_SUPPORTS_DEBUG_PAGEALLOC When ARCH_SUPPORTS_DEBUG_PAGEALLOC is not selected, the user can still select CONFIG_DEBUG_PAGEALLOC in which case __kernel_map_pages() is provided by mm/page_poison.c So only define __kernel_map_pages() when both CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC and CONFIG_DEBUG_PAGEALLOC are defined. Fixes: 68b44f94d637 ("powerpc/booke: Disable STRICT_KERNEL_RWX, DEBUG_PAGEALLOC and KFENCE") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/971b69739ff4746252e711a9845210465c023a9e.1635425947.git.christophe.leroy@csgroup.eu
2021-10-28	bpf,x86: Respect X86_FEATURE_RETPOLINE*	Peter Zijlstra
	Current BPF codegen doesn't respect X86_FEATURE_RETPOLINE* flags and unconditionally emits a thunk call, this is sub-optimal and doesn't match the regular, compiler generated, code. Update the i386 JIT to emit code equal to what the compiler emits for the regular kernel text (IOW. a plain THUNK call). Update the x86_64 JIT to emit code similar to the result of compiler and kernel rewrites as according to X86_FEATURE_RETPOLINE* flags. Inlining RETPOLINE_AMD (lfence; jmp %reg) and !RETPOLINE (jmp %reg), while doing a THUNK call for RETPOLINE. This removes the hard-coded retpoline thunks and shrinks the generated code. Leaving a single retpoline thunk definition in the kernel. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.614772675@infradead.org
2021-10-28	bpf,x86: Simplify computing label offsets	Peter Zijlstra
	Take an idea from the 32bit JIT, which uses the multi-pass nature of the JIT to compute the instruction offsets on a prior pass in order to compute the relative jump offsets on a later pass. Application to the x86_64 JIT is slightly more involved because the offsets depend on program variables (such as callee_regs_used and stack_depth) and hence the computed offsets need to be kept in the context of the JIT. This removes, IMO quite fragile, code that hard-codes the offsets and tries to compute the length of variable parts of it. Convert both emit_bpf_tail_call_*() functions which have an out: label at the end. Additionally emit_bpt_tail_call_direct() also has a poke table entry, for which it computes the offset from the end (and thus already relies on the previous pass to have computed addrs[i]), also convert this to be a forward based offset. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.552304864@infradead.org
2021-10-28	x86,bugs: Unconditionally allow spectre_v2=retpoline,amd	Peter Zijlstra
	Currently Linux prevents usage of retpoline,amd on !AMD hardware, this is unfriendly and gets in the way of testing. Remove this restriction. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.487348118@infradead.org
2021-10-28	x86/alternative: Add debug prints to apply_retpolines()	Peter Zijlstra
	Make sure we can see the text changes when booting with 'debug-alternative'. Example output: [ ] SMP alternatives: retpoline at: __traceiter_initcall_level+0x1f/0x30 (ffffffff8100066f) len: 5 to: __x86_indirect_thunk_rax+0x0/0x20 [ ] SMP alternatives: ffffffff82603e58: [2:5) optimized NOPs: ff d0 0f 1f 00 [ ] SMP alternatives: ffffffff8100066f: orig: e8 cc 30 00 01 [ ] SMP alternatives: ffffffff8100066f: repl: ff d0 0f 1f 00 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.422273830@infradead.org
2021-10-28	x86/alternative: Try inline spectre_v2=retpoline,amd	Peter Zijlstra
	Try and replace retpoline thunk calls with: LFENCE CALL %\reg for spectre_v2=retpoline,amd. Specifically, the sequence above is 5 bytes for the low 8 registers, but 6 bytes for the high 8 registers. This means that unless the compilers prefix stuff the call with higher registers this replacement will fail. Luckily GCC strongly favours RAX for the indirect calls and most (95%+ for defconfig-x86_64) will be converted. OTOH clang strongly favours R11 and almost nothing gets converted. Note: it will also generate a correct replacement for the Jcc.d32 case, except unless the compilers start to prefix stuff that, it'll never fit. Specifically: Jncc.d8 1f LFENCE JMP %\reg 1: is 7-8 bytes long, where the original instruction in unpadded form is only 6 bytes. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.359986601@infradead.org