linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-10-22	powerpc: Activate CONFIG_STRICT_KERNEL_RWX by default	Christophe Leroy
	CONFIG_STRICT_KERNEL_RWX should be set by default on every architectures (See https://github.com/KSPP/linux/issues/4) On PPC32 we have to find a compromise between performance and/or memory wasting and selection of strict_kernel_rwx, because it implies either smaller memory chunks or larger alignment between RO memory and RW memory. For instance the 8xx maps memory with 8M pages. So either the limit between RO and RW must be 8M aligned or it falls back or 512k pages which implies more pressure on the TLB. book3s/32 maps memory with BATs as much as possible. BATS can have any power-of-two size between 128k and 256M but we have only 4 to 8 BATs so the alignment must be good enough to allow efficient use of the BATs and avoid falling back on standard page mapping which would kill performance. So let's go one step forward and make it the default but still allow users to unset it when wanted. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/057c40164084bfc7d77c0b2ff78d95dbf6a2a21b.1632503622.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/8xx: Simplify TLB handling	Christophe Leroy
	In the old days, TLB handling for 8xx was using tlbie and tlbia instructions directly as much as possible. But commit f048aace29e0 ("powerpc/mm: Add SMP support to no-hash TLB handling") broke that by introducing out-of-line unnecessary complex functions for booke/smp which don't have tlbie/tlbia instructions and require more complex handling. Restore direct use of tlbie and tlbia for 8xx which is never SMP. With this patch we now get c00ecc68 <ptep_clear_flush>: c00ecc68: 39 00 00 00 li r8,0 c00ecc6c: 81 46 00 00 lwz r10,0(r6) c00ecc70: 91 06 00 00 stw r8,0(r6) c00ecc74: 7c 00 2a 64 tlbie r5,r0 c00ecc78: 7c 00 04 ac hwsync c00ecc7c: 91 43 00 00 stw r10,0(r3) c00ecc80: 4e 80 00 20 blr Before it was c0012880 <local_flush_tlb_page>: c0012880: 2c 03 00 00 cmpwi r3,0 c0012884: 41 82 00 54 beq c00128d8 <local_flush_tlb_page+0x58> c0012888: 81 22 00 00 lwz r9,0(r2) c001288c: 81 43 00 20 lwz r10,32(r3) c0012890: 39 29 00 01 addi r9,r9,1 c0012894: 91 22 00 00 stw r9,0(r2) c0012898: 2c 0a 00 00 cmpwi r10,0 c001289c: 41 82 00 10 beq c00128ac <local_flush_tlb_page+0x2c> c00128a0: 81 2a 01 dc lwz r9,476(r10) c00128a4: 2c 09 ff ff cmpwi r9,-1 c00128a8: 41 82 00 0c beq c00128b4 <local_flush_tlb_page+0x34> c00128ac: 7c 00 22 64 tlbie r4,r0 c00128b0: 7c 00 04 ac hwsync c00128b4: 81 22 00 00 lwz r9,0(r2) c00128b8: 39 29 ff ff addi r9,r9,-1 c00128bc: 2c 09 00 00 cmpwi r9,0 c00128c0: 91 22 00 00 stw r9,0(r2) c00128c4: 4c a2 00 20 bclr+ 4,eq c00128c8: 81 22 00 70 lwz r9,112(r2) c00128cc: 71 29 00 04 andi. r9,r9,4 c00128d0: 4d 82 00 20 beqlr c00128d4: 48 65 76 74 b c0669f48 <preempt_schedule> c00128d8: 81 22 00 00 lwz r9,0(r2) c00128dc: 39 29 00 01 addi r9,r9,1 c00128e0: 91 22 00 00 stw r9,0(r2) c00128e4: 4b ff ff c8 b c00128ac <local_flush_tlb_page+0x2c> ... c00ecdc8 <ptep_clear_flush>: c00ecdc8: 94 21 ff f0 stwu r1,-16(r1) c00ecdcc: 39 20 00 00 li r9,0 c00ecdd0: 93 c1 00 08 stw r30,8(r1) c00ecdd4: 83 c6 00 00 lwz r30,0(r6) c00ecdd8: 91 26 00 00 stw r9,0(r6) c00ecddc: 93 e1 00 0c stw r31,12(r1) c00ecde0: 7c 08 02 a6 mflr r0 c00ecde4: 7c 7f 1b 78 mr r31,r3 c00ecde8: 7c 83 23 78 mr r3,r4 c00ecdec: 7c a4 2b 78 mr r4,r5 c00ecdf0: 90 01 00 14 stw r0,20(r1) c00ecdf4: 4b f2 5a 8d bl c0012880 <local_flush_tlb_page> c00ecdf8: 93 df 00 00 stw r30,0(r31) c00ecdfc: 7f e3 fb 78 mr r3,r31 c00ece00: 80 01 00 14 lwz r0,20(r1) c00ece04: 83 c1 00 08 lwz r30,8(r1) c00ece08: 83 e1 00 0c lwz r31,12(r1) c00ece0c: 7c 08 03 a6 mtlr r0 c00ece10: 38 21 00 10 addi r1,r1,16 c00ece14: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fb324f1c8f2ddb57cf6aad1cea26329558f1c1c0.1631887021.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/lib/sstep: Don't use __{get/put}_user() on kernel addresses	Christophe Leroy
	In the old days, when we didn't have kernel userspace access protection and had set_fs(), it was wise to use __get_user() and friends to read kernel memory. Nowadays, get_user() and put_user() are granting userspace access and are exclusively for userspace access. Convert single step emulation functions to user_access_begin() and friends and use unsafe_get_user() and unsafe_put_user(). When addressing kernel addresses, there is no need to open userspace access. And for book3s/32 it is particularly important to no try and open userspace access on kernel address, because that would break the content of kernel space segment registers. No guard has been put against that risk in order to avoid degrading performance. copy_from_kernel_nofault() and copy_to_kernel_nofault() should be used but they are out-of-line functions which would degrade performance. Those two functions are making use of __get_kernel_nofault() and __put_kernel_nofault() macros. Those two macros are just wrappers behind __get_user_size_goto() and __put_user_size_goto(). unsafe_get_user() and unsafe_put_user() are also wrappers of __get_user_size_goto() and __put_user_size_goto(). Use them to access kernel space. That allows refactoring userspace and kernelspace access. Reported-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Depends-on: 4fe5cda9f89d ("powerpc/uaccess: Implement user_read_access_begin and user_write_access_begin") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/22831c9d17f948680a12c5292e7627288b15f713.1631817805.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc: warn on emulation of dcbz instruction in kernel mode	Christophe Leroy
	dcbz instruction shouldn't be used on non-cached memory. Using it on non-cached memory can result in alignment exception and implies a heavy handling. Instead of silentely emulating the instruction and resulting in high performance degradation, warn whenever an alignment exception is taken in kernel mode due to dcbz, so that the user is made aware that dcbz instruction has been used unexpectedly by the kernel. Reported-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2e3acfe63d289c6fba366e16973c9ab8369e8b75.1631803922.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/32: Add support for out-of-line static calls	Christophe Leroy
	Add support for out-of-line static calls on PPC32. This change improve performance of calls to global function pointers by using direct calls instead of indirect calls. The trampoline is initialy populated with a 'blr' or branch to target, followed by an unreachable long jump sequence. In order to cater with parallele execution, the trampoline needs to be updated in a way that ensures it remains consistent at all time. This means we can't use the traditional lis/addi to load r12 with the target address, otherwise there would be a window during which the first instruction contains the upper part of the new target address while the second instruction still contains the lower part of the old target address. To avoid that the target address is stored just after the 'bctr' and loaded from there with a single instruction. Then, depending on the target distance, arch_static_call_transform() will either replace the first instruction by a direct 'bl <target>' or 'nop' in order to have the trampoline fall through the long jump sequence. For the special case of __static_call_return0(), to avoid the risk of a far branch, a version of it is inlined at the end of the trampoline. Performancewise the long jump sequence is probably not better than the indirect calls set by GCC when we don't use static calls, but such calls are unlikely to be required on powerpc32: With most configurations the kernel size is far below 32 Mbytes so only modules may happen to be too far. And even modules are likely to be close enough as they are allocated below the kernel core and as close as possible of the kernel text. static_call selftest is running successfully with this change. With this patch, __do_irq() has the following sequence to trace irq entries: c0004a00 <__SCT__tp_func_irq_entry>: c0004a00: 48 00 00 e0 b c0004ae0 <__traceiter_irq_entry> c0004a04: 3d 80 c0 00 lis r12,-16384 c0004a08: 81 8c 4a 1c lwz r12,18972(r12) c0004a0c: 7d 89 03 a6 mtctr r12 c0004a10: 4e 80 04 20 bctr c0004a14: 38 60 00 00 li r3,0 c0004a18: 4e 80 00 20 blr c0004a1c: 00 00 00 00 .long 0x0 ... c0005654 <__do_irq>: ... c0005664: 7c 7f 1b 78 mr r31,r3 ... c00056a0: 81 22 00 00 lwz r9,0(r2) c00056a4: 39 29 00 01 addi r9,r9,1 c00056a8: 91 22 00 00 stw r9,0(r2) c00056ac: 3d 20 c0 af lis r9,-16209 c00056b0: 81 29 74 cc lwz r9,29900(r9) c00056b4: 2c 09 00 00 cmpwi r9,0 c00056b8: 41 82 00 10 beq c00056c8 <__do_irq+0x74> c00056bc: 80 69 00 04 lwz r3,4(r9) c00056c0: 7f e4 fb 78 mr r4,r31 c00056c4: 4b ff f3 3d bl c0004a00 <__SCT__tp_func_irq_entry> Before this patch, __do_irq() was doing the following to trace irq entries: c0005700 <__do_irq>: ... c0005710: 7c 7e 1b 78 mr r30,r3 ... c000574c: 93 e1 00 0c stw r31,12(r1) c0005750: 81 22 00 00 lwz r9,0(r2) c0005754: 39 29 00 01 addi r9,r9,1 c0005758: 91 22 00 00 stw r9,0(r2) c000575c: 3d 20 c0 af lis r9,-16209 c0005760: 83 e9 f4 cc lwz r31,-2868(r9) c0005764: 2c 1f 00 00 cmpwi r31,0 c0005768: 41 82 00 24 beq c000578c <__do_irq+0x8c> c000576c: 81 3f 00 00 lwz r9,0(r31) c0005770: 80 7f 00 04 lwz r3,4(r31) c0005774: 7d 29 03 a6 mtctr r9 c0005778: 7f c4 f3 78 mr r4,r30 c000577c: 4e 80 04 21 bctrl c0005780: 85 3f 00 0c lwzu r9,12(r31) c0005784: 2c 09 00 00 cmpwi r9,0 c0005788: 40 82 ff e4 bne c000576c <__do_irq+0x6c> Behind the fact of now using a direct 'bl' instead of a 'load/mtctr/bctr' sequence, we can also see that we get one less register on the stack. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6ec2a7865ed6a5ec54ab46d026785bafe1d837ea.1630484892.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/machdep: Remove stale functions from ppc_md structure	Christophe Leroy
	ppc_md.iommu_save() is not set anymore by any platform after commit c40785ad305b ("powerpc/dart: Use a cachable DART"). So iommu_save() has become a nop and can be removed. ppc_md.show_percpuinfo() is not set anymore by any platform after commit 4350147a816b ("[PATCH] ppc64: SMU based macs cpufreq support"). Last users of ppc_md.rtc_read_val() and ppc_md.rtc_write_val() were removed by commit 0f03a43b8f0f ("[POWERPC] Remove todc code from ARCH=powerpc") Last user of kgdb_map_scc() was removed by commit 17ce452f7ea3 ("kgdb, powerpc: arch specific powerpc kgdb support"). ppc.machine_kexec_prepare() has not been used since commit 8ee3e0d69623 ("powerpc: Remove the main legacy iSerie platform code"). This allows the removal of machine_kexec_prepare() and the rename of default_machine_kexec_prepare() into machine_kexec_prepare() Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Daniel Axtens <dja@axtens.net> [mpe: Drop prototype for default_machine_kexec_prepare() as noted by dja] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/24d4ca0ada683c9436a5f812a7aeb0a1362afa2b.1630398606.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/time: Remove generic_suspend_{dis/en}able_irqs()	Christophe Leroy
	Commit d75d68cfef49 ("powerpc: Clean up obsolete code relating to decrementer and timebase") made generic_suspend_enable_irqs() and generic_suspend_disable_irqs() static. Fold them into their only caller. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c3f9ec9950394ef939014f7934268e6ee30ca04f.1630398566.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/audit: Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC	Christophe Leroy
	Commit e65e1fc2d24b ("[PATCH] syscall class hookup for all normal targets") added generic support for AUDIT but that didn't include support for bi-arch like powerpc. Commit 4b58841149dc ("audit: Add generic compat syscall support") added generic support for bi-arch. Convert powerpc to that bi-arch generic audit support. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a4b3951d1191d4183d92a07a6097566bde60d00a.1629812058.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/32: Don't use lmw/stmw for saving/restoring non volatile regs	Christophe Leroy
	Instructions lmw/stmw are interesting for functions that are rarely used and not in the cache, because only one instruction is to be copied into the instruction cache instead of 19. However those instruction are less performant than 19x raw lwz/stw as they require synchronisation plus one additional cycle. SAVE_NVGPRS / REST_NVGPRS are used in only a few places which are mostly in interrupts entries/exits and in task switch so they are likely already in the cache. Using standard lwz improves null_syscall selftest by: - 10 cycles on mpc832x. - 2 cycles on mpc8xx. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/316c543b8906712c108985c8463eec09c8db577b.1629732542.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/5200: dts: fix memory node unit name	Anatolij Gustschin
	Fixes build warnings: Warning (unit_address_vs_reg): /memory: node has a reg or ranges property, but no unit name Signed-off-by: Anatolij Gustschin <agust@denx.de> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211013220532.24759-4-agust@denx.de
2021-10-22	powerpc/5200: dts: fix pci ranges warnings	Anatolij Gustschin
	Fix ranges property warnings: pci@f0000d00:ranges: 'oneOf' conditional failed, one must be fixed: Signed-off-by: Anatolij Gustschin <agust@denx.de> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211013220532.24759-3-agust@denx.de
2021-10-22	powerpc/5200: dts: add missing pci ranges	Anatolij Gustschin
	Add ranges property to fix build warnings: Warning (pci_bridge): /pci@f0000d00: missing ranges for PCI bridge (or not a bridge) Signed-off-by: Anatolij Gustschin <agust@denx.de> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211013220532.24759-2-agust@denx.de
2021-10-22	powerpc/vas: Fix potential NULL pointer dereference	Gustavo A. R. Silva
	(!ptr && !ptr->foo) strikes again. :) The expression (!ptr && !ptr->foo) is bogus and in case ptr is NULL, it leads to a NULL pointer dereference: ptr->foo. Fix this by converting && to \|\| This issue was detected with the help of Coccinelle, and audited and fixed manually. Fixes: 1a0d0d5ed5e3 ("powerpc/vas: Add platform specific user window operations") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211015050345.GA1161918@embeddedor
2021-10-22	powerpc/fsl_booke: Enable STRICT_KERNEL_RWX	Christophe Leroy
	Enable STRICT_KERNEL_RWX on fsl_booke. For that, we need additional TLBCAMs dedicated to linear mapping, based on the alignment of _sinittext. By default, up to 768 Mbytes of memory are mapped. It uses 3 TLBCAMs of size 256 Mbytes. With a data alignment of 16, we need up to 9 TLBCAMs: 16/16/16/16/64/64/64/256/256 With a data alignment of 4, we need up to 12 TLBCAMs: 4/4/4/4/16/16/16/64/64/64/256/256 With a data alignment of 1, we need up to 15 TLBCAMs: 1/1/1/1/4/4/4/16/16/16/64/64/64/256/256 By default, set a 16 Mbytes alignment as a compromise between memory usage and number of TLBCAMs. This can be adjusted manually when needed. For the time being, it doens't work when the base is randomised. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/29f9e5d2bbbc83ae9ca879265426a6278bf4d5bb.1634292136.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/fsl_booke: Update of TLBCAMs after init	Christophe Leroy
	After init, set readonly memory as ROX and set readwrite memory as RWX, if STRICT_KERNEL_RWX is enabled. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/66bef0b9c273e1121706883f3cf5ad0a053d863f.1634292136.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/fsl_booke: Allocate separate TLBCAMs for readonly memory	Christophe Leroy
	Reorganise TLBCAM allocation so that when STRICT_KERNEL_RWX is enabled, TLBCAMs are allocated such that readonly memory uses different TLBCAMs. This results in an allocation looking like: Memory CAM mapping: 4/4/4/1/1/1/1/16/16/16/64/64/64/256/256 Mb, residual: 256Mb Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8ca169bc288261a0e0558712f979023c3a960ebb.1634292136.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/fsl_booke: Tell map_mem_in_cams() if init is done	Christophe Leroy
	In order to be able to call map_mem_in_cams() once more after init for STRICT_KERNEL_RWX, add an argument. For now, map_mem_in_cams() is always called only during init. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3b69a7e0b393b16984ade882a5eae5d727717459.1634292136.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/fsl_booke: Enable reloading of TLBCAM without switching to AS1	Christophe Leroy
	Avoid switching to AS1 when reloading TLBCAM after init for STRICT_KERNEL_RWX. When we setup AS1 we expect the entire accessible memory to be mapped through one entry, this is not the case anymore at the end of init. We are not changing the size of TLBCAMs, only flags, so no need to switch to AS1. So change loadcam_multi() to not switch to AS1 when the given temporary tlb entry in 0. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a9d517fbfbc940f56103c46b323f6eb8f4485571.1634292136.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/fsl_booke: Take exec flag into account when setting TLBCAMs	Christophe Leroy
	Don't force MAS3_SX and MAS3_UX at all time. Take into account the exec flag. While at it, fix a couple of closeby style problems (indent with space and unnecessary parenthesis), it keeps more readability. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5467044e59f27f9fcf709b9661779e3ce5f784f6.1634292136.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/fsl_booke: Rename fsl_booke.c to fsl_book3e.c	Christophe Leroy
	We have a myriad of CONFIG symbols around different variants of BOOKEs, which would be worth tidying up one day. But at least, make file names and CONFIG option match: We have CONFIG_FSL_BOOKE and CONFIG_PPC_FSL_BOOK3E. fsl_booke.c is selected by and only by CONFIG_PPC_FSL_BOOK3E. So rename it fsl_book3e to reduce confusion. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5dc871db1f67739319bec11f049ca450da1c13a2.1634292136.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/booke: Disable STRICT_KERNEL_RWX, DEBUG_PAGEALLOC and KFENCE	Christophe Leroy
	fsl_booke and 44x are not able to map kernel linear memory with pages, so they can't support DEBUG_PAGEALLOC and KFENCE, and STRICT_KERNEL_RWX is also a problem for now. Enable those only on book3s (both 32 and 64 except KFENCE), 8xx and 40x. Fixes: 88df6e90fa97 ("[POWERPC] DEBUG_PAGEALLOC for 32-bit") Fixes: 95902e6c8864 ("powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32") Fixes: 90cbac0e995d ("powerpc: Enable KFENCE for PPC32") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d1ad9fdd9b27da3fdfa16510bb542ed51fa6e134.1634292136.git.christophe.leroy@csgroup.eu
2021-10-22	powerpc/kexec_file: Add of_node_put() before goto	Wan Jiabing
	Fix following coccicheck warning: ./arch/powerpc/kexec/file_load_64.c:698:1-22: WARNING: Function for_each_node_by_type should have of_node_put() before goto Early exits from for_each_node_by_type should decrement the node reference counter. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211018015418.10182-1-wanjiabing@vivo.com
2021-10-22	powerpc/pseries/iommu: Add of_node_put() before break	Wan Jiabing
	Fix following coccicheck warning: ./arch/powerpc/platforms/pseries/iommu.c:924:1-28: WARNING: Function for_each_node_with_property should have of_node_put() before break Early exits from for_each_node_with_property should decrement the node reference counter. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Reviewed-by: Leonardo Bras <leobras.c@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211014075624.16344-1-wanjiabing@vivo.com
2021-10-22	powerpc/s64: Clarify that radix lacks DEBUG_PAGEALLOC	Joel Stanley
	The page_alloc.c code will call into __kernel_map_pages() when DEBUG_PAGEALLOC is configured and enabled. As the implementation assumes hash, this should crash spectacularly if not for a bit of luck in __kernel_map_pages(). In this function linear_map_hash_count is always zero, the for loop exits without doing any damage. There are no other platforms that determine if they support debug_pagealloc at runtime. Instead of adding code to mm/page_alloc.c to do that, this change turns the map/unmap into a noop when in radix mode and prints a warning once. Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Reformat if per Christophe's suggestion] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211013213438.675095-1-joel@jms.id.au
2021-10-21	Merge tag 'powerpc-5.15-5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix a bug exposed by a previous fix, where running guests with certain SMT topologies could crash the host on Power8. - Fix atomic sleep warnings when re-onlining CPUs, when PREEMPT is enabled. Thanks to Nathan Lynch, Srikar Dronamraju, and Valentin Schneider. * tag 'powerpc-5.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/smp: do not decrement idle task preempt count in CPU offline powerpc/idle: Don't corrupt back chain when going idle
2021-10-21	Merge tag 'aspeed-5.16-defconfig' of ↵	Arnd Bergmann
	git://git.kernel.org/pub/scm/linux/kernel/git/joel/bmc into arm/defconfigs ASPEED defconfig updates for 5.16 - Add options that are enabled in the common OpenBMC kernel - Re-enable DRM_FBDEV_EMULATION - Turn on the various sensor drivers that are used in BMC systems, so we can boot test where they are modelled in Qemu * tag 'aspeed-5.16-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/bmc: ARM: configs: aspeed: Remove unused USB gadget devices ARM: config: aspeed: Enable Network Block Device ARM: configs: aspeed: Enable pstore and lockup detectors ARM: configs: aspeed: Enable commonly used drivers ARM: configs: aspeed: Disable IPV6 SIT device ARM: configs: aspeed_g5: Reneable DRM_FBDEV_EMULATION Link: https://lore.kernel.org/r/CACPK8Xd0mVn2Cy7d=VBTDMpU=WHrftsiihwH224ekFSDGKAbyA@mail.gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-21	x86/cpu: Fix migration safety with X86_BUG_NULL_SEL	Jane Malalane
	Currently, Linux probes for X86_BUG_NULL_SEL unconditionally which makes it unsafe to migrate in a virtualised environment as the properties across the migration pool might differ. To be specific, the case which goes wrong is: 1. Zen1 (or earlier) and Zen2 (or later) in a migration pool 2. Linux boots on Zen2, probes and finds the absence of X86_BUG_NULL_SEL 3. Linux is then migrated to Zen1 Linux is now running on a X86_BUG_NULL_SEL-impacted CPU while believing that the bug is fixed. The only way to address the problem is to fully trust the "no longer affected" CPUID bit when virtualised, because in the above case it would be clear deliberately to indicate the fact "you might migrate to somewhere which has this behaviour". Zen3 adds the NullSelectorClearsBase CPUID bit to indicate that loading a NULL segment selector zeroes the base and limit fields, as well as just attributes. Zen2 also has this behaviour but doesn't have the NSCB bit. [ bp: Minor touchups. ] Signed-off-by: Jane Malalane <jane.malalane@citrix.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20211021104744.24126-1-jane.malalane@citrix.com
2021-10-21	x86/fpu: Move xstate feature masks to fpu_*_cfg	Thomas Gleixner
	Move the feature mask storage to the kernel and user config structs. Default and maximum feature set are the same for now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.352041752@linutronix.de
2021-10-21	x86/fpu: Move xstate size to fpu_*_cfg	Thomas Gleixner
	Use the new kernel and user space config storage to store and retrieve the XSTATE buffer sizes. The default and the maximum size are the same for now, but will change when support for dynamically enabled features is added. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.296830097@linutronix.de
2021-10-21	x86/fpu/xstate: Cleanup size calculations	Thomas Gleixner
	The size calculations are partially unreadable gunk. Clean them up. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.241223689@linutronix.de
2021-10-21	x86/fpu: Cleanup fpu__init_system_xstate_size_legacy()	Thomas Gleixner
	Clean the function up before making changes. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.184014242@linutronix.de
2021-10-21	x86/fpu: Provide struct fpu_config	Thomas Gleixner
	Provide a struct to store information about the maximum supported and the default feature set and buffer sizes for both user and kernel space. This allows quick retrieval of this information for the upcoming support for dynamically enabled features. [ bp: Add vertical spacing between the struct members. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.126107370@linutronix.de
2021-10-21	KVM: SEV: Flush cache on non-coherent systems before RECEIVE_UPDATE_DATA	Masahiro Kozuka
	Flush the destination page before invoking RECEIVE_UPDATE_DATA, as the PSP encrypts the data with the guest's key when writing to guest memory. If the target memory was not previously encrypted, the cache may contain dirty, unecrypted data that will persist on non-coherent systems. Fixes: 15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command") Cc: stable@vger.kernel.org Cc: Peter Gonda <pgonda@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Masahiro Kozuka <masa.koz@kozuka.jp> [sean: converted bug report to changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210914210951.2994260-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-21	arm64: errata: Add detection for TRBE write to out-of-range	Suzuki K Poulose
	Arm Neoverse-N2 and Cortex-A710 cores are affected by an erratum where the trbe, under some circumstances, might write upto 64bytes to an address after the Limit as programmed by the TRBLIMITR_EL1.LIMIT. This might - - Corrupt a page in the ring buffer, which may corrupt trace from a previous session, consumed by userspace. - Hit the guard page at the end of the vmalloc area and raise a fault. To keep the handling simpler, we always leave the last page from the range, which TRBE is allowed to write. This can be achieved by ensuring that we always have more than a PAGE worth space in the range, while calculating the LIMIT for TRBE. And then the LIMIT pointer can be adjusted to leave the PAGE (TRBLIMITR.LIMIT -= PAGE_SIZE), out of the TRBE range while enabling it. This makes sure that the TRBE will only write to an area within its allowed limit (i.e, [head-head+size]) and we do not have to handle address faults within the driver. Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-5-suzuki.poulose@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: errata: Add workaround for TSB flush failures	Suzuki K Poulose
	Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers from errata, where a TSB (trace synchronization barrier) fails to flush the trace data completely, when executed from a trace prohibited region. In Linux we always execute it after we have moved the PE to trace prohibited region. So, we can apply the workaround every time a TSB is executed. The work around is to issue two TSB consecutively. NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying that a late CPU could be blocked from booting if it is the first CPU that requires the workaround. This is because we do not allow setting a cpu_hwcaps after the SMP boot. The other alternative is to use "this_cpu_has_cap()" instead of the faster system wide check, which may be a bit of an overhead, given we may have to do this in nvhe KVM host before a guest entry. Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-4-suzuki.poulose@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: errata: Add detection for TRBE overwrite in FILL mode	Suzuki K Poulose
	Arm Neoverse-N2 and the Cortex-A710 cores are affected by a CPU erratum where the TRBE will overwrite the trace buffer in FILL mode. The TRBE doesn't stop (as expected in FILL mode) when it reaches the limit and wraps to the base to continue writing upto 3 cache lines. This will overwrite any trace that was written previously. Add the Neoverse-N2 erratum(#2139208) and Cortex-A710 erratum (#2119858) to the detection logic. This will be used by the TRBE driver in later patches to work around the issue. The detection has been kept with the core arm64 errata framework list to make sure : - We don't duplicate the framework in TRBE driver - The errata detection is advertised like the rest of the CPU errata. Note that the Kconfig entries are not fully active until the TRBE driver implements the work around. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> cc: Leo Yan <leo.yan@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-3-suzuki.poulose@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: Add Neoverse-N2, Cortex-A710 CPU part definition	Suzuki K Poulose
	Add the CPU Partnumbers for the new Arm designs. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-2-suzuki.poulose@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	x86/sev: Allow #VC exceptions on the VC2 stack	Joerg Roedel
	When code running on the VC2 stack causes a nested VC exception, the handler will not handle it as expected but goes again into the error path. The result is that the panic() call happening when the VC exception was raised in an invalid context is called recursively. Fix this by checking the interrupted stack too and only call panic if it is not the VC2 stack. [ bp: Fixup comment. ] Fixes: 0786138c78e79 ("x86/sev-es: Add a Runtime #VC Exception Handler") Reported-by: Xinyang Ge <xing@microsoft.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021080833.30875-3-joro@8bytes.org
2021-10-21	x86/sev: Fix stack type check in vc_switch_off_ist()	Joerg Roedel
	The value of STACK_TYPE_EXCEPTION_LAST points to the last _valid_ exception stack. Reflect that in the check done in the vc_switch_off_ist() function. Fixes: a13644f3a53de ("x86/entry/64: Add entry code for #VC handler") Reported-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021080833.30875-2-joro@8bytes.org
2021-10-21	KVM: MMU: Reset mmu->pkru_mask to avoid stale data	Chenyi Qiang
	When updating mmu->pkru_mask, the value can only be added but it isn't reset in advance. This will make mmu->pkru_mask keep the stale data. Fix this issue. Fixes: 2d344105f57c ("KVM, pkeys: introduce pkru_mask to cache conditions") Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20211021071022.1140-1-chenyi.qiang@intel.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-21	um/drivers/ubd_kern: add error handling support for add_disk()	Luis Chamberlain
	We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. ubd_disk_register() never returned an error, so just fix that now and let the caller handle the error condition. Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211015233028.2167651-8-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-21	m68k/emu/nfblock: add error handling support for add_disk()	Luis Chamberlain
	We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211015233028.2167651-7-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-21	arm64: defconfig: Enable Qualcomm LMH driver	Bjorn Andersson
	With the introduction of LMH in '36c6581214c4 ("arm64: dts: qcom: sdm845: Add support for LMh node")' the CPUfreq gained a reference for the two interrupts from the LMh. This means that if the LMh driver isn't enabled CPUfreq will not probe and there's no frequency scaling. Enable LMh to make CPUfreq functional again on SDM845. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Tested-By: Steev Klimaszewski <steev@kali.org> Reviewed-and-Tested-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Link: https://lore.kernel.org/r/20211017204036.2761122-1-bjorn.andersson@linaro.org
2021-10-21	arm64: defconfig: Enable Qualcomm prima/pronto drivers	Bjorn Andersson
	The Qualcomm prima/pronto drivers consists of a remoteproc driver boot the combined WiFi and Bluetooth firmware on the related co-processor, a control driver to load calibration and settings firmware, the wcn36xx WiFi driver and the SMD based Bluetooth driver. Enable these in the defconfig to provide Bluetooth and WiFi on devices such as the Qualcomm Dragonboard 410c. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20211008192840.3593362-1-bjorn.andersson@linaro.org
2021-10-21	x86/CPU: Add support for Vortex CPUs	Marcos Del Sol Vives
	DM&P devices were not being properly identified, which resulted in unneeded Spectre/Meltdown mitigations being applied. The manufacturer states that these devices execute always in-order and don't support either speculative execution or branch prediction, so they are not vulnerable to this class of attack. [1] This is something I've personally tested by a simple timing analysis on my Vortex86MX CPU, and can confirm it is true. Add identification for some devices that lack the CPUID product name call, so they appear properly on /proc/cpuinfo. ¹https://www.ssv-embedded.de/doks/infos/DMP_Ann_180108_Meltdown.pdf [ bp: Massage commit message. ] Signed-off-by: Marcos Del Sol Vives <marcos@orca.pet> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211017094408.1512158-1-marcos@orca.pet
2021-10-21	x86/fpu/signal: Use fpstate for size and features	Thomas Gleixner
	For dynamically enabled features it's required to get the features which are enabled for that context when restoring from sigframe. The same applies for all signal frame size calculations. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/87ilxz5iew.ffs@tglx
2021-10-21	x86/fpu/xstate: Use fpstate for copy_uabi_to_xstate()	Thomas Gleixner
	Prepare for dynamically enabled states per task. The function needs to retrieve the features and sizes which are valid in a fpstate context. Retrieve them from fpstate. Move the function declarations to the core header as they are not required anywhere else. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.233529986@linutronix.de
2021-10-21	x86/fpu: Use fpstate in __copy_xstate_to_uabi_buf()	Thomas Gleixner
	With dynamically enabled features the copy function must know the features and the size which is valid for the task. Retrieve them from fpstate. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.181495492@linutronix.de
2021-10-21	x86/fpu: Use fpstate in fpu_copy_kvm_uabi_to_fpstate()	Thomas Gleixner
	Straight forward conversion. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.129699950@linutronix.de
2021-10-21	x86/fpu/xstate: Use fpstate for xsave_to_user_sigframe()	Thomas Gleixner
	With dynamically enabled features the sigframe code must know the features which are enabled for the task. Get them from fpstate. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.077781448@linutronix.de