linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-02-01	arm64: dts: mediatek: mt8186: Fix watchdog compatible	AngeloGioacchino Del Regno
	MT8186's watchdog embeds a reset controller and needs only the mediatek,mt8186-wdt compatible string as the MT6589 one is there for watchdogs that don't have any reset controller capability. Fixes: 2e78620b1350 ("arm64: dts: Add MediaTek MT8186 dts and evaluation board and Makefile") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Co-developed-by: Allen-KH Cheng <allen-kh.cheng@mediatek.com> Signed-off-by: Allen-KH Cheng <allen-kh.cheng@mediatek.com> Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Link: https://lore.kernel.org/r/20221108033209.22751-2-allen-kh.cheng@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2023-02-01	arm64: dts: mt8173-elm: Switch to SMC watchdog	Pin-yen Lin
	Switch to SMC watchdog because we need direct control of HW watchdog registers from kernel. The corresponding firmware was uploaded in https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3405. Signed-off-by: Pin-yen Lin <treapking@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220729053254.220585-1-treapking@chromium.org Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2023-02-01	arm64: dts: mediatek: mt7622: Add missing pwm-cells to pwm node	AngeloGioacchino Del Regno
	Specify #pwm-cells on pwm@11006000 to make it actually usable. Fixes: ae457b7679c4 ("arm64: dts: mt7622: add SoC and peripheral related device nodes") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20221128112028.58021-2-angelogioacchino.delregno@collabora.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2023-02-01	arm64: dts: marvell: Fix compatible strings for Armada 3720 boards	Pali Rohár
	All Armada 3720 boards have Armada 3720 processor which is of Armada 3700 family and do not have Armada 3710 processor. So none of them should have compatible string for Armada 3710 processor. Fix compatible string for all these boards by removing wrong processor string "marvell,armada3710" and adding family string "marvell,armada3700" as the last one. (Note that this is same way how are defined Armada 3710 DTS files). [gclement: fix conflict for v6.2] Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2023-01-31	riscv: remove riscv_isa_ext_keys[] array and related usage	Jisheng Zhang
	All users have switched to riscv_has_extension_*, remove unused definitions, vars and related setting code. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230128172856.3814-14-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: KVM: Switch has_svinval() to riscv_has_extension_unlikely()	Andrew Jones
	Switch has_svinval() from static branch to the new helper riscv_has_extension_unlikely(). Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Guo Ren <guoren@kernel.org> Acked-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20230128172856.3814-13-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: cpu_relax: switch to riscv_has_extension_likely()	Jisheng Zhang
	Switch cpu_relax() from static branch to the new helper riscv_has_extension_likely() Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-12-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: alternative: patch alternatives in the vDSO	Jisheng Zhang
	Make it possible to use alternatives in the vDSO, so that better implementations can be used if possible. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230128172856.3814-11-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: switch to relative alternative entries	Jisheng Zhang
	Instead of using absolute addresses for both the old instrucions and the alternative instructions, use offsets relative to the alt_entry values. So this not only cuts the size of the alternative entry, but also meets the prerequisite for patching alternatives in the vDSO, since absolute alternative entries are subject to dynamic relocation, which is incompatible with the vDSO building. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-10-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: module: Add ADD16 and SUB16 rela types	Andrew Jones
	To prepare for 16-bit relocation types to be emitted in alternatives add support for ADD16 and SUB16. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-9-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: module: move find_section to module.h	Jisheng Zhang
	Move find_section() to module.h so that the implementation can be shared by the alternatives code. This will allow us to use alternatives in the vdso. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20230128172856.3814-8-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: fpu: switch has_fpu() to riscv_has_extension_likely()	Jisheng Zhang
	Switch has_fpu() from static branch to the new helper riscv_has_extension_likely(). Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-7-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: introduce riscv_has_extension_[un]likely()	Jisheng Zhang
	Generally, riscv ISA extensions are fixed for any specific hardware platform, so a hart's features won't change after booting. This chacteristic makes it straightforward to use a static branch to check if a specific ISA extension is supported or not to optimize performance. However, some ISA extensions such as SVPBMT and ZICBOM are handled via. the alternative sequences. Basically, for ease of maintenance, we prefer to use static branches in C code, but recently, Samuel found that the static branch usage in cpu_relax() breaks building with CONFIG_CC_OPTIMIZE_FOR_SIZE[1]. As Samuel pointed out, "Having a static branch in cpu_relax() is problematic because that function is widely inlined, including in some quite complex functions like in the VDSO. A quick measurement shows this static branch is responsible by itself for around 40% of the jump table." Samuel's findings pointed out one of a few downsides of static branches usage in C code to handle ISA extensions detected at boot time: static branch's metadata in the __jump_table section, which is not discarded after ISA extensions are finalized, wastes some space. I want to try to solve the issue for all possible dynamic handling of ISA extensions at boot time. Inspired by Mark[2], this patch introduces riscv_has_extension_*() helpers, which work like static branches but are patched using alternatives, thus the metadata can be freed after patching. Link: https://lore.kernel.org/linux-riscv/20220922060958.44203-1-samuel@sholland.org/ [1] Link: https://lore.kernel.org/linux-arm-kernel/20220912162210.3626215-8-mark.rutland@arm.com/ [2] Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-6-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions	Jisheng Zhang
	riscv_cpufeature_patch_func() currently only scans a limited set of cpufeatures, explicitly defined with macros. Extend it to probe for all ISA extensions. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-5-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: hwcap: make ISA extension ids can be used in asm	Jisheng Zhang
	So that ISA extensions can be used in assembly files, convert the multi-letter RISC-V ISA extension IDs enums to macros. In order to make them visible, move the #ifndef __ASSEMBLY__ guard to a later point in the header Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-4-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: cpufeature: detect RISCV_ALTERNATIVES_EARLY_BOOT earlier	Jisheng Zhang
	Currently riscv_cpufeature_patch_func() does nothing at the RISCV_ALTERNATIVES_EARLY_BOOT stage. Add a check to detect whether we are in this stage and exit early. This will allow us to use riscv_cpufeature_patch_func() for scanning of all ISA extensions. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-3-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: move riscv_noncoherent_supported() out of ZICBOM probe	Jisheng Zhang
	It's a bit weird to call riscv_noncoherent_supported() each time when insmoding a module. Move the calling out of feature patch func. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-2-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	Merge patch "riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y"	Palmer Dabbelt
	This is a single fix, but it conflicts with some recent features. I'm merging it on top of the commit it fixes to ease backporting. * b4-shazam-merge: riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y Link: https://lore.kernel.org/r/20220922060958.44203-1-samuel@sholland.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y	Samuel Holland
	commit 8eb060e10185 ("arch/riscv: add Zihintpause support") broke building with CONFIG_CC_OPTIMIZE_FOR_SIZE enabled (gcc 11.1.0): CC arch/riscv/kernel/vdso/vgettimeofday.o In file included from <command-line>: ./arch/riscv/include/asm/jump_label.h: In function 'cpu_relax': ././include/linux/compiler_types.h:285:33: warning: 'asm' operand 0 probably does not match constraints 285 \| #define asm_volatile_goto(x...) asm goto(x) \| ^~~ ./arch/riscv/include/asm/jump_label.h:41:9: note: in expansion of macro 'asm_volatile_goto' 41 \| asm_volatile_goto( \| ^~~~~~~~~~~~~~~~~ ././include/linux/compiler_types.h:285:33: error: impossible constraint in 'asm' 285 \| #define asm_volatile_goto(x...) asm goto(x) \| ^~~ ./arch/riscv/include/asm/jump_label.h:41:9: note: in expansion of macro 'asm_volatile_goto' 41 \| asm_volatile_goto( \| ^~~~~~~~~~~~~~~~~ make[1]: * [scripts/Makefile.build:249: arch/riscv/kernel/vdso/vgettimeofday.o] Error 1 make: * [arch/riscv/Makefile:128: vdso_prepare] Error 2 Having a static branch in cpu_relax() is problematic because that function is widely inlined, including in some quite complex functions like in the VDSO. A quick measurement shows this static branch is responsible by itself for around 40% of the jump table. Drop the static branch, which ends up being the same number of instructions anyway. If Zihintpause is supported, we trade the nop from the static branch for a div. If Zihintpause is unsupported, we trade the jump from the static branch for (what gets interpreted as) a nop. Fixes: 8eb060e10185 ("arch/riscv: add Zihintpause support") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-01	ARM: dts: aspeed: p10bmc: Enable UART2	Eddie James
	The APSS can be accessed over the second uart on these systems. Signed-off-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20230126220842.885965-1-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>
2023-02-01	powerpc/kexec_file: Count hot-pluggable memory in FDT estimate	Sourabh Jain
	On Systems where online memory is lesser compared to max memory, the kexec_file_load system call may fail to load the kdump kernel with the below errors: "Failed to update fdt with linux,drconf-usable-memory property" "Error setting up usable-memory property for kdump kernel" This happens because the size estimation for usable memory properties for the kdump kernel's FDT is based on the online memory whereas the usable memory properties include max memory. In short, the hot-pluggable memory is not accounted for while estimating the size of the usable memory properties. The issue is addressed by calculating usable memory property size using max hotplug address instead of the last online memory address. Fixes: 2377c92e37fe ("powerpc/kexec_file: fix FDT size estimation for kdump kernel") Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230131030615.729894-1-sourabhjain@linux.ibm.com
2023-01-31	KVM: x86: Use emulator callbacks instead of duplicating "host flags"	Maxim Levitsky
	Instead of re-defining the "host flags" bits, just expose dedicated helpers for each of the two remaining flags that are consumed by the emulator. The emulator never consumes both "is guest" and "is SMM" in close proximity, so there is no motivation to avoid additional indirect branches. Also while at it, garbage collect the recently removed host flags. No functional change is intended. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com> Link: https://lore.kernel.org/r/20221129193717.513824-6-mlevitsk@redhat.com [sean: fix CONFIG_KVM_SMM=n builds, tweak names of wrappers] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-31	Sync mm-stable with mm-hotfixes-stable to pick up dependent patches	Andrew Morton
	Merge branch 'mm-hotfixes-stable' into mm-stable
2023-01-31	sh: define RUNTIME_DISCARD_EXIT	Tom Saeger
	sh vmlinux fails to link with GNU ld < 2.40 (likely < 2.36) since commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv"). This is similar to fixes for powerpc and s390: commit 4b9880dbf3bd ("powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT"). commit a494398bde27 ("s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36"). $ sh4-linux-gnu-ld --version \| head -n1 GNU ld (GNU Binutils for Debian) 2.35.2 $ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu- microdev_defconfig $ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu- `.exit.text' referenced in section `__bug_table' of crypto/algboss.o: defined in discarded section `.exit.text' of crypto/algboss.o `.exit.text' referenced in section `__bug_table' of drivers/char/hw_random/core.o: defined in discarded section `.exit.text' of drivers/char/hw_random/core.o make[2]: * [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make[1]: * [Makefile:1252: vmlinux] Error 2 arch/sh/kernel/vmlinux.lds.S keeps EXIT_TEXT: /* * .exit.text is discarded at runtime, not link time, to deal with * references from __bug_table */ .exit.text : AT(ADDR(.exit.text)) { EXIT_TEXT } However, EXIT_TEXT is thrown away by DISCARD(include/asm-generic/vmlinux.lds.h) because sh does not define RUNTIME_DISCARD_EXIT. GNU ld 2.40 does not have this issue and builds fine. This corresponds with Masahiro's comments in a494398bde27: "Nathan [Chancellor] also found that binutils commit 21401fc7bf67 ("Duplicate output sections in scripts") cured this issue, so we cannot reproduce it with binutils 2.36+, but it is better to not rely on it." Link: https://lkml.kernel.org/r/9166a8abdc0f979e50377e61780a4bba1dfa2f52.1674518464.git.tom.saeger@oracle.com Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Link: https://lore.kernel.org/all/Y7Jal56f6UBh1abE@dev-arch.thelio-3990X/ Link: https://lore.kernel.org/all/20230123194218.47ssfzhrpnv3xfez@oracle.com/ Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Dennis Gilmore <dennis@ausil.us> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31	ia64: fix build error due to switch case label appearing next to declaration	James Morse
	Since commit aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency"), gcc 10.1.0 fails to build ia64 with the gnomic: \| ../arch/ia64/kernel/sys_ia64.c: In function 'ia64_clock_getres': \| ../arch/ia64/kernel/sys_ia64.c:189:3: error: a label can only be part of a statement and a declaration is not a statement \| 189 \| s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq); This line appears immediately after a case label in a switch. Move the declarations out of the case, to the top of the function. Link: https://lkml.kernel.org/r/20230117151632.393836-1-james.morse@arm.com Fixes: aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency") Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Sergei Trofimovich <slyich@gmail.com> Cc: Émeric Maschino <emeric.maschino@gmail.com> Cc: matoro <matoro_mailinglist_kernel@matoro.tk> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31	arm64: dts: qcom: sm8350: use qcom,sm8350-dsi-ctrl compatibles	Dmitry Baryshkov
	Add the per-SoC (qcom,sm8350-dsi-ctrl) compatible strings to DSI nodes to follow the pending DSI bindings changes. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230118032024.1715857-1-dmitry.baryshkov@linaro.org
2023-01-31	arm64: dts: qcom: sc8280xp: add p1 register blocks to DP nodes	Dmitry Baryshkov
	Per DT bindings add p1 register blocks to all DP controllers on SC8280XP platform. Fixes: 6f299ae7f96d ("arm64: dts: qcom: sc8280xp: add p1 register blocks to DP nodes") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230118031718.1714861-4-dmitry.baryshkov@linaro.org
2023-01-31	arm64: dts: qcom: sc8280xp-crd: drop #sound-dai-cells from eDP node	Dmitry Baryshkov
	The eDP device doesn't provide sound DAI. Drop corresponding property from the eDP node. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230118031718.1714861-3-dmitry.baryshkov@linaro.org
2023-01-31	KVM: x86: Move HF_NMI_MASK and HF_IRET_MASK into "struct vcpu_svm"	Maxim Levitsky
	Move HF_NMI_MASK and HF_IRET_MASK (a.k.a. "waiting for IRET") out of the common "hflags" and into dedicated flags in "struct vcpu_svm". The flags are used only for the SVM and thus should not be in hflags. Tracking NMI masking in software isn't SVM specific, e.g. VMX has a similar flag (soft_vnmi_blocked), but that's much more of a hack as VMX can't intercept IRET, is useful only for ancient CPUs, i.e. will hopefully be removed at some point, and again the exact behavior is vendor specific and shouldn't ever be referenced in common code. converting VMX No functional change is intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com> Link: https://lore.kernel.org/r/20221129193717.513824-5-mlevitsk@redhat.com [sean: split from HF_GIF_MASK patch] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-31	KVM: x86: Move HF_GIF_MASK into "struct vcpu_svm" as "guest_gif"	Maxim Levitsky
	Move HF_GIF_MASK out of the common "hflags" and into vcpu_svm.guest_gif. GIF is an SVM-only concept and has should never be consulted outside of SVM-specific code. No functional change is intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com> Link: https://lore.kernel.org/r/20221129193717.513824-5-mlevitsk@redhat.com [sean: split to separate patch] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-31	KVM: nSVM: Don't sync tlb_ctl back to vmcb12 on nested VM-Exit	Maxim Levitsky
	Don't sync the TLB control field from vmcb02 to vmcs12 on nested VM-Exit. Per AMD's APM, the field is not modified by hardware: The VMRUN instruction reads, but does not change, the value of the TLB_CONTROL field Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com> Link: https://lore.kernel.org/r/20221129193717.513824-2-mlevitsk@redhat.com [sean: massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-31	arm64/sysreg: clean up some inconsistent indenting	Jiapeng Chong
	No functional modification involved. ./arch/arm64/kvm/sys_regs.c:80:2-9: code aligned with following code on line 82. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3897 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20230131082703.118101-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-31	Merge patch series "Zbb string optimizations"	Palmer Dabbelt
	Heiko Stuebner <heiko@sntech.de> says: From: Heiko Stuebner <heiko.stuebner@vrull.eu> This series still tries to allow optimized string functions for specific extensions. The last approach of using an inline base function to hold the alternative calls did cause some issues in a number of places So instead of that we're now just using an alternative j at the beginning of the generic function to jump to a separate place inside the function itself. * b4-shazam-merge: RISC-V: add zbb support to string functions RISC-V: add infrastructure to allow different str* implementations Link: https://lore.kernel.org/r/20230113212301.3534711-1-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	RISC-V: add zbb support to string functions	Heiko Stuebner
	Add handling for ZBB extension and add support for using it as a variant for optimized string functions. Support for the Zbb-str-variants is limited to the GNU-assembler for now, as LLVM has not yet acquired the functionality to selectively change the arch option in assembler code. This is still under review at https://reviews.llvm.org/D123515 Co-developed-by: Christoph Muellner <christoph.muellner@vrull.eu> Signed-off-by: Christoph Muellner <christoph.muellner@vrull.eu> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230113212301.3534711-3-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	RISC-V: add infrastructure to allow different str* implementations	Heiko Stuebner
	Depending on supported extensions on specific RISC-V cores, optimized str* functions might make sense. This adds basic infrastructure to allow patching the function calls via alternatives later on. The Linux kernel provides standard implementations for string functions but when architectures want to extend them, they need to provide their own. The added generic string functions are done in assembler (taken from disassembling the main-kernel functions for now) to allow us to control the used registers and extend them with optimized variants. This doesn't override the compiler's use of builtin replacements. So still first of all the compiler will select if a builtin will be better suitable i.e. for known strings. For all regular cases we will want to later select possible optimized variants and in the worst case fall back to the generic implemention added with this change. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230113212301.3534711-2-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31	x86/amd: Cache debug register values in percpu variables	Alexey Kardashevskiy
	Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to be noticeable with the AMD KVM SEV-ES DebugSwap feature enabled. KVM is going to store host's DR[0-3] and DR[0-3]_ADDR_MASK before switching to a guest; the hardware is going to swap these on VMRUN and VMEXIT. Store MSR values passed to set_dr_addr_mask() in percpu variables (when changed) and return them via new amd_get_dr_addr_mask(). The gain here is about 10x. As set_dr_addr_mask() uses the array too, change the @dr type to unsigned to avoid checking for <0. And give it the amd_ prefix to match the new helper as the whole DR_ADDR_MASK feature is AMD-specific anyway. While at it, replace deprecated boot_cpu_has() with cpu_feature_enabled() in set_dr_addr_mask(). Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230120031047.628097-2-aik@amd.com
2023-01-31	s390/mem_detect: do not update output parameters on failure	Alexander Gordeev
	Function __get_mem_detect_block() resets start and end output parameters in case of invalid mem_detect array index is provided. That violates the rule of sparing the output on fail path and leads e.g to a below anomaly: for_each_mem_detect_block(i, &start, &end) continue; One would expect start and end contain addresses of the last memory block (if available), but in fact the two will be reset to zeroes. That is not how an iterator is expected to work. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31	s390/cio: introduce locking for register/unregister functions	Vineeth Vijayan
	Unbinding an I/O subchannel with a child-CCW device in disconnected state sometimes causes a kernel-panic. The race condition was seen mostly during testing, when setting all the CHPIDs of a device to offline and at the same time, the unbinding the I/O subchannel driver. The kernel-panic occurs because of double delete, the I/O subchannel driver calls device_del on the CCW device while another device_del invocation for the same device is in-flight. For instance, disabling all the CHPIDs will trigger the ccw_device_remove function, which will call a ccw_device_unregister(), which ends up calling the device_del() which is asynchronous via cdev's todo workqueue. And unbinding the I/O subchannel driver calls io_subchannel_remove() function which calls the ccw_device_unregister() and device_del(). This double delete can be prevented by serializing all CCW device registration/unregistration calls into the driver core. This patch introduces a mutex which will be used for this purpose. Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31	s390/mm,ptdump: avoid Kasan vs Memcpy Real markers swapping	Vasily Gorbik
	---[ Real Memory Copy Area Start ]--- 0x001bfffffffff000-0x001c000000000000 4K PTE I ---[ Kasan Shadow Start ]--- ---[ Real Memory Copy Area End ]--- 0x001c000000000000-0x001c000200000000 8G PMD RW NX ... ---[ Kasan Shadow End ]--- ptdump does a stable sort of markers. Move kasan markers after memcpy real to avoid swapping. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31	s390/boot: remove pgtable_populate_end	Vasily Gorbik
	setup_vmem() already calls populate for all online memory regions. pgtable_populate_end() could be removed. Also rename pgtable_populate_begin() to pgtable_populate_init(). Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31	s390/boot: avoid mapping standby memory	Vasily Gorbik
	Commit bb1520d581a3 ("s390/mm: start kernel with DAT enabled") doesn't consider online memory holes due to potential memory offlining and erroneously creates pgtables for stand-by memory, which bear RW+X attribute and trigger a warning: RANGE SIZE STATE REMOVABLE BLOCK 0x0000000000000000-0x0000000c3fffffff 49G online yes 0-48 0x0000000c40000000-0x0000000c7fffffff 1G offline 49 0x0000000c80000000-0x0000000fffffffff 14G online yes 50-63 0x0000001000000000-0x00000013ffffffff 16G offline 64-79 s390/mm: Found insecure W+X mapping at address 0xc40000000 WARNING: CPU: 14 PID: 1 at arch/s390/mm/dump_pagetables.c:142 note_page+0x2cc/0x2d8 Map only online memory ranges which fit within identity mapping limit. Fixes: bb1520d581a3 ("s390/mm: start kernel with DAT enabled") Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31	s390/decompressor: specify __decompress() buf len to avoid overflow	Vasily Gorbik
	Historically calls to __decompress() didn't specify "out_len" parameter on many architectures including s390, expecting that no writes beyond uncompressed kernel image are performed. This has changed since commit 2aa14b1ab2c4 ("zstd: import usptream v1.5.2") which includes zstd library commit 6a7ede3dfccb ("Reduce size of dctx by reutilizing dst buffer (#2751)"). Now zstd decompression code might store literal buffer in the unwritten portion of the destination buffer. Since "out_len" is not set, it is considered to be unlimited and hence free to use for optimization needs. On s390 this might corrupt initrd or ipl report which are often placed right after the decompressor buffer. Luckily the size of uncompressed kernel image is already known to the decompressor, so to avoid the problem simply specify it in the "out_len" parameter. Link: https://github.com/facebook/zstd/commit/6a7ede3dfccb Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Link: https://lore.kernel.org/r/patch-1.thread-41c676.git-41c676c2d153.your-ad-here.call-01675030179-ext-9637@work.hours Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31	arm64: irqflags: use alternative branches for pseudo-NMI logic	Mark Rutland
	Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: \| [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig \| grep icc_pmr_el1 \| wc -l \| 885 \| [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig \| grep icc_pmr_el1 \| wc -l \| 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: \| [mark@lakrids:~/src/linux]% ls -al vmlinux-* \| -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig \| -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi \| -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig \| -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi \| [mark@lakrids:~/src/linux]% ls -al Image-* \| -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig \| -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi \| -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig \| -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31	arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap	Mark Rutland
	When Priority Mask Hint Enable (PMHE) == 0b1, the GIC may use the PMR value to determine whether to signal an IRQ to a PE, and consequently after a change to the PMR value, a DSB SY may be required to ensure that interrupts are signalled to a CPU in finite time. When PMHE == 0b0, interrupts are always signalled to the relevant PE, and all masking occurs locally, without requiring a DSB SY. Since commit: f226650494c6aa87 ("arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear") ... we handle this dynamically: in most cases a static key is used to determine whether to issue a DSB SY, but the entry code must read from ICC_CTLR_EL1 as static keys aren't accessible from plain assembly. It would be much nicer to use an alternative instruction sequence for the DSB, as this would avoid the need to read from ICC_CTLR_EL1 in the entry code, and for most other code this will result in simpler code generation with fewer instructions and fewer branches. This patch adds a new ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap which is only set when ICC_CTLR_EL1.PMHE == 0b0 (and GIC priority masking is in use). This allows us to replace the existing users of the `gic_pmr_sync` static key with alternative sequences which default to a DSB SY and are relaxed to a NOP when PMHE is not in use. The entry assembly management of the PMR is slightly restructured to use a branch (rather than multiple NOPs) when priority masking is not in use. This is more in keeping with other alternatives in the entry assembly, and permits the use of a separate alternatives for the PMHE-dependent DSB SY (and removal of the conditional branch this currently requires). For consistency I've adjusted both the save and restore paths. According to bloat-o-meter, when building defconfig + CONFIG_ARM64_PSEUDO_NMI=y this shrinks the kernel text by ~4KiB: \| add/remove: 4/2 grow/shrink: 42/310 up/down: 332/-5032 (-4700) The resulting vmlinux is ~66KiB smaller, though the resulting Image size is unchanged due to padding and alignment: \| [mark@lakrids:~/src/linux]% ls -al vmlinux-* \| -rwxr-xr-x 1 mark mark 137508344 Jan 17 14:11 vmlinux-after \| -rwxr-xr-x 1 mark mark 137575440 Jan 17 13:49 vmlinux-before \| [mark@lakrids:~/src/linux]% ls -al Image-* \| -rw-r--r-- 1 mark mark 38777344 Jan 17 14:11 Image-after \| -rw-r--r-- 1 mark mark 38777344 Jan 17 13:49 Image-before Prior to this patch we did not verify the state of ICC_CTLR_EL1.PMHE on secondary CPUs. As of this patch this is verified by the cpufeature code when using GIC priority masking (i.e. when using pseudo-NMIs). Note that since commit: 7e3a57fa6ca831fa ("arm64: Document ICC_CTLR_EL3.PMHE setting requirements") ... Documentation/arm64/booting.rst specifies: \| - ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across \| all CPUs the kernel is executing on, and must stay constant \| for the lifetime of the kernel. ... so that should not adversely affect any compliant systems, and as we'll only check for the absense of PMHE when using pseudo-NMIs, this will only fire when such mismatch will adversely affect the system. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31	arm64: make ARM64_HAS_GIC_PRIO_MASKING depend on ARM64_HAS_GIC_CPUIF_SYSREGS	Mark Rutland
	Currently the arm64_cpu_capabilities structure for ARM64_HAS_GIC_PRIO_MASKING open-codes the same CPU field definitions as the arm64_cpu_capabilities structure for ARM64_HAS_GIC_CPUIF_SYSREGS, so that can_use_gic_priorities() can use has_useable_gicv3_cpuif(). This duplication isn't ideal for the legibility of the code, and sets a bad example for any ARM64_HAS_GIC_* definitions added by subsequent patches. Instead, have ARM64_HAS_GIC_PRIO_MASKING check for the ARM64_HAS_GIC_CPUIF_SYSREGS cpucap, and add a comment explaining why this is safe. Subsequent patches will use the same pattern where one cpucap depends upon another. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31	arm64: rename ARM64_HAS_IRQ_PRIO_MASKING to ARM64_HAS_GIC_PRIO_MASKING	Mark Rutland
	Subsequent patches will add more GIC-related cpucaps. When we do so, it would be nice to give them a consistent HAS_GIC_* prefix. In preparation for doing so, this patch renames the existing ARM64_HAS_IRQ_PRIO_MASKING cap to ARM64_HAS_GIC_PRIO_MASKING. The cpucaps file was hand-modified; all other changes were scripted with: find . -type f -name '*.[chS]' -print0 \| \ xargs -0 sed -i 's/ARM64_HAS_IRQ_PRIO_MASKING/ARM64_HAS_GIC_PRIO_MASKING/' There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31	arm64: rename ARM64_HAS_SYSREG_GIC_CPUIF to ARM64_HAS_GIC_CPUIF_SYSREGS	Mark Rutland
	Subsequent patches will add more GIC-related cpucaps. When we do so, it would be nice to give them a consistent HAS_GIC_* prefix. In preparation for doing so, this patch renames the existing ARM64_HAS_SYSREG_GIC_CPUIF cap to ARM64_HAS_GIC_CPUIF_SYSREGS. The 'CPUIF_SYSREGS' suffix is chosen so that this will be ordered ahead of other ARM64_HAS_GIC_* definitions in subsequent patches. The cpucaps file was hand-modified; all other changes were scripted with: find . -type f -name '*.[chS]' -print0 \| \ xargs -0 sed -i 's/ARM64_HAS_SYSREG_GIC_CPUIF/ARM64_HAS_GIC_CPUIF_SYSREGS/' There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31	arm64: pauth: don't sign leaf functions	Mark Rutland
	Currently, when CONFIG_ARM64_PTR_AUTH_KERNEL=y (and CONFIG_UNWIND_PATCH_PAC_INTO_SCS=n), we enable pointer authentication for all functions, including leaf functions. This isn't necessary, and is unfortunate for a few reasons: * Any PACIASP instruction is implicitly a `BTI C` landing pad, and forcing the addition of a PACIASP in every function introduces a larger set of BTI gadgets than is necessary. * The PACIASP and AUTIASP instructions make leaf functions larger than necessary, bloating the kernel Image. For a defconfig v6.2-rc3 kernel, this appears to add ~64KiB relative to not signing leaf functions, which is unfortunate but not entirely onerous. * The PACIASP and AUTIASP instructions potentially make leaf functions more expensive in terms of performance and/or power. For many trivial leaf functions, this is clearly unnecessary, e.g. \| <arch_local_save_flags>: \| d503233f paciasp \| d53b4220 mrs x0, daif \| d50323bf autiasp \| d65f03c0 ret \| <calibration_delay_done>: \| d503233f paciasp \| d50323bf autiasp \| d65f03c0 ret \| d503201f nop * When CONFIG_UNWIND_PATCH_PAC_INTO_SCS=y we disable pointer authentication for leaf functions, so clearly this is not functionally necessary, indicates we have an inconsistent threat model, and convolutes the Makefile logic. We've used pointer authentication in leaf functions since the introduction of in-kernel pointer authentication in commit: 74afda4016a7437e ("arm64: compile the kernel with ptrauth return address signing") ... but at the time we had no rationale for signing leaf functions. Subsequently, we considered avoiding signing leaf functions: https://lore.kernel.org/linux-arm-kernel/1586856741-26839-1-git-send-email-amit.kachhap@arm.com/ https://lore.kernel.org/linux-arm-kernel/1588149371-20310-1-git-send-email-amit.kachhap@arm.com/ ... however at the time we didn't have an abundance of reasons to avoid signing leaf functions as above (e.g. the BTI case), we had no hardware to make performance measurements, and it was reasoned that this gave some level of protection against a limited set of code-reuse gadgets which would fall through to a RET. We documented this in commit: 717b938e22f8dbf0 ("arm64: Document why we enable PAC support for leaf functions") Notably, this was before we supported any forward-edge CFI scheme (e.g. Arm BTI, or Clang CFI/kCFI), which would prevent jumping into the middle of a function. In addition, even with signing forced for leaf functions, AUTIASP may be placed before a number of instructions which might constitute such a gadget, e.g. \| <user_regs_reset_single_step>: \| f9400022 ldr x2, [x1] \| d503233f paciasp \| d50323bf autiasp \| f9408401 ldr x1, [x0, #264] \| 720b005f tst w2, #0x200000 \| b26b0022 orr x2, x1, #0x200000 \| 926af821 and x1, x1, #0xffffffffffdfffff \| 9a820021 csel x1, x1, x2, eq // eq = none \| f9008401 str x1, [x0, #264] \| d65f03c0 ret \| <fpsimd_cpu_dead>: \| 2a0003e3 mov w3, w0 \| 9000ff42 adrp x2, ffff800009ffd000 <xen_dynamic_chip+0x48> \| 9120e042 add x2, x2, #0x838 \| 52800000 mov w0, #0x0 // #0 \| d503233f paciasp \| f000d041 adrp x1, ffff800009a20000 <this_cpu_vector> \| d50323bf autiasp \| 9102c021 add x1, x1, #0xb0 \| f8635842 ldr x2, [x2, w3, uxtw #3] \| f821685f str xzr, [x2, x1] \| d65f03c0 ret \| d503201f nop So generally, trying to use AUTIASP to detect such gadgetization is not robust, and this is dealt with far better by forward-edge CFI (which is designed to prevent such cases). We should bite the bullet and stop pretending that AUTIASP is a mitigation for such forward-edge gadgetization. For the above reasons, this patch has the kernel consistently sign non-leaf functions and avoid signing leaf functions. Considering a defconfig v6.2-rc3 kernel built with LLVM 15.0.6: * The vmlinux is ~43KiB smaller: \| [mark@lakrids:~/src/linux]% ls -al vmlinux-* \| -rwxr-xr-x 1 mark mark 338547808 Jan 25 17:17 vmlinux-after \| -rwxr-xr-x 1 mark mark 338591472 Jan 25 17:22 vmlinux-before * The resulting Image is 64KiB smaller: \| [mark@lakrids:~/src/linux]% ls -al Image-* \| -rwxr-xr-x 1 mark mark 32702976 Jan 25 17:17 Image-after \| -rwxr-xr-x 1 mark mark 32768512 Jan 25 17:22 Image-before * There are ~400 fewer BTI gadgets: \| [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before 2> /dev/null \| grep -ow 'paciasp\\|bti\sc\?' \| sort \| uniq -c \| 1219 bti c \| 61982 paciasp \| [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after 2> /dev/null \| grep -ow 'paciasp\\|bti\sc\?' \| sort \| uniq -c \| 10099 bti c \| 52699 paciasp Which is +8880 BTIs, and -9283 PACIASPs, for -403 unnecessary BTI gadgets. While this is small relative to the total, distinguishing the two cases will make it easier to analyse and reduce this set further in future. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230131105809.991288-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31	arm64: unify asm-arch manipulation	Mark Rutland
	Assemblers will reject instructions not supported by a target architecture version, and so we must explicitly tell the assembler the latest architecture version for which we want to assemble instructions from. We've added a few AS_HAS_ARMV8_<N> definitions for this, in addition to an inconsistently named AS_HAS_PAC definition, from which arm64's top-level Makefile determines the architecture version that we intend to target, and generates the `asm-arch` variable. To make this a bit clearer and easier to maintain, this patch reworks the Makefile to determine asm-arch in a single if-else-endif chain. AS_HAS_PAC, which is defined when the assembler supports `-march=armv8.3-a`, is renamed to AS_HAS_ARMV8_3. As the logic for armv8.3-a is lifted out of the block handling pointer authentication, `asm-arch` may now be set to armv8.3-a regardless of whether support for pointer authentication is selected. This means that it will be possible to assemble armv8.3-a instructions even if we didn't intend to, but this is consistent with our handling of other architecture versions, and the compiler won't generate armv8.3-a instructions regardless. For the moment there's no need for an CONFIG_AS_HAS_ARMV8_1, as the code for LSE atomics and LDAPR use individual `.arch_extension` entries and do not require the baseline asm arch to be bumped to armv8.1-a. The other armv8.1-a features (e.g. PAN) do not require assembler support. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230131105809.991288-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31	arm64: dts: qcom: Add base QDU1000/QRU1000 IDP DTs	Melody Olvera
	Add DTs for Qualcomm IDP platforms using the QDU1000 and QRU1000 SoCs. Signed-off-by: Melody Olvera <quic_molvera@quicinc.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230112210722.6234-3-quic_molvera@quicinc.com