linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-06-29	LoongArch: Add jump-label implementation	Youling Tang
	Add support for jump labels based on the ARM64 version. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Select HAVE_DEBUG_KMEMLEAK to support kmemleak	Tiezhu Yang
	We can see that DEBUG_KMEMLEAK depends on HAVE_DEBUG_KMEMLEAK after commit b69ec42b1b19 ("Kconfig: clean up the long arch list for the DEBUG_KMEMLEAK config option"), just select HAVE_DEBUG_KMEMLEAK to support kmemleak on LoongArch. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Export some arch-specific pm interfaces	Yinbo Zhu
	Some PMC (Power Management Controllers) need to support DTS and will use the suspend interfaces thus this patch was to export such interfaces for their use. Signed-off-by: Yinbo Zhu <zhuyinbo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Introduce hardware page table walker	Huacai Chen
	Loongson-3A6000 and newer processors have hardware page table walker (PTW) support. PTW can handle all fastpaths of TLBI/TLBL/TLBS/TLBM exceptions by hardware, software only need to handle slowpaths (page faults). BTW, PTW doesn't append _PAGE_MODIFIED for page table entries, so we change pmd_dirty() and pte_dirty() to also check _PAGE_DIRTY for the "dirty" attribute. Signed-off-by: Liang Gao <gaoliang@loongson.cn> Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Support dbar with different hints	Huacai Chen
	Traditionally, LoongArch uses "dbar 0" (full completion barrier) for everything. But the full completion barrier is a performance killer, so Loongson-3A6000 and newer processors have made finer granularity hints available: Bit4: ordering or completion (0: completion, 1: ordering) Bit3: barrier for previous read (0: true, 1: false) Bit2: barrier for previous write (0: true, 1: false) Bit1: barrier for succeeding read (0: true, 1: false) Bit0: barrier for succeeding write (0: true, 1: false) Hint 0x700: barrier for "read after read" from the same address, which is needed by LL-SC loops on old models (dbar 0x700 behaves the same as nop if such reordering is disabled on new models). This patch makes use of the various new hints for different kinds of memory barriers. It brings performance improvements on Loongson-3A6000 series, while not affecting the existing models because all variants are treated as 'dbar 0' there. Why override queued_spin_unlock()? After commit 01e3b958efe85a26d9b ("drivers: Remove explicit invocations of mmiowb()") we need a completion barrier in queued_spin_unlock(), but the generic implementation use smp_store_release() which only provide an ordering barrier. Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add SMT (Simultaneous Multi-Threading) support	Huacai Chen
	Loongson-3A6000 has SMT (Simultaneous Multi-Threading) support, each physical core has two logical cores (threads). This patch add SMT probe and scheduler support via ACPI PPTT. If SCHED_SMT enabled, Loongson-3A6000 is treated as 4 cores, 8 threads; If SCHED_SMT disabled, Loongson-3A6000 is treated as 8 cores, 8 threads. Remove smp_num_siblings to support HMP (Heterogeneous Multi-Processing). Signed-off-by: Liupu Wang <wangliupu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add vector extensions support	Huacai Chen
	Add LoongArch's vector extensions support, which including 128bit LSX (i.e., Loongson SIMD eXtension) and 256bit LASX (i.e., Loongson Advanced SIMD eXtension). Linux kernel doesn't use vector itself, it only handle exceptions and context save/restore. So it only needs a subset of these instructions: * Vector load/store: vld vst vldx vstx xvld xvst xvldx xvstx * 8bit-elements move: vpickve2gr.b xvpickve2gr.b vinsgr2vr.b xvinsgr2vr.b * 16bit-elements move: vpickve2gr.h xvpickve2gr.h vinsgr2vr.h xvinsgr2vr.h * 32bit-elements move: vpickve2gr.w xvpickve2gr.w vinsgr2vr.w xvinsgr2vr.w * 64bit-elements move: vpickve2gr.d xvpickve2gr.d vinsgr2vr.d xvinsgr2vr.d * Elements permute: vpermi.w vpermi.d xvpermi.w xvpermi.d xvpermi.q Introduce AS_HAS_LSX_EXTENSION and AS_HAS_LASX_EXTENSION to avoid non- vector toolchains complains unsupported instructions. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add support to clone a time namespace	Tiezhu Yang
	We can see that "Time namespaces are not supported" on LoongArch: (1) clone3 test # cd tools/testing/selftests/clone3 && make && ./clone3 ... # Time namespaces are not supported ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME # Totals: pass:17 fail:0 xfail:0 xpass:0 skip:1 error:0 (2) timens test # cd tools/testing/selftests/timens && make && ./timens ... 1..0 # SKIP Time namespaces are not supported On LoongArch the current kernel does not support CONFIG_TIME_NS which depends on GENERIC_VDSO_TIME_NS, select GENERIC_VDSO_TIME_NS to enable CONFIG_TIME_NS to build kernel/time/namespace.c. Additionally, it needs to define some arch-dependent functions for the timens, such as __arch_get_timens_vdso_data(), arch_get_vdso_data() and vdso_join_timens(). At the same time, modify the layout of vvar to use one page size for generic vdso data, expand another page size for timens vdso data and assign LOONGARCH_VDSO_DATA_SIZE (maybe exceeds a page size if expand in the future) for loongarch vdso data, at last add the callback function vvar_fault() and modify stack_top(). With this patch under CONFIG_TIME_NS: (1) clone3 test # cd tools/testing/selftests/clone3 && make && ./clone3 ... ok 18 [739] Result (0) matches expectation (0) # Totals: pass:18 fail:0 xfail:0 xpass:0 skip:0 error:0 (2) timens test # cd tools/testing/selftests/timens && make && ./timens ... # Totals: pass:10 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	Makefile: Add loongarch target flag for Clang compilation	WANG Xuerui
	The LoongArch kernel is 64-bit and built with the soft-float ABI, hence the loongarch64-linux-gnusf target. (The "libc" part can affect the codegen of libcalls: other arches do not use a bare-metal target, and currently the only fully supported libc on LoongArch is glibc anyway.) See: https://lore.kernel.org/loongarch/CAKwvOdnimxv8oJ4mVY74zqtt1x7KTMrWvn2_T9x22SFDbU6rHQ@mail.gmail.com/ Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Mark Clang LTO as working	WANG Xuerui
	Confirmed working with QEMU system emulation. Acked-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Include KBUILD_CPPFLAGS in CHECKFLAGS invocation	WANG Xuerui
	This is a port of commit 08f6554ff90e ("mips: Include KBUILD_CPPFLAGS in CHECKFLAGS invocation") to arch/loongarch, for fixing cross-compilation of Linux/LoongArch with Clang, where previously the `--target` flag would no longer be present for the CHECKFLAGS cc invocation leading to build failure. Reported-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1787#issuecomment-1608306002 Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: vDSO: Use CLANG_FLAGS instead of filtering out '--target='	WANG Xuerui
	This is a port of commit 76d7fff22be3e ("MIPS: VDSO: Use CLANG_FLAGS instead of filtering out '--target='") to arch/loongarch, for fixing cross-compilation with Clang. Reported-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1787#issuecomment-1608306002 Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Tweak CFLAGS for Clang compatibility	WANG Xuerui
	Now the arch code is mostly ready for LLVM/Clang consumption, it is time to re-organize the CFLAGS a little to actually enable the LLVM build. Namely, all -G0 switches from CFLAGS are removed, and -mexplicit-relocs and -mdirect-extern-access are now wrapped with cc-option (with the related asm/percpu.h definition guarded against toolchain combos that are known to not work). A build with !RELOCATABLE && !MODULE is confirmed working within a QEMU environment; support for the two features are currently blocked on LLVM/Clang, and will come later. Why -G0 can be removed: In GCC, -G stands for "small data threshold", that instructs the compiler to put data smaller than the specified threshold in a dedicated "small data" section (called .sdata on LoongArch and several other arches). However, benefiting from this would require ABI cooperation, which is not the case for LoongArch; and current GCC behave the same whether -G0 (equal to disabling this optimization) is given or not. So, remove -G0 from CFLAGS altogether for one less thing to care about. This also benefits LLVM/Clang compatibility where the -G switch is not supported. Why -mexplicit-relocs can now be conditionally applied without regressions: Originally -mexplicit-relocs is unconditionally added to CFLAGS in case of CONFIG_AS_HAS_EXPLICIT_RELOCS, because not having it (i.e. old GCC + new binutils) would not work: modules will have R_LARCH_ABS_* relocs inside, but given the rarity of such toolchain combo in the wild, it may not be worthwhile to support it, so support for such relocs in modules were not added back when explicit relocs support was upstreamed, and -mexplicit-relocs is unconditionally added to fail the build early. Now that Clang compatibility is desired, given Clang is behaving like -mexplicit-relocs from day one but without support for the CLI flag, we must ensure the flag is not passed in case of Clang. However, explicit compiler flavor checks can be more brittle than feature detection: in this case what actually matters is support for __attribute__((model)) when building modules. Given neither older GCC nor current Clang support this attribute, probing for the attribute support and #error'ing out would allow proper UX without checking for Clang, and also automatically work when Clang support for the attribute is to be added in the future. Why -mdirect-extern-access is now conditionally applied: This is actually a nice-to-have optimization that can reduce GOT accesses, but not having it is harmless either. Because Clang does not support the option currently, but might do so in the future, conditional application via cc-option ensures compatibility with both current and future Clang versions. Suggested-by: Xi Ruoyao <xry111@xry111.site> # cc-option changes Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Simplify the invtlb wrappers	WANG Xuerui
	The invtlb instruction has been supported by upstream LoongArch toolchains from day one, so ditch the raw opcode trickery and just use plain inline asm for it. While at it, also make the invtlb asm statements barriers, for proper modeling of the side effects. The functions are also marked as __always_inline instead of just "inline", because they cannot work at all if not inlined: the op argument will not be compile-time const in that case, thus failing to satisfy the "i" constraint. The signature of the other more specific invtlb wrappers contain unused arguments right now, but these are not removed right away in order for the patch to be focused. In the meantime, assertions are added to ensure no accidental misuse happens before the refactor. (The more specific wrappers cannot re-use the generic invtlb wrapper, because the ISA manual says $zero shall be used in case a particular op does not take the respective argument: re-using the generic wrapper would mean losing control over the register usage.) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Make the CPUCFG&CSR ops simple aliases of compiler built-ins	WANG Xuerui
	In addition to less visual clutter, this also makes Clang happy regarding the const-ness of arguments. In the original approach, all Clang gets to see is the incoming arguments whose const-ness cannot be proven without first being inlined; so Clang errors out here while GCC is fine. While at it, tweak several printk format strings because the return type of csr_read64 becomes effectively unsigned long, instead of unsigned long long. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Prepare for assemblers with proper FCSR class support	WANG Xuerui
	The GNU assembler (as of 2.40) mis-treats FCSR operands as GPRs, but the LLVM IAS does not. Probe for this and refer to FCSRs as "$fcsrNN" if support is present. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: extable: Also recognize ABI names of registers	WANG Rui
	When the kernel is compiled with LLVM, the register names being handled during exception fixup building are ABI names instead of bare $rNN style. Add mapping for the ABI names for LLVM compatibility. Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Calculate various sizes in the linker script	WANG Rui
	Taking the address delta between symbols in different sections is not supported by the LLVM IAS. Instead, do this in the linker script, so the same data can be properly referenced in assembly. Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: WANG Xuerui <git@xen0n.name> [chenhuacai: Fix build with !CONFIG_EFI_STUB] Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add guard for the larch_insn_gen_xxx functions	WANG Rui
	Add guard for the larch_insn_gen_xxx functions to verify whether the immediate operand is within the acceptable range. Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Delete unnecessary debugfs checking	Dan Carpenter
	Debugfs functions are not supposed to be checked for errors. This is sort of unusual but it is described in the comments for the debugfs_create_dir() function. Also debugfs_create_dir() can never return NULL. Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Set CPU#0 as the io master for FDT	Huacai Chen
	ACPI systems set io masters by parsing ACPI MADT, FDT systems have no MADT so we explicitly set CPU#0 as the io master. Otherwise CPU#0 will be considered as hotpluggable. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	Merge branch 'fix-ptp-received-on-wrong-port-with-bridged-sja1105-dsa'	Paolo Abeni
	Vladimir Oltean says: ==================== Fix PTP received on wrong port with bridged SJA1105 DSA Since the changes were made to tag_8021q to support imprecise RX for bridged ports, the tag_sja1105 driver still prefers the source port information deduced from the VLAN headers for link-local traffic, even though the switch can theoretically do better and report the precise source port. The problem is that the tagger doesn't know when to trust one source of information over another, because the INCL_SRCPT option (to "tag" link local frames) is sometimes enabled and sometimes it isn't. The first patch makes the switch provide the hardware tag for link local traffic under all circumstances, and the second patch makes the tagger always use that hardware tag as primary source of information for link local packets. ==================== Link: https://lore.kernel.org/r/20230627094207.3385231-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	net: dsa: tag_sja1105: always prefer source port information from INCL_SRCPT	Vladimir Oltean
	Currently the sja1105 tagging protocol prefers using the source port information from the VLAN header if that is available, falling back to the INCL_SRCPT option if it isn't. The VLAN header is available for all frames except for META frames initiated by the switch (containing RX timestamps), and thus, the "if (is_link_local)" branch is practically dead. The tag_8021q source port identification has become more loose ("imprecise") and will report a plausible rather than exact bridge port, when under a bridge (be it VLAN-aware or VLAN-unaware). But link-local traffic always needs to know the precise source port. With incorrect source port reporting, for example PTP traffic over 2 bridged ports will all be seen on sockets opened on the first such port, which is incorrect. Now that the tagging protocol has been changed to make link-local frames always contain source port information, we can reverse the order of the checks so that we always give precedence to that information (which is always precise) in lieu of the tag_8021q VID which is only precise for a standalone port. Fixes: d7f9787a763f ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID") Fixes: 91495f21fcec ("net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	net: dsa: sja1105: always enable the INCL_SRCPT option	Vladimir Oltean
	Link-local traffic on bridged SJA1105 ports is sometimes tagged by the hardware with source port information (when the port is under a VLAN aware bridge). The tag_8021q source port identification has become more loose ("imprecise") and will report a plausible rather than exact bridge port, when under a bridge (be it VLAN-aware or VLAN-unaware). But link-local traffic always needs to know the precise source port. Modify the driver logic (and therefore: the tagging protocol itself) to always include the source port information with link-local packets, regardless of whether the port is standalone, under a VLAN-aware or VLAN-unaware bridge. This makes it possible for the tagging driver to give priority to that information over the tag_8021q VLAN header. The big drawback with INCL_SRCPT is that it makes it impossible to distinguish between an original MAC DA of 01:80:C2:XX:YY:ZZ and 01:80:C2:AA:BB:ZZ, because the tagger just patches MAC DA bytes 3 and 4 with zeroes. Only if PTP RX timestamping is enabled, the switch will generate a META follow-up frame containing the RX timestamp and the original bytes 3 and 4 of the MAC DA. Those will be used to patch up the original packet. Nonetheless, in the absence of PTP RX timestamping, we have to live with this limitation, since it is more important to have the more precise source port information for link-local traffic. Fixes: d7f9787a763f ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID") Fixes: 91495f21fcec ("net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	arch/sparc: Add module license and description for fbdev helpers	Thomas Zimmermann
	Add MODULE_LICENSE() and MODULE_DESCRIPTION() for fbdev helpers on sparc. Fixes the following error: ERROR: modpost: missing MODULE_LICENSE() in arch/sparc/video/fbdev.o Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/dri-devel/c525adc9-6623-4660-8718-e0c9311563b8@roeck-us.net/ Suggested-by: Arnd Bergmann <arnd@arndb.de> Fixes: 4eec0b3048fc ("arch/sparc: Implement fb_is_primary_device() in source file") Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: sparclinux@vger.kernel.org Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230627145843.31794-1-tzimmermann@suse.de
2023-06-29	Merge branch 'fix-ptp-packet-drops-with-ocelot-8021q-dsa-tag-protocol'	Paolo Abeni
	Vladimir Oltean says: ==================== Fix PTP packet drops with ocelot-8021q DSA tag protocol Changes in v2: - Distinguish between L2 and L4 PTP packets v1 at: https://lore.kernel.org/netdev/20230626154003.3153076-1-vladimir.oltean@nxp.com/ Patch 3/3 fixes an issue with the ocelot/felix driver, where it would drop PTP traffic on RX unless hardware timestamping for that packet type was enabled. Fixing that requires the driver to know whether it had previously configured the hardware to timestamp PTP packets on that port. But it cannot correctly determine that today using the existing code structure, so patches 1/3 and 2/3 fix the control path of the code such that ocelot->ports[port]->trap_proto faithfully reflects whether that configuration took place. ==================== Link: https://lore.kernel.org/r/20230627163114.3561597-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	net: dsa: felix: don't drop PTP frames with tag_8021q when RX timestamping ↵	Vladimir Oltean
	is disabled The driver implements a workaround for the fact that it doesn't have an IRQ source to tell it whether PTP frames are available through the extraction registers, for those frames to be processed and passed towards the network stack. That workaround is to configure the switch, through felix_hwtstamp_set() -> felix_update_trapping_destinations(), to create two copies of PTP packets: one sent over Ethernet to the DSA master, and one to be consumed through the aforementioned CPU extraction queue registers. The reason why we want PTP packets to be consumed through the CPU extraction registers in the first place is because we want to see their hardware RX timestamp. With tag_8021q, that is only visible that way, and it isn't visible with the copy of the packet that's transmitted over Ethernet. The problem with the workaround implementation is that it drops the packet received over Ethernet, in expectation of its copy being present in the CPU extraction registers. However, if felix_hwtstamp_set() hasn't run (aka PTP RX timestamping is disabled), the driver will drop the original PTP frame and there will be no copy of it in the CPU extraction registers. So, the network stack will simply not see any PTP frame. Look at the port's trapping configuration to see whether the driver has previously enabled the CPU extraction registers. If it hasn't, just don't RX timestamp the frame and let it be passed up the stack by DSA, which is perfectly fine. Fixes: 0a6f17c6ae21 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	net: mscc: ocelot: don't keep PTP configuration of all ports in single structure	Vladimir Oltean
	In a future change, the driver will need to determine whether PTP RX timestamping is enabled on a port (including whether traps were set up on that port in particular) and that is currently not possible. The driver supports different RX filters (L2, L4) and kinds of TX timestamping (one-step, two-step) on its ports, but it saves all configuration in a single struct hwtstamp_config that is global to the switch. So, the latest timestamping configuration on one port (including a request to disable timestamping) affects what gets reported for all ports, even though the configuration itself is still individual to each port. The port timestamping configurations are only coupled because of the common structure, so replace the hwtstamp_config with a mask of trapped protocols saved per port. We also have the ptp_cmd to distinguish between one-step and two-step PTP timestamping, so with those 2 bits of information we can fully reconstruct a descriptive struct hwtstamp_config for each port, during the SIOCGHWTSTAMP ioctl. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	net: mscc: ocelot: don't report that RX timestamping is enabled by default	Vladimir Oltean
	PTP RX timestamping should be enabled when the user requests it, not by default. If it is enabled by default, it can be problematic when the ocelot driver is a DSA master, and it sidesteps what DSA tries to avoid through __dsa_master_hwtstamp_validate(). Additionally, after the change which made ocelot trap PTP packets only to the CPU at ocelot_hwtstamp_set() time, it is no longer even true that RX timestamping is enabled by default, because until ocelot_hwtstamp_set() is called, the PTP traps are actually not set up. So the rx_filter field of ocelot->hwtstamp_config reflects an incorrect reality. Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets") Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	spi: spi-geni-qcom: enable SPI_CONTROLLER_MUST_TX for GPI DMA mode	Dmitry Baryshkov
	The GPI DMA mode requires for TX DMA to be prepared. Force SPI core to provide TX buffer even if the caller didn't provide one by setting the SPI_CONTROLLER_MUST_TX flag. Fixes: b59c122484ec ("spi: spi-geni-qcom: Add support for GPI dma") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20230629095847.3648597-1-dmitry.baryshkov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-29	arm64: sme: Use STR P to clear FFR context field in streaming SVE mode	Will Deacon
	The FFR is a predicate register which can vary between 16 and 256 bits in size depending upon the configured vector length. When saving the SVE state in streaming SVE mode, the FFR register is inaccessible and so commit 9f5848665788 ("arm64/sve: Make access to FFR optional") simply clears the FFR field of the in-memory context structure. Unfortunately, it achieves this using an unconditional 8-byte store and so if the SME vector length is anything other than 64 bytes in size we will either fail to clear the entire field or, worse, we will corrupt memory immediately following the structure. This has led to intermittent kfence splats in CI [1] and can trigger kmalloc Redzone corruption messages when running the 'fp-stress' kselftest: \| ============================================================================= \| BUG kmalloc-1k (Not tainted): kmalloc Redzone overwritten \| ----------------------------------------------------------------------------- \| \| 0xffff000809bf1e22-0xffff000809bf1e27 @offset=7714. First byte 0x0 instead of 0xcc \| Allocated in do_sme_acc+0x9c/0x220 age=2613 cpu=1 pid=531 \| __kmalloc+0x8c/0xcc \| do_sme_acc+0x9c/0x220 \| ... Replace the 8-byte store with a store of a predicate register which has been zero-initialised with PFALSE, ensuring that the entire field is cleared in memory. [1] https://lore.kernel.org/r/CA+G9fYtU7HsV0R0dp4XEH5xXHSJFw8KyDf5VQrLLfMxWfxQkag@mail.gmail.com Cc: Mark Brown <broonie@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: 9f5848665788 ("arm64/sve: Make access to FFR optional") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://lore.kernel.org/r/20230628155605.22296-1-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-29	Merge branch 'net-sched-act_ipt-bug-fixes'	Paolo Abeni
	Florian Westphal says: ==================== net/sched: act_ipt bug fixes v3: prefer skb_header() helper in patch 2. No other changes. I've retained Acks and RvB-Tags of v2. While checking if netfilter could be updated to replace selected instances of NF_DROP with kfree_skb_reason+NF_STOLEN to improve debugging info via drop monitor I found that act_ipt is incompatible with such an approach. Moreover, it lacks multiple sanity checks to avoid certain code paths that make assumptions that the tc layer doesn't meet, such as header sanity checks, availability of skb_dst, skb_nfct() and so on. act_ipt test in the tc selftest still pass with this applied. I think that we should consider removal of this module, while this should take care of all problems, its ipv4 only and I don't think there are any netfilter targets that lack a native tc equivalent, even when ignoring bpf. ==================== Link: https://lore.kernel.org/r/20230627123813.3036-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	net/sched: act_ipt: zero skb->cb before calling target	Florian Westphal
	xtables relies on skb being owned by ip stack, i.e. with ipv4 check in place skb->cb is supposed to be IPCB. I don't see an immediate problem (REJECT target cannot be used anymore now that PRE/POSTROUTING hook validation has been fixed), but better be safe than sorry. A much better patch would be to either mark act_ipt as "depends on BROKEN" or remove it altogether. I plan to do this for -next in the near future. This tc extension is broken in the sense that tc lacks an equivalent of NF_STOLEN verdict. With NF_STOLEN, target function takes complete ownership of skb, caller cannot dereference it anymore. ACT_STOLEN cannot be used for this: it has a different meaning, caller is allowed to dereference the skb. At this time NF_STOLEN won't be returned by any targets as far as I can see, but this may change in the future. It might be possible to work around this via list of allowed target extensions known to only return DROP or ACCEPT verdicts, but this is error prone/fragile. Existing selftest only validates xt_LOG and act_ipt is restricted to ipv4 so I don't think this action is used widely. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	net/sched: act_ipt: add sanity checks on skb before calling target	Florian Westphal
	Netfilter targets make assumptions on the skb state, for example iphdr is supposed to be in the linear area. This is normally done by IP stack, but in act_ipt case no such checks are made. Some targets can even assume that skb_dst will be valid. Make a minimum effort to check for this: - Don't call the targets eval function for non-ipv4 skbs. - Don't call the targets eval function for POSTROUTING emulation when the skb has no dst set. v3: use skb_protocol helper (Davide Caratti) Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	net/sched: act_ipt: add sanity checks on table name and hook locations	Florian Westphal
	Looks like "tc" hard-codes "mangle" as the only supported table name, but on kernel side there are no checks. This is wrong. Not all xtables targets are safe to call from tc. E.g. "nat" targets assume skb has a conntrack object assigned to it. Normally those get called from netfilter nat core which consults the nat table to obtain the address mapping. "tc" userspace either sets PRE or POSTROUTING as hook number, but there is no validation of this on kernel side, so update netlink policy to reject bogus numbers. Some targets may assume skb_dst is set for input/forward hooks, so prevent those from being used. act_ipt uses the hook number in two places: 1. the state hook number, this is fine as-is 2. to set par.hook_mask The latter is a bit mask, so update the assignment to make xt_check_target() to the right thing. Followup patch adds required checks for the skb/packet headers before calling the targets evaluation function. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	sctp: fix potential deadlock on &net->sctp.addr_wq_lock	Chengfeng Ye
	As &net->sctp.addr_wq_lock is also acquired by the timer sctp_addr_wq_timeout_handler() in protocal.c, the same lock acquisition at sctp_auto_asconf_init() seems should disable irq since it is called from sctp_accept() under process context. Possible deadlock scenario: sctp_accept() -> sctp_sock_migrate() -> sctp_auto_asconf_init() -> spin_lock(&net->sctp.addr_wq_lock) <timer interrupt> -> sctp_addr_wq_timeout_handler() -> spin_lock_bh(&net->sctp.addr_wq_lock); (deadlock here) This flaw was found using an experimental static analysis tool we are developing for irq-related deadlock. The tentative patch fix the potential deadlock by spin_lock_bh(). Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Fixes: 34e5b0118685 ("sctp: delay auto_asconf init until binding the first addr") Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/20230627120340.19432-1-dg573847474@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	ACPI: bus: Constify acpi_companion_match() returned value	Andy Shevchenko
	acpi_companion_match() doesn't alter the contents of the passed parameter, so we don't expect that returned value can be altered either. So constify it. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-29	net: lan743x: Don't sleep in atomic context	Moritz Fischer
	dev_set_rx_mode() grabs a spin_lock, and the lan743x implementation proceeds subsequently to go to sleep using readx_poll_timeout(). Introduce a helper wrapping the readx_poll_timeout_atomic() function and use it to replace the calls to readx_polL_timeout(). Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Cc: stable@vger.kernel.org Cc: Bryan Whitehead <bryan.whitehead@microchip.com> Cc: UNGLinuxDriver@microchip.com Signed-off-by: Moritz Fischer <moritzf@google.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230627035000.1295254-1-moritzf@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29	ACPI: scan: Move acpi_root to internal header	Andy Shevchenko
	Compiler is not happy about handling of acpi_root variable: ...drivers/acpi/bus.c:37:20: warning: symbol 'acpi_root' was not declared. Should it be static? Move it's definition to the internal header. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-29	media: wl128x: fix a clang warning	Mauro Carvalho Chehab
	Clang-16 produces this warning, which is fatal with CONFIG_WERROR: ../drivers/media/radio/wl128x/fmdrv_common.c:1237:19: error: variable 'cmd_cnt' set but not used [-Werror,-Wunused-but-set-variable] int ret, fw_len, cmd_cnt; ^ 1 error generated. What happens is that cmd_cnt tracks the amount of firmware data packets were transfered, which is printed only when debug is used. Switch to use the firmware count, as the message is all about reporting a partial firmware transfer. Link: https://lore.kernel.org/linux-media/6badd27ebfa718d5737f517f18b29a3e0f6e43f8.1687981726.git.mchehab@kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-06-29	ALSA: hda/realtek: Add quirk for Clevo NPx0SNx	Werner Sembach
	This applies a SND_PCI_QUIRK(...) to the Clevo NPx0SNx barebones fixing the microphone not being detected on the headset combo port. Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230628155434.584159-1-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-06-29	swiotlb: reduce the number of areas to match actual memory pool size	Petr Tesarik
	Although the desired size of the SWIOTLB memory pool is increased in swiotlb_adjust_nareas() to match the number of areas, the actual allocation may be smaller, which may require reducing the number of areas. For example, Xen uses swiotlb_init_late(), which in turn uses the page allocator. On x86, page size is 4 KiB and MAX_ORDER is 10 (1024 pages), resulting in a maximum memory pool size of 4 MiB. This corresponds to 2048 slots of 2 KiB each. The minimum area size is 128 (IO_TLB_SEGSIZE), allowing at most 2048 / 128 = 16 areas. If num_possible_cpus() is greater than the maximum number of areas, areas are smaller than IO_TLB_SEGSIZE and contiguous groups of free slots will span multiple areas. When allocating and freeing slots, only one area will be properly locked, causing race conditions on the unlocked slots and ultimately data corruption, kernel hangs and crashes. Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock") Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-06-29	swiotlb: always set the number of areas before allocating the pool	Petr Tesarik
	The number of areas defaults to the number of possible CPUs. However, the total number of slots may have to be increased after adjusting the number of areas. Consequently, the number of areas must be determined before allocating the memory pool. This is even explained with a comment in swiotlb_init_remap(), but swiotlb_init_late() adjusts the number of areas after slots are already allocated. The areas may end up being smaller than IO_TLB_SEGSIZE, which breaks per-area locking. While fixing swiotlb_init_late(), move all relevant comments before the definition of swiotlb_adjust_nareas() and convert them to kernel-doc. Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock") Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-06-28	ksmbd: avoid field overflow warning	Arnd Bergmann
	clang warns about a possible field overflow in a memcpy: In file included from fs/smb/server/smb_common.c:7: include/linux/fortify-string.h:583:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning] __write_overflow_field(p_size_field, size); It appears to interpret the "&out[baselen + 4]" as referring to a single byte of the character array, while the equivalen "out + baselen + 4" is seen as an offset into the array. I don't see that kind of warning elsewhere, so just go with the simple rework. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-28	Merge branch 'expand-stack'	Linus Torvalds
	This modifies our user mode stack expansion code to always take the mmap_lock for writing before modifying the VM layout. It's actually something we always technically should have done, but because we didn't strictly need it, we were being lazy ("opportunistic" sounds so much better, doesn't it?) about things, and had this hack in place where we would extend the stack vma in-place without doing the proper locking. And it worked fine. We just needed to change vm_start (or, in the case of grow-up stacks, vm_end) and together with some special ad-hoc locking using the anon_vma lock and the mm->page_table_lock, it all was fairly straightforward. That is, it was all fine until Ruihan Li pointed out that now that the vma layout uses the maple tree code, we really don't just change vm_start and vm_end any more, and the locking really is broken. Oops. It's not actually all _that_ horrible to fix this once and for all, and do proper locking, but it's a bit painful. We have basically three different cases of stack expansion, and they all work just a bit differently: - the common and obvious case is the page fault handling. It's actually fairly simple and straightforward, except for the fact that we have something like 24 different versions of it, and you end up in a maze of twisty little passages, all alike. - the simplest case is the execve() code that creates a new stack. There are no real locking concerns because it's all in a private new VM that hasn't been exposed to anybody, but lockdep still can end up unhappy if you get it wrong. - and finally, we have GUP and page pinning, which shouldn't really be expanding the stack in the first place, but in addition to execve() we also use it for ptrace(). And debuggers do want to possibly access memory under the stack pointer and thus need to be able to expand the stack as a special case. None of these cases are exactly complicated, but the page fault case in particular is just repeated slightly differently many many times. And ia64 in particular has a fairly complicated situation where you can have both a regular grow-down stack _and_ a special grow-up stack for the register backing store. So to make this slightly more manageable, the bulk of this series is to first create a helper function for the most common page fault case, and convert all the straightforward architectures to it. Thus the new 'lock_mm_and_find_vma()' helper function, which ends up being used by x86, arm, powerpc, mips, riscv, alpha, arc, csky, hexagon, loongarch, nios2, sh, sparc32, and xtensa. So we not only convert more than half the architectures, we now have more shared code and avoid some of those twisty little passages. And largely due to this common helper function, the full diffstat of this series ends up deleting more lines than it adds. That still leaves eight architectures (ia64, m68k, microblaze, openrisc, parisc, s390, sparc64 and um) that end up doing 'expand_stack()' manually because they are doing something slightly different from the normal pattern. Along with the couple of special cases in execve() and GUP. So there's a couple of patches that first create 'locked' helper versions of the stack expansion functions, so that there's a obvious path forward in the conversion. The execve() case is then actually pretty simple, and is a nice cleanup from our old "grow-up stackls are special, because at execve time even they grow down". The #ifdef CONFIG_STACK_GROWSUP in that code just goes away, because it's just more straightforward to write out the stack expansion there manually, instead od having get_user_pages_remote() do it for us in some situations but not others and have to worry about locking rules for GUP. And the final step is then to just convert the remaining odd cases to a new world order where 'expand_stack()' is called with the mmap_lock held for reading, but where it might drop it and upgrade it to a write, only to return with it held for reading (in the success case) or with it completely dropped (in the failure case). In the process, we remove all the stack expansion from GUP (where dropping the lock wouldn't be ok without special rules anyway), and add it in manually to __access_remote_vm() for ptrace(). Thanks to Adrian Glaubitz and Frank Scheiner who tested the ia64 cases. Everything else here felt pretty straightforward, but the ia64 rules for stack expansion are really quite odd and very different from everything else. Also thanks to Vegard Nossum who caught me getting one of those odd conditions entirely the wrong way around. Anyway, I think I want to actually move all the stack expansion code to a whole new file of its own, rather than have it split up between mm/mmap.c and mm/memory.c, but since this will have to be backported to the initial maple tree vma introduction anyway, I tried to keep the patches _fairly_ minimal. Also, while I don't think it's valid to expand the stack from GUP, the final patch in here is a "warn if some crazy GUP user wants to try to expand the stack" patch. That one will be reverted before the final release, but it's left to catch any odd cases during the merge window and release candidates. Reported-by: Ruihan Li <lrh2000@pku.edu.cn> * branch 'expand-stack': gup: add warning if some caller would seem to want stack expansion mm: always expand the stack with the mmap write lock held execve: expand new process stack manually ahead of time mm: make find_extend_vma() fail if write lock not held powerpc/mm: convert coprocessor fault to lock_mm_and_find_vma() mm/fault: convert remaining simple cases to lock_mm_and_find_vma() arm/mm: Convert to using lock_mm_and_find_vma() riscv/mm: Convert to using lock_mm_and_find_vma() mips/mm: Convert to using lock_mm_and_find_vma() powerpc/mm: Convert to using lock_mm_and_find_vma() arm64/mm: Convert to using lock_mm_and_find_vma() mm: make the page fault mmap locking killable mm: introduce new 'lock_mm_and_find_vma()' page fault helper
2023-06-28	scsi: ufs: core: Remove unused function declaration	Keoseong Park
	Commit 2468da61ea09 ("scsi: ufs: core: mcq: Configure operation and runtime interface") added ufshcd_mcq_select_mcq_mode(), but it's not used anywhere. So remove it. Signed-off-by: Keoseong Park <keosung.park@samsung.com> Link: https://lore.kernel.org/r/20230627012931epcms2p76f458e0b2ce8a591b56bbcc6a2f1a3bb@epcms2p7 Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-28	scsi: target: docs: Remove tcm_mod_builder.py	Rong Tao
	This script is not used and requires additional development to sync with the SCSI target code. Signed-off-by: Rong Tao <rongtao@cestc.cn> Link: https://lore.kernel.org/r/tencent_58D7935159C421036421B42CD04B0A959207@qq.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-28	scsi: target: iblock: Quiet bool conversion warning with pr_preempt use	Mike Christie
	We want to pass in true for pr_preempt's argument if we are doing a PRO_PREEMPT_AND_ABORT, so just test sa against PRO_PREEMPT_AND_ABORT, and pass the result directly to pr_preempt. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306221655.Kwtqi1gI-lkp@intel.com/ Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230623161136.6270-1-michael.christie@oracle.com Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-28	scsi: dt-bindings: ufs: qcom: Fix ICE phandle	Abel Vesa
	The check for 'qcom,ice' property is wrong. Fix it by checking using if-required clause and expand the clocks minItems and maxItems for platforms where 'qcom,ice' is not required so that it includes platforms with single reg entry and clocks that do not provide an ICE one. Fixes: 29a6d1215b7c ("scsi: ufs: dt-bindings: qcom: Add ICE phandle") Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20230623113009.2512206-2-abel.vesa@linaro.org Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-28	scsi: core: Simplify scsi_cdl_check_cmd()	Damien Le Moal
	Reading the 800+ pages of SPC often leads to a brain shutdown and to less than ideal code... This resulted in the checks of the rwcdlp and cdlp fields in scsi_cdl_check_cmd() to have identical if-else branches. Replace this with a comment describing the cases we are interested in and replace the if-else code block with a simple test of the cdlp field that is used as the function return value. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202306221657.BJHEADkz-lkp@intel.com/ Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20230623073057.816199-1-dlemoal@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>