linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-05-08	Merge tag '6.9-rc7-ksmbd-fixes' of git://git.samba.org/ksmbd	Linus Torvalds
	Pull smb server fixes from Steve French: "Five ksmbd server fixes, all also for stable - Three fixes related to SMB3 leases (fixes two xfstests, and a locking issue) - Unitialized variable fix - Socket creation fix when bindv6only is set" * tag '6.9-rc7-ksmbd-fixes' of git://git.samba.org/ksmbd: ksmbd: do not grant v2 lease if parent lease key and epoch are not set ksmbd: use rwsem instead of rwlock for lease break ksmbd: avoid to send duplicate lease break notifications ksmbd: off ipv6only for both ipv4/ipv6 binding ksmbd: fix uninitialized symbol 'share' in smb2_tree_connect()
2024-05-08	Merge tag 'fuse-fixes-6.9-final' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "Two one-liner fixes for issues introduced in -rc1" * tag 'fuse-fixes-6.9-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: virtiofs: include a newline in sysfs tag fuse: verify zero padding in fuse_backing_map
2024-05-08	Merge tag 'exfat-for-6.9-rc8' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fixes from Namjae Jeon: - Fix xfstests generic/013 test failure with dirsync mount option - Initialize the reserved fields of deleted file and stream extension dentries to zero * tag 'exfat-for-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: zero the reserved fields of file and stream extension dentries exfat: fix timing of synchronizing bitmap and inode
2024-05-08	fscrypt: try to avoid refing parent dentry in fscrypt_file_open	Mateusz Guzik
	Merely checking if the directory is encrypted happens for every open when using ext4, at the moment refing and unrefing the parent, costing 2 atomics and serializing opens of different files. The most common case of encryption not being used can be checked for with RCU instead. Sample result from open1_processes -t 20 ("Separate file open/close") from will-it-scale on Sapphire Rapids (ops/s): before: 12539898 after: 25575494 (+103%) v2: - add a comment justifying rcu usage, submitted by Eric Biggers - whack spurious IS_ENCRYPTED check from the refed case Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20240508081400.422212-1-mjguzik@gmail.com Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-05-08	Merge tag 'bcachefs-2024-05-07.2' of https://evilpiepirate.org/git/bcachefs	Linus Torvalds
	Pull bcachefs fixes from Kent Overstreet: - Various syzbot fixes; mainly small gaps in validation - Fix an integer overflow in fiemap() which was preventing filefrag from returning the full list of extents - Fix a refcounting bug on the device refcount, turned up by new assertions in the development branch - Fix a device removal/readd bug; write_super() was repeatedly dropping and retaking bch_dev->io_ref references * tag 'bcachefs-2024-05-07.2' of https://evilpiepirate.org/git/bcachefs: bcachefs: Add missing sched_annotate_sleep() in bch2_journal_flush_seq_async() bcachefs: Fix race in bch2_write_super() bcachefs: BCH_SB_LAYOUT_SIZE_BITS_MAX bcachefs: Add missing skcipher_request_set_callback() call bcachefs: Fix snapshot_t() usage in bch2_fs_quota_read_inode() bcachefs: Fix shift-by-64 in bformat_needs_redo() bcachefs: Guard against unknown k.k->type in __bkey_invalid() bcachefs: Add missing validation for superblock section clean bcachefs: Fix assert in bch2_alloc_v4_invalid() bcachefs: fix overflow in fiemap bcachefs: Add a better limit for maximum number of buckets bcachefs: Fix lifetime issue in device iterator helpers bcachefs: Fix bch2_dev_lookup() refcounting bcachefs: Initialize bch_write_op->failed in inline data path bcachefs: Fix refcount put in sb_field_resize error path bcachefs: Inodes need extra padding for varint_decode_fast() bcachefs: Fix early error path in bch2_fs_btree_key_cache_exit() bcachefs: bucket_pos_to_bp_noerror() bcachefs: don't free error pointers bcachefs: Fix a scheduler splat in __bch2_next_write_buffer_flush_journal_buf()
2024-05-08	Merge tag 'soc-fixes-6.9-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "These are a couple of last minute fixes that came in over the previous week, addressing: - A pin configuration bug on a qualcomm board that caused issues with ethernet and mmc - Two minor code fixes for misleading console output in the microchip firmware driver - A build warning in the sifive cache driver" * tag 'soc-fixes-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firmware: microchip: clarify that sizes and addresses are in hex firmware: microchip: don't unconditionally print validation success arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration cache: sifive_ccache: Silence unused variable warning
2024-05-08	bpf: guard BPF_NO_PRESERVE_ACCESS_INDEX in skb_pkt_end.c	Jose E. Marchesi
	This little patch is a follow-up to: https://lore.kernel.org/bpf/20240507095011.15867-1-jose.marchesi@oracle.com/T/#u The temporary workaround of passing -DBPF_NO_PRESERVE_ACCESS_INDEX when building with GCC triggers a redefinition preprocessor error when building progs/skb_pkt_end.c. This patch adds a guard to avoid redefinition. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240508110332.17332-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-08	bpf: avoid UB in usages of the __imm_insn macro	Jose E. Marchesi
	[Changes from V2: - no-strict-aliasing is only applied when building with GCC. - cpumask_failure.c is excluded, as it doesn't use __imm_insn.] The __imm_insn macro is defined in bpf_misc.h as: #define __imm_insn(name, expr) [name]"i"((long )&(expr)) This may lead to type-punning and strict aliasing rules violations in it's typical usage where the address of a struct bpf_insn is passed as expr, like in: __imm_insn(st_mem, BPF_ST_MEM(BPF_W, BPF_REG_1, offsetof(struct __sk_buff, mark), 42)) Where: #define BPF_ST_MEM(SIZE, DST, OFF, IMM) \ ((struct bpf_insn) { \ .code = BPF_ST \| BPF_SIZE(SIZE) \| BPF_MEM, \ .dst_reg = DST, \ .src_reg = 0, \ .off = OFF, \ .imm = IMM }) In all the actual instances of this in the BPF selftests the value is fed to a volatile asm statement as soon as it gets read from memory, and thus it is unlikely anti-aliasing rules breakage may lead to misguided optimizations. However, GCC detects the potential problem (indirectly) by issuing a warning stating that a temporary <Uxxxxxx> is used uninitialized, where the temporary corresponds to the memory read by (long ). This patch adds -fno-strict-aliasing to the compilation flags of the particular selftests that do type punning via __imm_insn, only for GCC. Tested in master bpf-next. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240508103551.14955-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-08	bpf: avoid uninitialized warnings in verifier_global_subprogs.c	Jose E. Marchesi
	[Changes from V1: - The warning to disable is -Wmaybe-uninitialized, not -Wuninitialized. - This warning is only supported in GCC.] The BPF selftest verifier_global_subprogs.c contains code that purposedly performs out of bounds access to memory, to check whether the kernel verifier is able to catch them. For example: __noinline int global_unsupp(const int mem) { if (!mem) return 0; return mem[100]; / BOOM */ } With -O1 and higher and no inlining, GCC notices this fact and emits a "maybe uninitialized" warning. This is by design. Note that the emission of these warnings is highly dependent on the precise optimizations that are performed. This patch adds a compiler pragma to verifier_global_subprogs.c to ignore these warnings. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240507184756.1772-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-08	fs/coredump: Enable dynamic configuration of max file note size	Allen Pais
	Introduce the capability to dynamically configure the maximum file note size for ELF core dumps via sysctl. Why is this being done? We have observed that during a crash when there are more than 65k mmaps in memory, the existing fixed limit on the size of the ELF notes section becomes a bottleneck. The notes section quickly reaches its capacity, leading to incomplete memory segment information in the resulting coredump. This truncation compromises the utility of the coredumps, as crucial information about the memory state at the time of the crash might be omitted. This enhancement removes the previous static limit of 4MB, allowing system administrators to adjust the size based on system-specific requirements or constraints. Eg: $ sysctl -a \| grep core_file_note_size_limit kernel.core_file_note_size_limit = 4194304 $ sysctl -n kernel.core_file_note_size_limit 4194304 $echo 519304 > /proc/sys/kernel/core_file_note_size_limit $sysctl -n kernel.core_file_note_size_limit 519304 Attempting to write beyond the ceiling value of 16MB $echo 17194304 > /proc/sys/kernel/core_file_note_size_limit bash: echo: write error: Invalid argument Signed-off-by: Vijay Nag <nagvijay@microsoft.com> Signed-off-by: Allen Pais <apais@linux.microsoft.com> Link: https://lore.kernel.org/r/20240506193700.7884-1-apais@linux.microsoft.com Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-08	Merge tag 'pci-v6.9-fixes-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Update kernel-parameters doc to describe "pcie_aspm=off" more accurately (Bjorn Helgaas) - Restore the parent's (not the child's) ASPM state to the parent during resume, which fixes a reboot during resume (Kai-Heng Feng) * tag 'pci-v6.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/ASPM: Restore parent state to parent, child state to child PCI/ASPM: Clarify that pcie_aspm=off means leave ASPM untouched
2024-05-08	net: e1000e & ixgbe: Remove PCI_HEADER_TYPE_MFD duplicates	Ilpo Järvinen
	PCI_HEADER_TYPE_MULTIFUNC is define by e1000e and ixgbe and both are unused. There is already PCI_HEADER_TYPE_MFD in pci_regs.h anyway which should be used instead so remove the duplicated defines of it. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-05-08	igc: fix a log entry using uninitialized netdev	Corinna Vinschen
	During successful probe, igc logs this: [ 5.133667] igc 0000:01:00.0 (unnamed net_device) (uninitialized): PHC added ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The reason is that igc_ptp_init() is called very early, even before register_netdev() has been called. So the netdev_info() call works on a partially uninitialized netdev. Fix this by calling igc_ptp_init() after register_netdev(), right after the media autosense check, just as in igb. Add a comment, just as in igb. Now the log message is fine: [ 5.200987] igc 0000:01:00.0 eth0: PHC added Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-05-08	ice: remove correct filters during eswitch release	Michal Swiatkowski
	ice_clear_dflt_vsi() is only removing default rule. Both default RX and TX rule should be removed during release. If it isn't switching to switchdev, second time results in error, because TX filter is already there. Fix it by removing the correct set of rules. Fixes: 50d62022f455 ("ice: default Tx rule instead of to queue") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-05-08	igb: flower: validate control flags	Asbjørn Sloth Tønnesen
	This driver currently doesn't support any control flags. Use flow_rule_match_has_control_flags() to check for control flags, such as can be set through `tc flower ... ip_flags frag`. In case any control flags are masked, flow_rule_match_has_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-05-08	ice: flower: validate control flags	Asbjørn Sloth Tønnesen
	This driver currently doesn't support any control flags. Use flow_rule_has_control_flags() to check for control flags, such as can be set through `tc flower ... ip_flags frag`. In case any control flags are masked, flow_rule_has_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-05-08	iavf: flower: validate control flags	Asbjørn Sloth Tønnesen
	This driver currently doesn't support any control flags. Use flow_rule_has_control_flags() to check for control flags, such as can be set through `tc flower ... ip_flags frag`. In case any control flags are masked, flow_rule_has_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-05-08	Merge tag 'devfreq-next-for-6.10' of ↵	Rafael J. Wysocki
	git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux Merge devfreq updates for v6.10 from Chanwoo Choi: - Convert the platfrom remove callback to .remove_new ops for following drivers: exyno-nocp.c/exynos-ppmu.c/mtk-cci-devfreq.c/ sun8i-a33-mbus.c/rk3399_dmc.c - Use DEFINE_SIMPLE_PM_OPS for exyno-bus.c driver * tag 'devfreq-next-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux: PM / devfreq: exynos: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions PM / devfreq: rk3399_dmc: Convert to platform remove callback returning void PM / devfreq: sun8i-a33-mbus: Convert to platform remove callback returning void PM / devfreq: mtk-cci: Convert to platform remove callback returning void PM / devfreq: exynos-ppmu: Convert to platform remove callback returning void PM / devfreq: exynos-nocp: Convert to platform remove callback returning void
2024-05-08	i40e: flower: validate control flags	Asbjørn Sloth Tønnesen
	This driver currently doesn't support any control flags. Use flow_rule_has_control_flags() to check for control flags, such as can be set through `tc flower ... ip_flags frag`. In case any control flags are masked, flow_rule_has_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-05-08	m68k: defconfig: Update defconfigs for v6.9-rc1	Geert Uytterhoeven
	- Enable trimming of unused exported kernel symbols, - Drop CONFIG_IP_NF_ARPTABLES=m (auto-enabled since commit 4654467dc7e111e8 ("netfilter: arptables: allow xtables-nft only builds")), - Drop CONFIG_STRING_SELFTEST=m (replaced by auto-modular CONFIG_STRING_KUNIT_TEST in commit 29d8568849fe5937 ("string: Convert selftest to KUnit")), - Drop CONFIG_TEST_STRING_HELPERS=m (replaced by auto-modular CONFIG_STRING_HELPERS_KUNIT_TEST in commit fb57550fcbd86839 ("string: Convert helpers selftest to KUnit")). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/e17b3ac60832a3ff92d25d1a05bf814e8f15d0c5.1711475325.git.geert@linux-m68k.org
2024-05-08	m68k: Move ARCH_HAS_CPU_CACHE_ALIASING	Geert Uytterhoeven
	Move the recently added ARCH_HAS_CPU_CACHE_ALIASING to restore alphabetical sort order. Fixes: 8690bbcf3b7010b3 ("Introduce cpu_dcache_is_aliasing() across all architectures") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/r/4574ad6cc1117e4b5d29812c165bf7f6e5b60773.1714978406.git.geert@linux-m68k.org
2024-05-08	m68k: mac: Fix reboot hang on Mac IIci	Finn Thain
	Calling mac_reset() on a Mac IIci does reset the system, but what follows is a POST failure that requires a manual reset to resolve. Avoid that by using the 68030 asm implementation instead of the C implementation. Apparently the SE/30 has a similar problem as it has used the asm implementation since before git. This patch extends that solution to other systems with a similar ROM. After this patch, the only systems still using the C implementation are 68040 systems where adb_type is either MAC_ADB_IOP or MAC_ADB_II. This implies a 1 MiB Quadra ROM. This now includes the Quadra 900/950, which previously fell through to the "should never get here" catch-all. Reported-and-tested-by: Stan Johnson <userm57@yahoo.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/480ebd1249d229c6dc1f3f1c6d599b8505483fd8.1714797072.git.fthain@linux-m68k.org Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2024-05-08	m68k: Fix spinlock race in kernel thread creation	Michael Schmitz
	Context switching does take care to retain the correct lock owner across the switch from 'prev' to 'next' tasks. This does rely on interrupts remaining disabled for the entire duration of the switch. This condition is guaranteed for normal process creation and context switching between already running processes, because both 'prev' and 'next' already have interrupts disabled in their saved copies of the status register. The situation is different for newly created kernel threads. The status register is set to PS_S in copy_thread(), which does leave the IPL at 0. Upon restoring the 'next' thread's status register in switch_to() aka resume(), interrupts then become enabled prematurely. resume() then returns via ret_from_kernel_thread() and schedule_tail() where run queue lock is released (see finish_task_switch() and finish_lock_switch()). A timer interrupt calling scheduler_tick() before the lock is released in finish_task_switch() will find the lock already taken, with the current task as lock owner. This causes a spinlock recursion warning as reported by Guenter Roeck. As far as I can ascertain, this race has been opened in commit 533e6903bea0 ("m68k: split ret_from_fork(), simplify kernel_thread()") but I haven't done a detailed study of kernel history so it may well predate that commit. Interrupts cannot be disabled in the saved status register copy for kernel threads (init will complain about interrupts disabled when finally starting user space). Disable interrupts temporarily when switching the tasks' register sets in resume(). Note that a simple oriw 0x700,%sr after restoring sr is not enough here - this leaves enough of a race for the 'spinlock recursion' warning to still be observed. Tested on ARAnyM and qemu (Quadra 800 emulation). Fixes: 533e6903bea0 ("m68k: split ret_from_fork(), simplify kernel_thread()") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/07811b26-677c-4d05-aeb4-996cd880b789@roeck-us.net Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20240411033631.16335-1-schmitzmic@gmail.com Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2024-05-08	m68k: Let GENERIC_IOMAP depend on HAS_IOPORT	Niklas Schnelle
	In a future patch HAS_IOPORT=n will disable inb()/outb() and friends at compile time. With that choosing dynamically between I/O port and MMIO access via GNERIC_IOMAP will not work. So only select GENERIC_IOMAP when HAS_IOPORT is selected. Co-developed-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20240403122851.38808-2-schnelle@linux.ibm.com Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2024-05-09	PM / devfreq: exynos: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions	Anand Moon
	This macro has the advantage over SET_SYSTEM_SLEEP_PM_OPS that we don't have to care about when the functions are actually used. Also make use of pm_sleep_ptr() to discard all PM_SLEEP related stuff if CONFIG_PM_SLEEP isn't enabled. Link: https://lore.kernel.org/lkml/20240417044459.1908-2-linux.amoon@gmail.com/ Signed-off-by: Anand Moon <linux.amoon@gmail.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2024-05-08	Merge branch 'rxrpc-miscellaneous-fixes'	Jakub Kicinski
	David Howells says: ==================== rxrpc: Miscellaneous fixes (part) Here some miscellaneous fixes for AF_RXRPC: (1) Fix the congestion control algorithm to start cwnd at 4 and to not cut ssthresh when the peer cuts its rwind size. (2) Only transmit a single ACK for all the DATA packets glued together into a jumbo packet to reduce the number of ACKs being generated. ==================== Link: https://lore.kernel.org/r/20240503150749.1001323-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-08	rxrpc: Only transmit one ACK per jumbo packet received	David Howells
	Only generate one ACK packet for all the subpackets in a jumbo packet. If we would like to generate more than one ACK, we prioritise them base on their reason code, in the order, highest first: OutOfSeq > NoSpace > ExceedsWin > Duplicate > Requested > Delay > Idle For the first four, we reference the lowest offending subpacket; for the last three, the highest. This reduces the number of ACKs we end up transmitting to one per UDP packet transmitted to reduce network loading and packet parsing. Fixes: 5d7edbc9231e ("rxrpc: Get rid of the Rx ring") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Reviewed-by: Jeffrey Altman <jaltman@auristor.com <mailto:jaltman@auristor.com>> Link: https://lore.kernel.org/r/20240503150749.1001323-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-08	rxrpc: Fix congestion control algorithm	David Howells
	Make the following fixes to the congestion control algorithm: (1) Don't vary the cwnd starting value by the size of RXRPC_TX_SMSS since that's currently held constant - set to the size of a jumbo subpacket payload so that we can create jumbo packets on the fly. The current code invariably picks 3 as the starting value. Further, the starting cwnd needs to be an even number because we ack every other packet, so set it to 4. (2) Don't cut ssthresh when we see an ACK come from the peer with a receive window (rwind) less than ssthresh. ssthresh keeps track of characteristics of the connection whereas rwind may be reduced by the peer for any reason - and may be reduced to 0. Fixes: 1fc4fa2ac93d ("rxrpc: Fix congestion management") Fixes: 0851115090a3 ("rxrpc: Reduce ssthresh to peer's receive window") Signed-off-by: David Howells <dhowells@redhat.com> Suggested-by: Simon Wilkinson <sxw@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Reviewed-by: Jeffrey Altman <jaltman@auristor.com <mailto:jaltman@auristor.com>> Link: https://lore.kernel.org/r/20240503150749.1001323-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-08	PM / devfreq: rk3399_dmc: Convert to platform remove callback returning void	Uwe Kleine-König
	The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2024-05-08	PM / devfreq: sun8i-a33-mbus: Convert to platform remove callback returning void	Uwe Kleine-König
	The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2024-05-08	PM / devfreq: mtk-cci: Convert to platform remove callback returning void	Uwe Kleine-König
	The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2024-05-08	PM / devfreq: exynos-ppmu: Convert to platform remove callback returning void	Uwe Kleine-König
	The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2024-05-08	PM / devfreq: exynos-nocp: Convert to platform remove callback returning void	Uwe Kleine-König
	The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2024-05-08	bpf, arm64: Add support for lse atomics in bpf_arena	Puranjay Mohan
	When LSE atomics are available, BPF atomic instructions are implemented as single ARM64 atomic instructions, therefore it is easy to enable these in bpf_arena using the currently available exception handling setup. LL_SC atomics use loops and therefore would need more work to enable in bpf_arena. Enable LSE atomics based instructions in bpf_arena and use the bpf_jit_supports_insn() callback to reject atomics in bpf_arena if LSE atomics are not available. All atomics and arena_atomics selftests are passing: [root@ip-172-31-2-216 bpf]# ./test_progs -a atomics,arena_atomics #3/1 arena_atomics/add:OK #3/2 arena_atomics/sub:OK #3/3 arena_atomics/and:OK #3/4 arena_atomics/or:OK #3/5 arena_atomics/xor:OK #3/6 arena_atomics/cmpxchg:OK #3/7 arena_atomics/xchg:OK #3 arena_atomics:OK #10/1 atomics/add:OK #10/2 atomics/sub:OK #10/3 atomics/and:OK #10/4 atomics/or:OK #10/5 atomics/xor:OK #10/6 atomics/cmpxchg:OK #10/7 atomics/xchg:OK #10 atomics:OK Summary: 2/14 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20240426161116.441-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-08	io_uring/filetable: don't unnecessarily clear/reset bitmap	Jens Axboe
	If we're updating an existing slot, we clear the slot bitmap only to set it again right after. Just leave the bit set rather than toggle it off and on, and move the unused slot setting into the branch of not already having a file occupy this slot. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-08	selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC	Ido Schimmel
	When creating the topology for the test, three veth pairs are created in the initial network namespace before being moved to one of the network namespaces created by the test. On systems where systemd-udev uses MACAddressPolicy=persistent (default since systemd version 242), this will result in some net devices having the same MAC address since they were created with the same name in the initial network namespace. In turn, this leads to arping / ndisc6 failing since packets are dropped by the bridge's loopback filter. Fix by creating each net device in the correct network namespace instead of moving it there from the initial network namespace. Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20240426074015.251854d4@kernel.org/ Fixes: 7648ac72dcd7 ("selftests: net: Add bridge neighbor suppression test") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20240507113033.1732534-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-08	nvmet-rdma: fix possible bad dereference when freeing rsps	Sagi Grimberg
	It is possible that the host connected and saw a cm established event and started sending nvme capsules on the qp, however the ctrl did not yet see an established event. This is why the rsp_wait_list exists (for async handling of these cmds, we move them to a pending list). Furthermore, it is possible that the ctrl cm times out, resulting in a connect-error cm event. in this case we hit a bad deref [1] because in nvmet_rdma_free_rsps we assume that all the responses are in the free list. We are freeing the cmds array anyways, so don't even bother to remove the rsp from the free_list. It is also guaranteed that we are not racing anything when we are releasing the queue so no other context accessing this array should be running. [1]: -- Workqueue: nvmet-free-wq nvmet_rdma_free_queue_work [nvmet_rdma] [...] pc : nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma] lr : nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma] Call trace: nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma] nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma] process_one_work+0x1ec/0x4a0 worker_thread+0x48/0x490 kthread+0x158/0x160 ret_from_fork+0x10/0x18 -- Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-08	x86/irq: Use existing helper for pending vector check	Jacob Pan
	lapic_vector_set_in_irr() is already available, use it for checking pending vectors at the local APIC. No functional change. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Imran Khan <imran.f.khan@oracle.com> Link: https://lore.kernel.org/r/20240506175612.1141095-1-jacob.jun.pan@linux.intel.com
2024-05-08	nvmet: prevent sprintf() overflow in nvmet_subsys_nsid_exists()	Dan Carpenter
	The nsid value is a u32 that comes from nvmet_req_find_ns(). It's endian data and we're on an error path and both of those raise red flags. So let's make this safer. 1) Make the buffer large enough for any u32. 2) Remove the unnecessary initialization. 3) Use snprintf() instead of sprintf() for even more safety. 4) The sprintf() function returns the number of bytes printed, not counting the NUL terminator. It is impossible for the return value to be <= 0 so delete that. Fixes: 505363957fad ("nvmet: fix nvme status code when namespace is disabled") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-08	erofs: clean up z_erofs_load_full_lcluster()	Gao Xiang
	Only four lcluster types here, remove redundant code. No real logic changes. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240508123357.3266173-1-hsiangkao@linux.alibaba.com
2024-05-08	thermal: intel: hfi: Increase the number of CPU capabilities per netlink event	Ricardo Neri
	The number of updated CPU capabilities per netlink event is hard-coded to 16. On systems with more than 16 CPUs (a common case), it takes more than one thermal netlink event to relay all the new capabilities after an HFI interrupt. This adds unnecessary overhead to both the kernel and user space entities. Increase the number of CPU capabilities updated per event to 64. Any system with 64 CPUs or less can now update all the capabilities in a single thermal netlink event. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08	thermal: intel: hfi: Rename HFI_MAX_THERM_NOTIFY_COUNT	Ricardo Neri
	When processing a hardware update, HFI generates as many thermal netlink events as needed to relay all the updated CPU capabilities to user space. The constant HFI_MAX_THERM_NOTIFY_COUNT is the number of CPU capabilities updated per each of those events. Give this constant a more descriptive name. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08	thermal: intel: hfi: Shorten the thermal netlink event delay to 100ms	Ricardo Neri
	The delay between an HFI interrupt and its corresponding thermal netlink event has so far been hard-coded to CONFIG_HZ jiffies (1 second). This delay is too long for hardware that generates updates every tens of milliseconds. The HFI driver uses a delayed workqueue to send thermal netlink events. No subsequent events will be sent if there is pending work. As a result, much of the information of consecutive hardware updates will be lost if the workqueue delay is too long. User space entities may act on obsolete data. If the delay is too short, multiple events may overwhelm listeners. Set the delay to 100ms to strike a balance between too many and too few events. Use milliseconds instead of jiffies to improve readability. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08	thermal: intel: hfi: Rename HFI_UPDATE_INTERVAL	Ricardo Neri
	The name of the constant HFI_UPDATE_INTERVAL is misleading. It is not a periodic interval at which HFI updates are processed. It is the delay in the processing of an HFI update after the arrival of an HFI interrupt. Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08	cpufreq: amd-pstate: fix the highest frequency issue which limits performance	Perry Yuan
	To address the performance drop issue, an optimization has been implemented. The incorrect highest performance value previously set by the low-level power firmware for AMD CPUs with Family ID 0x19 and Model ID ranging from 0x70 to 0x7F series has been identified as the cause. To resolve this, a check has been implemented to accurately determine the CPU family and model ID. The correct highest performance value is now set and the performance drop caused by the incorrect highest performance value are eliminated. Before the fix, the highest frequency was set to 4200MHz, now it is set to 4971MHz which is correct. CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ MHZ 0 0 0 0 0:0:0:0 yes 4971.0000 400.0000 400.0000 1 0 0 0 0:0:0:0 yes 4971.0000 400.0000 400.0000 2 0 0 1 1:1:1:0 yes 4971.0000 400.0000 4865.8140 3 0 0 1 1:1:1:0 yes 4971.0000 400.0000 400.0000 Fixes: f3a052391822 ("cpufreq: amd-pstate: Enable amd-pstate preferred core support") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218759 Signed-off-by: Perry Yuan <perry.yuan@amd.com> Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Gaha Bana <gahabana@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08	test: hsr: Call cleanup_all_ns when hsr_redbox.sh script exits	Lukasz Majewski
	Without this change the created netns instances are not cleared after this script execution. To fix this problem the cleanup_all_ns function from ../lib.sh is called. Signed-off-by: Lukasz Majewski <lukma@denx.de> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08	ax25: Remove superfuous "return" from ax25_ds_set_timer	Joel Granados
	Remove the explicit call to "return" in the void ax25_ds_set_timer function that was introduced in 78a7b5dbc060 ("ax.25: x.25: Remove the now superfluous sentinel elements from ctl_table array"). Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08	ipvs: allow some sysctls in non-init user namespaces	Alexander Mikhalitsyn
	Let's make all IPVS sysctls writtable even when network namespace is owned by non-initial user namespace. Let's make a few sysctls to be read-only for non-privileged users: - sync_qlen_max - sync_sock_size - run_estimation - est_cpulist - est_nice I'm trying to be conservative with this to prevent introducing any security issues in there. Maybe, we can allow more sysctls to be writable, but let's do this on-demand and when we see real use-case. This patch is motivated by user request in the LXC project [1]. Having this can help with running some Kubernetes [2] or Docker Swarm [3] workloads inside the system containers. Link: https://github.com/lxc/lxc/issues/4278 [1] Link: https://github.com/kubernetes/kubernetes/blob/b722d017a34b300a2284b890448e5a605f21d01e/pkg/proxy/ipvs/proxier.go#L103 [2] Link: https://github.com/moby/libnetwork/blob/3797618f9a38372e8107d8c06f6ae199e1133ae8/osl/namespace_linux.go#L682 [3] Cc: Julian Anastasov <ja@ssi.bg> Cc: Simon Horman <horms@verge.net.au> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08	ipvs: add READ_ONCE barrier for ipvs->sysctl_amemthresh	Alexander Mikhalitsyn
	Cc: Julian Anastasov <ja@ssi.bg> Cc: Simon Horman <horms@verge.net.au> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Suggested-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08	ipv6: Fix potential uninit-value access in __ip6_make_skb()	Shigeru Yoshida
	As it was done in commit fc1092f51567 ("ipv4: Fix uninit-value access in __ip_make_skb()") for IPv4, check FLOWI_FLAG_KNOWN_NH on fl6->flowi6_flags instead of testing HDRINCL on the socket to avoid a race condition which causes uninit-value access. Fixes: ea30388baebc ("ipv6: Fix an uninit variable access bug in __ip6_make_skb()") Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>