git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-02-20	selftests: drv-net: add missing new line in xdp_helper	Jakub Kicinski
	Kurt and Joe report missing new line at the end of Usage. Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20	selftests: drv-net: use cfg.rpath() in netlink xsk attr test	Jakub Kicinski
	The cfg.rpath() helper was been recently added to make formatting paths for helper binaries easier. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20	selftests: drv-net: add a warning for bkg + shell + terminate	Jakub Kicinski
	Joe Damato reports that some shells will fork before running the command when python does "sh -c $cmd", while bash on my machine does an exec of $cmd directly. This will have implications for our ability to terminate the child process on various configurations of bash and other shells. Warn about using bkg(... shell=True, termininate=True) most background commands can hopefully exit cleanly (exit_wait). Link: https://lore.kernel.org/Z7Yld21sv_Ip3gQx@LQ3V64L9R2 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	Linus Torvalds
	Pull BPF fixes from Daniel Borkmann: - Fix a soft-lockup in BPF arena_map_free on 64k page size kernels (Alan Maguire) - Fix a missing allocation failure check in BPF verifier's acquire_lock_state (Kumar Kartikeya Dwivedi) - Fix a NULL-pointer dereference in trace_kfree_skb by adding kfree_skb to the raw_tp_null_args set (Kuniyuki Iwashima) - Fix a deadlock when freeing BPF cgroup storage (Abel Wu) - Fix a syzbot-reported deadlock when holding BPF map's freeze_mutex (Andrii Nakryiko) - Fix a use-after-free issue in bpf_test_init when eth_skb_pkt_type is accessing skb data not containing an Ethernet header (Shigeru Yoshida) - Fix skipping non-existing keys in generic_map_lookup_batch (Yan Zhai) - Several BPF sockmap fixes to address incorrect TCP copied_seq calculations, which prevented correct data reads from recv(2) in user space (Jiayuan Chen) - Two fixes for BPF map lookup nullness elision (Daniel Xu) - Fix a NULL-pointer dereference from vmlinux BTF lookup in bpf_sk_storage_tracing_allowed (Jared Kangas) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests: bpf: test batch lookup on array of maps with holes bpf: skip non exist keys in generic_map_lookup_batch bpf: Handle allocation failure in acquire_lock_state bpf: verifier: Disambiguate get_constant_map_key() errors bpf: selftests: Test constant key extraction on irrelevant maps bpf: verifier: Do not extract constant map keys for irrelevant maps bpf: Fix softlockup in arena_map_free on 64k page kernel net: Add rx_skb of kfree_skb to raw_tp_null_args[]. bpf: Fix deadlock when freeing cgroup storage selftests/bpf: Add strparser test for bpf selftests/bpf: Fix invalid flag of recv() bpf: Disable non stream socket for strparser bpf: Fix wrong copied_seq calculation strparser: Add read_sock callback bpf: avoid holding freeze_mutex during mmap operation bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic selftests/bpf: Adjust data size to have ETH_HLEN bpf, test_run: Fix use-after-free issue in eth_skb_pkt_type() bpf: Remove unnecessary BTF lookups in bpf_sk_storage_tracing_allowed
2025-02-20	selftests/bpf: Add launch time request to xdp_hw_metadata	Song Yoong Siang
	Add launch time hardware offload request to xdp_hw_metadata. Users can configure the delta of launch time relative to HW RX-time using the "-l" argument. By default, the delta is set to 0 ns, which means the launch time is disabled. By setting the delta to a non-zero value, the launch time hardware offload feature will be enabled and requested. Additionally, users can configure the Tx Queue to be enabled with the launch time hardware offload using the "-L" argument. By default, Tx Queue 0 will be used. Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250216093430.957880-3-yoong.siang.song@intel.com
2025-02-20	xsk: Add launch time hardware offload support to XDP Tx metadata	Song Yoong Siang
	Extend the XDP Tx metadata framework so that user can requests launch time hardware offload, where the Ethernet device will schedule the packet for transmission at a pre-determined time called launch time. The value of launch time is communicated from user space to Ethernet driver via launch_time field of struct xsk_tx_metadata. Suggested-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250216093430.957880-2-yoong.siang.song@intel.com
2025-02-20	selftests/bpf: Add simple bpf tests in the tx path for timestamping feature	Jason Xing
	BPF program calculates a couple of latency deltas between each tx timestamping callbacks. It can be used in the real world to diagnose the kernel behaviour in the tx path. Check the safety issues by accessing a few bpf calls in bpf_test_access_bpf_calls() which are implemented in the patch 3 and 4. Check if the bpf timestamping can co-exist with socket timestamping. There remains a few realistic things[1][2] to highlight: 1. in general a packet may pass through multiple qdiscs. For instance with bonding or tunnel virtual devices in the egress path. 2. packets may be resent, in which case an ACK might precede a repeat SCHED and SND. 3. erroneous or malicious peers may also just never send an ACK. [1]: https://lore.kernel.org/all/67a389af981b0_14e0832949d@willemb.c.googlers.com.notmuch/ [2]: https://lore.kernel.org/all/c329a0c1-239b-4ca1-91f2-cb30b8dd2f6a@linux.dev/ Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-13-kerneljasonxing@gmail.com
2025-02-20	bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback	Jason Xing
	This patch introduces a new callback in tcp_tx_timestamp() to correlate tcp_sendmsg timestamp with timestamps from other tx timestamping callbacks (e.g., SND/SW/ACK). Without this patch, BPF program wouldn't know which timestamps belong to which flow because of no socket lock protection. This new callback is inserted in tcp_tx_timestamp() to address this issue because tcp_tx_timestamp() still owns the same socket lock with tcp_sendmsg_locked() in the meanwhile tcp_tx_timestamp() initializes the timestamping related fields for the skb, especially tskey. The tskey is the bridge to do the correlation. For TCP, BPF program hooks the beginning of tcp_sendmsg_locked() and then stores the sendmsg timestamp at the bpf_sk_storage, correlating this timestamp with its tskey that are later used in other sending timestamping callbacks. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-11-kerneljasonxing@gmail.com
2025-02-20	bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback	Jason Xing
	Support the ACK case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_ACK_CB. This callback will occur at the same timestamping point as the user space's SCM_TSTAMP_ACK. The BPF program can use it to get the same SCM_TSTAMP_ACK timestamp without modifying the user-space application. This patch extends txstamp_ack to two bits: 1 stands for SO_TIMESTAMPING mode, 2 bpf extension. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-10-kerneljasonxing@gmail.com
2025-02-20	bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback	Jason Xing
	Support hw SCM_TSTAMP_SND case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SND_HW_CB. This callback will occur at the same timestamping point as the user space's hardware SCM_TSTAMP_SND. The BPF program can use it to get the same SCM_TSTAMP_SND timestamp without modifying the user-space application. To avoid increasing the code complexity, replace SKBTX_HW_TSTAMP with SKBTX_HW_TSTAMP_NOBPF instead of changing numerous callers from driver side using SKBTX_HW_TSTAMP. The new definition of SKBTX_HW_TSTAMP means the combination tests of socket timestamping and bpf timestamping. After this patch, drivers can work under the bpf timestamping. Considering some drivers don't assign the skb with hardware timestamp, this patch does the assignment and then BPF program can acquire the hwstamp from skb directly. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-9-kerneljasonxing@gmail.com
2025-02-20	bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback	Jason Xing
	Support sw SCM_TSTAMP_SND case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SND_SW_CB. This callback will occur at the same timestamping point as the user space's software SCM_TSTAMP_SND. The BPF program can use it to get the same SCM_TSTAMP_SND timestamp without modifying the user-space application. Based on this patch, BPF program will get the software timestamp when the driver is ready to send the skb. In the sebsequent patch, the hardware timestamp will be supported. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-8-kerneljasonxing@gmail.com
2025-02-20	bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback	Jason Xing
	Support SCM_TSTAMP_SCHED case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SCHED_CB. This callback will occur at the same timestamping point as the user space's SCM_TSTAMP_SCHED. The BPF program can use it to get the same SCM_TSTAMP_SCHED timestamp without modifying the user-space application. A new SKBTX_BPF flag is added to mark skb_shinfo(skb)->tx_flags, ensuring that the new BPF timestamping and the current user space's SO_TIMESTAMPING do not interfere with each other. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-7-kerneljasonxing@gmail.com
2025-02-20	bpf: Add networking timestamping support to bpf_get/setsockopt()	Jason Xing
	The new SK_BPF_CB_FLAGS and new SK_BPF_CB_TX_TIMESTAMPING are added to bpf_get/setsockopt. The later patches will implement the BPF networking timestamping. The BPF program will use bpf_setsockopt(SK_BPF_CB_FLAGS, SK_BPF_CB_TX_TIMESTAMPING) to enable the BPF networking timestamping on a socket. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-2-kerneljasonxing@gmail.com
2025-02-20	tools/nolibc: add support for 32-bit s390	Thomas Weißschuh
	32-bit s390 is very close to the existing 64-bit implementation. Some special handling is necessary as there is neither LLVM nor QEMU support. Also the kernel itself can not build natively for 32-bit s390, so instead the test program is executed with a 64-bit kernel. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250206-nolibc-s390-v2-2-991ad97e3d58@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-02-20	selftests/nolibc: rename s390 to s390x	Thomas Weißschuh
	Support for 32-bit s390 is about to be added. As "s39032" would look horrible, use the another naming scheme. 32-bit s390 is "s390" and 64-bit s390 is "s390x", similar to how it is handled in various toolchain components. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250206-nolibc-s390-v2-1-991ad97e3d58@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-02-20	selftests/nolibc: only run constructor tests on nolibc	Thomas Weißschuh
	The nolibc testsuite can be run against other libcs to test for interoperability. Some aspects of the constructor execution are not standardized and musl does not provide all tested feature, for one it does not provide arguments to the constructors, anymore? Skip the constructor tests on non-nolibc configurations. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250212-nolibc-test-constructor-v1-1-c963875b3da4@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-02-20	selftests/tracing: Allow some more tests to run in instances	Steven Rostedt
	The tests: trigger-action-hist-xfail.tc trigger-onchange-action-hist.tc trigger-snapshot-action-hist.tc trigger-hist-expressions.tc can all run in an instance. Test them in an instance as well. Link: https://lore.kernel.org/r/20250220185846.451234966@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-02-20	selftests/ftrace: Clean up triggers after setting them	Steven Rostedt
	The triggers set in trigger-onchange-action-hist.tc and trigger-snapshot-action-hist.tc are not cleaned up at the end. These tests can also be done in instances and without cleaning up the triggers, the instances can not be removed as they are still "busy". Link: https://lore.kernel.org/r/20250220185846.291817731@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-02-20	selftests/tracing: Test only toplevel README file not the instances	Steven Rostedt
	For the tests that have both a README attribute as well as the instance flag to run the tests as an instance, the instance version will always exit with UNSUPPORTED. That's because the instance directory does not contain a README file. Currently, the tests check for a README file in the directory that the test runs in and if there's a requirement for something to be present in the README file, it will not find it, as the instance directory doesn't have it. Have the tests check if the current directory is an instance directory, and if it is, check two directories above the current directory for the README file: /sys/kernel/tracing/README /sys/kernel/tracing/instances/foo/../../README Link: https://lore.kernel.org/r/20250220185846.130216270@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-02-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.14-rc4). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20	Merge tag 'net-6.14-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Smaller than usual with no fixes from any subtree. Current release - regressions: - core: fix race of rtnl_net_lock(dev_net(dev)) Previous releases - regressions: - core: remove the single page frag cache for good - flow_dissector: fix handling of mixed port and port-range keys - sched: cls_api: fix error handling causing NULL dereference - tcp: - adjust rcvq_space after updating scaling ratio - drop secpath at the same time as we currently drop dst - eth: gtp: suppress list corruption splat in gtp_net_exit_batch_rtnl(). Previous releases - always broken: - vsock: - fix variables initialization during resuming - for connectible sockets allow only connected - eth: - geneve: fix use-after-free in geneve_find_dev() - ibmvnic: don't reference skb after sending to VIOS" * tag 'net-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) Revert "net: skb: introduce and use a single page frag cache" net: allow small head cache usage with large MAX_SKB_FRAGS values nfp: bpf: Add check for nfp_app_ctrl_msg_alloc() tcp: drop secpath at the same time as we currently drop dst net: axienet: Set mac_managed_pm arp: switch to dev_getbyhwaddr() in arp_req_set_public() net: Add non-RCU dev_getbyhwaddr() helper sctp: Fix undefined behavior in left shift operation selftests/bpf: Add a specific dst port matching flow_dissector: Fix port range key handling in BPF conversion selftests/net/forwarding: Add a test case for tc-flower of mixed port and port-range flow_dissector: Fix handling of mixed port and port-range keys geneve: Suppress list corruption splat in geneve_destroy_tunnels(). gtp: Suppress list corruption splat in gtp_net_exit_batch_rtnl(). dev: Use rtnl_net_dev_lock() in unregister_netdev(). net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). net: Add net_passive_inc() and net_passive_dec(). net: pse-pd: pd692x0: Fix power limit retrieval MAINTAINERS: trim the GVE entry gve: set xdp redirect target only when it is available ...
2025-02-20	cpupower: monitor: Exit with error status if execvp() fail	Yiwei Lin
	In the case that we give a invalid command to idle_monitor for monitoring, the execvp() will fail and thus go to the next line. As a result, we'll see two differnt monitoring output. For example, running `cpupower monitor -i 5 invalidcmd` which `invalidcmd` is not executable. Link: https://lore.kernel.org/r/20250220163846.2765-1-s921975628@gmail.com Signed-off-by: Yiwei Lin <s921975628@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-02-20	tools/memory-model: Define effect of Mb tags on RMWs in tools/...	Jonas Oberhauser
	Herd7 transforms successful RMW with Mb tags by inserting smp_mb() fences around them. We emulate this by considering imaginary po-edges before the RMW read and before the RMW write, and extending the smp_mb() ordering rule, which currently only applies to real po edges that would be found around a really inserted smp_mb(), also to cases of the only imagined po edges. Reported-by: Viktor Vafeiadis <viktor@mpi-sws.org> Suggested-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-20	tools/memory-model: Define applicable tags on operation in tools/...	Jonas Oberhauser
	Herd7 transforms reads, writes, and read-modify-writes by eliminating 'acquire tags from writes, 'release tags from reads, and 'acquire, 'release, and 'mb tags from failed read-modify-writes. We emulate this behavior by redefining Acquire, Release, and Mb sets in linux-kernel.bell to explicitly exclude those combinations. Herd7 furthermore adds 'noreturn tag to certain reads. Currently herd7 does not allow specifying the 'noreturn tag manually, but such manual declaration (e.g., through a syntax __atomic_op{noreturn}) would add invalid 'noreturn tags to writes; in preparation, we already also exclude this combination. Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-20	tools/memory-model: Legitimize current use of tags in LKMM macros	Jonas Oberhauser
	The current macros in linux-kernel.def reference instructions such as __xchg{mb} or __cmpxchg{acquire}, which are invalid combinations of tags and instructions according to the declarations in linux-kernel.bell. This works with current herd7 because herd7 removes these tags anyways and does not actually enforce validity of combinations at all. If a future herd7 version no longer applies these hardcoded transformations, then all currently invalid combinations will actually appear on some instruction. We therefore adjust the declarations to make the resulting combinations valid, by adding the 'mb tag to the set of Accesses and allowing all Accesses to appear on all read, write, and RMW instructions. Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-20	tools/memory-model: Add atomic_andnot() with its variants	Puranjay Mohan
	Pull-855[1] added the support of atomic_andnot() to the herd tool. Use this to add the implementation in the LKMM. All of the ordering variants are also added. Here is a small litmus-test that uses this operation: C andnot { atomic_t u = ATOMIC_INIT(7); } P0(atomic_t u) { r0 = atomic_fetch_andnot(3, u); r1 = READ_ONCE(u); } exists (0:r0=7 /\ 0:r1=4) Test andnot Allowed States 1 0:r0=7; 0:r1=4; Ok Witnesses Positive: 1 Negative: 0 Condition exists (0:r0=7 /\ 0:r1=4) Observation andnot Always 1 0 Time andnot 0.00 Hash=78f011a0b5a0c65fa1cf106fcd62c845 [1] https://github.com/herd/herdtools7/pull/855 Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jade Alglave <j.alglave@ucl.ac.uk> Cc: Luc Maranget <luc.maranget@inria.fr> Cc: Akira Yokosawa <akiyks@gmail.com> Cc: Daniel Lustig <dlustig@nvidia.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: <linux-arch@vger.kernel.org>
2025-02-20	tools/memory-model: Add atomic_and()/or()/xor() and add_negative	Puranjay Mohan
	Pull-849[1] added the support of '&', '\|', and '^' to the herd7 tool's atomics operations. Use these in linux-kernel.def to implement atomic_and()/or()/xor() with all their ordering variants. atomic_add_negative() is already available so add its acquire, release, and relaxed ordering variants. [1] https://github.com/herd/herdtools7/pull/849 Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jade Alglave <j.alglave@ucl.ac.uk> Cc: Luc Maranget <luc.maranget@inria.fr> Cc: Akira Yokosawa <akiyks@gmail.com> Cc: Daniel Lustig <dlustig@nvidia.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: <linux-arch@vger.kernel.org>
2025-02-20	selftests/nsfs: add ioctl validation tests	Christian Brauner
	Add simple tests to validate that non-nsfs ioctls are rejected. Link: https://lore.kernel.org/r/20250219-work-nsfs-v1-2-21128d73c5e8@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-20	tools/power turbostat: Add idle governor statistics reporting	Artem Bityutskiy
	The idle governor provides the following per-idle state sysfs files: * above - Indicates overshoots, where a more shallow state should have been requested (if avaliale and enabled). * below - Indicates undershoots, where a deeper state should have been requested (if available and enabled). These files offer valuable insights into how effectively the Linux kernel idle governor selects idle states for a given workload. This commit adds support for these files in turbostat. Expose the contents of these files with the following naming convention: * C1: The number of times the C1 state was requested (existing counter). * C1+: The number of times the idle governor selected C1, but a deeper idle state should have been selected instead. * C1-: The number of times the idle governor selected C1, but a shallower idle state should have been selected instead. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-02-19	selftests: drv-net: add a simple TSO test	Jakub Kicinski
	Add a simple test for TSO. Send a few MB of data and check device stats to verify that the device was performing segmentation. Do the same thing over a few tunnel types. Injecting GSO packets directly would give us more ability to test corner cases, but perhaps starting simple is good enough? # ./ksft-net-drv/drivers/net/hw/tso.py # Detected qstat for LSO wire-packets KTAP version 1 1..14 ok 1 tso.ipv4 # SKIP Test requires IPv4 connectivity ok 2 tso.vxlan4_ipv4 # SKIP Test requires IPv4 connectivity ok 3 tso.vxlan6_ipv4 # SKIP Test requires IPv4 connectivity ok 4 tso.vxlan_csum4_ipv4 # SKIP Test requires IPv4 connectivity ok 5 tso.vxlan_csum6_ipv4 # SKIP Test requires IPv4 connectivity ok 6 tso.gre4_ipv4 # SKIP Test requires IPv4 connectivity ok 7 tso.gre6_ipv4 # SKIP Test requires IPv4 connectivity ok 8 tso.ipv6 ok 9 tso.vxlan4_ipv6 ok 10 tso.vxlan6_ipv6 ok 11 tso.vxlan_csum4_ipv6 ok 12 tso.vxlan_csum6_ipv6 # Testing with mangleid enabled ok 13 tso.gre4_ipv6 ok 14 tso.gre6_ipv6 # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:7 error:0 Note that the test currently depends on the driver reporting the LSO count via qstat, which appears to be relatively rare (virtio, cisco/enic, sfc/efc; but virtio needs host support). Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250218225426.77726-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests: drv-net: store addresses in dict indexed by ipver	Jakub Kicinski
	Looks like more and more tests want to iterate over IP version, run the same test over ipv4 and ipv6. The current naming of members in the env class makes it a bit awkward, we have separate members for ipv4 and ipv6 parameters. Store the parameters inside dicts, so that tests can easily index them with ip version. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250218225426.77726-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests: drv-net: get detailed interface info	Jakub Kicinski
	We already record output of ip link for NETIF in env for easy access. Record the detailed version. TSO test will want to know the max tso size. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250218225426.77726-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests: drv-net: resolve remote interface name	Jakub Kicinski
	Find out and record in env the name of the interface which remote host will use for the IP address provided via config. Interface name is useful for mausezahn and for setting up tunnels. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250218225426.77726-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests: net: Fix minor typos in MPTCP and psock tests	Suchit
	Fixes minor spelling errors: - `simult_flows.sh`: "al testcases" -> "all testcases" - `psock_tpacket.c`: "accross" -> "across" Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250218165923.20740-1-suchitkarunakaran@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests/bpf: Add a specific dst port matching	Cong Wang
	After this patch: #102/1 flow_dissector_classification/ipv4:OK #102/2 flow_dissector_classification/ipv4_continue_dissect:OK #102/3 flow_dissector_classification/ipip:OK #102/4 flow_dissector_classification/gre:OK #102/5 flow_dissector_classification/port_range:OK #102/6 flow_dissector_classification/ipv6:OK #102 flow_dissector_classification:OK Summary: 1/6 PASSED, 0 SKIPPED, 0 FAILED Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250218043210.732959-5-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests/net/forwarding: Add a test case for tc-flower of mixed port and ↵	Cong Wang
	port-range After this patch: # ./tc_flower_port_range.sh TEST: Port range matching - IPv4 UDP [ OK ] TEST: Port range matching - IPv4 TCP [ OK ] TEST: Port range matching - IPv6 UDP [ OK ] TEST: Port range matching - IPv6 TCP [ OK ] TEST: Port range matching - IPv4 UDP Drop [ OK ] Cc: Qiang Zhang <dtzq01@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250218043210.732959-3-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests: fib_rule_tests: Add port mask match tests	Ido Schimmel
	Add tests for FIB rules that match on source and destination ports with a mask. Test both good and bad flows. # ./fib_rule_tests.sh IPv6 FIB rule tests [...] TEST: rule6 check: sport and dport redirect to table [ OK ] TEST: rule6 check: sport and dport no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport redirect to table [ OK ] TEST: rule6 check: sport and dport range redirect to table [ OK ] TEST: rule6 check: sport and dport range no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport range redirect to table [ OK ] TEST: rule6 check: sport and dport masked redirect to table [ OK ] TEST: rule6 check: sport and dport masked no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport masked redirect to table [ OK ] [...] Tests passed: 292 Tests failed: 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250217134109.311176-9-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests: fib_rule_tests: Add port range match tests	Ido Schimmel
	Currently, only matching on specific ports is tested. Add port range testing to make sure this use case does not regress. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250217134109.311176-8-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests/bpf: Add tests for bpf_copy_from_user_task_str	Jordan Rome
	This adds tests for both the happy path and the error path (with and without the BPF_F_PAD_ZEROS flag). Signed-off-by: Jordan Rome <linux@jordanrome.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250213152125.1837400-3-linux@jordanrome.com
2025-02-19	selftests/bpf: Enable kprobe_multi tests for ARM64	Alexis Lothoré (eBPF Foundation)
	The kprobe_multi feature was disabled on ARM64 due to the lack of fprobe support. The fprobe rewrite on function_graph has been recently merged and thus brought support for fprobes on arm64. This then enables kprobe_multi support on arm64, and so the corresponding tests can now be run on this architecture. Remove the tests depending on kprobe_multi from DENYLIST.aarch64 to allow those to run in CI. CONFIG_FPROBE is already correctly set in tools/testing/selftests/bpf/config Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250219-enable_kprobe_multi_tests-v1-1-faeec99240c8@bootlin.com
2025-02-19	libbpf: Wrap libbpf API direct err with libbpf_err	Tao Chen
	Just wrap the direct err with libbpf_err, keep consistency with other APIs. Signed-off-by: Tao Chen <chen.dylane@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250219153711.29651-1-chen.dylane@linux.dev
2025-02-19	perf tools: Improve startup time by reducing unnecessary stat() calls	Krzysztof Łopatowski
	When testing perf trace on NixOS, I noticed significant startup delays: - `ls`: ~2ms - `strace ls`: ~10ms - `perf trace ls`: ~550ms Profiling showed that 51% of the time is spent reading files, 26% in loading BPF programs, and 11% in `newfstatat`. This patch optimizes module path exploration by avoiding `stat()` calls unless necessary. For filesystems that do not implement `d_type` (DT_UNKNOWN), it falls back to the old behavior. See `readdir(3)` for details. This reduces `perf trace ls` time to ~500ms. A more thorough startup optimization based on command parameters would be ideal, but that is a larger effort. Signed-off-by: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Acked-by: Howard Chu <howardchu95@gmail.com> Link: https://lore.kernel.org/r/20250206113314.335376-2-krzysztof.m.lopatowski@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19	perf report: Fix input reload/switch with symbol sort key	Dmitry Vyukov
	Currently the code checks that there is no "ipc" in the sort order and add an ipc string. This will always error out on the second pass after input reload/switch, since the sort order already contains "ipc". Do the ipc check/fixup only on the first pass. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lore.kernel.org/r/20250108063628.215577-1-dvyukov@google.com Fixes: ec6ae74fe8f0 ("perf report: Display average IPC and IPC coverage per symbol") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19	perf report: Support switching data w/ and w/o callchains	Namhyung Kim
	The symbol_conf.use_callchain should be reset when switching to new data file, otherwise report__setup_sample_type() will show an error message that it enabled callchains but no callchain data. The function also will turn on the callchains if the data has PERF_SAMPLE_CALLCHAIN so I think it's ok to reset symbol_conf.use_callchain here. Link: https://lore.kernel.org/r/20250211060745.294289-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19	perf report: Switch data file correctly in TUI	Namhyung Kim
	The 's' key is to switch to a new data file and load the data in the same window. The switch_data_file() will show a popup menu to select which data file user wants and update the 'input_name' global variable. But in the cmd_report(), it didn't update the data.path using the new 'input_name' and keep usng the old file. This is fairly an old bug and I assume people don't use this feature much. :) Link: https://lore.kernel.org/r/20250211060745.294289-1-namhyung@kernel.org Closes: https://lore.kernel.org/linux-perf-users/89e678bc-f0af-4929-a8a6-a2666f1294a4@linaro.org Fixes: f5fc14124c5cefdd ("perf tools: Add data object to handle perf data file") Reported-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19	perf tools: Fix up some comments and code to properly use the event_source bus	Greg Kroah-Hartman
	In sysfs, the perf events are all located in /sys/bus/event_source/devices/ but some places ended up hard-coding the location to be at the root of /sys/devices/ which could be very risky as you do not exactly know what type of device you are accessing in sysfs at that location. So fix this all up by properly pointing everything at the bus device list instead of the root of the sysfs devices/ tree. Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/2025021955-implant-excavator-179d@gregkh Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19	perf list: Also append PMU name in verbose mode	James Clark
	When listing in verbose mode, the long description is used but the PMU name isn't appended. There doesn't seem to be a reason to exclude it when asking for more information, so use the same print block for both long and short descriptions. Before: $ perf list -v ... inst_retired [Instruction architecturally executed] After: $ perf list -v ... inst_retired [Instruction architecturally executed. Unit: armv8_cortex_a57] Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250219151622.1097289-1-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19	perf vendor events arm64: Fix incorrect CPU_CYCLE in metrics expr	Yangyu Chen
	Some existing metrics for Neoverse N3 and V3 expressions use CPU_CYCLE to represent the number of cycles, but this is incorrect. The correct event to use is CPU_CYCLES. I encountered this issue while working on a patch to add pmu events for Cortex A720 and A520 by reusing the existing patch for Neoverse N3 and V3 by James Clark [1] and my check script [2] reported this issue. [1] https://lore.kernel.org/lkml/20250122163504.2061472-1-james.clark@linaro.org/ [2] https://github.com/cyyself/arm-pmu-check Signed-off-by: Yangyu Chen <cyy@cyyself.name> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/tencent_D4ED18476ADCE818E31084C60E3E72C14907@qq.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19	pm: cpupower: bench: Prevent NULL dereference on malloc failure	Zhongqiu Han
	If malloc returns NULL due to low memory, 'config' pointer can be NULL. Add a check to prevent NULL dereference. Link: https://lore.kernel.org/r/20250219122715.3892223-1-quic_zhonhan@quicinc.com Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-02-19	selftests/bpf: Add rto max for bpf_setsockopt test	Jason Xing
	Test the TCP_RTO_MAX_MS optname in the existing setget_sockopt test. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250219081333.56378-3-kerneljasonxing@gmail.com