summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2025-02-22selftests/lam: Test get_user() LAM pointer handlingMaciej Wieczor-Retman
Recent change in how get_user() handles pointers: https://lore.kernel.org/all/20241024013214.129639-1-torvalds@linux-foundation.org/ has a specific case for LAM. It assigns a different bitmask that's later used to check whether a pointer comes from userland in get_user(). Add test case to LAM that utilizes a ioctl (FIOASYNC) syscall which uses get_user() in its implementation. Execute the syscall with differently tagged pointers to verify that valid user pointers are passing through and invalid kernel/non-canonical pointers are not. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/1624d9d1b9502517053a056652d50dc5d26884ac.1737990375.git.maciej.wieczor-retman@intel.com
2025-02-22selftests/lam: Skip test if LAM is disabledMaciej Wieczor-Retman
Until LASS is merged into the kernel: https://lore.kernel.org/all/20241028160917.1380714-1-alexander.shishkin@linux.intel.com/ LAM is left disabled in the config file. Running the LAM selftest with disabled LAM only results in unhelpful output. Use one of LAM syscalls() to determine whether the kernel was compiled with LAM support (CONFIG_ADDRESS_MASKING) or not. Skip running the tests in the latter case. Merge CPUID checking function with the one mentioned above to achieve a single function that shows LAM's availability from both CPU and the kernel. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/251d0f45f6a768030115e8d04bc85458910cb0dc.1737990375.git.maciej.wieczor-retman@intel.com
2025-02-22selftests/lam: Move cpu_has_la57() to use cpuinfo flagMaciej Wieczor-Retman
In current form cpu_has_la57() reports platform's support for LA57 through reading the output of cpuid. A much more useful information is whether 5-level paging is actually enabled on the running system. Check whether 5-level paging is enabled by trying to map a page in the high linear address space. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/8b1ca51b13e6d94b5a42b6930d81b692cbb0bcbb.1737990375.git.maciej.wieczor-retman@intel.com
2025-02-21selftests: fib_nexthops: do not mark skipped tests as failedHangbin Liu
The current test marks all unexpected return values as failed and sets ret to 1. If a test is skipped, the entire test also returns 1, incorrectly indicating failure. To fix this, add a skipped variable and set ret to 4 if it was previously 0. Otherwise, keep ret set to 1. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250220085326.1512814-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: fib_rule_tests: Add DSCP mask match testsIdo Schimmel
Add tests for FIB rules that match on DSCP with a mask. Test both good and bad flows and both the input and output paths. # ./fib_rule_tests.sh IPv6 FIB rule tests [...] TEST: rule6 check: dscp redirect to table [ OK ] TEST: rule6 check: dscp no redirect to table [ OK ] TEST: rule6 del by pref: dscp redirect to table [ OK ] TEST: rule6 check: iif dscp redirect to table [ OK ] TEST: rule6 check: iif dscp no redirect to table [ OK ] TEST: rule6 del by pref: iif dscp redirect to table [ OK ] TEST: rule6 check: dscp masked redirect to table [ OK ] TEST: rule6 check: dscp masked no redirect to table [ OK ] TEST: rule6 del by pref: dscp masked redirect to table [ OK ] TEST: rule6 check: iif dscp masked redirect to table [ OK ] TEST: rule6 check: iif dscp masked no redirect to table [ OK ] TEST: rule6 del by pref: iif dscp masked redirect to table [ OK ] [...] Tests passed: 316 Tests failed: 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250220080525.831924-7-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-02-20 We've added 19 non-merge commits during the last 8 day(s) which contain a total of 35 files changed, 1126 insertions(+), 53 deletions(-). The main changes are: 1) Add TCP_RTO_MAX_MS support to bpf_set/getsockopt, from Jason Xing 2) Add network TX timestamping support to BPF sock_ops, from Jason Xing 3) Add TX metadata Launch Time support, from Song Yoong Siang * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: igc: Add launch time support to XDP ZC igc: Refactor empty frame insertion for launch time support net: stmmac: Add launch time support to XDP ZC selftests/bpf: Add launch time request to xdp_hw_metadata xsk: Add launch time hardware offload support to XDP Tx metadata selftests/bpf: Add simple bpf tests in the tx path for timestamping feature bpf: Support selective sampling for bpf timestamping bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback net-timestamp: Prepare for isolating two modes of SO_TIMESTAMPING bpf: Disable unsafe helpers in TX timestamping callbacks bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback bpf: Prepare the sock_ops ctx and call bpf prog for TX timestamping bpf: Add networking timestamping support to bpf_get/setsockopt() selftests/bpf: Add rto max for bpf_setsockopt test bpf: Support TCP_RTO_MAX_MS for bpf_setsockopt ==================== Link: https://patch.msgid.link/20250221022104.386462-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: net: Add test cases for link and peer netnsXiao Liang
- Add test for creating link in another netns when a link of the same name and ifindex exists in current netns. - Add test to verify that link is created in target netns directly - no link new/del events should be generated in link netns or current netns. - Add test cases to verify that link-netns is set as expected for various drivers and combination of namespace-related parameters. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20250219125039.18024-14-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: net: Add python context manager for netns enteringXiao Liang
Change netns of current thread and switch back on context exit. For example: with NetNSEnter("ns1"): ip("link add dummy0 type dummy") The command be executed in netns "ns1". Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20250219125039.18024-13-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests/ftrace: Update fprobe test to check enabled_functions fileSteven Rostedt
A few bugs were found in the fprobe accounting logic along with it using the function graph infrastructure. Update the fprobe selftest to catch those bugs in case they or something similar shows up in the future. The test now checks the enabled_functions file which shows all the functions attached to ftrace or fgraph. When enabling a fprobe, make sure that its corresponding function is also added to that file. Also add two more fprobes to enable to make sure that the fprobe logic works properly with multiple probes. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/20250220202055.733001756@goodmis.org Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Tested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-21Merge branch 'perf/urgent' into perf/core, to pick up fixes before merging ↵Ingo Molnar
new patches Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-02-21selftests: fix spelling/grammar errors in sysctl/sysctl.shChandra Pratap
Fix the grammatical/spelling errors in sysctl/sysctl.sh. This fixes all errors pointed out by codespell in the file. Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-20selftests: drv-net: rename queues check_xdp to check_xskJakub Kicinski
The test is for AF_XDP, we refer to AF_XDP as XSK. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20selftests: drv-net: improve the use of ksft helpers in XSK queue testJakub Kicinski
Avoid exceptions when xsk attr is not present, and add a proper ksft helper for "not in" condition. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20selftests: drv-net: add a way to wait for a local processJakub Kicinski
We use wait_port_listen() extensively to wait for a process we spawned to be ready. Not all processes will open listening sockets. Add a method of explicitly waiting for a child to be ready. Pass a FD to the spawned process and wait for it to write a message to us. FD number is passed via KSFT_READY_FD env variable. Similarly use KSFT_WAIT_FD to let the child process for a sign that we are done and child should exit. Sending a signal to a child with shell=True can get tricky. Make use of this method in the queues test to make it less flaky. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20selftests: drv-net: probe for AF_XDP sockets more explicitlyJakub Kicinski
Separate the support check from socket binding for easier refactoring. Use: ./helper - - just to probe if we can open the socket. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20selftests: drv-net: add missing new line in xdp_helperJakub Kicinski
Kurt and Joe report missing new line at the end of Usage. Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20selftests: drv-net: use cfg.rpath() in netlink xsk attr testJakub Kicinski
The cfg.rpath() helper was been recently added to make formatting paths for helper binaries easier. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20selftests: drv-net: add a warning for bkg + shell + terminateJakub Kicinski
Joe Damato reports that some shells will fork before running the command when python does "sh -c $cmd", while bash on my machine does an exec of $cmd directly. This will have implications for our ability to terminate the child process on various configurations of bash and other shells. Warn about using bkg(... shell=True, termininate=True) most background commands can hopefully exit cleanly (exit_wait). Link: https://lore.kernel.org/Z7Yld21sv_Ip3gQx@LQ3V64L9R2 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull BPF fixes from Daniel Borkmann: - Fix a soft-lockup in BPF arena_map_free on 64k page size kernels (Alan Maguire) - Fix a missing allocation failure check in BPF verifier's acquire_lock_state (Kumar Kartikeya Dwivedi) - Fix a NULL-pointer dereference in trace_kfree_skb by adding kfree_skb to the raw_tp_null_args set (Kuniyuki Iwashima) - Fix a deadlock when freeing BPF cgroup storage (Abel Wu) - Fix a syzbot-reported deadlock when holding BPF map's freeze_mutex (Andrii Nakryiko) - Fix a use-after-free issue in bpf_test_init when eth_skb_pkt_type is accessing skb data not containing an Ethernet header (Shigeru Yoshida) - Fix skipping non-existing keys in generic_map_lookup_batch (Yan Zhai) - Several BPF sockmap fixes to address incorrect TCP copied_seq calculations, which prevented correct data reads from recv(2) in user space (Jiayuan Chen) - Two fixes for BPF map lookup nullness elision (Daniel Xu) - Fix a NULL-pointer dereference from vmlinux BTF lookup in bpf_sk_storage_tracing_allowed (Jared Kangas) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests: bpf: test batch lookup on array of maps with holes bpf: skip non exist keys in generic_map_lookup_batch bpf: Handle allocation failure in acquire_lock_state bpf: verifier: Disambiguate get_constant_map_key() errors bpf: selftests: Test constant key extraction on irrelevant maps bpf: verifier: Do not extract constant map keys for irrelevant maps bpf: Fix softlockup in arena_map_free on 64k page kernel net: Add rx_skb of kfree_skb to raw_tp_null_args[]. bpf: Fix deadlock when freeing cgroup storage selftests/bpf: Add strparser test for bpf selftests/bpf: Fix invalid flag of recv() bpf: Disable non stream socket for strparser bpf: Fix wrong copied_seq calculation strparser: Add read_sock callback bpf: avoid holding freeze_mutex during mmap operation bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic selftests/bpf: Adjust data size to have ETH_HLEN bpf, test_run: Fix use-after-free issue in eth_skb_pkt_type() bpf: Remove unnecessary BTF lookups in bpf_sk_storage_tracing_allowed
2025-02-20selftests/bpf: Add launch time request to xdp_hw_metadataSong Yoong Siang
Add launch time hardware offload request to xdp_hw_metadata. Users can configure the delta of launch time relative to HW RX-time using the "-l" argument. By default, the delta is set to 0 ns, which means the launch time is disabled. By setting the delta to a non-zero value, the launch time hardware offload feature will be enabled and requested. Additionally, users can configure the Tx Queue to be enabled with the launch time hardware offload using the "-L" argument. By default, Tx Queue 0 will be used. Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250216093430.957880-3-yoong.siang.song@intel.com
2025-02-20xsk: Add launch time hardware offload support to XDP Tx metadataSong Yoong Siang
Extend the XDP Tx metadata framework so that user can requests launch time hardware offload, where the Ethernet device will schedule the packet for transmission at a pre-determined time called launch time. The value of launch time is communicated from user space to Ethernet driver via launch_time field of struct xsk_tx_metadata. Suggested-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250216093430.957880-2-yoong.siang.song@intel.com
2025-02-20selftests/bpf: Add simple bpf tests in the tx path for timestamping featureJason Xing
BPF program calculates a couple of latency deltas between each tx timestamping callbacks. It can be used in the real world to diagnose the kernel behaviour in the tx path. Check the safety issues by accessing a few bpf calls in bpf_test_access_bpf_calls() which are implemented in the patch 3 and 4. Check if the bpf timestamping can co-exist with socket timestamping. There remains a few realistic things[1][2] to highlight: 1. in general a packet may pass through multiple qdiscs. For instance with bonding or tunnel virtual devices in the egress path. 2. packets may be resent, in which case an ACK might precede a repeat SCHED and SND. 3. erroneous or malicious peers may also just never send an ACK. [1]: https://lore.kernel.org/all/67a389af981b0_14e0832949d@willemb.c.googlers.com.notmuch/ [2]: https://lore.kernel.org/all/c329a0c1-239b-4ca1-91f2-cb30b8dd2f6a@linux.dev/ Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-13-kerneljasonxing@gmail.com
2025-02-20bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callbackJason Xing
This patch introduces a new callback in tcp_tx_timestamp() to correlate tcp_sendmsg timestamp with timestamps from other tx timestamping callbacks (e.g., SND/SW/ACK). Without this patch, BPF program wouldn't know which timestamps belong to which flow because of no socket lock protection. This new callback is inserted in tcp_tx_timestamp() to address this issue because tcp_tx_timestamp() still owns the same socket lock with tcp_sendmsg_locked() in the meanwhile tcp_tx_timestamp() initializes the timestamping related fields for the skb, especially tskey. The tskey is the bridge to do the correlation. For TCP, BPF program hooks the beginning of tcp_sendmsg_locked() and then stores the sendmsg timestamp at the bpf_sk_storage, correlating this timestamp with its tskey that are later used in other sending timestamping callbacks. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-11-kerneljasonxing@gmail.com
2025-02-20bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callbackJason Xing
Support the ACK case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_ACK_CB. This callback will occur at the same timestamping point as the user space's SCM_TSTAMP_ACK. The BPF program can use it to get the same SCM_TSTAMP_ACK timestamp without modifying the user-space application. This patch extends txstamp_ack to two bits: 1 stands for SO_TIMESTAMPING mode, 2 bpf extension. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-10-kerneljasonxing@gmail.com
2025-02-20bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callbackJason Xing
Support hw SCM_TSTAMP_SND case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SND_HW_CB. This callback will occur at the same timestamping point as the user space's hardware SCM_TSTAMP_SND. The BPF program can use it to get the same SCM_TSTAMP_SND timestamp without modifying the user-space application. To avoid increasing the code complexity, replace SKBTX_HW_TSTAMP with SKBTX_HW_TSTAMP_NOBPF instead of changing numerous callers from driver side using SKBTX_HW_TSTAMP. The new definition of SKBTX_HW_TSTAMP means the combination tests of socket timestamping and bpf timestamping. After this patch, drivers can work under the bpf timestamping. Considering some drivers don't assign the skb with hardware timestamp, this patch does the assignment and then BPF program can acquire the hwstamp from skb directly. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-9-kerneljasonxing@gmail.com
2025-02-20bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callbackJason Xing
Support sw SCM_TSTAMP_SND case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SND_SW_CB. This callback will occur at the same timestamping point as the user space's software SCM_TSTAMP_SND. The BPF program can use it to get the same SCM_TSTAMP_SND timestamp without modifying the user-space application. Based on this patch, BPF program will get the software timestamp when the driver is ready to send the skb. In the sebsequent patch, the hardware timestamp will be supported. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-8-kerneljasonxing@gmail.com
2025-02-20bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callbackJason Xing
Support SCM_TSTAMP_SCHED case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SCHED_CB. This callback will occur at the same timestamping point as the user space's SCM_TSTAMP_SCHED. The BPF program can use it to get the same SCM_TSTAMP_SCHED timestamp without modifying the user-space application. A new SKBTX_BPF flag is added to mark skb_shinfo(skb)->tx_flags, ensuring that the new BPF timestamping and the current user space's SO_TIMESTAMPING do not interfere with each other. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-7-kerneljasonxing@gmail.com
2025-02-20bpf: Add networking timestamping support to bpf_get/setsockopt()Jason Xing
The new SK_BPF_CB_FLAGS and new SK_BPF_CB_TX_TIMESTAMPING are added to bpf_get/setsockopt. The later patches will implement the BPF networking timestamping. The BPF program will use bpf_setsockopt(SK_BPF_CB_FLAGS, SK_BPF_CB_TX_TIMESTAMPING) to enable the BPF networking timestamping on a socket. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-2-kerneljasonxing@gmail.com
2025-02-20tools/nolibc: add support for 32-bit s390Thomas Weißschuh
32-bit s390 is very close to the existing 64-bit implementation. Some special handling is necessary as there is neither LLVM nor QEMU support. Also the kernel itself can not build natively for 32-bit s390, so instead the test program is executed with a 64-bit kernel. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250206-nolibc-s390-v2-2-991ad97e3d58@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-02-20selftests/nolibc: rename s390 to s390xThomas Weißschuh
Support for 32-bit s390 is about to be added. As "s39032" would look horrible, use the another naming scheme. 32-bit s390 is "s390" and 64-bit s390 is "s390x", similar to how it is handled in various toolchain components. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250206-nolibc-s390-v2-1-991ad97e3d58@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-02-20selftests/nolibc: only run constructor tests on nolibcThomas Weißschuh
The nolibc testsuite can be run against other libcs to test for interoperability. Some aspects of the constructor execution are not standardized and musl does not provide all tested feature, for one it does not provide arguments to the constructors, anymore? Skip the constructor tests on non-nolibc configurations. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250212-nolibc-test-constructor-v1-1-c963875b3da4@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.14-rc4). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20Merge tag 'net-6.14-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Smaller than usual with no fixes from any subtree. Current release - regressions: - core: fix race of rtnl_net_lock(dev_net(dev)) Previous releases - regressions: - core: remove the single page frag cache for good - flow_dissector: fix handling of mixed port and port-range keys - sched: cls_api: fix error handling causing NULL dereference - tcp: - adjust rcvq_space after updating scaling ratio - drop secpath at the same time as we currently drop dst - eth: gtp: suppress list corruption splat in gtp_net_exit_batch_rtnl(). Previous releases - always broken: - vsock: - fix variables initialization during resuming - for connectible sockets allow only connected - eth: - geneve: fix use-after-free in geneve_find_dev() - ibmvnic: don't reference skb after sending to VIOS" * tag 'net-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) Revert "net: skb: introduce and use a single page frag cache" net: allow small head cache usage with large MAX_SKB_FRAGS values nfp: bpf: Add check for nfp_app_ctrl_msg_alloc() tcp: drop secpath at the same time as we currently drop dst net: axienet: Set mac_managed_pm arp: switch to dev_getbyhwaddr() in arp_req_set_public() net: Add non-RCU dev_getbyhwaddr() helper sctp: Fix undefined behavior in left shift operation selftests/bpf: Add a specific dst port matching flow_dissector: Fix port range key handling in BPF conversion selftests/net/forwarding: Add a test case for tc-flower of mixed port and port-range flow_dissector: Fix handling of mixed port and port-range keys geneve: Suppress list corruption splat in geneve_destroy_tunnels(). gtp: Suppress list corruption splat in gtp_net_exit_batch_rtnl(). dev: Use rtnl_net_dev_lock() in unregister_netdev(). net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). net: Add net_passive_inc() and net_passive_dec(). net: pse-pd: pd692x0: Fix power limit retrieval MAINTAINERS: trim the GVE entry gve: set xdp redirect target only when it is available ...
2025-02-20cpupower: monitor: Exit with error status if execvp() failYiwei Lin
In the case that we give a invalid command to idle_monitor for monitoring, the execvp() will fail and thus go to the next line. As a result, we'll see two differnt monitoring output. For example, running `cpupower monitor -i 5 invalidcmd` which `invalidcmd` is not executable. Link: https://lore.kernel.org/r/20250220163846.2765-1-s921975628@gmail.com Signed-off-by: Yiwei Lin <s921975628@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-02-20tools/memory-model: Define effect of Mb tags on RMWs in tools/...Jonas Oberhauser
Herd7 transforms successful RMW with Mb tags by inserting smp_mb() fences around them. We emulate this by considering imaginary po-edges before the RMW read and before the RMW write, and extending the smp_mb() ordering rule, which currently only applies to real po edges that would be found around a really inserted smp_mb(), also to cases of the only imagined po edges. Reported-by: Viktor Vafeiadis <viktor@mpi-sws.org> Suggested-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-20tools/memory-model: Define applicable tags on operation in tools/...Jonas Oberhauser
Herd7 transforms reads, writes, and read-modify-writes by eliminating 'acquire tags from writes, 'release tags from reads, and 'acquire, 'release, and 'mb tags from failed read-modify-writes. We emulate this behavior by redefining Acquire, Release, and Mb sets in linux-kernel.bell to explicitly exclude those combinations. Herd7 furthermore adds 'noreturn tag to certain reads. Currently herd7 does not allow specifying the 'noreturn tag manually, but such manual declaration (e.g., through a syntax __atomic_op{noreturn}) would add invalid 'noreturn tags to writes; in preparation, we already also exclude this combination. Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-20tools/memory-model: Legitimize current use of tags in LKMM macrosJonas Oberhauser
The current macros in linux-kernel.def reference instructions such as __xchg{mb} or __cmpxchg{acquire}, which are invalid combinations of tags and instructions according to the declarations in linux-kernel.bell. This works with current herd7 because herd7 removes these tags anyways and does not actually enforce validity of combinations at all. If a future herd7 version no longer applies these hardcoded transformations, then all currently invalid combinations will actually appear on some instruction. We therefore adjust the declarations to make the resulting combinations valid, by adding the 'mb tag to the set of Accesses and allowing all Accesses to appear on all read, write, and RMW instructions. Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-20tools/memory-model: Add atomic_andnot() with its variantsPuranjay Mohan
Pull-855[1] added the support of atomic_andnot() to the herd tool. Use this to add the implementation in the LKMM. All of the ordering variants are also added. Here is a small litmus-test that uses this operation: C andnot { atomic_t u = ATOMIC_INIT(7); } P0(atomic_t *u) { r0 = atomic_fetch_andnot(3, u); r1 = READ_ONCE(*u); } exists (0:r0=7 /\ 0:r1=4) Test andnot Allowed States 1 0:r0=7; 0:r1=4; Ok Witnesses Positive: 1 Negative: 0 Condition exists (0:r0=7 /\ 0:r1=4) Observation andnot Always 1 0 Time andnot 0.00 Hash=78f011a0b5a0c65fa1cf106fcd62c845 [1] https://github.com/herd/herdtools7/pull/855 Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jade Alglave <j.alglave@ucl.ac.uk> Cc: Luc Maranget <luc.maranget@inria.fr> Cc: Akira Yokosawa <akiyks@gmail.com> Cc: Daniel Lustig <dlustig@nvidia.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: <linux-arch@vger.kernel.org>
2025-02-20tools/memory-model: Add atomic_and()/or()/xor() and add_negativePuranjay Mohan
Pull-849[1] added the support of '&', '|', and '^' to the herd7 tool's atomics operations. Use these in linux-kernel.def to implement atomic_and()/or()/xor() with all their ordering variants. atomic_add_negative() is already available so add its acquire, release, and relaxed ordering variants. [1] https://github.com/herd/herdtools7/pull/849 Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jade Alglave <j.alglave@ucl.ac.uk> Cc: Luc Maranget <luc.maranget@inria.fr> Cc: Akira Yokosawa <akiyks@gmail.com> Cc: Daniel Lustig <dlustig@nvidia.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: <linux-arch@vger.kernel.org>
2025-02-20selftests/nsfs: add ioctl validation testsChristian Brauner
Add simple tests to validate that non-nsfs ioctls are rejected. Link: https://lore.kernel.org/r/20250219-work-nsfs-v1-2-21128d73c5e8@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-19selftests: drv-net: add a simple TSO testJakub Kicinski
Add a simple test for TSO. Send a few MB of data and check device stats to verify that the device was performing segmentation. Do the same thing over a few tunnel types. Injecting GSO packets directly would give us more ability to test corner cases, but perhaps starting simple is good enough? # ./ksft-net-drv/drivers/net/hw/tso.py # Detected qstat for LSO wire-packets KTAP version 1 1..14 ok 1 tso.ipv4 # SKIP Test requires IPv4 connectivity ok 2 tso.vxlan4_ipv4 # SKIP Test requires IPv4 connectivity ok 3 tso.vxlan6_ipv4 # SKIP Test requires IPv4 connectivity ok 4 tso.vxlan_csum4_ipv4 # SKIP Test requires IPv4 connectivity ok 5 tso.vxlan_csum6_ipv4 # SKIP Test requires IPv4 connectivity ok 6 tso.gre4_ipv4 # SKIP Test requires IPv4 connectivity ok 7 tso.gre6_ipv4 # SKIP Test requires IPv4 connectivity ok 8 tso.ipv6 ok 9 tso.vxlan4_ipv6 ok 10 tso.vxlan6_ipv6 ok 11 tso.vxlan_csum4_ipv6 ok 12 tso.vxlan_csum6_ipv6 # Testing with mangleid enabled ok 13 tso.gre4_ipv6 ok 14 tso.gre6_ipv6 # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:7 error:0 Note that the test currently depends on the driver reporting the LSO count via qstat, which appears to be relatively rare (virtio, cisco/enic, sfc/efc; but virtio needs host support). Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250218225426.77726-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19selftests: drv-net: store addresses in dict indexed by ipverJakub Kicinski
Looks like more and more tests want to iterate over IP version, run the same test over ipv4 and ipv6. The current naming of members in the env class makes it a bit awkward, we have separate members for ipv4 and ipv6 parameters. Store the parameters inside dicts, so that tests can easily index them with ip version. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250218225426.77726-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19selftests: drv-net: get detailed interface infoJakub Kicinski
We already record output of ip link for NETIF in env for easy access. Record the detailed version. TSO test will want to know the max tso size. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250218225426.77726-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19selftests: drv-net: resolve remote interface nameJakub Kicinski
Find out and record in env the name of the interface which remote host will use for the IP address provided via config. Interface name is useful for mausezahn and for setting up tunnels. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250218225426.77726-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19selftests: net: Fix minor typos in MPTCP and psock testsSuchit
Fixes minor spelling errors: - `simult_flows.sh`: "al testcases" -> "all testcases" - `psock_tpacket.c`: "accross" -> "across" Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250218165923.20740-1-suchitkarunakaran@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19selftests/bpf: Add a specific dst port matchingCong Wang
After this patch: #102/1 flow_dissector_classification/ipv4:OK #102/2 flow_dissector_classification/ipv4_continue_dissect:OK #102/3 flow_dissector_classification/ipip:OK #102/4 flow_dissector_classification/gre:OK #102/5 flow_dissector_classification/port_range:OK #102/6 flow_dissector_classification/ipv6:OK #102 flow_dissector_classification:OK Summary: 1/6 PASSED, 0 SKIPPED, 0 FAILED Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250218043210.732959-5-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19selftests/net/forwarding: Add a test case for tc-flower of mixed port and ↵Cong Wang
port-range After this patch: # ./tc_flower_port_range.sh TEST: Port range matching - IPv4 UDP [ OK ] TEST: Port range matching - IPv4 TCP [ OK ] TEST: Port range matching - IPv6 UDP [ OK ] TEST: Port range matching - IPv6 TCP [ OK ] TEST: Port range matching - IPv4 UDP Drop [ OK ] Cc: Qiang Zhang <dtzq01@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250218043210.732959-3-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19selftests: fib_rule_tests: Add port mask match testsIdo Schimmel
Add tests for FIB rules that match on source and destination ports with a mask. Test both good and bad flows. # ./fib_rule_tests.sh IPv6 FIB rule tests [...] TEST: rule6 check: sport and dport redirect to table [ OK ] TEST: rule6 check: sport and dport no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport redirect to table [ OK ] TEST: rule6 check: sport and dport range redirect to table [ OK ] TEST: rule6 check: sport and dport range no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport range redirect to table [ OK ] TEST: rule6 check: sport and dport masked redirect to table [ OK ] TEST: rule6 check: sport and dport masked no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport masked redirect to table [ OK ] [...] Tests passed: 292 Tests failed: 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250217134109.311176-9-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19selftests: fib_rule_tests: Add port range match testsIdo Schimmel
Currently, only matching on specific ports is tested. Add port range testing to make sure this use case does not regress. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250217134109.311176-8-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19pm: cpupower: bench: Prevent NULL dereference on malloc failureZhongqiu Han
If malloc returns NULL due to low memory, 'config' pointer can be NULL. Add a check to prevent NULL dereference. Link: https://lore.kernel.org/r/20250219122715.3892223-1-quic_zhonhan@quicinc.com Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>