linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-05-30	selftests/bpf: test struct_ops with epoll	Kui-Feng Lee
	Verify whether a user space program is informed through epoll with EPOLLHUP when a struct_ops object is detached. The BPF code in selftests/bpf/progs/struct_ops_module.c has become complex. Therefore, struct_ops_detach.c has been added to segregate the BPF code for detachment tests from the BPF code for other tests based on the recommendation of Andrii Nakryiko. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240530065946.979330-6-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-30	bpf: pass bpf_struct_ops_link to callbacks in bpf_struct_ops.	Kui-Feng Lee
	Pass an additional pointer of bpf_struct_ops_link to callback function reg, unreg, and update provided by subsystems defined in bpf_struct_ops. A bpf_struct_ops_map can be registered for multiple links. Passing a pointer of bpf_struct_ops_link helps subsystems to distinguish them. This pointer will be used in the later patches to let the subsystem initiate a detachment on a link that was registered to it previously. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240530065946.979330-2-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-30	kcsan: Add example to data_race() kerneldoc header	Paul E. McKenney
	Although the data_race() kerneldoc header accurately states what it does, some of the implications and usage patterns are non-obvious. Therefore, add a brief locking example and also state how to have KCSAN ignore accesses while also preventing the compiler from folding, spindling, or otherwise mutilating the access. [ paulmck: Apply Bart Van Assche feedback. ] [ paulmck: Apply feedback from Marco Elver. ] Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Breno Leitao <leitao@debian.org> Cc: Jens Axboe <axboe@kernel.dk>
2024-05-30	selftests/bpf: use section names understood by libbpf in test_sockmap	Jakub Sitnicki
	libbpf can deduce program type and attach type from the ELF section name. We don't need to pass it out-of-band if we switch to libbpf convention [1]. [1] https://docs.kernel.org/bpf/libbpf/program_types.html Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240522080936.2475833-1-jakub@cloudflare.com
2024-05-30	selftest: run vector prctl test for ZVE32X	Andy Chiu
	The minimal requirement for running Vector subextension on Linux is ZVE32X. So change the test accordingly to run prctl as long as it find it. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20240510-zve-detection-v5-8-0711bdd26c12@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30	selftests/futex: pass _GNU_SOURCE without a value to the compiler	John Hubbard
	It's slightly better to set _GNU_SOURCE in the source code, but if one must do it via the compiler invocation, then the best way to do so is this: $(CC) -D_GNU_SOURCE= ...because otherwise, if this form is used: $(CC) -D_GNU_SOURCE ...then that leads the compiler to set a value, as if you had passed in: $(CC) -D_GNU_SOURCE=1 That, in turn, leads to warnings under both gcc and clang, like this: futex_requeue_pi.c:20: warning: "_GNU_SOURCE" redefined Fix this by using the "-D_GNU_SOURCE=" form. Reviewed-by: Edward Liaw <edliaw@google.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-30	perf top: Allow filters on events	Ian Rogers
	Allow filters to be added to perf top events. One use is to workaround issues with: ``` $ perf top --uid="$(id -u)" ``` which tries to scan /proc find processes belonging to the uid and can fail in such a pid terminates between the scan and the perf_event_open reporting: ``` Error: The sys_perf_event_open() syscall returned with 3 (No such process) for event (cycles:P). /bin/dmesg \| grep -i perf may provide additional information. ``` A similar filter: ``` $ perf top -e cycles:P --filter "uid == $(id -u)" ``` doesn't fail this way. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240524205227.244375-4-irogers@google.com
2024-05-30	perf bpf filter: Add uid and gid terms	Ian Rogers
	Allow the BPF filter to use the uid and gid terms determined by the bpf_get_current_uid_gid BPF helper. For example, the following will record the cpu-clock event system wide discarding samples that don't belong to the current user. $ perf record -e cpu-clock --filter "uid == $(id -u)" -a sleep 0.1 Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240524205227.244375-3-irogers@google.com
2024-05-30	perf bpf filter: Give terms their own enum	Ian Rogers
	Give the term types their own enum so that additional terms can be added that don't correspond to a PERF_SAMPLE_xx flag. The term values are numerically ascending rather than bit field positions, this means they need translating to a PERF_SAMPLE_xx bit field in certain places using a shift. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240524205227.244375-2-irogers@google.com
2024-05-30	tools api io: Move filling the io buffer to its own function	Ian Rogers
	In general a read fills 4kb so filling the buffer is a 1 in 4096 operation, move it out of the io__get_char function to avoid some checking overhead and to better hint the function is good to inline. For perf's IO intensive internal (non-rigorous) benchmarks there's a small improvement to kallsyms-parsing with a default build. Before: ``` $ perf bench internals all Computing performance of single threaded perf event synthesis by synthesizing events on the perf process itself: Average synthesis took: 146.322 usec (+- 0.305 usec) Average num. events: 61.000 (+- 0.000) Average time per event 2.399 usec Average data synthesis took: 145.056 usec (+- 0.155 usec) Average num. events: 329.000 (+- 0.000) Average time per event 0.441 usec Average kallsyms__parse took: 162.313 ms (+- 0.599 ms) ... Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 53.720 usec (+- 7.823 usec) Average PMU scanning took: 375.145 usec (+- 23.974 usec) ``` After: ``` $ perf bench internals all Computing performance of single threaded perf event synthesis by synthesizing events on the perf process itself: Average synthesis took: 127.829 usec (+- 0.079 usec) Average num. events: 61.000 (+- 0.000) Average time per event 2.096 usec Average data synthesis took: 133.652 usec (+- 0.101 usec) Average num. events: 327.000 (+- 0.000) Average time per event 0.409 usec Average kallsyms__parse took: 150.415 ms (+- 0.313 ms) ... Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 47.790 usec (+- 1.178 usec) Average PMU scanning took: 376.945 usec (+- 23.683 usec) ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240519181716.4088459-1-irogers@google.com
2024-05-30	Merge tag 'net-6.10-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - gro: initialize network_offset in network layer - tcp: reduce accepted window in NEW_SYN_RECV state Current release - new code bugs: - eth: mlx5e: do not use ptp structure for tx ts stats when not initialized - eth: ice: check for unregistering correct number of devlink params Previous releases - regressions: - bpf: Allow delete from sockmap/sockhash only if update is allowed - sched: taprio: extend minimum interval restriction to entire cycle too - netfilter: ipset: add list flush to cancel_gc - ipv4: fix address dump when IPv4 is disabled on an interface - sock_map: avoid race between sock_map_close and sk_psock_put - eth: mlx5: use mlx5_ipsec_rx_status_destroy to correctly delete status rules Previous releases - always broken: - core: fix __dst_negative_advice() race - bpf: - fix multi-uprobe PID filtering logic - fix pkt_type override upon netkit pass verdict - netfilter: tproxy: bail out if IP has been disabled on the device - af_unix: annotate data-race around unix_sk(sk)->addr - eth: mlx5e: fix UDP GSO for encapsulated packets - eth: idpf: don't enable NAPI and interrupts prior to allocating Rx buffers - eth: i40e: fully suspend and resume IO operations in EEH case - eth: octeontx2-pf: free send queue buffers incase of leaf to inner - eth: ipvlan: dont Use skb->sk in ipvlan_process_v{4,6}_outbound" * tag 'net-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) netdev: add qstat for csum complete ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outbound net: ena: Fix redundant device NUMA node override ice: check for unregistering correct number of devlink params ice: fix 200G PHY types to link speed mapping i40e: Fully suspend and resume IO operations in EEH case i40e: factoring out i40e_suspend/i40e_resume e1000e: move force SMBUS near the end of enable_ulp function net: dsa: microchip: fix RGMII error in KSZ DSA driver ipv4: correctly iterate over the target netns in inet_dump_ifaddr() net: fix __dst_negative_advice() race nfc/nci: Add the inconsistency check between the input data length and count MAINTAINERS: dwmac: starfive: update Maintainer net/sched: taprio: extend minimum interval restriction to entire cycle too net/sched: taprio: make q->picos_per_byte available to fill_sched_entry() netfilter: nft_fib: allow from forward/input without iif selector netfilter: tproxy: bail out if IP has been disabled on the device netfilter: nft_payload: skbuff vlan metadata mangle support net: ti: icssg-prueth: Fix start counter for ft1 filter sock_map: avoid race between sock_map_close and sk_psock_put ...
2024-05-30	netdev: add qstat for csum complete	Jakub Kicinski
	Recent commit 0cfe71f45f42 ("netdev: add queue stats") added a lot of useful stats, but only those immediately needed by virtio. Presumably virtio does not support CHECKSUM_COMPLETE, so statistic for that form of checksumming wasn't included. Other drivers will definitely need it, in fact we expect it to be needed in net-next soon (mlx5). So let's add the definition of the counter for CHECKSUM_COMPLETE to uAPI in net already, so that the counters are in a more natural order (all subsequent counters have not been present in any released kernel, yet). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Fixes: 0cfe71f45f42 ("netdev: add queue stats") Link: https://lore.kernel.org/r/20240529163547.3693194-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-29	perf trace beauty: Always show mmap prot even though PROT_NONE	Changbin Du
	PROT_NONE is also useful information, so do not omit the mmap prot even though it is 0. syscall_arg__scnprintf_mmap_prot() could print PROT_NONE for prot 0. Before: PROT_NONE is not shown. $ sudo perf trace -e syscalls:sys_enter_mmap --filter prot==0 -- ls 0.000 ls/2979231 syscalls:sys_enter_mmap(len: 4220888, flags: PRIVATE\|ANONYMOUS) After: PROT_NONE is displayed. $ sudo perf trace -e syscalls:sys_enter_mmap --filter prot==0 -- ls 0.000 ls/2975708 syscalls:sys_enter_mmap(len: 4220888, prot: NONE, flags: PRIVATE\|ANONYMOUS) Signed-off-by: Changbin Du <changbin.du@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240522033542.1359421-3-changbin.du@huawei.com
2024-05-29	perf trace beauty: Always show param if show_zero is set	Changbin Du
	For some parameters, it is best to also display them when they are 0, e.g. flags. Here we only check the show_zero property and let arg printer handle special cases. Signed-off-by: Changbin Du <changbin.du@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240522033542.1359421-2-changbin.du@huawei.com
2024-05-29	doc: netlink: Fix op pre and post fields in generated .rst	Donald Hunter
	The generated .rst has pre and post headings without any values, e.g. here: https://docs.kernel.org/6.9/networking/netlink_spec/dpll.html#device-id-get Emit keys and values in the generated .rst Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240528140652.9445-5-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29	doc: netlink: Fix formatting of op flags in generated .rst	Donald Hunter
	Generate op flags as an inline list instead of a stringified python value. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240528140652.9445-4-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29	doc: netlink: Don't 'sanitize' op docstrings in generated .rst	Donald Hunter
	The doc strings for do/dump ops are emitted as toplevel .rst constructs so they can be multi-line. Pass multi-line text straight through to the .rst to retain any simple formatting from the .yaml This fixes e.g. list formatting for the pin-get docs in dpll.yaml: https://docs.kernel.org/6.9/networking/netlink_spec/dpll.html#pin-get Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240528140652.9445-3-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29	doc: netlink: Fix generated .rst for multi-line docs	Donald Hunter
	Fix the newline replacement in ynl-gen-rst.py to put spaces between concatenated lines. This fixes the broken doc string formatting. See the dpll docs for an example of broken concatenation: https://docs.kernel.org/6.9/networking/netlink_spec/dpll.html#lock-status Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240528140652.9445-2-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29	selftests/bpf: Add selftest for bits iter	Yafang Shao
	Add test cases for the bits iter: - Positive cases - Bit mask representing a single word (8-byte unit) - Bit mask representing data spanning more than one word - The index of the set bit - Nagative cases - bpf_iter_bits_destroy() is required after calling bpf_iter_bits_new() - bpf_iter_bits_destroy() can only destroy an initialized iter - bpf_iter_bits_next() must use an initialized iter - Bit mask representing zero words - Bit mask representing fewer words than expected - Case for ENOMEM - Case for NULL pointer Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240517023034.48138-3-laoar.shao@gmail.com
2024-05-29	selftests/overlayfs: Fix build error on ppc64	Michael Ellerman
	Fix build error on ppc64: dev_in_maps.c: In function ‘get_file_dev_and_inode’: dev_in_maps.c:60:59: error: format ‘%llu’ expects argument of type ‘long long unsigned int ’, but argument 7 has type ‘__u64 ’ {aka ‘long unsigned int *’} [-Werror=format=] By switching to unsigned long long for u64 for ppc64 builds. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-29	selftests/openat2: Fix build warnings on ppc64	Michael Ellerman
	Fix warnings like: openat2_test.c: In function ‘test_openat2_flags’: openat2_test.c:303:73: warning: format ‘%llX’ expects argument of type ‘long long unsigned int’, but argument 5 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=] By switching to unsigned long long for u64 for ppc64 builds. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-29	selftests: cachestat: Fix build warnings on ppc64	Michael Ellerman
	Fix warnings like: test_cachestat.c: In function ‘print_cachestat’: test_cachestat.c:30:38: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=] By switching to unsigned long long for u64 for ppc64 builds. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-29	tracing/selftests: Fix kprobe event name test for .isra. functions	Steven Rostedt (Google)
	The kprobe_eventname.tc test checks if a function with .isra. can have a kprobe attached to it. It loops through the kallsyms file for all the functions that have the .isra. name, and checks if it exists in the available_filter_functions file, and if it does, it uses it to attach a kprobe to it. The issue is that kprobes can not attach to functions that are listed more than once in available_filter_functions. With the latest kernel, the function that is found is: rapl_event_update.isra.0 # grep rapl_event_update.isra.0 /sys/kernel/tracing/available_filter_functions rapl_event_update.isra.0 rapl_event_update.isra.0 It is listed twice. This causes the attached kprobe to it to fail which in turn fails the test. Instead of just picking the function function that is found in available_filter_functions, pick the first one that is listed only once in available_filter_functions. Cc: stable@vger.kernel.org Fixes: 604e3548236d ("selftests/ftrace: Select an existing function in kprobe_eventname test") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-29	selftests/ftrace: Update required config	Masami Hiramatsu (Google)
	Update required config options for running all tests. This also sorts the config entries alphabetically. Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-29	selftests/ftrace: Fix to check required event file	Masami Hiramatsu (Google)
	The dynevent/test_duplicates.tc test case uses `syscalls/sys_enter_openat` event for defining eprobe on it. Since this `syscalls` events depend on CONFIG_FTRACE_SYSCALLS=y, if it is not set, the test will fail. Add the event file to `required` line so that the test will return `unsupported` result. Fixes: 297e1dcdca3d ("selftests/ftrace: Add selftest for testing duplicate eprobes and kprobes") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-29	kselftest/alsa: Ensure _GNU_SOURCE is defined	Mark Brown
	The pcmtest driver tests use the kselftest harness which requires that _GNU_SOURCE is defined but nothing causes it to be defined. Since the KHDR_INCLUDES Makefile variable has had the required define added let's use that, this should provide some futureproofing. Fixes: daef47b89efd ("selftests: Compile kselftest headers with -D_GNU_SOURCE") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-28	perf docs: Fix typos	Ian Rogers
	Assorted typo fixes. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240521223555.858859-1-irogers@google.com
2024-05-28	net/sched: taprio: extend minimum interval restriction to entire cycle too	Vladimir Oltean
	It is possible for syzbot to side-step the restriction imposed by the blamed commit in the Fixes: tag, because the taprio UAPI permits a cycle-time different from (and potentially shorter than) the sum of entry intervals. We need one more restriction, which is that the cycle time itself must be larger than N * ETH_ZLEN bit times, where N is the number of schedule entries. This restriction needs to apply regardless of whether the cycle time came from the user or was the implicit, auto-calculated value, so we move the existing "cycle == 0" check outside the "if "(!new->cycle_time)" branch. This way covers both conditions and scenarios. Add a selftest which illustrates the issue triggered by syzbot. Fixes: b5b73b26b3ca ("taprio: Fix allowing too small intervals") Reported-by: syzbot+a7d2b1d5d1af83035567@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/0000000000007d66bc06196e7c66@google.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20240527153955.553333-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28	net/sched: taprio: make q->picos_per_byte available to fill_sched_entry()	Vladimir Oltean
	In commit b5b73b26b3ca ("taprio: Fix allowing too small intervals"), a comparison of user input against length_to_duration(q, ETH_ZLEN) was introduced, to avoid RCU stalls due to frequent hrtimers. The implementation of length_to_duration() depends on q->picos_per_byte being set for the link speed. The blamed commit in the Fixes: tag has moved this too late, so the checks introduced above are ineffective. The q->picos_per_byte is zero at parse_taprio_schedule() -> parse_sched_list() -> parse_sched_entry() -> fill_sched_entry() time. Move the taprio_set_picos_per_byte() call as one of the first things in taprio_change(), before the bulk of the netlink attribute parsing is done. That's because it is needed there. Add a selftest to make sure the issue doesn't get reintroduced. Fixes: 09dbdf28f9f9 ("net/sched: taprio: fix calculation of maximum gate durations") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240527153955.553333-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28	selftests/bpf: Use start_server_str in do_test in bpf_tcp_ca	Geliang Tang
	This patch uses new helper start_server_str() in do_test() in bpf_tcp_ca.c to accept a struct network_helper_opts argument instead of using start_server() and settcpca(). Then change the type of the first paramenter of do_test() into a struct network_helper_opts one. Define its own cb_opts and opts for each test, set its own cc name into cb_opts.cc, and cc_cb() into post_socket_cb callback, then pass it to do_test(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/6e1b6555e3284e77c8aa60668c61a66c5f99aa37.1716638248.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-28	selftests/bpf: Use post_socket_cb in start_server_str	Geliang Tang
	This patch uses start_server_str() helper in test_dctcp_fallback() in bpf_tcp_ca.c, instead of using start_server() and settcpca(). For support opts in start_server_str() helper, opts->cb_opts needs to be passed to post_socket_cb() in __start_server(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/414c749321fa150435f7fe8e12c80fec8b447c78.1716638248.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-28	selftests/bpf: Use post_socket_cb in connect_to_fd_opts	Geliang Tang
	Since the post_socket_cb() callback is added in struct network_helper_opts, it's make sense to use it not only in __start_server(), but also in connect_to_fd_opts(). Then it can be used to set TCP_CONGESTION sockopt. Add a "void *" type member cb_opts into struct network_helper_opts, and add a new struct named cb_opts in prog_tests/bpf_tcp_ca.c, then cc can be moved into struct cb_opts from network_helper_opts. Define a new callback cc_cb() to set TCP_CONGESTION sockopt, and set it to post_socket_cb pointer of opts. Define a new cb_opts cubic, set it to cb_opts of opts. Pass this opts to connect_to_fd_opts() in test_dctcp_fallback(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/b512bb8d8f6854c9ea5c409b69d1bf37c6f272c6.1716638248.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-28	selftests/bpf: Add start_server_str helper	Geliang Tang
	It's a tech debt that start_server() does not take the "opts" argument. It's pretty handy to have start_server() as a helper that takes string address. So this patch creates a new helper start_server_str(). Then start_server() can be a wrapper of it. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/606e6cfd7e1aff8bc51ede49862eed0802e52170.1716638248.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-28	selftests/bpf: Drop struct post_socket_opts	Geliang Tang
	It's not possible to have one generic/common "struct post_socket_opts" for all tests. It's better to have the individual test define its own callback opts struct. So this patch drops struct post_socket_opts, and changes the second parameter of post_socket_cb as "void *" type. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/f8bda41c7cb9cb6979b2779f89fb3a684234304f.1716638248.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-28	libbpf: Configure log verbosity with env variable	Mykyta Yatsenko
	Configure logging verbosity by setting LIBBPF_LOG_LEVEL environment variable, which is applied only to default logger. Once user set their custom logging callback, it is up to them to handle filtering. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240524131840.114289-1-yatsenko@meta.com
2024-05-28	cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c	Dave Jiang
	tools/testing/cxl/test/mem.c uses vmalloc() and vfree() but does not include linux/vmalloc.h. Kernel v6.10 made changes that causes the currently included headers not depend on vmalloc.h and therefore mem.c can no longer compile. Add linux/vmalloc.h to fix compile issue. CC [M] tools/testing/cxl/test/mem.o tools/testing/cxl/test/mem.c: In function ‘label_area_release’: tools/testing/cxl/test/mem.c:1428:9: error: implicit declaration of function ‘vfree’; did you mean ‘kvfree’? [-Werror=implicit-function-declaration] 1428 \| vfree(lsa); \| ^~~~~ \| kvfree tools/testing/cxl/test/mem.c: In function ‘cxl_mock_mem_probe’: tools/testing/cxl/test/mem.c:1466:22: error: implicit declaration of function ‘vmalloc’; did you mean ‘kmalloc’? [-Werror=implicit-function-declaration] 1466 \| mdata->lsa = vmalloc(LSA_SIZE); \| ^~~~~~~ \| kmalloc Fixes: 7d3eb23c4ccf ("tools/testing/cxl: Introduce a mock memory device + driver") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20240528225551.1025977-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-05-28	tools headers UAPI: Update i915_drm.h with the kernel sources	Arnaldo Carvalho de Melo
	Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28	tools headers UAPI: Sync kvm headers with the kernel sources	Arnaldo Carvalho de Melo
	To pick the changes in: 4af663c2f64a8d25 ("KVM: SEV: Allow per-guest configuration of GHCB protocol version") 4f5defae708992dd ("KVM: SEV: introduce KVM_SEV_INIT2 operation") 26c44aa9e076ed83 ("KVM: SEV: define VM types for SEV and SEV-ES") ac5c48027bacb1b5 ("KVM: SEV: publish supported VMSA features") 651d61bc8b7d8bb6 ("KVM: PPC: Fix documentation for ppc mmu caps") That don't change functionality in tools/perf, as no new ioctl is added for the 'perf trace' scripts to harvest. This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joel Stanley <joel@jms.id.au> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Roth <michael.roth@amd.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/lkml/ZlYxAdHjyAkvGtMW@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28	perf list: Fix the --no-desc option	Breno Leitao
	Currently, the --no-desc option in perf list isn't functioning as intended. This issue arises from the overwriting of struct option->desc with the opposite value of struct option->long_desc. Consequently, whatever parse_options() returns at struct option->desc gets overridden later, rendering the --desc or --no-desc arguments ineffective. To resolve this, set ->desc as true by default and allow parse_options() to adjust it accordingly. This adjustment will fix the --no-desc option while preserving the functionality of the other parameters. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: leit@meta.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240517141427.1905691-1-leitao@debian.org
2024-05-28	perf arm-spe: Unaligned pointer work around	Ian Rogers
	Use get_unaligned_leXX instead of leXX_to_cpu to handle unaligned pointers. Such pointers occur with libFuzzer testing. A similar change for intel-pt was done in: https://lore.kernel.org/r/20231005190451.175568-6-adrian.hunter@intel.com Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240514052402.3031871-1-irogers@google.com
2024-05-28	perf tests: Add some pmu core functionality tests	Ian Rogers
	Test behavior of PMU names and comparisons wrt suffixes using Intel uncore_cha, marvell mrvl_ddr_pmu and S390's cpum_cf as examples. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Bharat Bhushan <bbhushan2@marvell.com> Cc: Bhaskara Budiredla <bbudiredla@marvell.com> Cc: Tuan Phan <tuanphan@os.amperecomputing.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240515060114.3268149-3-irogers@google.com
2024-05-28	perf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmu	Ian Rogers
	The mrvl_ddr_pmu is uncore and has a hexadecimal address suffix while the previous PMU sorting/merging code assumes uncore PMU names start with uncore_ and have a decimal suffix. Because of the previous assumption it isn't possible to wildcard the mrvl_ddr_pmu. Modify pmu_name_len_no_suffix but also remove the suffix number out argument, this is because we don't know if a suffix number of say 100 is in hexadecimal or decimal. As the only use of the suffix number is in comparisons, it is safe there to compare the values as hexadecimal. Modify perf_pmu__match_ignoring_suffix so that hexadecimal suffixes are ignored. Only allow hexadecimal suffixes to be greater than length 2 (ie 3 or more) so that S390's cpum_cf PMU doesn't lose its suffix. Change the return type of pmu_name_len_no_suffix to size_t to workaround GCC incorrectly determining the result could be negative. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Bharat Bhushan <bbhushan2@marvell.com> Cc: Bhaskara Budiredla <bbudiredla@marvell.com> Cc: Tuan Phan <tuanphan@os.amperecomputing.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240515060114.3268149-2-irogers@google.com
2024-05-28	tools arch x86: Sync the msr-index.h copy with the kernel sources	Arnaldo Carvalho de Melo
	To pick up the changes from these csets: 53bc516ade85a764 ("x86/msr: Move ARCH_CAP_XAPIC_DISABLE bit definition to its rightful place") That patch just move definitions around, so this just silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Link: https://lore.kernel.org/lkml/ZlYe8jOzd1_DyA7X@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28	tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs	Dhananjay Ugwekar
	Update cpupower's P-State frequency calculation and reporting with AMD Family 1Ah+ processors, when using the acpi-cpufreq driver. This is due to a change in the PStateDef MSR layout in AMD Family 1Ah+. Tested on 4th and 5th Gen AMD EPYC system Signed-off-by: Ananth Narayan <Ananth.Narayan@amd.com> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-28	Merge tag 'for-netdev' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-05-28 We've added 23 non-merge commits during the last 11 day(s) which contain a total of 45 files changed, 696 insertions(+), 277 deletions(-). The main changes are: 1) Rename skb's mono_delivery_time to tstamp_type for extensibility and add SKB_CLOCK_TAI type support to bpf_skb_set_tstamp(), from Abhishek Chauhan. 2) Add netfilter CT zone ID and direction to bpf_ct_opts so that arbitrary CT zones can be used from XDP/tc BPF netfilter CT helper functions, from Brad Cowie. 3) Several tweaks to the instruction-set.rst IETF doc to address the Last Call review comments, from Dave Thaler. 4) Small batch of riscv64 BPF JIT optimizations in order to emit more compressed instructions to the JITed image for better icache efficiency, from Xiao Wang. 5) Sort bpftool C dump output from BTF, aiming to simplify vmlinux.h diffing and forcing more natural type definitions ordering, from Mykyta Yatsenko. 6) Use DEV_STATS_INC() macro in BPF redirect helpers to silence a syzbot/KCSAN race report for the tx_errors counter, from Jiang Yunshui. 7) Un-constify bpf_func_info in bpftool to fix compilation with LLVM 17+ which started treating const structs as constants and thus breaking full BTF program name resolution, from Ivan Babrou. 8) Fix up BPF program numbers in test_sockmap selftest in order to reduce some of the test-internal array sizes, from Geliang Tang. 9) Small cleanup in Makefile.btf script to use test-ge check for v1.25-only pahole, from Alan Maguire. 10) Fix bpftool's make dependencies for vmlinux.h in order to avoid needless rebuilds in some corner cases, from Artem Savkov. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits) bpf, net: Use DEV_STAT_INC() bpf, docs: Fix instruction.rst indentation bpf, docs: Clarify call local offset bpf, docs: Add table captions bpf, docs: clarify sign extension of 64-bit use of 32-bit imm bpf, docs: Use RFC 2119 language for ISA requirements bpf, docs: Move sentence about returning R0 to abi.rst bpf: constify member bpf_sysctl_kern:: Table riscv, bpf: Try RVC for reg move within BPF_CMPXCHG JIT riscv, bpf: Use STACK_ALIGN macro for size rounding up riscv, bpf: Optimize zextw insn with Zba extension selftests/bpf: Handle forwarding of UDP CLOCK_TAI packets net: Add additional bit to support clockid_t timestamp type net: Rename mono_delivery_time to tstamp_type for scalabilty selftests/bpf: Update tests for new ct zone opts for nf_conntrack kfuncs net: netfilter: Make ct zone opts configurable for bpf ct helpers selftests/bpf: Fix prog numbers in test_sockmap bpf: Remove unused variable "prev_state" bpftool: Un-const bpf_func_info to fix it for llvm 17 and newer bpf: Fix order of args in call to bpf_map_kvcalloc ... ==================== Link: https://lore.kernel.org/r/20240528105924.30905-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28	tools headers: Update the syscall tables and unistd.h, mostly to support the ↵	Arnaldo Carvalho de Melo
	new 'mseal' syscall But also to wire up shadow stacks on 32-bit x86, picking up those changes from these csets: ff388fe5c481d39c ("mseal: wire up mseal syscall") 2883f01ec37dd866 ("x86/shstk: Enable shadow stacks for x32") This makes 'perf trace' support it, now its possible, for instance to do: # perf trace -e mseal --max-stack=16 Here is an example with the 'sendmmsg' syscall: root@x1:~# perf trace -e sendmmsg --max-stack 16 --max-events=1 0.000 ( 0.062 ms): dbus-broker/1012 sendmmsg(fd: 150, mmsg: 0x7ffef57cca50, vlen: 1, flags: DONTWAIT\|NOSIGNAL) = 1 syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) [0x117ce7] (/usr/lib64/libc.so.6 (deleted)) root@x1:~# To do a system wide tracing of the new 'mseal' syscall with a backtrace of at most 16 entries. This addresses these perf tools build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H J Lu <hjl.tools@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlXlo4TNcba4wnVZ@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28	tools: hv: suppress the invalid warning for packed member alignment	Saurabh Sengar
	Packed struct vmbus_bufring is 4096 byte aligned and the reporting warning is for the first member of that struct which shouldn't add any offset to create alignment issue. Suppress the warning by adding -Wno-address-of-packed-member flag to gcc. Fixes: 45bab4d74651 ("tools: hv: Add vmbus_bufring") Reported-by: kernel test robot <yujie.liu@intel.com> Closes: https://lore.kernel.org/all/202404121913.GhtSoKbW-lkp@intel.com/ Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/1714973938-4063-1-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1714973938-4063-1-git-send-email-ssengar@linux.microsoft.com>
2024-05-27	selftests: mptcp: join: mark 'fail' tests as flaky	Matthieu Baerts (NGI0)
	These tests are rarely unstable. It depends on the CI running the tests, especially if it is also busy doing other tasks in parallel, and if a debug kernel config is being used. It looks like this issue is sometimes present with the NetDev CI. While this is being investigated, the tests are marked as flaky not to create noises on such CIs. Fixes: b6e074e171bc ("selftests: mptcp: add infinite map testcase") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/491 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-4-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-27	selftests: mptcp: join: mark 'fastclose' tests as flaky	Matthieu Baerts (NGI0)
	These tests are flaky since their introduction. This might be less or not visible depending on the CI running the tests, especially if it is also busy doing other tasks in parallel, and if a debug kernel config is being used. It looks like this issue is often present with the NetDev CI. While this is being investigated, the tests are marked as flaky not to create noises on such CIs. Fixes: 01542c9bf9ab ("selftests: mptcp: add fastclose testcase") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/324 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-3-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-27	selftests: mptcp: simult flows: mark 'unbalanced' tests as flaky	Matthieu Baerts (NGI0)
	These tests are flaky since their introduction. This might be less or not visible depending on the CI running the tests, especially if it is also busy doing other tasks in parallel. A first analysis shown that the transfer can be slowed down when there are some re-injections at the MPTCP level. Such re-injections can of course happen, and disturb the transfer, but it looks strange to have them in this lab. That could be caused by the kernel having access to less CPU cycles -- e.g. when other activities are executed in parallel -- or by a misinterpretation on the MPTCP packet scheduler side. While this is being investigated, the tests are marked as flaky not to create noises in other CIs. Fixes: 219d04992b68 ("mptcp: push pending frames when subflow has free space") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/475 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-2-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>