git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-01-23	bpf: Add cookie to perf_event bpf_link_info records	Jiri Olsa
	At the moment we don't store cookie for perf_event probes, while we do that for the rest of the probes. Adding cookie fields to struct bpf_link_info perf event probe records: perf_event.uprobe perf_event.kprobe perf_event.tracepoint perf_event.perf_event And the code to store that in bpf_link_info struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20240119110505.400573-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	bpf: Use r constraint instead of p constraint in selftests	Jose E. Marchesi
	Some of the BPF selftests use the "p" constraint in inline assembly snippets, for input operands for MOV (rN = rM) instructions. This is mainly done via the __imm_ptr macro defined in tools/testing/selftests/bpf/progs/bpf_misc.h: #define __imm_ptr(name) [name]"p"(&name) Example: int consume_first_item_only(void ctx) { struct bpf_iter_num iter; asm volatile ( / create iterator */ "r1 = %[iter];" [...] : : __imm_ptr(iter) : CLOBBERS); [...] } The "p" constraint is a tricky one. It is documented in the GCC manual section "Simple Constraints": An operand that is a valid memory address is allowed. This is for ``load address'' and ``push address'' instructions. p in the constraint must be accompanied by address_operand as the predicate in the match_operand. This predicate interprets the mode specified in the match_operand as the mode of the memory reference for which the address would be valid. There are two problems: 1. It is questionable whether that constraint was ever intended to be used in inline assembly templates, because its behavior really depends on compiler internals. A "memory address" is not the same than a "memory operand" or a "memory reference" (constraint "m"), and in fact its usage in the template above results in an error in both x86_64-linux-gnu and bpf-unkonwn-none: foo.c: In function ‘bar’: foo.c:6:3: error: invalid 'asm': invalid expression as operand 6 \| asm volatile ("r1 = %[jorl]" : : [jorl]"p"(&jorl)); \| ^~~ I would assume the same happens with aarch64, riscv, and most/all other targets in GCC, that do not accept operands of the form A + B that are not wrapped either in a const or in a memory reference. To avoid that error, the usage of the "p" constraint in internal GCC instruction templates is supposed to be complemented by the 'a' modifier, like in: asm volatile ("r1 = %a[jorl]" : : [jorl]"p"(&jorl)); Internally documented (in GCC's final.cc) as: %aN means expect operand N to be a memory address (not a memory reference!) and print a reference to that address. That works because when the modifier 'a' is found, GCC prints an "operand address", which is not the same than an "operand". But... 2. Even if we used the internal 'a' modifier (we shouldn't) the 'rN = rM' instruction really requires a register argument. In cases involving automatics, like in the examples above, we easily end with: bar: #APP r1 = r10-4 #NO_APP In other cases we could conceibly also end with a 64-bit label that may overflow the 32-bit immediate operand of `rN = imm32' instructions: r1 = foo All of which is clearly wrong. clang happens to do "the right thing" in the current usage of __imm_ptr in the BPF tests, because even with -O2 it seems to "reload" the fp-relative address of the automatic to a register like in: bar: r1 = r10 r1 += -4 #APP r1 = r1 #NO_APP Which is what GCC would generate with -O0. Whether this is by chance or by design, the compiler shouln't be expected to do that reload driven by the "p" constraint. This patch changes the usage of the "p" constraint in the BPF selftests macros to use the "r" constraint instead. If a register is what is required, we should let the compiler know. Previous discussion in bpf@vger: https://lore.kernel.org/bpf/87h6p5ebpb.fsf@oracle.com/T/#ef0df83d6975c34dff20bf0dd52e078f5b8ca2767 Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240123181309.19853-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	bpf: fix constraint in test_tcpbpf_kern.c	Jose E. Marchesi
	GCC emits a warning: progs/test_tcpbpf_kern.c:60:9: error: ‘op’ is used uninitialized [-Werror=uninitialized] when an uninialized op is used with a "+r" constraint. The + modifier means a read-write operand, but that operand in the selftest is just written to. This patch changes the selftest to use a "=r" constraint. This pacifies GCC. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: Yonghong Song <yhs@meta.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240123205624.14746-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	bpf: avoid VLAs in progs/test_xdp_dynptr.c	Jose E. Marchesi
	VLAs are not supported by either the BPF port of clang nor GCC. The selftest test_xdp_dynptr.c contains the following code: const size_t tcphdr_sz = sizeof(struct tcphdr); const size_t udphdr_sz = sizeof(struct udphdr); const size_t ethhdr_sz = sizeof(struct ethhdr); const size_t iphdr_sz = sizeof(struct iphdr); const size_t ipv6hdr_sz = sizeof(struct ipv6hdr); [...] static __always_inline int handle_ipv4(struct xdp_md xdp, struct bpf_dynptr xdp_ptr) { __u8 eth_buffer[ethhdr_sz + iphdr_sz + ethhdr_sz]; __u8 iph_buffer_tcp[iphdr_sz + tcphdr_sz]; __u8 iph_buffer_udp[iphdr_sz + udphdr_sz]; [...] } The eth_buffer, iph_buffer_tcp and other automatics are fixed size only if the compiler optimizes away the constant global variables. clang does this, but GCC does not, turning these automatics into variable length arrays. This patch removes the global variables and turns these values into preprocessor constants. This makes the selftest to build properly with GCC. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: Yonghong Song <yhs@meta.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240123201729.16173-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	libbpf: call dup2() syscall directly	Andrii Nakryiko
	We've ran into issues with using dup2() API in production setting, where libbpf is linked into large production environment and ends up calling unintended custom implementations of dup2(). These custom implementations don't provide atomic FD replacement guarantees of dup2() syscall, leading to subtle and hard to debug issues. To prevent this in the future and guarantee that no libc implementation will do their own custom non-atomic dup2() implementation, call dup2() syscall directly with syscall(SYS_dup2). Note that some architectures don't seem to provide dup2 and have dup3 instead. Try to detect and pick best syscall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240119210201.1295511-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: Enable kptr_xchg_inline test for arm64	Hou Tao
	Now arm64 bpf jit has enable bpf_jit_supports_ptr_xchg(), so enable the test for arm64 as well. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20240119102529.99581-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftest/bpf: Add map_in_maps with BPF_MAP_TYPE_PERF_EVENT_ARRAY values	Andrey Grafin
	Check that bpf_object__load() successfully creates map_in_maps with BPF_MAP_TYPE_PERF_EVENT_ARRAY values. These changes cover fix in the previous patch "libbpf: Apply map_set_def_max_entries() for inner_maps on creation". A command line output is: - w/o fix $ sudo ./test_maps libbpf: map 'mim_array_pe': failed to create inner map: -22 libbpf: map 'mim_array_pe': failed to create: Invalid argument(-22) libbpf: failed to load object './test_map_in_map.bpf.o' Failed to load test prog - with fix $ sudo ./test_maps ... test_maps: OK, 0 SKIPPED Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support") Signed-off-by: Andrey Grafin <conquistador@yandex-team.ru> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20240117130619.9403-2-conquistador@yandex-team.ru Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	libbpf: Apply map_set_def_max_entries() for inner_maps on creation	Andrey Grafin
	This patch allows to auto create BPF_MAP_TYPE_ARRAY_OF_MAPS and BPF_MAP_TYPE_HASH_OF_MAPS with values of BPF_MAP_TYPE_PERF_EVENT_ARRAY by bpf_object__load(). Previous behaviour created a zero filled btf_map_def for inner maps and tried to use it for a map creation but the linux kernel forbids to create a BPF_MAP_TYPE_PERF_EVENT_ARRAY map with max_entries=0. Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support") Signed-off-by: Andrey Grafin <conquistador@yandex-team.ru> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20240117130619.9403-1-conquistador@yandex-team.ru Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	bpf: Sync uapi bpf.h header for the tooling infra	Daniel Borkmann
	Both commit 91051f003948 ("tcp: Dump bound-only sockets in inet_diag.") and commit 985b8ea9ec7e ("bpf, docs: Fix bpf_redirect_peer header doc") missed the tooling header sync. Fix it. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftest: bpf: Test bpf_sk_assign_tcp_reqsk().	Kuniyuki Iwashima
	This commit adds a sample selftest to demonstrate how we can use bpf_sk_assign_tcp_reqsk() as the backend of SYN Proxy. The test creates IPv4/IPv6 x TCP connections and transfer messages over them on lo with BPF tc prog attached. The tc prog will process SYN and returns SYN+ACK with the following ISN and TS. In a real use case, this part will be done by other hosts. MSB LSB ISN: \| 31 ... 8 \| 7 6 \| 5 \| 4 \| 3 2 1 0 \| \| Hash_1 \| MSS \| ECN \| SACK \| WScale \| TS: \| 31 ... 8 \| 7 ... 0 \| \| Random \| Hash_2 \| WScale in SYN is reused in SYN+ACK. The client returns ACK, and tc prog will recalculate ISN and TS from ACK and validate SYN Cookie. If it's valid, the prog calls kfunc to allocate a reqsk for skb and configure the reqsk based on the argument created from SYN Cookie. Later, the reqsk will be processed in cookie_v[46]_check() to create a connection. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240115205514.68364-7-kuniyu@amazon.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: Fix potential premature unload in bpf_testmod	Artem Savkov
	It is possible for bpf_kfunc_call_test_release() to be called from bpf_map_free_deferred() when bpf_testmod is already unloaded and perf_test_stuct.cnt which it tries to decrease is no longer in memory. This patch tries to fix the issue by waiting for all references to be dropped in bpf_testmod_exit(). The issue can be triggered by running 'test_progs -t map_kptr' in 6.5, but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only synchronous grace periods urgently"). Fixes: 65eb006d85a2 ("bpf: Move kernel test kfuncs to bpf_testmod") Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/82f55c0e-0ec8-4fe1-8d8c-b1de07558ad9@linux.dev Link: https://lore.kernel.org/bpf/20240110085737.8895-1-asavkov@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	bpftool: Silence build warning about calloc()	Tiezhu Yang
	There exists the following warning when building bpftool: CC prog.o prog.c: In function ‘profile_open_perf_events’: prog.c:2301:24: warning: ‘calloc’ sizes specified with ‘sizeof’ in the earlier argument and not in the later argument [-Wcalloc-transposed-args] 2301 \| sizeof(int), obj->rodata->num_cpu * obj->rodata->num_metric); \| ^~~ prog.c:2301:24: note: earlier argument should specify number of elements, later size of each element Tested with the latest upstream GCC which contains a new warning option -Wcalloc-transposed-args. The first argument to calloc is documented to be number of elements in array, while the second argument is size of each element, just switch the first and second arguments of calloc() to silence the build warning, compile tested only. Fixes: 47c09d6a9f67 ("bpftool: Introduce "prog profile" command") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20240116061920.31172-1-yangtiezhu@loongson.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	bpf: Minor improvements for bpf_cmp.	Alexei Starovoitov
	Few minor improvements for bpf_cmp() macro: . reduce number of args in __bpf_cmp() . rename NOFLIP to UNLIKELY . add a comment about 64-bit truncation in "i" constraint . use "ri" constraint for sizeof(rhs) <= 4 . improve error message for bpf_cmp_likely() Before: progs/iters_task_vma.c:31:7: error: variable 'ret' is uninitialized when used here [-Werror,-Wuninitialized] 31 \| if (bpf_cmp_likely(seen, <==, 1000)) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../bpf/bpf_experimental.h:325:3: note: expanded from macro 'bpf_cmp_likely' 325 \| ret; \| ^~~ progs/iters_task_vma.c:31:7: note: variable 'ret' is declared here ../bpf/bpf_experimental.h:310:3: note: expanded from macro 'bpf_cmp_likely' 310 \| bool ret; \| ^ After: progs/iters_task_vma.c:31:7: error: invalid operand for instruction 31 \| if (bpf_cmp_likely(seen, <==, 1000)) \| ^ ../bpf/bpf_experimental.h:324:17: note: expanded from macro 'bpf_cmp_likely' 324 \| asm volatile("r0 " #OP " invalid compare"); \| ^ <inline asm>:1:5: note: instantiated into assembly here 1 \| r0 <== invalid compare \| ^ Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240112220134.71209-1-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: Add a selftest with not-8-byte aligned BPF_ST	Yonghong Song
	Add a selftest with a 4 bytes BPF_ST of 0 where the store is not 8-byte aligned. The goal is to ensure that STACK_ZERO is properly marked in stack slots and the STACK_ZERO value can propagate properly during the load. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240110051355.2737232-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	bpf: Track aligned st store as imprecise spilled registers	Yonghong Song
	With patch set [1], precision backtracing supports register spill/fill to/from the stack. The patch [2] allows initial imprecise register spill with content 0. This is a common case for cpuv3 and lower for initializing the stack variables with pattern r1 = 0 (u64 )(r10 - 8) = r1 and the [2] has demonstrated good verification improvement. For cpuv4, the initialization could be (u64 )(r10 - 8) = 0 The current verifier marks the r10-8 contents with STACK_ZERO. Similar to [2], let us permit the above insn to behave like imprecise register spill which can reduce number of verified states. The change is in function check_stack_write_fixed_off(). Before this patch, spilled zero will be marked as STACK_ZERO which can provide precise values. In check_stack_write_var_off(), STACK_ZERO will be maintained if writing a const zero so later it can provide precise values if needed. The above handling of '(u64 )(r10 - 8) = 0' as a spill will have issues in check_stack_write_var_off() as the spill will be converted to STACK_MISC and the precise value 0 is lost. To fix this issue, if the spill slots with const zero and the BPF_ST write also with const zero, the spill slots are preserved, which can later provide precise values if needed. Without the change in check_stack_write_var_off(), the test_verifier subtest 'BPF_ST_MEM stack imm zero, variable offset' will fail. I checked cpuv3 and cpuv4 with and without this patch with veristat. There is no state change for cpuv3 since '(u64 )(r10 - 8) = 0' is only generated with cpuv4. For cpuv4: $ ../veristat -C old.cpuv4.csv new.cpuv4.csv -e file,prog,insns,states -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) ------------------------------------------ ------------------- --------- --------- --------------- ---------- ---------- ------------- local_storage_bench.bpf.linked3.o get_local 228 168 -60 (-26.32%) 17 14 -3 (-17.65%) pyperf600_bpf_loop.bpf.linked3.o on_event 6066 4889 -1177 (-19.40%) 403 321 -82 (-20.35%) test_cls_redirect.bpf.linked3.o cls_redirect 35483 35387 -96 (-0.27%) 2179 2177 -2 (-0.09%) test_l4lb_noinline.bpf.linked3.o balancer_ingress 4494 4522 +28 (+0.62%) 217 219 +2 (+0.92%) test_l4lb_noinline_dynptr.bpf.linked3.o balancer_ingress 1432 1455 +23 (+1.61%) 92 94 +2 (+2.17%) test_xdp_noinline.bpf.linked3.o balancer_ingress_v6 3462 3458 -4 (-0.12%) 216 216 +0 (+0.00%) verifier_iterating_callbacks.bpf.linked3.o widening 52 41 -11 (-21.15%) 4 3 -1 (-25.00%) xdp_synproxy_kern.bpf.linked3.o syncookie_tc 12412 11719 -693 (-5.58%) 345 330 -15 (-4.35%) xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 12478 11794 -684 (-5.48%) 346 331 -15 (-4.34%) test_l4lb_noinline and test_l4lb_noinline_dynptr has minor regression, but pyperf600_bpf_loop and local_storage_bench gets pretty good improvement. [1] https://lore.kernel.org/all/20231205184248.1502704-1-andrii@kernel.org/ [2] https://lore.kernel.org/all/20231205184248.1502704-9-andrii@kernel.org/ Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Tested-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240110051348.2737007-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: Test assigning ID to scalars on spill	Maxim Mikityanskiy
	The previous commit implemented assigning IDs to registers holding scalars before spill. Add the test cases to check the new functionality. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240108205209.838365-10-maxtram95@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	bpf: Assign ID to scalars on spill	Maxim Mikityanskiy
	Currently, when a scalar bounded register is spilled to the stack, its ID is preserved, but only if was already assigned, i.e. if this register was MOVed before. Assign an ID on spill if none is set, so that equal scalars could be tracked if a register is spilled to the stack and filled into another register. One test is adjusted to reflect the change in register IDs. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240108205209.838365-9-maxtram95@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: Add a test case for 32-bit spill tracking	Maxim Mikityanskiy
	When a range check is performed on a register that was 32-bit spilled to the stack, the IDs of the two instances of the register are the same, so the range should also be the same. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240108205209.838365-6-maxtram95@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: check if imprecise stack spills confuse infinite loop detection	Eduard Zingerman
	Verify that infinite loop detection logic separates states with identical register states but different imprecise scalars spilled to stack. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240108205209.838365-4-maxtram95@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: Fix the u64_offset_to_skb_data test	Maxim Mikityanskiy
	The u64_offset_to_skb_data test is supposed to make a 64-bit fill, but instead makes a 16-bit one. Fix the test according to its intention and update the comments accordingly (umax is no longer 0xffff). The 16-bit fill is covered by u16_offset_to_skb_data. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240108205209.838365-2-maxtram95@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: Update LLVM Phabricator links	Nathan Chancellor
	reviews.llvm.org was LLVM's Phabricator instances for code review. It has been abandoned in favor of GitHub pull requests. While the majority of links in the kernel sources still work because of the work Fangrui has done turning the dynamic Phabricator instance into a static archive, there are some issues with that work, so preemptively convert all the links in the kernel sources to point to the commit on GitHub. Most of the commits have the corresponding differential review link in the commit message itself so there should not be any loss of fidelity in the relevant information. Additionally, fix a typo in the xdpwall.c print ("LLMV" -> "LLVM") while in the area. Link: https://discourse.llvm.org/t/update-on-github-pull-requests/71540/172 Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20240111-bpf-update-llvm-phabricator-links-v2-1-9a7ae976bd64@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: detect testing prog flags support	Andrii Nakryiko
	Various tests specify extra testing prog_flags when loading BPF programs, like BPF_F_TEST_RND_HI32, and more recently also BPF_F_TEST_REG_INVARIANTS. While BPF_F_TEST_RND_HI32 is old enough to not cause much problem on older kernels, BPF_F_TEST_REG_INVARIANTS is very fresh and unconditionally specifying it causes selftests to fail on even slightly outdated kernels. This breaks libbpf CI test against 4.9 and 5.15 kernels, it can break some local development (done outside of VM), etc. To prevent this, and guard against similar problems in the future, do runtime detection of supported "testing flags", and only provide those that host kernel recognizes. Acked-by: Song Liu <song@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240109231738.575844-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: fix test_loader check message	Andrii Nakryiko
	Seeing: process_subtest:PASS:Can't alloc specs array 0 nsec ... in verbose successful test log is very confusing. Use smaller identifier-like test tag to denote that we are asserting specs array allocation success. Now it's much less distracting: process_subtest:PASS:specs_alloc 0 nsec Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240105000909.2818934-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: Test the inlining of bpf_kptr_xchg()	Hou Tao
	The test uses bpf_prog_get_info_by_fd() to obtain the xlated instructions of the program first. Since these instructions have already been rewritten by the verifier, the tests then checks whether the rewritten instructions are as expected. And to ensure LLVM generates code exactly as expected, use inline assembly and a naked function. Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240105104819.3916743-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	selftests/bpf: Factor out get_xlated_program() helper	Hou Tao
	Both test_verifier and test_progs use get_xlated_program(), so moving the helper into testing_helpers.h to reuse it. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20240105104819.3916743-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23	tools: iio: replace seekdir() in iio_generic_buffer	Petre Rodan
	Replace seekdir() with rewinddir() in order to fix a localized glibc bug. One of the glibc patches that stable Gentoo is using causes an improper directory stream positioning bug on 32bit arm. That in turn ends up as a floating point exception in iio_generic_buffer. The attached patch provides a fix by using an equivalent function which should not cause trouble for other distros and is easier to reason about in general as it obviously always goes back to to the start. https://sourceware.org/bugzilla/show_bug.cgi?id=31212 Signed-off-by: Petre Rodan <petre.rodan@subdimension.ro> Link: https://lore.kernel.org/r/20240108103224.3986-1-petre.rodan@subdimension.ro Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-01-23	selftest: Don't reuse port for SO_INCOMING_CPU test.	Kuniyuki Iwashima
	Jakub reported that ASSERT_EQ(cpu, i) in so_incoming_cpu.c seems to fire somewhat randomly. # # RUN so_incoming_cpu.before_reuseport.test3 ... # # so_incoming_cpu.c:191:test3:Expected cpu (32) == i (0) # # test3: Test terminated by assertion # # FAIL so_incoming_cpu.before_reuseport.test3 # not ok 3 so_incoming_cpu.before_reuseport.test3 When the test failed, not-yet-accepted CLOSE_WAIT sockets received SYN with a "challenging" SEQ number, which was sent from an unexpected CPU that did not create the receiver. The test basically does: 1. for each cpu: 1-1. create a server 1-2. set SO_INCOMING_CPU 2. for each cpu: 2-1. set cpu affinity 2-2. create some clients 2-3. let clients connect() to the server on the same cpu 2-4. close() clients 3. for each server: 3-1. accept() all child sockets 3-2. check if all children have the same SO_INCOMING_CPU with the server The root cause was the close() in 2-4. and net.ipv4.tcp_tw_reuse. In a loop of 2., close() changed the client state to FIN_WAIT_2, and the peer transitioned to CLOSE_WAIT. In another loop of 2., connect() happened to select the same port of the FIN_WAIT_2 socket, and it was reused as the default value of net.ipv4.tcp_tw_reuse is 2. As a result, the new client sent SYN to the CLOSE_WAIT socket from a different CPU, and the receiver's sk_incoming_cpu was overwritten with unexpected CPU ID. Also, the SYN had a different SEQ number, so the CLOSE_WAIT socket responded with Challenge ACK. The new client properly returned RST and effectively killed the CLOSE_WAIT socket. This way, all clients were created successfully, but the error was detected later by 3-2., ASSERT_EQ(cpu, i). To avoid the failure, let's make sure that (i) the number of clients is less than the number of available ports and (ii) such reuse never happens. Fixes: 6df96146b202 ("selftest: Add test for SO_INCOMING_CPU.") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Tested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20240120031642.67014-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-22	perf sched: Commit to evsel__taskstate() to parse task state info	Ze Gao
	Now that we have evsel__taskstate() which no longer relies on the hardcoded task state string and has good backward compatibility, we have a good reason to use it. Note TASK_STATE_TO_CHAR_STR and task bitmasks are useless now so we remove them for good. And now we pass the state info back and forth in a symbolic char which explains itself well instead. Signed-off-by: Ze Gao <zegao@tencent.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20240123022425.1611483-1-zegao@tencent.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf util: Add evsel__taskstate() to parse the task state info instead	Ze Gao
	Now that we have the __prinf_flags() parsing routines, we add a new helper evsel__taskstate() to extract the task state info from the recorded data. Signed-off-by: Ze Gao <zegao@tencent.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20240122070859.1394479-5-zegao@tencent.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf util: Add helpers to parse task state string from libtraceevent	Ze Gao
	Perf uses a hard coded string "RSDTtXZPI" to index the sched_switch prev_state field raw bitmask value. This works well except for when the kernel changes this string, in which case this will break again. Instead we add a new way to parse task state string from tracepoint print format already recorded by perf, which eliminates the further dependencies with this hardcode and unmaintainable macro, and this is exactly what libtraceevent[1] does for now. So we borrow the print flags parsing logic from libtraceevent[1]. And in get_states(), we walk the print arguments until the __print_flags() for the target state field is found, and use that to build the states string for future parsing. [1]: https://lore.kernel.org/linux-trace-devel/20231224140732.7d41698d@rorschach.local.home/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Ze Gao <zegao@tencent.com> Link: https://lore.kernel.org/r/20240122070859.1394479-4-zegao@tencent.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf sched: Sync state char array with the kernel	Ze Gao
	Update state char array to match the latest kernel definitions and remove unused state mapping macros. Note this is the preparing patch for get rid of the way to parse process state from raw bitmask value. Instead we are going to parse it from the recorded tracepoint print format, and this change marks why we're doing it. Signed-off-by: Ze Gao <zegao@tencent.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20240122070859.1394479-3-zegao@tencent.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf data: Minor code style alignment cleanup	Yang Jihong
	Minor code style alignment cleanup for perf_data__switch() and perf_data__write(). No functional change. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240119040304.3708522-4-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf record: Check conflict between '--timestamp-filename' option and pipe ↵	Yang Jihong
	mode before recording In pipe mode, no need to switch perf data output, therefore, '--timestamp-filename' option should not take effect. Check the conflict before recording and output WARNING. In this case, the check pipe mode in perf_data__switch() can be removed. Before: # perf record --timestamp-filename -o- perf test -w noploop \| perf report -i- --percent-limit=1 # To display the perf.data header info, please use --header/--header-only options. # [ perf record: Woken up 1 times to write data ] [ perf record: Dump -.2024011812110182 ] # # Total Lost Samples: 0 # # Samples: 4K of event 'cycles:P' # Event count (approx.): 2176784359 # # Overhead Command Shared Object Symbol # ........ ....... .................... ...................................... # 97.83% perf perf [.] noploop # # (Tip: Print event counts in CSV format with: perf stat -x,) # After: # perf record --timestamp-filename -o- perf test -w noploop \| perf report -i- --percent-limit=1 WARNING: --timestamp-filename option is not available in pipe mode. # To display the perf.data header info, please use --header/--header-only options. # [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] # # Total Lost Samples: 0 # # Samples: 4K of event 'cycles:P' # Event count (approx.): 2185575421 # # Overhead Command Shared Object Symbol # ........ ....... ..................... ............................................. # 97.75% perf perf [.] noploop # # (Tip: Profiling branch (mis)predictions with: perf record -b / perf report) # Fixes: ecfd7a9c044e ("perf record: Add '--timestamp-filename' option to append timestamp to output file name") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240119040304.3708522-3-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf record: Fix possible incorrect free in record__switch_output()	Yang Jihong
	perf_data__switch() may not assign a legal value to 'new_filename'. In this case, 'new_filename' uses the on-stack value, which may cause a incorrect free and unexpected result. Fixes: 03724b2e9c45 ("perf record: Allow to limit number of reported perf.data files") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240119040304.3708522-2-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf dwarf-aux: Check allowed DWARF Ops	Namhyung Kim
	The DWARF location expression can be fairly complex and it'd be hard to match it with the condition correctly. So let's be conservative and only allow simple expressions. For now it just checks the first operation in the list. The following operations looks ok: * DW_OP_stack_value * DW_OP_deref_size * DW_OP_deref * DW_OP_piece To refuse complex (and unsupported) location expressions, add check_allowed_ops() to compare the rest of the list. It seems earlier result contained those unsupported expressions. For example, I found some local struct variable is placed like below. <2><43d1517>: Abbrev Number: 62 (DW_TAG_variable) <43d1518> DW_AT_location : 15 byte block: 91 50 93 8 91 78 93 4 93 84 8 91 68 93 4 (DW_OP_fbreg: -48; DW_OP_piece: 8; DW_OP_fbreg: -8; DW_OP_piece: 4; DW_OP_piece: 1028; DW_OP_fbreg: -24; DW_OP_piece: 4) Another example is something like this. 0057c8be ffffffffffffffff ffffffff812109f0 (base address) 0057c8ce ffffffff812112b5 ffffffff812112c8 (DW_OP_breg3 (rbx): 0; DW_OP_constu: 18446744073709551612; DW_OP_and; DW_OP_stack_value) It should refuse them. After the change, the stat shows: Annotate data type stats: total 294, ok 158 (53.7%), bad 136 (46.3%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 53 : no_var 14 : no_typeinfo 7 : bad_offset Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240117062657.985479-10-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf annotate-data: Support stack variables	Namhyung Kim
	Local variables are allocated in the stack and the location list should look like base register(s) and an offset. Extend the die_find_variable_by_reg() to handle the following expressions * DW_OP_breg{0..31} * DW_OP_bregx * DW_OP_fbreg Ususally DWARF subprogram entries have frame base information and use it to locate stack variable like below: <2><43d1575>: Abbrev Number: 62 (DW_TAG_variable) <43d1576> DW_AT_location : 2 byte block: 91 7c (DW_OP_fbreg: -4) <--- here <43d1579> DW_AT_name : (indirect string, offset: 0x2c00c9): i <43d157d> DW_AT_decl_file : 1 <43d157e> DW_AT_decl_line : 78 <43d157f> DW_AT_type : <0x43d19d7> I found some differences on saving the frame base between gcc and clang. The gcc uses the CFA to get the base so it needs to check the current frame's CFI info. In this case, stack offset needs to be adjusted from the start of the CFA. <1><1bb8d>: Abbrev Number: 102 (DW_TAG_subprogram) <1bb8e> DW_AT_name : (indirect string, offset: 0x74d41): kernel_init <1bb92> DW_AT_decl_file : 2 <1bb92> DW_AT_decl_line : 1440 <1bb94> DW_AT_decl_column : 18 <1bb95> DW_AT_prototyped : 1 <1bb95> DW_AT_type : <0xcc> <1bb99> DW_AT_low_pc : 0xffffffff81bab9e0 <1bba1> DW_AT_high_pc : 0x1b2 <1bba9> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <------ here <1bbab> DW_AT_call_all_calls: 1 <1bbab> DW_AT_sibling : <0x1bf5a> While clang sets it to a register directly and it can check the register and offset in the instruction directly. <1><43d1542>: Abbrev Number: 60 (DW_TAG_subprogram) <43d1543> DW_AT_low_pc : 0xffffffff816a7c60 <43d154b> DW_AT_high_pc : 0x98 <43d154f> DW_AT_frame_base : 1 byte block: 56 (DW_OP_reg6 (rbp)) <---------- here <43d1551> DW_AT_GNU_all_call_sites: 1 <43d1551> DW_AT_name : (indirect string, offset: 0x3bce91): foo <43d1555> DW_AT_decl_file : 1 <43d1556> DW_AT_decl_line : 75 <43d1557> DW_AT_prototyped : 1 <43d1557> DW_AT_type : <0x43c7332> <43d155b> DW_AT_external : 1 Also it needs to update the offset after finding the type like global variables since the offset was from the frame base. Factor out match_var_offset() to check global and local variables in the same way. The type stats are improved too: Annotate data type stats: total 294, ok 160 (54.4%), bad 134 (45.6%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 51 : no_var 14 : no_typeinfo 7 : bad_offset Reviewed-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20240117062657.985479-9-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf dwarf-aux: Add die_get_cfa()	Namhyung Kim
	The die_get_cfa() is to get frame base register and offset at the given instruction address (pc). This info will be used to locate stack variables which have location expression using DW_OP_fbreg. Reviewed-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20240117062657.985479-8-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf annotate-data: Support global variables	Namhyung Kim
	Global variables are accessed using PC-relative address so it needs to be handled separately. The PC-rel addressing is detected by using DWARF_REG_PC. On x86, %rip register would be used. The address can be calculated using the ip and offset in the instruction. But it should start from the next instruction so add calculate_pcrel_addr() to do it properly. But global variables defined in a different file would only have a declaration which doesn't include a location list. So it first tries to get the type info using the address, and then looks up the variable declarations using name. The name of global variables should be get from the symbol table. The declaration would have the type info. So extend find_var_type() to take both address and name for global variables. The stat is now looks like: Annotate data type stats: total 294, ok 153 (52.0%), bad 141 (48.0%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 61 : no_var 10 : no_typeinfo 8 : bad_offset Reviewed-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20240117062657.985479-7-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf annotate-data: Handle PC-relative addressing	Namhyung Kim
	Extend find_data_type_die() to find data type from PC-relative address using die_find_variable_by_addr(). Users need to pass the address for the (global) variable. The offset for the variable should be updated after finding the type because the offset in the instruction is just to calcuate the address for the variable. So it changed to pass a pointer to offset and renamed it to 'poffset'. First it searches variables in the CU DIE as it's likely that the global variables are defined in the file level. And then it iterates the scope DIEs to find a local (static) variable. Reviewed-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20240117062657.985479-6-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf annotate-data: Add stack operation pseudo type	Namhyung Kim
	A typical function prologue and epilogue include multiple stack operations to save and restore the current value of registers. On x86, it looks like below: push r15 push r14 push r13 push r12 ... pop r12 pop r13 pop r14 pop r15 ret As these all touches the stack memory region, chances are high that they appear in a memory profile data. But these are not used for any real purpose yet so it'd return no types. One of my profile type shows that non neglible portion of data came from the stack operations. It also seems GCC generates more stack operations than clang. Annotate Instruction stats total 264, ok 169 (64.0%), bad 95 (36.0%) Name : Good Bad ----------------------------------------------------------- movq : 49 27 movl : 24 9 popq : 0 19 <-- here cmpl : 17 2 addq : 14 1 cmpq : 12 2 cmpxchgl : 3 7 Instead of dealing them as unknown, let's create a seperate pseudo type to represent those stack operations separately. Reviewed-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20240117062657.985479-5-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf annotate-data: Handle array style accesses	Namhyung Kim
	On x86, instructions for array access often looks like below. mov 0x1234(%rax,%rbx,8), %rcx Usually the first register holds the type information and the second one has the index. And the current code only looks up a variable for the first register. But it's possible to be in the other way around so it needs to check the second register if the first one failed. The stat changed like this. Annotate data type stats: total 294, ok 148 (50.3%), bad 146 (49.7%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 66 : no_var 10 : no_typeinfo 8 : bad_offset Reviewed-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20240117062657.985479-4-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf annotate-data: Handle macro fusion on x86	Namhyung Kim
	When a sample was come from a conditional branch without a memory operand, it could be due to a macro fusion with a previous instruction. So it needs to check the memory operand in the previous one. This improves the stat like below: Annotate data type stats: total 294, ok 147 (50.0%), bad 147 (50.0%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 71 : no_var 6 : no_typeinfo 8 : bad_offset Reviewed-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20240117062657.985479-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf annotate-data: Parse 'lock' prefix from llvm-objdump	Namhyung Kim
	For the performance reason, I prefer llvm-objdump over GNU's. But I found that llvm-objdump puts x86 lock prefix in a separate line like below. ffffffff81000695: f0 lock ffffffff81000696: ff 83 54 0b 00 00 incl 2900(%rbx) This should be parsed properly, but I just changed to find the insn with next offset for now. This improves the statistics as it can process more instructions. Annotate data type stats: total 294, ok 144 (49.0%), bad 150 (51.0%) ----------------------------------------------------------- 30 : no_sym 35 : no_mem_ops 71 : no_var 6 : no_typeinfo 8 : bad_offset Reviewed-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20240117062657.985479-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf build: Check whether pkg-config is installed when libtraceevent is linked	Yang Jihong
	If pkg-config is not installed when libtraceevent is linked, the build fails. The error information is as follows: $ make <SNIP> In file included from /home/yjh/projects_linux/perf-tool-next/linux/tools/perf/util/evsel.c:43: /home/yjh/projects_linux/perf-tool-next/linux/tools/perf/util/trace-event.h:149:62: error: operator '&&' has no right operand 149 \| #if defined(LIBTRACEEVENT_VERSION) && LIBTRACEEVENT_VERSION >= MAKE_LIBTRACEEVENT_VERSION(1, 5, 0) \| ^~ error: command '/usr/bin/gcc' failed with exit code 1 cp: cannot stat 'python_ext_build/lib/perf.so': No such file or directory make[2]: [Makefile.perf:668: python/perf.cpython-310-x86_64-linux-gnu.so] Error 1 make[2]: * Waiting for unfinished jobs.... Because pkg-config is not installed, fail to get libtraceevent version in Makefile.config file. As a result, LIBTRACEEVENT_VERSION is empty. However, the preceding error information is not user-friendly. Identify errors in advance by checking that pkg-config is installed at compile time. The build results of various scenarios are as follows: 1. build successful when libtraceevent is not linked and pkg-config is not installed $ pkg-config --version -bash: /usr/bin/pkg-config: No such file or directory $ make clean >/dev/null $ make NO_LIBTRACEEVENT=1 >/dev/null Makefile.config:1133: No alternatives command found, you need to set JDIR= to point to the root of your Java directory PERF_VERSION = 6.7.rc6.gd988c9f511af $ echo $? 0 2. dummy pkg-config is missing when libtraceevent is linked $ pkg-config --version -bash: /usr/bin/pkg-config: No such file or directory $ make clean >/dev/null $ make >/dev/null Makefile.config:221: * Error: pkg-config needed by libtraceevent is missing on this system, please install it. Stop. make[1]: * [Makefile.perf:251: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 $ echo $? 2 3. build successful when libtraceevent is linked and pkg-config is installed $ pkg-config --version 0.29.2 $ make clean >/dev/null $ make >/dev/null Makefile.config:1133: No alternatives command found, you need to set JDIR= to point to the root of your Java directory PERF_VERSION = 6.7.rc6.gd988c9f511af $ echo $? 0 Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240112034019.3558584-1-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	perf test: raise limit to 20 percent for perf_stat_--bpf-counters_test	Thomas Richter
	This test case often fails on s390 (about 2 out of 10) because the 10% percent limit on the difference between --bpf-counters event counting and s390 hardware counting is more than 10% in all failure cases. Raise the limit to 20% on s390 and the test case succeeds. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: gor@linux.ibm.com Cc: hca@linux.ibm.com Cc: sumanthk@linux.ibm.com Cc: svens@linux.ibm.com Link: https://lore.kernel.org/r/20240108084009.3959211-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22	tools/testing/nvdimm: Disable "missing prototypes / declarations" warnings	Dan Williams
	Prevent warnings of the form: tools/testing/nvdimm/config_check.c:4:6: error: no previous prototype for ‘check’ [-Werror=missing-prototypes] ...by locally disabling some warnings. It turns out that: Commit 0fcb70851fbf ("Makefile.extrawarn: turn on missing-prototypes globally") ...in addition to expanding in-tree coverage, also impacts out-of-tree module builds like those in tools/testing/nvdimm/. Filter out the warning options on unit test code that does not effect mainline builds. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/170543984331.460832.1780246477583036191.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-22	tools/testing/cxl: Disable "missing prototypes / declarations" warnings	Dan Williams
	Prevent warnings of the form: tools/testing/cxl/test/mock.c:44:6: error: no previous prototype for ‘__wrap_is_acpi_device_node’ [-Werror=missing-prototypes] tools/testing/cxl/test/mock.c:63:5: error: no previous prototype for ‘__wrap_acpi_table_parse_cedt’ [-Werror=missing-prototypes] tools/testing/cxl/test/mock.c:81:13: error: no previous prototype for ‘__wrap_acpi_evaluate_integer’ [-Werror=missing-prototypes] ...by locally disabling some warnings. It turns out that: Commit 0fcb70851fbf ("Makefile.extrawarn: turn on missing-prototypes globally") ...in addition to expanding in-tree coverage, also impacts out-of-tree module builds like those in tools/testing/cxl/. Filter out the warning options on unit test code that does not effect mainline builds. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/170543983780.460832.10920261849128601697.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-22	selftests/rseq: Do not skip !allowed_cpus for mm_cid	Mathieu Desnoyers
	Indexing with mm_cid is incompatible with skipping disallowed cpumask, because concurrency IDs are based on a virtual ID allocation which is unrelated to the physical CPU mask. These issues can be reproduced by running the rseq selftests under a taskset which excludes CPU 0, e.g. taskset -c 10-20 ./run_param_test.sh Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22	selftests: livepatch: Test livepatching a heavily called syscall	Marcos Paulo de Souza
	The test proves that a syscall can be livepatched. It is interesting because syscalls are called a tricky way. Also the process gets livepatched either when sleeping in the userspace or when entering or leaving the kernel space. The livepatch is a bit tricky: 1. The syscall function name is architecture specific. Also ARCH_HAS_SYSCALL_WRAPPER must be taken in account. 2. The syscall must stay working the same way for other processes on the system. It is solved by decrementing a counter only for PIDs of the test processes. It means that the test processes has to call the livepatched syscall at least once. The test creates one userspace process per online cpu. The processes are calling getpid in a busy loop. The intention is to create random locations when the livepatch gets enabled. Nothing is guarantted. The magic is in the randomness. Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22	livepatch: Move tests from lib/livepatch to selftests/livepatch	Marcos Paulo de Souza
	The modules are being moved from lib/livepatch to tools/testing/selftests/livepatch/test_modules. This code moving will allow writing more complex tests, like for example an userspace C code that will call a livepatched kernel function. The modules are now built as out-of-tree modules, but being part of the kernel source means they will be maintained. Another advantage of the code moving is to be able to easily change, debug and rebuild the tests by running make on the selftests/livepatch directory, which is not currently possible since the modules on lib/livepatch are build and installed using the "modules" target. The current approach also keeps the ability to execute the tests manually by executing the scripts inside selftests/livepatch directory, as it's currently supported. If the modules are modified, they needed to be rebuilt before running the scripts though. The modules are built before running the selftests when using the kselftest invocations: make kselftest TARGETS=livepatch or make -C tools/testing/selftests/livepatch run_tests Having the modules being built as out-of-modules requires changing the currently used 'modprobe' by 'insmod' and adapt the test scripts that check for the kernel message buffer. Now it is possible to only compile the modules by running: make -C tools/testing/selftests/livepatch/ This way the test modules and other test program can be built in order to be packaged if so desired. As there aren't any modules being built on lib/livepatch, remove the TEST_LIVEPATCH Kconfig and it's references. Note: "make gen_tar" packages the pre-built binaries into the tarball. It means that it will store the test modules pre-built for the kernel running on the build host. Note that these modules need not binary compatible with the kernel built from the same sources. But the same is true for other packaged selftest binaries. The entire kernel sources are needed for rebuilding the selftests on another system. Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>