linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2023-04-15	selftests/bpf: Add refcounted_kptr tests	Dave Marchevsky
	Test refcounted local kptr functionality added in previous patches in the series. Usecases which pass verification: * Add refcounted local kptr to both tree and list. Then, read and - possibly, depending on test variant - delete from tree, then list. * Also test doing read-and-maybe-delete in opposite order * Stash a refcounted local kptr in a map_value, then add it to a rbtree. Read from both, possibly deleting after tree read. * Add refcounted local kptr to both tree and list. Then, try reading and deleting twice from one of the collections. * bpf_refcount_acquire of just-added non-owning ref should work, as should bpf_refcount_acquire of owning ref just out of bpf_obj_new Usecases which fail verification: * The simple successful bpf_refcount_acquire cases from above should both fail to verify if the newly-acquired owning ref is not dropped Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-10-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-15	bpf: Migrate bpf_rbtree_remove to possibly fail	Dave Marchevsky
	This patch modifies bpf_rbtree_remove to account for possible failure due to the input rb_node already not being in any collection. The function can now return NULL, and does when the aforementioned scenario occurs. As before, on successful removal an owning reference to the removed node is returned. Adding KF_RET_NULL to bpf_rbtree_remove's kfunc flags - now KF_RET_NULL \| KF_ACQUIRE - provides the desired verifier semantics: * retval must be checked for NULL before use * if NULL, retval's ref_obj_id is released * retval is a "maybe acquired" owning ref, not a non-owning ref, so it will live past end of critical section (bpf_spin_unlock), and thus can be checked for NULL after the end of the CS BPF programs must add checks ============================ This does change bpf_rbtree_remove's verifier behavior. BPF program writers will need to add NULL checks to their programs, but the resulting UX looks natural: bpf_spin_lock(&glock); n = bpf_rbtree_first(&ghead); if (!n) { /* ... /} res = bpf_rbtree_remove(&ghead, &n->node); bpf_spin_unlock(&glock); if (!res) / Newly-added check after this patch / return 1; n = container_of(res, / ... /); / Do something else with n / bpf_obj_drop(n); return 0; The "if (!res)" check above is the only addition necessary for the above program to pass verification after this patch. bpf_rbtree_remove no longer clobbers non-owning refs ==================================================== An issue arises when bpf_rbtree_remove fails, though. Consider this example: struct node_data { long key; struct bpf_list_node l; struct bpf_rb_node r; struct bpf_refcount ref; }; long failed_sum; void bpf_prog() { struct node_data n = bpf_obj_new(/* ... /); struct bpf_rb_node res; n->key = 10; bpf_spin_lock(&glock); bpf_list_push_back(&some_list, &n->l); /* n is now a non-owning ref / res = bpf_rbtree_remove(&some_tree, &n->r, / ... /); if (!res) failed_sum += n->key; / not possible / bpf_spin_unlock(&glock); / if (res) { do something useful and drop } ... */ } The bpf_rbtree_remove in this example will always fail. Similarly to bpf_spin_unlock, bpf_rbtree_remove is a non-owning reference invalidation point. The verifier clobbers all non-owning refs after a bpf_rbtree_remove call, so the "failed_sum += n->key" line will fail verification, and in fact there's no good way to get information about the node which failed to add after the invalidation. This patch removes non-owning reference invalidation from bpf_rbtree_remove to allow the above usecase to pass verification. The logic for why this is now possible is as follows: Before this series, bpf_rbtree_add couldn't fail and thus assumed that its input, a non-owning reference, was in the tree. But it's easy to construct an example where two non-owning references pointing to the same underlying memory are acquired and passed to rbtree_remove one after another (see rbtree_api_release_aliasing in selftests/bpf/progs/rbtree_fail.c). So it was necessary to clobber non-owning refs to prevent this case and, more generally, to enforce "non-owning ref is definitely in some collection" invariant. This series removes that invariant and the failure / runtime checking added in this patch provide a clean way to deal with the aliasing issue - just fail to remove. Because the aliasing issue prevented by clobbering non-owning refs is no longer an issue, this patch removes the invalidate_non_owning_refs call from verifier handling of bpf_rbtree_remove. Note that bpf_spin_unlock - the other caller of invalidate_non_owning_refs - clobbers non-owning refs for a different reason, so its clobbering behavior remains unchanged. No BPF program changes are necessary for programs to remain valid as a result of this clobbering change. A valid program before this patch passed verification with its non-owning refs having shorter (or equal) lifetimes due to more aggressive clobbering. Also, update existing tests to check bpf_rbtree_remove retval for NULL where necessary, and move rbtree_api_release_aliasing from progs/rbtree_fail.c to progs/rbtree.c since it's now expected to pass verification. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-8-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-15	selftests/bpf: Modify linked_list tests to work with macro-ified inserts	Dave Marchevsky
	The linked_list tests use macros and function pointers to reduce code duplication. Earlier in the series, bpf_list_push_{front,back} were modified to be macros, expanding to invoke actual kfuncs bpf_list_push_{front,back}_impl. Due to this change, a code snippet like: void (p)(void , void ) = (void )&bpf_list_##op; p(hexpr, nexpr); meant to do bpf_list_push_{front,back}(hexpr, nexpr), will no longer work as it's no longer valid to do &bpf_list_push_{front,back} since they're no longer functions. This patch fixes issues of this type, along with two other minor changes - one improvement and one fix - both related to the node argument to list_push_{front,back}. * The fix: migration of list_push tests away from (void , void ) func ptr uncovered that some tests were incorrectly passing pointer to node, not pointer to struct bpf_list_node within the node. This patch fixes such issues (CHECK(..., f) -> CHECK(..., &f->node)) * The improvement: In linked_list tests, the struct foo type has two list_node fields: node and node2, at byte offsets 0 and 40 within the struct, respectively. Currently node is used in ~all tests involving struct foo and lists. The verifier needs to do some work to account for the offset of bpf_list_node within the node type, so using node2 instead of node exercises that logic more in the tests. This patch migrates linked_list tests to use node2 instead of node. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-7-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-15	bpf: Migrate bpf_rbtree_add and bpf_list_push_{front,back} to possibly fail	Dave Marchevsky
	Consider this code snippet: struct node { long key; bpf_list_node l; bpf_rb_node r; bpf_refcount ref; } int some_bpf_prog(void ctx) { struct node n = bpf_obj_new(/.../), m; bpf_spin_lock(&glock); bpf_rbtree_add(&some_tree, &n->r, / ... /); m = bpf_refcount_acquire(n); bpf_rbtree_add(&other_tree, &m->r, / ... /); bpf_spin_unlock(&glock); / ... / } After bpf_refcount_acquire, n and m point to the same underlying memory, and that node's bpf_rb_node field is being used by the some_tree insert, so overwriting it as a result of the second insert is an error. In order to properly support refcounted nodes, the rbtree and list insert functions must be allowed to fail. This patch adds such support. The kfuncs bpf_rbtree_add, bpf_list_push_{front,back} are modified to return an int indicating success/failure, with 0 -> success, nonzero -> failure. bpf_obj_drop on failure ======================= Currently the only reason an insert can fail is the example above: the bpf_{list,rb}_node is already in use. When such a failure occurs, the insert kfuncs will bpf_obj_drop the input node. This allows the insert operations to logically fail without changing their verifier owning ref behavior, namely the unconditional release_reference of the input owning ref. With insert that always succeeds, ownership of the node is always passed to the collection, since the node always ends up in the collection. With a possibly-failed insert w/ bpf_obj_drop, ownership of the node is always passed either to the collection (success), or to bpf_obj_drop (failure). Regardless, it's correct to continue unconditionally releasing the input owning ref, as something is always taking ownership from the calling program on insert. Keeping owning ref behavior unchanged results in a nice default UX for insert functions that can fail. If the program's reaction to a failed insert is "fine, just get rid of this owning ref for me and let me go on with my business", then there's no reason to check for failure since that's default behavior. e.g.: long important_failures = 0; int some_bpf_prog(void ctx) { struct node n, m, o; / all bpf_obj_new'd / bpf_spin_lock(&glock); bpf_rbtree_add(&some_tree, &n->node, / ... /); bpf_rbtree_add(&some_tree, &m->node, / ... /); if (bpf_rbtree_add(&some_tree, &o->node, / ... /)) { important_failures++; } bpf_spin_unlock(&glock); } If we instead chose to pass ownership back to the program on failed insert - by returning NULL on success or an owning ref on failure - programs would always have to do something with the returned ref on failure. The most likely action is probably "I'll just get rid of this owning ref and go about my business", which ideally would look like: if (n = bpf_rbtree_add(&some_tree, &n->node, / ... /)) bpf_obj_drop(n); But bpf_obj_drop isn't allowed in a critical section and inserts must occur within one, so in reality error handling would become a hard-to-parse mess. For refcounted nodes, we can replicate the "pass ownership back to program on failure" logic with this patch's semantics, albeit in an ugly way: struct node n = bpf_obj_new(/* ... /), m; bpf_spin_lock(&glock); m = bpf_refcount_acquire(n); if (bpf_rbtree_add(&some_tree, &n->node, /* ... /)) { / Do something with m / } bpf_spin_unlock(&glock); bpf_obj_drop(m); bpf_refcount_acquire is used to simulate "return owning ref on failure". This should be an uncommon occurrence, though. Addition of two verifier-fixup'd args to collection inserts =========================================================== The actual bpf_obj_drop kfunc is bpf_obj_drop_impl(void , struct btf_struct_meta ), with bpf_obj_drop macro populating the second arg with 0 and the verifier later filling in the arg during insn fixup. Because bpf_rbtree_add and bpf_list_push_{front,back} now might do bpf_obj_drop, these kfuncs need a btf_struct_meta parameter that can be passed to bpf_obj_drop_impl. Similarly, because the 'node' param to those insert functions is the bpf_{list,rb}_node within the node type, and bpf_obj_drop expects a pointer to the beginning of the node, the insert functions need to be able to find the beginning of the node struct. A second verifier-populated param is necessary: the offset of {list,rb}_node within the node type. These two new params allow the insert kfuncs to correctly call __bpf_obj_drop_impl: beginning_of_node = bpf_rb_node_ptr - offset if (already_inserted) __bpf_obj_drop_impl(beginning_of_node, btf_struct_meta->record); Similarly to other kfuncs with "hidden" verifier-populated params, the insert functions are renamed with _impl prefix and a macro is provided for common usage. For example, bpf_rbtree_add kfunc is now bpf_rbtree_add_impl and bpf_rbtree_add is now a macro which sets "hidden" args to 0. Due to the two new args BPF progs will need to be recompiled to work with the new _impl kfuncs. This patch also rewrites the "hidden argument" explanation to more directly say why the BPF program writer doesn't need to populate the arguments with anything meaningful. How does this new logic affect non-owning references? ===================================================== Currently, non-owning refs are valid until the end of the critical section in which they're created. We can make this guarantee because, if a non-owning ref exists, the referent was added to some collection. The collection will drop() its nodes when it goes away, but it can't go away while our program is accessing it, so that's not a problem. If the referent is removed from the collection in the same CS that it was added in, it can't be bpf_obj_drop'd until after CS end. Those are the only two ways to free the referent's memory and neither can happen until after the non-owning ref's lifetime ends. On first glance, having these collection insert functions potentially bpf_obj_drop their input seems like it breaks the "can't be bpf_obj_drop'd until after CS end" line of reasoning. But we care about the memory not being _freed_ until end of CS end, and a previous patch in the series modified bpf_obj_drop such that it doesn't free refcounted nodes until refcount == 0. So the statement can be more accurately rewritten as "can't be free'd until after CS end". We can prove that this rewritten statement holds for any non-owning reference produced by collection insert functions: If the input to the insert function is _not_ refcounted * We have an owning reference to the input, and can conclude it isn't in any collection * Inserting a node in a collection turns owning refs into non-owning, and since our input type isn't refcounted, there's no way to obtain additional owning refs to the same underlying memory * Because our node isn't in any collection, the insert operation cannot fail, so bpf_obj_drop will not execute * If bpf_obj_drop is guaranteed not to execute, there's no risk of memory being free'd * Otherwise, the input to the insert function is refcounted * If the insert operation fails due to the node's list_head or rb_root already being in some collection, there was some previous successful insert which passed refcount to the collection * We have an owning reference to the input, it must have been acquired via bpf_refcount_acquire, which bumped the refcount * refcount must be >= 2 since there's a valid owning reference and the node is already in a collection * Insert triggering bpf_obj_drop will decr refcount to >= 1, never resulting in a free So although we may do bpf_obj_drop during the critical section, this will never result in memory being free'd, and no changes to non-owning ref logic are needed in this patch. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-6-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-15	bpf: Add bpf_refcount_acquire kfunc	Dave Marchevsky
	Currently, BPF programs can interact with the lifetime of refcounted local kptrs in the following ways: bpf_obj_new - Initialize refcount to 1 as part of new object creation bpf_obj_drop - Decrement refcount and free object if it's 0 collection add - Pass ownership to the collection. No change to refcount but collection is responsible for bpf_obj_dropping it In order to be able to add a refcounted local kptr to multiple collections we need to be able to increment the refcount and acquire a new owning reference. This patch adds a kfunc, bpf_refcount_acquire, implementing such an operation. bpf_refcount_acquire takes a refcounted local kptr and returns a new owning reference to the same underlying memory as the input. The input can be either owning or non-owning. To reinforce why this is safe, consider the following code snippets: struct node n = bpf_obj_new(typeof(n)); // A struct node m = bpf_refcount_acquire(n); // B In the above snippet, n will be alive with refcount=1 after (A), and since nothing changes that state before (B), it's obviously safe. If n is instead added to some rbtree, we can still safely refcount_acquire it: struct node n = bpf_obj_new(typeof(n)); struct node m; bpf_spin_lock(&glock); bpf_rbtree_add(&groot, &n->node, less); // A m = bpf_refcount_acquire(n); // B bpf_spin_unlock(&glock); In the above snippet, after (A) n is a non-owning reference, and after (B) m is an owning reference pointing to the same memory as n. Although n has no ownership of that memory's lifetime, it's guaranteed to be alive until the end of the critical section, and n would be clobbered if we were past the end of the critical section, so it's safe to bump refcount. Implementation details: * From verifier's perspective, bpf_refcount_acquire handling is similar to bpf_obj_new and bpf_obj_drop. Like the former, it returns a new owning reference matching input type, although like the latter, type can be inferred from concrete kptr input. Verifier changes in {check,fixup}_kfunc_call and check_kfunc_args are largely copied from aforementioned functions' verifier changes. * An exception to the above is the new KF_ARG_PTR_TO_REFCOUNTED_KPTR arg, indicated by new "__refcounted_kptr" kfunc arg suffix. This is necessary in order to handle both owning and non-owning input without adding special-casing to "__alloc" arg handling. Also a convenient place to confirm that input type has bpf_refcount field. * The implemented kfunc is actually bpf_refcount_acquire_impl, with 'hidden' second arg that the verifier sets to the type's struct_meta in fixup_kfunc_call. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-5-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	selftests/bpf: Workaround for older vm_sockets.h.	Alexei Starovoitov
	Some distros ship with older vm_sockets.h that doesn't have VMADDR_CID_LOCAL which causes selftests build to fail: /tmp/work/bpf/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c:261:18: error: ‘VMADDR_CID_LOCAL’ undeclared (first use in this function); did you mean ‘VMADDR_CID_HOST’? 261 \| addr->svm_cid = VMADDR_CID_LOCAL; \| ^~~~~~~~~~~~~~~~ \| VMADDR_CID_HOST Workaround this issue by defining it on demand. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	selftests/bpf: Fix merge conflict due to SYS() macro change.	Alexei Starovoitov
	Fix merge conflict between bpf/bpf-next trees due to change of arguments in SYS() macro. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	Daniel Borkmann says:	Jakub Kicinski
	==================== pull-request: bpf-next 2023-04-13 We've added 260 non-merge commits during the last 36 day(s) which contain a total of 356 files changed, 21786 insertions(+), 11275 deletions(-). The main changes are: 1) Rework BPF verifier log behavior and implement it as a rotating log by default with the option to retain old-style fixed log behavior, from Andrii Nakryiko. 2) Adds support for using {FOU,GUE} encap with an ipip device operating in collect_md mode and add a set of BPF kfuncs for controlling encap params, from Christian Ehrig. 3) Allow BPF programs to detect at load time whether a particular kfunc exists or not, and also add support for this in light skeleton, from Alexei Starovoitov. 4) Optimize hashmap lookups when key size is multiple of 4, from Anton Protopopov. 5) Enable RCU semantics for task BPF kptrs and allow referenced kptr tasks to be stored in BPF maps, from David Vernet. 6) Add support for stashing local BPF kptr into a map value via bpf_kptr_xchg(). This is useful e.g. for rbtree node creation for new cgroups, from Dave Marchevsky. 7) Fix BTF handling of is_int_ptr to skip modifiers to work around tracing issues where a program cannot be attached, from Feng Zhou. 8) Migrate a big portion of test_verifier unit tests over to test_progs -a verifier_* via inline asm to ease {read,debug}ability, from Eduard Zingerman. 9) Several updates to the instruction-set.rst documentation which is subject to future IETF standardization (https://lwn.net/Articles/926882/), from Dave Thaler. 10) Fix BPF verifier in the __reg_bound_offset's 64->32 tnum sub-register known bits information propagation, from Daniel Borkmann. 11) Add skb bitfield compaction work related to BPF with the overall goal to make more of the sk_buff bits optional, from Jakub Kicinski. 12) BPF selftest cleanups for build id extraction which stand on its own from the upcoming integration work of build id into struct file object, from Jiri Olsa. 13) Add fixes and optimizations for xsk descriptor validation and several selftest improvements for xsk sockets, from Kal Conley. 14) Add BPF links for struct_ops and enable switching implementations of BPF TCP cong-ctls under a given name by replacing backing struct_ops map, from Kui-Feng Lee. 15) Remove a misleading BPF verifier env->bypass_spec_v1 check on variable offset stack read as earlier Spectre checks cover this, from Luis Gerhorst. 16) Fix issues in copy_from_user_nofault() for BPF and other tracers to resemble copy_from_user_nmi() from safety PoV, from Florian Lehner and Alexei Starovoitov. 17) Add --json-summary option to test_progs in order for CI tooling to ease parsing of test results, from Manu Bretelle. 18) Batch of improvements and refactoring to prep for upcoming bpf_local_storage conversion to bpf_mem_cache_{alloc,free} allocator, from Martin KaFai Lau. 19) Improve bpftool's visual program dump which produces the control flow graph in a DOT format by adding C source inline annotations, from Quentin Monnet. 20) Fix attaching fentry/fexit/fmod_ret/lsm to modules by extracting the module name from BTF of the target and searching kallsyms of the correct module, from Viktor Malik. 21) Improve BPF verifier handling of '<const> <cond> <non_const>' to better detect whether in particular jmp32 branches are taken, from Yonghong Song. 22) Allow BPF TCP cong-ctls to write app_limited of struct tcp_sock. A built-in cc or one from a kernel module is already able to write to app_limited, from Yixin Shen. Conflicts: Documentation/bpf/bpf_devel_QA.rst b7abcd9c656b ("bpf, doc: Link to submitting-patches.rst for general patch submission info") 0f10f647f455 ("bpf, docs: Use internal linking for link to netdev subsystem doc") https://lore.kernel.org/all/20230307095812.236eb1be@canb.auug.org.au/ include/net/ip_tunnels.h bc9d003dc48c3 ("ip_tunnel: Preserve pointer const in ip_tunnel_info_opts") ac931d4cdec3d ("ipip,ip_tunnel,sit: Add FOU support for externally controlled ipip devices") https://lore.kernel.org/all/20230413161235.4093777-1-broonie@kernel.org/ net/bpf/test_run.c e5995bc7e2ba ("bpf, test_run: fix crashes due to XDP frame overwriting/corruption") 294635a8165a ("bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES") https://lore.kernel.org/all/20230320102619.05b80a98@canb.auug.org.au/ ==================== Link: https://lore.kernel.org/r/20230413191525.7295-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Conflicts: tools/testing/selftests/net/config 62199e3f1658 ("selftests: net: Add VXLAN MDB test") 3a0385be133e ("selftests: add the missing CONFIG_IP_SCTP in net config") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	selftests/bpf: Adjust bpf_xdp_metadata_rx_hash for new arg	Jesper Dangaard Brouer
	Update BPF selftests to use the new RSS type argument for kfunc bpf_xdp_metadata_rx_hash. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132894068.340624.8914711185697163690.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	selftests/bpf: xdp_hw_metadata remove bpf_printk and add counters	Jesper Dangaard Brouer
	The tool xdp_hw_metadata can be used by driver developers implementing XDP-hints metadata kfuncs. Remove all bpf_printk calls, as the tool already transfers all the XDP-hints related information via metadata area to AF_XDP userspace process. Add counters for providing remaining information about failure and skipped packet events. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132891533.340624.7313781245316405141.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	selftests/bpf: Fix compiler warnings in bpf_testmod for kfuncs	Andrii Nakryiko
	Add -Wmissing-prototypes ignore in bpf_testmod.c, similarly to what we do in kernel code proper. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/oe-kbuild-all/202304080951.l14IDv3n-lkp@intel.com/ Link: https://lore.kernel.org/bpf/20230412034647.3968143-1-andrii@kernel.org
2023-04-13	selftests/bpf: Remove stand-along test_verifier_log test binary	Andrii Nakryiko
	test_prog's prog_tests/verifier_log.c is superseding test_verifier_log stand-alone test. It cover same checks and adds more, and is also integrated into test_progs test runner. Just remove test_verifier_log.c. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230412170655.1866831-1-andrii@kernel.org
2023-04-13	selftests/bpf: Keep the loop in bpf_testmod_loop_test	Song Liu
	Some compilers (for example clang-15) optimize bpf_testmod_loop_test and remove the loop: gcc version (gdb) disassemble bpf_testmod_loop_test Dump of assembler code for function bpf_testmod_loop_test: 0x0000000000000570 <+0>: callq 0x575 <bpf_testmod_loop_test+5> 0x0000000000000575 <+5>: xor %eax,%eax 0x0000000000000577 <+7>: test %edi,%edi 0x0000000000000579 <+9>: jle 0x587 <bpf_testmod_loop_test+23> 0x000000000000057b <+11>: xor %edx,%edx 0x000000000000057d <+13>: add %edx,%eax 0x000000000000057f <+15>: add $0x1,%edx 0x0000000000000582 <+18>: cmp %edx,%edi 0x0000000000000584 <+20>: jne 0x57d <bpf_testmod_loop_test+13> 0x0000000000000586 <+22>: retq 0x0000000000000587 <+23>: retq clang-15 version (gdb) disassemble bpf_testmod_loop_test Dump of assembler code for function bpf_testmod_loop_test: 0x0000000000000450 <+0>: nopl 0x0(%rax,%rax,1) 0x0000000000000455 <+5>: test %edi,%edi 0x0000000000000457 <+7>: jle 0x46b <bpf_testmod_loop_test+27> 0x0000000000000459 <+9>: lea -0x1(%rdi),%eax 0x000000000000045c <+12>: lea -0x2(%rdi),%ecx 0x000000000000045f <+15>: imul %rax,%rcx 0x0000000000000463 <+19>: shr %rcx 0x0000000000000466 <+22>: lea -0x1(%rdi,%rcx,1),%eax 0x000000000000046a <+26>: retq 0x000000000000046b <+27>: xor %eax,%eax 0x000000000000046d <+29>: retq Note: The jne instruction is removed in clang-15 version. Force the compile to keep the loop by making sum volatile. Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230412210423.900851-4-song@kernel.org
2023-04-13	selftests/bpf: Fix leaked bpf_link in get_stackid_cannot_attach	Song Liu
	skel->links.oncpu is leaked in one case. This causes test perf_branches fails when it runs after get_stackid_cannot_attach: ./test_progs -t get_stackid_cannot_attach,perf_branches 84 get_stackid_cannot_attach:OK test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:FAIL:output not valid no valid sample from prog 146/1 perf_branches/perf_branches_hw:FAIL 146/2 perf_branches/perf_branches_no_hw:OK 146 perf_branches:FAIL All error logs: test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:FAIL:output not valid no valid sample from prog 146/1 perf_branches/perf_branches_hw:FAIL 146 perf_branches:FAIL Summary: 1/1 PASSED, 0 SKIPPED, 1 FAILED Fix this by adding the missing bpf_link__destroy(). Fixes: 346938e9380c ("selftests/bpf: Add get_stackid_cannot_attach") Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230412210423.900851-3-song@kernel.org
2023-04-13	selftests/bpf: Use read_perf_max_sample_freq() in perf_event_stackmap	Song Liu
	Currently, perf_event sample period in perf_event_stackmap is set too low that the test fails randomly. Fix this by using the max sample frequency, from read_perf_max_sample_freq(). Move read_perf_max_sample_freq() to testing_helpers.c. Replace the CHECK() with if-printf, as CHECK is not available in testing_helpers.c. Fixes: 1da4864c2b20 ("selftests/bpf: Add callchain_stackid") Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230412210423.900851-2-song@kernel.org
2023-04-13	selftests/bpf: Fix use of uninitialized op_name in log tests	Lorenz Bauer
	One of the test assertions uses an uninitialized op_name, which leads to some headscratching if it fails. Use a string constant instead. Fixes: b1a7a480a112 ("selftests/bpf: Add fixed vs rotating verifier log tests") Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230413094740.18041-1-lmb@isovalent.com
2023-04-12	selftests/bpf: Test FOU kfuncs for externally controlled ipip devices	Christian Ehrig
	Add tests for FOU and GUE encapsulation via the bpf_skb_{set,get}_fou_encap kfuncs, using ipip devices in collect-metadata mode. These tests make sure that we can successfully set and obtain FOU and GUE encap parameters using ingress / egress BPF tc-hooks. Signed-off-by: Christian Ehrig <cehrig@cloudflare.com> Link: https://lore.kernel.org/r/040193566ddbdb0b53eb359f7ac7bbd316f338b5.1680874078.git.cehrig@cloudflare.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-12	bpf: Remove bpf_cgroup_kptr_get() kfunc	David Vernet
	Now that bpf_cgroup_acquire() is KF_RCU \| KF_RET_NULL, bpf_cgroup_kptr_get() is redundant. Let's remove it, and update selftests to instead use bpf_cgroup_acquire() where appropriate. The next patch will update the BPF documentation to not mention bpf_cgroup_kptr_get(). Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230411041633.179404-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-12	bpf: Make bpf_cgroup_acquire() KF_RCU \| KF_RET_NULL	David Vernet
	struct cgroup is already an RCU-safe type in the verifier. We can therefore update bpf_cgroup_acquire() to be KF_RCU \| KF_RET_NULL, and subsequently remove bpf_cgroup_kptr_get(). This patch does the first of these by updating bpf_cgroup_acquire() to be KF_RCU \| KF_RET_NULL, and also updates selftests accordingly. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230411041633.179404-1-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-11	selftests/bpf: Add test to access u32 ptr argument in tracing program	Feng Zhou
	Adding verifier test for accessing u32 pointer argument in tracing programs. The test program loads 1nd argument of bpf_fentry_test9 function which is u32 pointer and checks that verifier allows that. Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230410085908.98493-3-zhoufeng.zf@bytedance.com
2023-04-11	selftests/bpf: Add verifier log tests for BPF_BTF_LOAD command	Andrii Nakryiko
	Add verifier log tests for BPF_BTF_LOAD command, which are very similar, conceptually, to BPF_PROG_LOAD tests. These are two separate commands dealing with verbose verifier log, so should be both tested separately. Test that log_buf==NULL condition does not return -ENOSPC. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/bpf/20230406234205.323208-20-andrii@kernel.org
2023-04-11	selftests/bpf: Add testing of log_buf==NULL condition for BPF_PROG_LOAD	Andrii Nakryiko
	Add few extra test conditions to validate that it's ok to pass log_buf==NULL and log_size==0 to BPF_PROG_LOAD command with the intent to get log_true_size without providing a buffer. Test that log_buf==NULL condition does not return -ENOSPC. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/bpf/20230406234205.323208-19-andrii@kernel.org
2023-04-11	selftests/bpf: Add tests to validate log_true_size feature	Andrii Nakryiko
	Add additional test cases validating that log_true_size is consistent between fixed and rotating log modes, and that log_true_size can be used exactly without causing -ENOSPC, while using just 1 byte shorter log buffer would cause -ENOSPC. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/bpf/20230406234205.323208-18-andrii@kernel.org
2023-04-11	selftests/bpf: Add fixed vs rotating verifier log tests	Andrii Nakryiko
	Add selftests validating BPF_LOG_FIXED behavior, which used to be the only behavior, and now default rotating BPF verifier log, which returns just up to last N bytes of full verifier log, instead of returning -ENOSPC. To stress test correctness of in-kernel verifier log logic, we force it to truncate program's verifier log to all lengths from 1 all the way to its full size (about 450 bytes today). This was a useful stress test while developing the feature. For both fixed and rotating log modes we expect -ENOSPC if log contents doesn't fit in user-supplied log buffer. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/bpf/20230406234205.323208-7-andrii@kernel.org
2023-04-11	veristat: Add more veristat control over verifier log options	Andrii Nakryiko
	Add --log-size to be able to customize log buffer sent to bpf() syscall for BPF program verification logging. Add --log-fixed to enforce BPF_LOG_FIXED behavior for BPF verifier log. This is useful in unlikely event that beginning of truncated verifier log is more important than the end of it (which with rotating verifier log behavior is the default now). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230406234205.323208-6-andrii@kernel.org
2023-04-11	bpf: Switch BPF verifier log to be a rotating log by default	Andrii Nakryiko
	Currently, if user-supplied log buffer to collect BPF verifier log turns out to be too small to contain full log, bpf() syscall returns -ENOSPC, fails BPF program verification/load, and preserves first N-1 bytes of the verifier log (where N is the size of user-supplied buffer). This is problematic in a bunch of common scenarios, especially when working with real-world BPF programs that tend to be pretty complex as far as verification goes and require big log buffers. Typically, it's when debugging tricky cases at log level 2 (verbose). Also, when BPF program is successfully validated, log level 2 is the only way to actually see verifier state progression and all the important details. Even with log level 1, it's possible to get -ENOSPC even if the final verifier log fits in log buffer, if there is a code path that's deep enough to fill up entire log, even if normally it would be reset later on (there is a logic to chop off successfully validated portions of BPF verifier log). In short, it's not always possible to pre-size log buffer. Also, what's worse, in practice, the end of the log most often is way more important than the beginning, but verifier stops emitting log as soon as initial log buffer is filled up. This patch switches BPF verifier log behavior to effectively behave as rotating log. That is, if user-supplied log buffer turns out to be too short, verifier will keep overwriting previously written log, effectively treating user's log buffer as a ring buffer. -ENOSPC is still going to be returned at the end, to notify user that log contents was truncated, but the important last N bytes of the log would be returned, which might be all that user really needs. This consistent -ENOSPC behavior, regardless of rotating or fixed log behavior, allows to prevent backwards compatibility breakage. The only user-visible change is which portion of verifier log user ends up seeing if buffer is too small. Given contents of verifier log itself is not an ABI, there is no breakage due to this behavior change. Specialized tools that rely on specific contents of verifier log in -ENOSPC scenario are expected to be easily adapted to accommodate old and new behaviors. Importantly, though, to preserve good user experience and not require every user-space application to adopt to this new behavior, before exiting to user-space verifier will rotate log (in place) to make it start at the very beginning of user buffer as a continuous zero-terminated string. The contents will be a chopped off N-1 last bytes of full verifier log, of course. Given beginning of log is sometimes important as well, we add BPF_LOG_FIXED (which equals 8) flag to force old behavior, which allows tools like veristat to request first part of verifier log, if necessary. BPF_LOG_FIXED flag is also a simple and straightforward way to check if BPF verifier supports rotating behavior. On the implementation side, conceptually, it's all simple. We maintain 64-bit logical start and end positions. If we need to truncate the log, start position will be adjusted accordingly to lag end position by N bytes. We then use those logical positions to calculate their matching actual positions in user buffer and handle wrap around the end of the buffer properly. Finally, right before returning from bpf_check(), we rotate user log buffer contents in-place as necessary, to make log contents contiguous. See comments in relevant functions for details. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/bpf/20230406234205.323208-4-andrii@kernel.org
2023-04-10	selftests/bpf: Reset err when symbol name already exist in kprobe_multi_test	Manu Bretelle
	When trying to add a name to the hashmap, an error code of EEXIST is returned and we continue as names are possibly duplicated in the sys file. If the last name in the file is a duplicate, we will continue to the next iteration of the while loop, and exit the loop with a value of err set to EEXIST and enter the error label with err set, which causes the test to fail when it should not. This change reset err to 0 before continue-ing into the next iteration, this way, if there is no more data to read from the file we iterate through, err will be set to 0. Behaviour prior to this change: ``` test_kprobe_multi_bench_attach:FAIL:get_syms unexpected error: -17 (errno 2) All error logs: test_kprobe_multi_bench_attach:FAIL:get_syms unexpected error: -17 (errno 2) Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED ``` After this change: ``` Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED ``` Signed-off-by: Manu Bretelle <chantr4@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230408022919.54601-1-chantr4@gmail.com
2023-04-07	selftests/bpf: Prevent infinite loop in veristat when base file is too short	Eduard Zingerman
	The following example forces veristat to loop indefinitely: $ cat two-ok file_name,prog_name,verdict,total_states file-a,a,success,12 file-b,b,success,67 $ cat add-failure file_name,prog_name,verdict,total_states file-a,a,success,12 file-b,b,success,67 file-b,c,failure,32 $ veristat -C two-ok add-failure <does not return> The loop is caused by handle_comparison_mode() not checking if `base` variable points to `fallback_stats` prior advancing joined results using `base`. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230407154125.896927-1-eddyz87@gmail.com
2023-04-07	selftests/bpf: Use PERF_COUNT_HW_CPU_CYCLES event for get_branch_snapshot	Song Liu
	perf_event with type=PERF_TYPE_RAW and config=0x1b00 turned out to be not reliable in ensuring LBR is active. Thus, test_progs:get_branch_snapshot is not reliable in some systems. Replace it with PERF_COUNT_HW_CPU_CYCLES event, which gives more consistent results. Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230407190130.2093736-1-song@kernel.org
2023-04-06	selftests/bpf: Add verifier tests for code pattern '<const> <cond_op> ↵	Yonghong Song
	<non_const>' Add various tests for code pattern '<const> <cond_op> <non_const>' to exercise the previous verifier patch. The following are veristat changed number of processed insns stat comparing the previous patch vs. this patch: File Program Insns (A) Insns (B) Insns (DIFF) ----------------------------------------------------- ---------------------------------------------------- --------- --------- ------------- test_seg6_loop.bpf.linked3.o __add_egr_x 12423 12314 -109 (-0.88%) Only one program is affected with minor change. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230406164510.1047757-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-06	bpf: Improve handling of pattern '<const> <cond_op> <non_const>' in verifier	Yonghong Song
	Currently, the verifier does not handle '<const> <cond_op> <non_const>' well. For example, ... 10: (79) r1 = (u64 )(r10 -16) ; R1_w=scalar() R10=fp0 11: (b7) r2 = 0 ; R2_w=0 12: (2d) if r2 > r1 goto pc+2 13: (b7) r0 = 0 14: (95) exit 15: (65) if r1 s> 0x1 goto pc+3 16: (0f) r0 += r1 ... At insn 12, verifier decides both true and false branch are possible, but actually only false branch is possible. Currently, the verifier already supports patterns '<non_const> <cond_op> <const>. Add support for patterns '<const> <cond_op> <non_const>' in a similar way. Also fix selftest 'verifier_bounds_mix_sign_unsign/bounds checks mixing signed and unsigned, variant 10' due to this change. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230406164505.1046801-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-06	selftests/bpf: Add tests for non-constant cond_op NE/EQ bound deduction	Yonghong Song
	Add various tests for code pattern '<non-const> NE/EQ <const>' implemented in the previous verifier patch. Without the verifier patch, these new tests will fail. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230406164500.1045715-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-06	selftests: xsk: Add test UNALIGNED_INV_DESC_4K1_FRAME_SIZE	Kal Conley
	Add unaligned descriptor test for frame size of 4001. Using an odd frame size ensures that the end of the UMEM is not near a page boundary. This allows testing descriptors that staddle the end of the UMEM but not a page. This test used to fail without the previous commit ("xsk: Fix unaligned descriptor validation"). Signed-off-by: Kal Conley <kal.conley@dectris.com> Link: https://lore.kernel.org/r/20230405235920.7305-3-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-06	selftests/bpf: fix xdp_redirect xdp-features selftest for veth driver	Lorenzo Bianconi
	xdp-features supported by veth driver are no more static, but they depends on veth configuration (e.g. if GRO is enabled/disabled or TX/RX queue configuration). Take it into account in xdp_redirect xdp-features selftest for veth driver. Fixes: fccca038f300 ("veth: take into account device reconfiguration for xdp_features flag") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/bc35455cfbb1d4f7f52536955ded81ad47d8dc54.1680777371.git.lorenzo@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests/bpf: Wait for receive in cg_storage_multi test	YiFei Zhu
	In some cases the loopback latency might be large enough, causing the assertion on invocations to be run before ingress prog getting executed. The assertion would fail and the test would flake. This can be reliably reproduced by arbitrarily increasing the loopback latency (thanks to [1]): tc qdisc add dev lo root handle 1: htb default 12 tc class add dev lo parent 1:1 classid 1:12 htb rate 20kbps ceil 20kbps tc qdisc add dev lo parent 1:12 netem delay 100ms Fix this by waiting on the receive end, instead of instantly returning to the assert. The call to read() will wait for the default SO_RCVTIMEO timeout of 3 seconds provided by start_server(). [1] https://gist.github.com/kstevens715/4598301 Reported-by: Martin KaFai Lau <martin.lau@linux.dev> Link: https://lore.kernel.org/bpf/9c5c8b7e-1d89-a3af-5400-14fde81f4429@linux.dev/ Fixes: 3573f384014f ("selftests/bpf: Test CGROUP_STORAGE behavior on shared egress + ingress") Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/20230405193354.1956209-1-zhuyifei@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Deflakify STATS_RX_DROPPED test	Kal Conley
	Fix flaky STATS_RX_DROPPED test. The receiver calls getsockopt after receiving the last (valid) packet which is not the final packet sent in the test (valid and invalid packets are sent in alternating fashion with the final packet being invalid). Since the last packet may or may not have been dropped already, both outcomes must be allowed. This issue could also be fixed by making sure the last packet sent is valid. This alternative is left as an exercise to the reader (or the benevolent maintainers of this file). This problem was quite visible on certain setups. On one machine this failure was observed 50% of the time. Also, remove a redundant assignment of pkt_stream->nb_pkts. This field is already initialized by __pkt_stream_alloc. Fixes: 27e934bec35b ("selftests: xsk: make stat tests not spin on getsockopt") Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230403120400.31018-1-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Disable IPv6 on VETH1	Kal Conley
	This change fixes flakiness in the BIDIRECTIONAL test: # [is_pkt_valid] expected length [60], got length [90] not ok 1 FAIL: SKB BUSY-POLL BIDIRECTIONAL When IPv6 is enabled, the interface will periodically send MLDv1 and MLDv2 packets. These packets can cause the BIDIRECTIONAL test to fail since it uses VETH0 for RX. For other tests, this was not a problem since they only receive on VETH1 and IPv6 was already disabled on VETH0. Fixes: a89052572ebb ("selftests/bpf: Xsk selftests framework") Signed-off-by: Kal Conley <kal.conley@dectris.com> Link: https://lore.kernel.org/r/20230405082905.6303-1-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Add test case for packets at end of UMEM	Kal Conley
	Add test case to testapp_invalid_desc for valid packets at the end of the UMEM. Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230403145047.33065-3-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Use correct UMEM size in testapp_invalid_desc	Kal Conley
	Avoid UMEM_SIZE macro in testapp_invalid_desc which is incorrect when the frame size is not XSK_UMEM__DEFAULT_FRAME_SIZE. Also remove the macro since it's no longer being used. Fixes: 909f0e28207c ("selftests: xsk: Add tests for 2K frame size") Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230403145047.33065-2-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Add xskxceiver.h dependency to Makefile	Kal Conley
	xskxceiver depends on xskxceiver.h so tell make about it. Signed-off-by: Kal Conley <kal.conley@dectris.com> Link: https://lore.kernel.org/r/20230403130151.31195-1-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-04	selftests/bpf: Add tracing tests for walking skb and req.	Alexei Starovoitov
	Add tracing tests for walking skb->sk and req->sk. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-9-alexei.starovoitov@gmail.com
2023-04-04	selftests/bpf: Add RESOLVE_BTFIDS dependency to bpf_testmod.ko	Ilya Leoshkevich
	bpf_testmod.ko sometimes fails to build from a clean checkout: BTF [M] linux/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.ko /bin/sh: 1: linux-build//tools/build/resolve_btfids/resolve_btfids: not found The reason is that RESOLVE_BTFIDS may not yet be built. Fix by adding a dependency. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230403172935.1553022-1-iii@linux.ibm.com
2023-04-01	bpf: Remove now-defunct task kfuncs	David Vernet
	In commit 22df776a9a86 ("tasks: Extract rcu_users out of union"), the 'refcount_t rcu_users' field was extracted out of a union with the 'struct rcu_head rcu' field. This allows us to safely perform a refcount_inc_not_zero() on task->rcu_users when acquiring a reference on a task struct. A prior patch leveraged this by making struct task_struct an RCU-protected object in the verifier, and by bpf_task_acquire() to use the task->rcu_users field for synchronization. Now that we can use RCU to protect tasks, we no longer need bpf_task_kptr_get(), or bpf_task_acquire_not_zero(). bpf_task_kptr_get() is truly completely unnecessary, as we can just use RCU to get the object. bpf_task_acquire_not_zero() is now equivalent to bpf_task_acquire(). In addition to these changes, this patch also updates the associated selftests to no longer use these kfuncs. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230331195733.699708-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	bpf: Make struct task_struct an RCU-safe type	David Vernet
	struct task_struct objects are a bit interesting in terms of how their lifetime is protected by refcounts. task structs have two refcount fields: 1. refcount_t usage: Protects the memory backing the task struct. When this refcount drops to 0, the task is immediately freed, without waiting for an RCU grace period to elapse. This is the field that most callers in the kernel currently use to ensure that a task remains valid while it's being referenced, and is what's currently tracked with bpf_task_acquire() and bpf_task_release(). 2. refcount_t rcu_users: A refcount field which, when it drops to 0, schedules an RCU callback that drops a reference held on the 'usage' field above (which is acquired when the task is first created). This field therefore provides a form of RCU protection on the task by ensuring that at least one 'usage' refcount will be held until an RCU grace period has elapsed. The qualifier "a form of" is important here, as a task can remain valid after task->rcu_users has dropped to 0 and the subsequent RCU gp has elapsed. In terms of BPF, we want to use task->rcu_users to protect tasks that function as referenced kptrs, and to allow tasks stored as referenced kptrs in maps to be accessed with RCU protection. Let's first determine whether we can safely use task->rcu_users to protect tasks stored in maps. All of the bpf_task* kfuncs can only be called from tracepoint, struct_ops, or BPF_PROG_TYPE_SCHED_CLS, program types. For tracepoint and struct_ops programs, the struct task_struct passed to a program handler will always be trusted, so it will always be safe to call bpf_task_acquire() with any task passed to a program. Note, however, that we must update bpf_task_acquire() to be KF_RET_NULL, as it is possible that the task has exited by the time the program is invoked, even if the pointer is still currently valid because the main kernel holds a task->usage refcount. For BPF_PROG_TYPE_SCHED_CLS, tasks should never be passed as an argument to the any program handlers, so it should not be relevant. The second question is whether it's safe to use RCU to access a task that was acquired with bpf_task_acquire(), and stored in a map. Because bpf_task_acquire() now uses task->rcu_users, it follows that if the task is present in the map, that it must have had at least one task->rcu_users refcount by the time the current RCU cs was started. Therefore, it's safe to access that task until the end of the current RCU cs. With all that said, this patch makes struct task_struct is an RCU-protected object. In doing so, we also change bpf_task_acquire() to be KF_ACQUIRE \| KF_RCU \| KF_RET_NULL, and adjust any selftests as necessary. A subsequent patch will remove bpf_task_kptr_get(), and bpf_task_acquire_not_zero() respectively. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230331195733.699708-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	veristat: small fixed found in -O2 mode	Andrii Nakryiko
	Fix few potentially unitialized variables uses, found while building veristat.c in release (-O2) mode. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	veristat: avoid using kernel-internal headers	Andrii Nakryiko
	Drop linux/compiler.h include, which seems to be needed for ARRAY_SIZE macro only. Redefine own version of ARRAY_SIZE instead. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	veristat: improve version reporting	Andrii Nakryiko
	For packaging version of the tool is important, so add a simple way to specify veristat version for upstream mirror at Github. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	veristat: relicense veristat.c as dual GPL-2.0-only or BSD-2-Clause licensed	Andrii Nakryiko
	Dual-license veristat.c to dual GPL-2.0-only or BSD-2-Clause license. This is needed to mirror it to Github to make it convenient for distro packagers to package veristat as a separate package. Veristat grew into a useful tool by itself, and there are already a bunch of users relying on veristat as generic BPF loading and verification helper tool. So making it easy to packagers by providing Github mirror just like we do for bpftool and libbpf is the next step to get veristat into the hands of users. Apart from few typo fixes, I'm the sole contributor to veristat.c so far, so no extra Acks should be needed for relicensing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31	selftests/bpf: Fix conflicts with built-in functions in ↵	James Hilliard
	bench_local_storage_create The fork function in gcc is considered a built in function due to being used by libgcov when building with gnu extensions. Rename fork to sched_process_fork to prevent this conflict. See details: https://github.com/gcc-mirror/gcc/commit/d1c38823924506d389ca58d02926ace21bdf82fa https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82457 Fixes the following error: In file included from progs/bench_local_storage_create.c:6: progs/bench_local_storage_create.c:43:14: error: conflicting types for built-in function 'fork'; expected 'int(void)' [-Werror=builtin-declaration-mismatch] 43 \| int BPF_PROG(fork, struct task_struct parent, struct task_struct child) \| ^~~~ Fixes: cbe9d93d58b1 ("selftests/bpf: Add bench for task storage creation") Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230331075848.1642814-1-james.hilliard1@gmail.com