summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2023-12-06bpf: add BPF token support to BPF_PROG_LOAD commandAndrii Nakryiko
Add basic support of BPF token to BPF_PROG_LOAD. Wire through a set of allowed BPF program types and attach types, derived from BPF FS at BPF token creation time. Then make sure we perform bpf_token_capable() checks everywhere where it's relevant. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: add BPF token support to BPF_BTF_LOAD commandAndrii Nakryiko
Accept BPF token FD in BPF_BTF_LOAD command to allow BTF data loading through delegated BPF token. BTF loading is a pretty straightforward operation, so as long as BPF token is created with allow_cmds granting BPF_BTF_LOAD command, kernel proceeds to parsing BTF data and creating BTF object. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: add BPF token support to BPF_MAP_CREATE commandAndrii Nakryiko
Allow providing token_fd for BPF_MAP_CREATE command to allow controlled BPF map creation from unprivileged process through delegated BPF token. Wire through a set of allowed BPF map types to BPF token, derived from BPF FS at BPF token creation time. This, in combination with allowed_cmds allows to create a narrowly-focused BPF token (controlled by privileged agent) with a restrictive set of BPF maps that application can attempt to create. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: introduce BPF token objectAndrii Nakryiko
Add new kind of BPF kernel object, BPF token. BPF token is meant to allow delegating privileged BPF functionality, like loading a BPF program or creating a BPF map, from privileged process to a *trusted* unprivileged process, all while having a good amount of control over which privileged operations could be performed using provided BPF token. This is achieved through mounting BPF FS instance with extra delegation mount options, which determine what operations are delegatable, and also constraining it to the owning user namespace (as mentioned in the previous patch). BPF token itself is just a derivative from BPF FS and can be created through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts BPF FS FD, which can be attained through open() API by opening BPF FS mount point. Currently, BPF token "inherits" delegated command, map types, prog type, and attach type bit sets from BPF FS as is. In the future, having an BPF token as a separate object with its own FD, we can allow to further restrict BPF token's allowable set of things either at the creation time or after the fact, allowing the process to guard itself further from unintentionally trying to load undesired kind of BPF programs. But for now we keep things simple and just copy bit sets as is. When BPF token is created from BPF FS mount, we take reference to the BPF super block's owning user namespace, and then use that namespace for checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN} capabilities that are normally only checked against init userns (using capable()), but now we check them using ns_capable() instead (if BPF token is provided). See bpf_token_capable() for details. Such setup means that BPF token in itself is not sufficient to grant BPF functionality. User namespaced process has to *also* have necessary combination of capabilities inside that user namespace. So while previously CAP_BPF was useless when granted within user namespace, now it gains a meaning and allows container managers and sys admins to have a flexible control over which processes can and need to use BPF functionality within the user namespace (i.e., container in practice). And BPF FS delegation mount options and derived BPF tokens serve as a per-container "flag" to grant overall ability to use bpf() (plus further restrict on which parts of bpf() syscalls are treated as namespaced). Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF) within the BPF FS owning user namespace, rounding up the ns_capable() story of BPF token. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05selftests/bpf: validate precision logic in partial_stack_load_preserves_zerosAndrii Nakryiko
Enhance partial_stack_load_preserves_zeros subtest with detailed precision propagation log checks. We know expect fp-16 to be spilled, initially imprecise, zero const register, which is later marked as precise even when partial stack slot load is performed, even if it's not a register fill (!). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05selftests/bpf: validate zero preservation for sub-slot loadsAndrii Nakryiko
Validate that 1-, 2-, and 4-byte loads from stack slots not aligned on 8-byte boundary still preserve zero, when loading from all-STACK_ZERO sub-slots, or when stack sub-slots are covered by spilled register with known constant zero value. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05selftests/bpf: validate STACK_ZERO is preserved on subreg spillAndrii Nakryiko
Add tests validating that STACK_ZERO slots are preserved when slot is partially overwritten with subregister spill. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05selftests/bpf: add stack access precision testAndrii Nakryiko
Add a new selftests that validates precision tracking for stack access instruction, using both r10-based and non-r10-based accesses. For non-r10 ones we also make sure to have non-zero var_off to validate that final stack offset is tracked properly in instruction history information inside verifier. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05bpf: support non-r10 register spill/fill to/from stack in precision trackingAndrii Nakryiko
Use instruction (jump) history to record instructions that performed register spill/fill to/from stack, regardless if this was done through read-only r10 register, or any other register after copying r10 into it *and* potentially adjusting offset. To make this work reliably, we push extra per-instruction flags into instruction history, encoding stack slot index (spi) and stack frame number in extra 10 bit flags we take away from prev_idx in instruction history. We don't touch idx field for maximum performance, as it's checked most frequently during backtracking. This change removes basically the last remaining practical limitation of precision backtracking logic in BPF verifier. It fixes known deficiencies, but also opens up new opportunities to reduce number of verified states, explored in the subsequent patches. There are only three differences in selftests' BPF object files according to veristat, all in the positive direction (less states). File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) -------------------------------------- ------------- --------- --------- ------------- ---------- ---------- ------------- test_cls_redirect_dynptr.bpf.linked3.o cls_redirect 2987 2864 -123 (-4.12%) 240 231 -9 (-3.75%) xdp_synproxy_kern.bpf.linked3.o syncookie_tc 82848 82661 -187 (-0.23%) 5107 5073 -34 (-0.67%) xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 85116 84964 -152 (-0.18%) 5162 5130 -32 (-0.62%) Note, I avoided renaming jmp_history to more generic insn_hist to minimize number of lines changed and potential merge conflicts between bpf and bpf-next trees. Notice also cur_hist_entry pointer reset to NULL at the beginning of instruction verification loop. This pointer avoids the problem of relying on last jump history entry's insn_idx to determine whether we already have entry for current instruction or not. It can happen that we added jump history entry because current instruction is_jmp_point(), but also we need to add instruction flags for stack access. In this case, we don't want to entries, so we need to reuse last added entry, if it is present. Relying on insn_idx comparison has the same ambiguity problem as the one that was fixed recently in [0], so we avoid that. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231110002638.4168352-3-andrii@kernel.org/ Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reported-by: Tao Lyu <tao.lyu@epfl.ch> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05selftests/bpf: Make sure we trigger metadata kfuncs for dst 8080Stanislav Fomichev
xdp_metadata test is flaky sometimes: verify_xsk_metadata:FAIL:rx_hash_type unexpected rx_hash_type: actual 8 != expected 0 Where 8 means XDP_RSS_TYPE_L4_ANY and is exported from veth driver only when 'skb->l4_hash' condition is met. This makes me think that the program is triggering again for some other packet. Let's have a filter, similar to xdp_hw_metadata, where we trigger XDP kfuncs only for UDP packets destined to port 8080. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231204174423.3460052-1-sdf@google.com
2023-12-05selftests/bpf: Test bpf_kptr_xchg stashing of bpf_rb_rootDave Marchevsky
There was some confusion amongst Meta sched_ext folks regarding whether stashing bpf_rb_root - the tree itself, rather than a single node - was supported. This patch adds a small test which demonstrates this functionality: a local kptr with rb_root is created, a node is created and added to the tree, then the tree is kptr_xchg'd into a mapval. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20231204211722.571346-1-davemarchevsky@fb.com
2023-12-04selftests/bpf: Test outer map update operations in syscall programHou Tao
Syscall program is running with rcu_read_lock_trace being held, so if bpf_map_update_elem() or bpf_map_delete_elem() invokes synchronize_rcu_tasks_trace() when operating on an outer map, there will be dead-lock, so add a test to guarantee that it is dead-lock free. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20231204140425.1480317-8-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-04selftests/bpf: Add test cases for inner mapHou Tao
Add test cases to test the race between the destroy of inner map due to map-in-map update and the access of inner map in bpf program. The following 4 combinations are added: (1) array map in map array + bpf program (2) array map in map array + sleepable bpf program (3) array map in map htab + bpf program (4) array map in map htab + sleepable bpf program Before applying the fixes, when running `./test_prog -a map_in_map`, the following error was reported: ================================================================== BUG: KASAN: slab-use-after-free in array_map_update_elem+0x48/0x3e0 Read of size 4 at addr ffff888114f33824 by task test_progs/1858 CPU: 1 PID: 1858 Comm: test_progs Tainted: G O 6.6.0+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...... Call Trace: <TASK> dump_stack_lvl+0x4a/0x90 print_report+0xd2/0x620 kasan_report+0xd1/0x110 __asan_load4+0x81/0xa0 array_map_update_elem+0x48/0x3e0 bpf_prog_be94a9f26772f5b7_access_map_in_array+0xe6/0xf6 trace_call_bpf+0x1aa/0x580 kprobe_perf_func+0xdd/0x430 kprobe_dispatcher+0xa0/0xb0 kprobe_ftrace_handler+0x18b/0x2e0 0xffffffffc02280f7 RIP: 0010:__x64_sys_getpgid+0x1/0x30 ...... </TASK> Allocated by task 1857: kasan_save_stack+0x26/0x50 kasan_set_track+0x25/0x40 kasan_save_alloc_info+0x1e/0x30 __kasan_kmalloc+0x98/0xa0 __kmalloc_node+0x6a/0x150 __bpf_map_area_alloc+0x141/0x170 bpf_map_area_alloc+0x10/0x20 array_map_alloc+0x11f/0x310 map_create+0x28a/0xb40 __sys_bpf+0x753/0x37c0 __x64_sys_bpf+0x44/0x60 do_syscall_64+0x36/0xb0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 Freed by task 11: kasan_save_stack+0x26/0x50 kasan_set_track+0x25/0x40 kasan_save_free_info+0x2b/0x50 __kasan_slab_free+0x113/0x190 slab_free_freelist_hook+0xd7/0x1e0 __kmem_cache_free+0x170/0x260 kfree+0x9b/0x160 kvfree+0x2d/0x40 bpf_map_area_free+0xe/0x20 array_map_free+0x120/0x2c0 bpf_map_free_deferred+0xd7/0x1e0 process_one_work+0x462/0x990 worker_thread+0x370/0x670 kthread+0x1b0/0x200 ret_from_fork+0x3a/0x70 ret_from_fork_asm+0x1b/0x30 Last potentially related work creation: kasan_save_stack+0x26/0x50 __kasan_record_aux_stack+0x94/0xb0 kasan_record_aux_stack_noalloc+0xb/0x20 __queue_work+0x331/0x950 queue_work_on+0x75/0x80 bpf_map_put+0xfa/0x160 bpf_map_fd_put_ptr+0xe/0x20 bpf_fd_array_map_update_elem+0x174/0x1b0 bpf_map_update_value+0x2b7/0x4a0 __sys_bpf+0x2551/0x37c0 __x64_sys_bpf+0x44/0x60 do_syscall_64+0x36/0xb0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20231204140425.1480317-7-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-04selftests/bpf: Fix spelling mistake "get_signaure_size" -> "get_signature_size"Colin Ian King
There is a spelling mistake in an ASSERT_GT message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231204093940.2611954-1-colin.i.king@gmail.com
2023-12-02bpf: simplify tnum output if a fully known constantAndrii Nakryiko
Emit tnum representation as just a constant if all bits are known. Use decimal-vs-hex logic to determine exact format of emitted constant value, just like it's done for register range values. For that move tnum_strn() to kernel/bpf/log.c to reuse decimal-vs-hex determination logic and constants. Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231202175705.885270-12-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-02selftests/bpf: adjust global_func15 test to validate prog exit precisionAndrii Nakryiko
Add one more subtest to global_func15 selftest to validate that verifier properly marks r0 as precise and avoids erroneous state pruning of the branch that has return value outside of expected [0, 1] value. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231202175705.885270-11-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-02selftests/bpf: validate async callback return value check correctnessAndrii Nakryiko
Adjust timer/timer_ret_1 test to validate more carefully verifier logic of enforcing async callback return value. This test will pass only if return result is marked precise and read. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231202175705.885270-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-02bpf: enforce precise retval range on program exitAndrii Nakryiko
Similarly to subprog/callback logic, enforce return value of BPF program using more precise smin/smax range. We need to adjust a bunch of tests due to a changed format of an error message. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231202175705.885270-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-02selftests/bpf: add selftest validating callback result is enforcedAndrii Nakryiko
BPF verifier expects callback subprogs to return values from specified range (typically [0, 1]). This requires that r0 at exit is both precise (because we rely on specific value range) and is marked as read (otherwise state comparison will ignore such register as unimportant). Add a simple test that validates that all these conditions are enforced. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231202175705.885270-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-02bpf: provide correct register name for exception callback retval checkAndrii Nakryiko
bpf_throw() is checking R1, so let's report R1 in the log. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231202175705.885270-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-01selftests/bpf: Add test that uses fsverity and xattr to sign a fileSong Liu
This selftests shows a proof of concept method to use BPF LSM to enforce file signature. This test is added to verify_pkcs7_sig, so that some existing logic can be reused. This file signature method uses fsverity, which provides reliable and efficient hash (known as digest) of the file. The file digest is signed with asymmetic key, and the signature is stored in xattr. At the run time, BPF LSM reads file digest and the signature, and then checks them against the public key. Note that this solution does NOT require FS_VERITY_BUILTIN_SIGNATURES. fsverity is only used to provide file digest. The signature verification and access control is all implemented in BPF LSM. Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231129234417.856536-7-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-01selftests/bpf: Add tests for filesystem kfuncsSong Liu
Add selftests for two new filesystem kfuncs: 1. bpf_get_file_xattr 2. bpf_get_fsverity_digest These tests simply make sure the two kfuncs work. Another selftest will be added to demonstrate how to use these kfuncs to verify file signature. CONFIG_FS_VERITY is added to selftests config. However, this is not sufficient to guarantee bpf_get_fsverity_digest works. This is because fsverity need to be enabled at file system level (for example, with tune2fs on ext4). If local file system doesn't have this feature enabled, just skip the test. Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231129234417.856536-6-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-01selftests/bpf: Sort config in alphabetic orderSong Liu
Move CONFIG_VSOCKETS up, so the CONFIGs are in alphabetic order. Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231129234417.856536-5-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-01selftests/bpf: Fix erroneous bitmask operationJeroen van Ingen Schenau
xdp_synproxy_kern.c is a BPF program that generates SYN cookies on allowed TCP ports and sends SYNACKs to clients, accelerating synproxy iptables module. Fix the bitmask operation when checking the status of an existing conntrack entry within tcp_lookup() function. Do not AND with the bit position number, but with the bitmask value to check whether the entry found has the IPS_CONFIRMED flag set. Fixes: fb5cd0ce70d4 ("selftests/bpf: Add selftests for raw syncookie helpers") Signed-off-by: Jeroen van Ingen Schenau <jeroen.vaningenschenau@novoserve.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Minh Le Hoang <minh.lehoang@novoserve.com> Link: https://lore.kernel.org/xdp-newbies/CAAi1gX7owA+Tcxq-titC-h-KPM7Ri-6ZhTNMhrnPq5gmYYwKow@mail.gmail.com/T/#u Link: https://lore.kernel.org/bpf/20231130120353.3084-1-jeroen.vaningenschenau@novoserve.com
2023-11-30selftests: tc-testing: remove filters/tests.jsonPedro Tammela
Remove this generic file and move the tests to their appropriate files Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231129222424.910148-5-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30selftests: tc-testing: rename concurrency.json to flower.jsonPedro Tammela
All tests in this file pertain to flower, so name it appropriately Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231129222424.910148-4-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30selftests: tc-testing: remove spurious './' from MakefilePedro Tammela
Patchwork CI didn't like the extra './', so remove it. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231129222424.910148-3-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30selftests: tc-testing: remove spurious nsPlugin usagePedro Tammela
Tests using DEV2 should not be run in a dedicated net namespace, and in parallel, as this device cannot be shared. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231129222424.910148-2-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30docs: netlink: link to family documentations from spec infoJakub Kicinski
To increase the chances of people finding the rendered docs add a link to specs.rst and index.rst. Add a label in the generated index.rst and while at it adjust the title a little bit. Reviewed-by: Breno Leitao <leitao@debian.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231129041427.2763074-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-11-30 We've added 30 non-merge commits during the last 7 day(s) which contain a total of 58 files changed, 1598 insertions(+), 154 deletions(-). The main changes are: 1) Add initial TX metadata implementation for AF_XDP with support in mlx5 and stmmac drivers. Two types of offloads are supported right now, that is, TX timestamp and TX checksum offload, from Stanislav Fomichev with stmmac implementation from Song Yoong Siang. 2) Change BPF verifier logic to validate global subprograms lazily instead of unconditionally before the main program, so they can be guarded using BPF CO-RE techniques, from Andrii Nakryiko. 3) Add BPF link_info support for uprobe multi link along with bpftool integration for the latter, from Jiri Olsa. 4) Use pkg-config in BPF selftests to determine ld flags which is in particular needed for linking statically, from Akihiko Odaki. 5) Fix a few BPF selftest failures to adapt to the upcoming LLVM18, from Yonghong Song. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (30 commits) bpf/tests: Remove duplicate JSGT tests selftests/bpf: Add TX side to xdp_hw_metadata selftests/bpf: Convert xdp_hw_metadata to XDP_USE_NEED_WAKEUP selftests/bpf: Add TX side to xdp_metadata selftests/bpf: Add csum helpers selftests/xsk: Support tx_metadata_len xsk: Add option to calculate TX checksum in SW xsk: Validate xsk_tx_metadata flags xsk: Document tx_metadata_len layout net: stmmac: Add Tx HWTS support to XDP ZC net/mlx5e: Implement AF_XDP TX timestamp and checksum offload tools: ynl: Print xsk-features from the sample xsk: Add TX timestamp and TX checksum offload support xsk: Support tx_metadata_len selftests/bpf: Use pkg-config for libelf selftests/bpf: Override PKG_CONFIG for static builds selftests/bpf: Choose pkg-config for the target bpftool: Add support to display uprobe_multi links selftests/bpf: Add link_info test for uprobe_multi link selftests/bpf: Use bpf_link__destroy in fill_link_info tests ... ==================== Conflicts: Documentation/netlink/specs/netdev.yaml: 839ff60df3ab ("net: page_pool: add nlspec for basic access to page pools") 48eb03dd2630 ("xsk: Add TX timestamp and TX checksum offload support") https://lore.kernel.org/all/20231201094705.1ee3cab8@canb.auug.org.au/ While at it also regen, tree is dirty after: 48eb03dd2630 ("xsk: Add TX timestamp and TX checksum offload support") looks like code wasn't re-rendered after "render-max" was removed. Link: https://lore.kernel.org/r/20231130145708.32573-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-01Merge tag 'net-6.7-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and wifi. Current release - regressions: - neighbour: fix __randomize_layout crash in struct neighbour - r8169: fix deadlock on RTL8125 in jumbo mtu mode Previous releases - regressions: - wifi: - mac80211: fix warning at station removal time - cfg80211: fix CQM for non-range use - tools: ynl-gen: fix unexpected response handling - octeontx2-af: fix possible buffer overflow - dpaa2: recycle the RX buffer only after all processing done - rswitch: fix missing dev_kfree_skb_any() in error path Previous releases - always broken: - ipv4: fix uaf issue when receiving igmp query packet - wifi: mac80211: fix debugfs deadlock at device removal time - bpf: - sockmap: af_unix stream sockets need to hold ref for pair sock - netdevsim: don't accept device bound programs - selftests: fix a char signedness issue - dsa: mv88e6xxx: fix marvell 6350 probe crash - octeontx2-pf: restore TC ingress police rules when interface is up - wangxun: fix memory leak on msix entry - ravb: keep reverse order of operations in ravb_remove()" * tag 'net-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits) net: ravb: Keep reverse order of operations in ravb_remove() net: ravb: Stop DMA in case of failures on ravb_open() net: ravb: Start TX queues after HW initialization succeeded net: ravb: Make write access to CXR35 first before accessing other EMAC registers net: ravb: Use pm_runtime_resume_and_get() net: ravb: Check return value of reset_control_deassert() net: libwx: fix memory leak on msix entry ice: Fix VF Reset paths when interface in a failed over aggregate bpf, sockmap: Add af_unix test with both sockets in map bpf, sockmap: af_unix stream sockets need to hold ref for pair sock tools: ynl-gen: always construct struct ynl_req_state ethtool: don't propagate EOPNOTSUPP from dumps ravb: Fix races between ravb_tx_timeout_work() and net related ops r8169: prevent potential deadlock in rtl8169_close r8169: fix deadlock on RTL8125 in jumbo mtu mode neighbour: Fix __randomize_layout crash in struct neighbour octeontx2-pf: Restore TC ingress police rules when interface is up octeontx2-pf: Fix adding mbox work queue entry when num_vfs > 64 net: stmmac: xgmac: Disable FPE MMC interrupts octeontx2-af: Fix possible buffer overflow ...
2023-11-29selftests: mptcp: add mptcp_lib_wait_local_port_listenGeliang Tang
To avoid duplicated code in different MPTCP selftests, we can add and use helpers defined in mptcp_lib.sh. wait_local_port_listen() helper is defined in diag.sh, mptcp_connect.sh, mptcp_join.sh and simult_flows.sh, export it into mptcp_lib.sh and rename it with mptcp_lib_ prefix. Use this new helper in all these scripts. Note: We only have IPv4 connections in this helper, not looking at IPv6 (tcp6) but that's OK because we only have IPv4 connections here in diag.sh. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-15-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: add mptcp_lib_check_transferGeliang Tang
To avoid duplicated code in different MPTCP selftests, we can add and use helpers defined in mptcp_lib.sh. check_transfer() and print_file_err() helpers are defined both in mptcp_connect.sh and mptcp_sockopt.sh, export them into mptcp_lib.sh and rename them with mptcp_lib_ prefix. And use them in all scripts. Note: In mptcp_sockopt.sh it is OK to drop 'ret=1' in check_transfer() because it will be set in run_tests() anyway. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-14-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: add mptcp_lib_make_fileGeliang Tang
To avoid duplicated code in different MPTCP selftests, we can add and use helpers defined in mptcp_lib.sh. make_file() helper in mptcp_sockopt.sh and userspace_pm.sh are the same. Export it into mptcp_lib.sh and rename it as mptcp_lib_kill_wait(). Use it in both mptcp_connect.sh and mptcp_join.sh. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-13-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: add missing oflag=appendGeliang Tang
In mptcp_connect.sh we are missing something like "oflag=append" because this will write "${rem}" bytes at the beginning of the file where there is already some random bytes. It should write that at the end. This patch adds this missing 'oflag=append' flag for 'dd' command in make_file(). Suggested-by: Matthieu Baerts <matttbe@kernel.org> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-12-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: add mptcp_lib_get_counterGeliang Tang
To avoid duplicated code in different MPTCP selftests, we can add and use helpers defined in mptcp_lib.sh. The helper get_counter() in mptcp_join.sh and get_mib_counter() in mptcp_connect.sh have the same functionality, export get_counter() into mptcp_lib.sh and rename it as mptcp_lib_get_counter(). Use this new helper instead of get_counter() and get_mib_counter(). Use this helper in test_prio() in userspace_pm.sh too instead of open-coding. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-11-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: add mptcp_lib_is_v6Geliang Tang
To avoid duplicated code in different MPTCP selftests, we can add and use helpers defined in mptcp_lib.sh. is_v6() helper is defined in mptcp_connect.sh, mptcp_join.sh and mptcp_sockopt.sh, so export it into mptcp_lib.sh and rename it as mptcp_lib_is_v6(). Use this new helper in all scripts. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-10-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: add mptcp_lib_kill_waitGeliang Tang
To avoid duplicated code in different MPTCP selftests, we can add and use helpers defined in mptcp_lib.sh. Export kill_wait() helper in userspace_pm.sh into mptcp_lib.sh and rename it as mptcp_lib_kill_wait(). It can be used to instead of kill_wait() in mptcp_join.sh. Use the new helper in both scripts. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-9-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: userspace pm send RM_ADDR for ID 0Geliang Tang
This patch adds a selftest for userspace PM to remove id 0 address. Use userspace_pm_add_addr() helper to add an id 10 address, then use userspace_pm_rm_addr() helper to remove id 0 address. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-8-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: userspace pm remove initial subflowGeliang Tang
This patch adds a selftest for userspace PM to remove the initial subflow. Use userspace_pm_add_sf() to add a subflow, and pass initial IP address to userspace_pm_rm_sf() to remove the initial subflow. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-7-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: userspace pm create id 0 subflowGeliang Tang
This patch adds a selftest to create id 0 subflow. Pass id 0 to the helper userspace_pm_add_sf() to create id 0 subflow. chk_mptcp_info shows one subflow but chk_subflows_total shows two subflows in each namespace. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-5-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: update userspace pm test helpersGeliang Tang
This patch adds a new argument namespace to userspace_pm_add_addr() and userspace_pm_add_sf() to make these two helper more versatile. Add two more versatile helpers for userspace pm remove subflow or address: userspace_pm_rm_addr() and userspace_pm_rm_sf(). The original test helpers userspace_pm_rm_sf_addr_ns1() and userspace_pm_rm_sf_addr_ns2() can be replaced by these new helpers. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-4-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: add chk_subflows_total helperGeliang Tang
This patch adds a new helper chk_subflows_total(), in it use the newly added counter mptcpi_subflows_total to get the "correct" amount of subflows, including the initial one. To be compatible with old 'ss' or kernel versions not supporting this counter, get the total subflows by listing TCP connections that are MPTCP subflows: ss -ti state state established state syn-sent state syn-recv | grep -c tcp-ulp-mptcp. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-3-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29selftests: mptcp: add evts_get_info helperGeliang Tang
This patch adds a new helper get_info_value(), using 'sed' command to parse the value of the given item name in the line with the given keyword, to make chk_mptcp_info() and pedit_action_pkts() more readable. Also add another helper evts_get_info() to use get_info_value() to parse the output of 'pm_nl_ctl events' command, to make all the userspace pm selftests more readable, both in mptcp_join.sh and userspace_pm.sh. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-2-8d6b94150f6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2023-11-30 We've added 5 non-merge commits during the last 7 day(s) which contain a total of 10 files changed, 66 insertions(+), 15 deletions(-). The main changes are: 1) Fix AF_UNIX splat from use after free in BPF sockmap, from John Fastabend. 2) Fix a syzkaller splat in netdevsim by properly handling offloaded programs (and not device-bound ones), from Stanislav Fomichev. 3) Fix bpf_mem_cache_alloc_flags() to initialize the allocation hint, from Hou Tao. 4) Fix netkit by rejecting IFLA_NETKIT_PEER_INFO in changelink, from Daniel Borkmann. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, sockmap: Add af_unix test with both sockets in map bpf, sockmap: af_unix stream sockets need to hold ref for pair sock netkit: Reject IFLA_NETKIT_PEER_INFO in netkit_change_link bpf: Add missed allocation hint for bpf_mem_cache_alloc_flags() netdevsim: Don't accept device bound programs ==================== Link: https://lore.kernel.org/r/20231129234916.16128-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29tools: ynl: don't skip regeneration from make targetsJakub Kicinski
Commit 2b7ac0c87d98 ("tools: ynl-gen: don't touch the output file if content is the same") is working too well. It was added so that ynl-regen -f doesn't make us rebuild half of the kernel, if there are no actual changes in any generated code. When ynl-gen-c is called by make, however, we're better off trusting make's tracking and overwrite the file. Otherwise if output is identical we won't update file timestamps and make will retry code gen on every invocation. Link: https://lore.kernel.org/r/20231129193622.2912353-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29tools: ynl: order building samples after generated codeJakub Kicinski
Parallel builds of ynl: make -C tools/net/ynl/ -j 4 don't work correctly right now. samples get handled before generated, so build of samples does not notice that protos.a has changed. Order samples to be last. Link: https://lore.kernel.org/r/20231129193622.2912353-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29tools: ynl: make sure we use local headers for page-poolJakub Kicinski
Building samples generates the following warning: In file included from page-pool.c:11: generated/netdev-user.h:21:45: warning: ‘enum netdev_xdp_rx_metadata’ declared inside parameter list will not be visible outside of this definition or declaration 21 | const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value); | ^~~~~~~~~~~~~~~~~~~~~~ Our magic way of including uAPI headers assumes the sample name matches the family name. We need to copy the flags over. Fixes: 637567e4a3ef ("tools: ynl: add sample for getting page-pool information") Link: https://lore.kernel.org/r/20231129193622.2912353-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29tools: ynl: fix build of the page-pool sampleJakub Kicinski
The name of the "destroyed" field in the reply was not changed in the sample after we started calling it "detach_time". page-pool.c: In function ‘main’: page-pool.c:84:33: error: ‘struct <anonymous>’ has no member named ‘destroyed’ 84 | if (pp->_present.destroyed) | ^ Fixes: 637567e4a3ef ("tools: ynl: add sample for getting page-pool information") Link: https://lore.kernel.org/r/20231129193622.2912353-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>