summaryrefslogtreecommitdiff
path: root/tools/testing/selftests/bpf
AgeCommit message (Collapse)Author
2023-07-18selftests/bpf: Add more tests for check_max_stack_depth bugKumar Kartikeya Dwivedi
Another test which now exercies the path of the verifier where it will explore call chains rooted at the async callback. Without the prior fixes, this program loads successfully, which is incorrect. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230717161530.1238-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-05selftests/bpf: Add selftest for check_stack_max_depth bugKumar Kartikeya Dwivedi
Use the bpf_timer_set_callback helper to mark timer_cb as an async callback, and put a direct call to timer_cb in the main subprog. As the check_stack_max_depth happens after the do_check pass, the order does not matter. Without the previous fix, the test passes successfully. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230705144730.235802-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-28Merge tag 'net-next-6.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking changes from Jakub Kicinski: "WiFi 7 and sendpage changes are the biggest pieces of work for this release. The latter will definitely require fixes but I think that we got it to a reasonable point. Core: - Rework the sendpage & splice implementations Instead of feeding data into sockets page by page extend sendmsg handlers to support taking a reference on the data, controlled by a new flag called MSG_SPLICE_PAGES Rework the handling of unexpected-end-of-file to invoke an additional callback instead of trying to predict what the right combination of MORE/NOTLAST flags is Remove the MSG_SENDPAGE_NOTLAST flag completely - Implement SCM_PIDFD, a new type of CMSG type analogous to SCM_CREDENTIALS, but it contains pidfd instead of plain pid - Enable socket busy polling with CONFIG_RT - Improve reliability and efficiency of reporting for ref_tracker - Auto-generate a user space C library for various Netlink families Protocols: - Allow TCP to shrink the advertised window when necessary, prevent sk_rcvbuf auto-tuning from growing the window all the way up to tcp_rmem[2] - Use per-VMA locking for "page-flipping" TCP receive zerocopy - Prepare TCP for device-to-device data transfers, by making sure that payloads are always attached to skbs as page frags - Make the backoff time for the first N TCP SYN retransmissions linear. Exponential backoff is unnecessarily conservative - Create a new MPTCP getsockopt to retrieve all info (MPTCP_FULL_INFO) - Avoid waking up applications using TLS sockets until we have a full record - Allow using kernel memory for protocol ioctl callbacks, paving the way to issuing ioctls over io_uring - Add nolocalbypass option to VxLAN, forcing packets to be fully encapsulated even if they are destined for a local IP address - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure in-kernel ECMP implementation (e.g. Open vSwitch) select the same link for all packets. Support L4 symmetric hashing in Open vSwitch - PPPoE: make number of hash bits configurable - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client (ipconfig) - Add layer 2 miss indication and filtering, allowing higher layers (e.g. ACL filters) to make forwarding decisions based on whether packet matched forwarding state in lower devices (bridge) - Support matching on Connectivity Fault Management (CFM) packets - Hide the "link becomes ready" IPv6 messages by demoting their printk level to debug - HSR: don't enable promiscuous mode if device offloads the proto - Support active scanning in IEEE 802.15.4 - Continue work on Multi-Link Operation for WiFi 7 BPF: - Add precision propagation for subprogs and callbacks. This allows maintaining verification efficiency when subprograms are used, or in fact passing the verifier at all for complex programs, especially those using open-coded iterators - Improve BPF's {g,s}setsockopt() length handling. Previously BPF assumed the length is always equal to the amount of written data. But some protos allow passing a NULL buffer to discover what the output buffer *should* be, without writing anything - Accept dynptr memory as memory arguments passed to helpers - Add routing table ID to bpf_fib_lookup BPF helper - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark maps as read-only) - Show target_{obj,btf}_id in tracing link fdinfo - Addition of several new kfuncs (most of the names are self-explanatory): - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(), bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size() and bpf_dynptr_clone(). - bpf_task_under_cgroup() - bpf_sock_destroy() - force closing sockets - bpf_cpumask_first_and(), rework bpf_cpumask_any*() kfuncs Netfilter: - Relax set/map validation checks in nf_tables. Allow checking presence of an entry in a map without using the value - Increase ip_vs_conn_tab_bits range for 64BIT builds - Allow updating size of a set - Improve NAT tuple selection when connection is closing Driver API: - Integrate netdev with LED subsystem, to allow configuring HW "offloaded" blinking of LEDs based on link state and activity (i.e. packets coming in and out) - Support configuring rate selection pins of SFP modules - Factor Clause 73 auto-negotiation code out of the drivers, provide common helper routines - Add more fool-proof helpers for managing lifetime of MDIO devices associated with the PCS layer - Allow drivers to report advanced statistics related to Time Aware scheduler offload (taprio) - Allow opting out of VF statistics in link dump, to allow more VFs to fit into the message - Split devlink instance and devlink port operations New hardware / drivers: - Ethernet: - Synopsys EMAC4 IP support (stmmac) - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches - Marvell 88E6250 7 port switches - Microchip LAN8650/1 Rev.B0 PHYs - MediaTek MT7981/MT7988 built-in 1GE PHY driver - WiFi: - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps - Realtek RTL8723DS (SDIO variant) - Realtek RTL8851BE - CAN: - Fintek F81604 Drivers: - Ethernet NICs: - Intel (100G, ice): - support dynamic interrupt allocation - use meta data match instead of VF MAC addr on slow-path - nVidia/Mellanox: - extend link aggregation to handle 4, rather than just 2 ports - spawn sub-functions without any features by default - OcteonTX2: - support HTB (Tx scheduling/QoS) offload - make RSS hash generation configurable - support selecting Rx queue using TC filters - Wangxun (ngbe/txgbe): - add basic Tx/Rx packet offloads - add phylink support (SFP/PCS control) - Freescale/NXP (enetc): - report TAPRIO packet statistics - Solarflare/AMD: - support matching on IP ToS and UDP source port of outer header - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6 - add devlink dev info support for EF10 - Virtual NICs: - Microsoft vNIC: - size the Rx indirection table based on requested configuration - support VLAN tagging - Amazon vNIC: - try to reuse Rx buffers if not fully consumed, useful for ARM servers running with 16kB pages - Google vNIC: - support TCP segmentation of >64kB frames - Ethernet embedded switches: - Marvell (mv88e6xxx): - enable USXGMII (88E6191X) - Microchip: - lan966x: add support for Egress Stage 0 ACL engine - lan966x: support mapping packet priority to internal switch priority (based on PCP or DSCP) - Ethernet PHYs: - Broadcom PHYs: - support for Wake-on-LAN for BCM54210E/B50212E - report LPI counter - Microsemi PHYs: support RGMII delay configuration (VSC85xx) - Micrel PHYs: receive timestamp in the frame (LAN8841) - Realtek PHYs: support optional external PHY clock - Altera TSE PCS: merge the driver into Lynx PCS which it is a variant of - CAN: Kvaser PCIEcan: - support packet timestamping - WiFi: - Intel (iwlwifi): - major update for new firmware and Multi-Link Operation (MLO) - configuration rework to drop test devices and split the different families - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature - Qualcomm 802.11ax (ath11k): - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode - support factory test mode - RealTek (rtw89): - add RSSI based antenna diversity - support U-NII-4 channels on 5 GHz band - RealTek (rtl8xxxu): - AP mode support for 8188f - support USB RX aggregation for the newer chips" * tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1602 commits) net: scm: introduce and use scm_recv_unix helper af_unix: Skip SCM_PIDFD if scm->pid is NULL. net: lan743x: Simplify comparison netlink: Add __sock_i_ino() for __netlink_diag_dump(). net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses Revert "af_unix: Call scm_recv() only after scm_set_cred()." phylink: ReST-ify the phylink_pcs_neg_mode() kdoc libceph: Partially revert changes to support MSG_SPLICE_PAGES net: phy: mscc: fix packet loss due to RGMII delays net: mana: use vmalloc_array and vcalloc net: enetc: use vmalloc_array and vcalloc ionic: use vmalloc_array and vcalloc pds_core: use vmalloc_array and vcalloc gve: use vmalloc_array and vcalloc octeon_ep: use vmalloc_array and vcalloc net: usb: qmi_wwan: add u-blox 0x1312 composition perf trace: fix MSG_SPLICE_PAGES build error ipvlan: Fix return value of ipvlan_queue_xmit() netfilter: nf_tables: fix underflow in chain reference counter netfilter: nf_tables: unbind non-anonymous set if rule construction fails ...
2023-06-28Merge tag 'v6.5-rc1-modules-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull module updates from Luis Chamberlain: "The changes queued up for modules are pretty tame, mostly code removal of moving of code. Only two minor functional changes are made, the only one which stands out is Sebastian Andrzej Siewior's simplification of module reference counting by removing preempt_disable() and that has been tested on linux-next for well over a month without no regressions. I'm now, I guess, also a kitchen sink for some kallsyms changes" [ There was a mis-communication about the concurrent module load changes that I had expected to come through Luis despite me authoring the patch. So some of the module updates were left hanging in the email ether, and I just committed them separately. It's my bad - I should have made it more clear that I expected my own patches to come through the module tree too. Now they missed linux-next, but hopefully that won't cause any issues - Linus ] * tag 'v6.5-rc1-modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: kallsyms: make kallsyms_show_value() as generic function kallsyms: move kallsyms_show_value() out of kallsyms.c kallsyms: remove unsed API lookup_symbol_attrs kallsyms: remove unused arch_get_kallsym() helper module: Remove preempt_disable() from module reference counting.
2023-06-24Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-06-23 We've added 49 non-merge commits during the last 24 day(s) which contain a total of 70 files changed, 1935 insertions(+), 442 deletions(-). The main changes are: 1) Extend bpf_fib_lookup helper to allow passing the route table ID, from Louis DeLosSantos. 2) Fix regsafe() in verifier to call check_ids() for scalar registers, from Eduard Zingerman. 3) Extend the set of cpumask kfuncs with bpf_cpumask_first_and() and a rework of bpf_cpumask_any*() kfuncs. Additionally, add selftests, from David Vernet. 4) Fix socket lookup BPF helpers for tc/XDP to respect VRF bindings, from Gilad Sever. 5) Change bpf_link_put() to use workqueue unconditionally to fix it under PREEMPT_RT, from Sebastian Andrzej Siewior. 6) Follow-ups to address issues in the bpf_refcount shared ownership implementation, from Dave Marchevsky. 7) A few general refactorings to BPF map and program creation permissions checks which were part of the BPF token series, from Andrii Nakryiko. 8) Various fixes for benchmark framework and add a new benchmark for BPF memory allocator to BPF selftests, from Hou Tao. 9) Documentation improvements around iterators and trusted pointers, from Anton Protopopov. 10) Small cleanup in verifier to improve allocated object check, from Daniel T. Lee. 11) Improve performance of bpf_xdp_pointer() by avoiding access to shared_info when XDP packet does not have frags, from Jesper Dangaard Brouer. 12) Silence a harmless syzbot-reported warning in btf_type_id_size(), from Yonghong Song. 13) Remove duplicate bpfilter_umh_cleanup in favor of umd_cleanup_helper, from Jarkko Sakkinen. 14) Fix BPF selftests build for resolve_btfids under custom HOSTCFLAGS, from Viktor Malik. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (49 commits) bpf, docs: Document existing macros instead of deprecated bpf, docs: BPF Iterator Document selftests/bpf: Fix compilation failure for prog vrf_socket_lookup selftests/bpf: Add vrf_socket_lookup tests bpf: Fix bpf socket lookup from tc/xdp to respect socket VRF bindings bpf: Call __bpf_sk_lookup()/__bpf_skc_lookup() directly via TC hookpoint bpf: Factor out socket lookup functions for the TC hookpoint. selftests/bpf: Set the default value of consumer_cnt as 0 selftests/bpf: Ensure that next_cpu() returns a valid CPU number selftests/bpf: Output the correct error code for pthread APIs selftests/bpf: Use producer_cnt to allocate local counter array xsk: Remove unused inline function xsk_buff_discard() bpf: Keep BPF_PROG_LOAD permission checks clear of validations bpf: Centralize permissions checks for all BPF map types bpf: Inline map creation logic in map_create() function bpf: Move unprivileged checks into map_create() and bpf_prog_load() bpf: Remove in_atomic() from bpf_link_put(). selftests/bpf: Verify that check_ids() is used for scalars in regsafe() bpf: Verify scalar ids mapping in regsafe() using check_ids() selftests/bpf: Check if mark_chain_precision() follows scalar ids ... ==================== Link: https://lore.kernel.org/r/20230623211256.8409-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: tools/testing/selftests/net/fcnal-test.sh d7a2fc1437f7 ("selftests: net: fcnal-test: check if FIPS mode is enabled") dd017c72dde6 ("selftests: fcnal: Test SO_DONTROUTE on TCP sockets.") https://lore.kernel.org/all/5007b52c-dd16-dbf6-8d64-b9701bfa498b@tessares.net/ https://lore.kernel.org/all/20230619105427.4a0df9b3@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-22selftests/bpf: Fix compilation failure for prog vrf_socket_lookupYonghong Song
When building the latest kernel/selftest with clang17 compiler: make LLVM=1 -j <== for kernel make -C tools/testing/selftests/bpf LLVM=1 -j <== for selftest I hit the following compilation error: [...] In file included from progs/vrf_socket_lookup.c:3: In file included from /usr/include/linux/ip.h:21: In file included from /usr/include/asm/byteorder.h:5: In file included from /usr/include/linux/byteorder/little_endian.h:13: /usr/include/linux/swab.h:136:8: error: unknown type name '__always_inline' 136 | static __always_inline unsigned long __swab(const unsigned long y) | ^ /usr/include/linux/swab.h:171:8: error: unknown type name '__always_inline' 171 | static __always_inline __u16 __swab16p(const __u16 *p) | ^ /usr/include/linux/swab.h:171:29: error: expected ';' after top level declarator 171 | static __always_inline __u16 __swab16p(const __u16 *p) | ^ [...] Basically, with header files in my local host which is based on 5.12 kernel, __always_inline is not defined and this caused compilation failure. Since __always_inline is defined in bpf_helpers.h, let us move bpf_helpers.h to an early position which fixed the problem. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230622061921.816772-1-yhs@fb.com
2023-06-21selftests/bpf: Add vrf_socket_lookup testsGilad Sever
Verify that socket lookup via TC/XDP with all BPF APIs is VRF aware. Signed-off-by: Gilad Sever <gilad9366@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eyal Birger <eyal.birger@gmail.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230621104211.301902-5-gilad9366@gmail.com
2023-06-19selftests/bpf: Set the default value of consumer_cnt as 0Hou Tao
Considering that only bench_ringbufs.c supports consumer, just set the default value of consumer_cnt as 0. After that, update the validity check of consumer_cnt, remove unused consumer_thread code snippets and set consumer_cnt as 1 in run_bench_ringbufs.sh accordingly. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230613080921.1623219-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-19selftests/bpf: Ensure that next_cpu() returns a valid CPU numberHou Tao
When using option -a without --prod-affinity or --cons-affinity, if the number of producers and consumers is greater than the number of online CPUs, the benchmark will fail to run as shown below: $ getconf _NPROCESSORS_ONLN 8 $ ./bench bpf-loop -a -p9 Setting up benchmark 'bpf-loop'... setting affinity to CPU #8 failed: -22 Fix it by returning the remainder of next_cpu divided by the number of online CPUs in next_cpu(). Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230613080921.1623219-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-19selftests/bpf: Output the correct error code for pthread APIsHou Tao
The return value of pthread API is the error code when the called API fails, so output the return value instead of errno. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230613080921.1623219-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-19selftests/bpf: Use producer_cnt to allocate local counter arrayHou Tao
For count-local benchmark, use producer_cnt instead of consumer_cnt when allocating local counter array. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230613080921.1623219-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-19bpf: Centralize permissions checks for all BPF map typesAndrii Nakryiko
This allows to do more centralized decisions later on, and generally makes it very explicit which maps are privileged and which are not (e.g., LRU_HASH and LRU_PERCPU_HASH, which are privileged HASH variants, as opposed to unprivileged HASH and HASH_PERCPU; now this is explicit and easy to verify). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230613223533.3689589-4-andrii@kernel.org
2023-06-13selftests/bpf: Verify that check_ids() is used for scalars in regsafe()Eduard Zingerman
Verify that the following example is rejected by verifier: r9 = ... some pointer with range X ... r6 = ... unbound scalar ID=a ... r7 = ... unbound scalar ID=b ... if (r6 > r7) goto +1 r7 = r6 if (r7 > X) goto exit r9 += r6 *(u64 *)r9 = Y Also add test cases to: - check that check_alu_op() for BPF_MOV instruction does not allocate scalar ID if source register is a constant; - check that unique scalar IDs are ignored when new verifier state is compared to cached verifier state; - check that two different scalar IDs in a verified state can't be mapped to the same scalar ID in current state. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230613153824.3324830-5-eddyz87@gmail.com
2023-06-13selftests/bpf: Check if mark_chain_precision() follows scalar idsEduard Zingerman
Check __mark_chain_precision() log to verify that scalars with same IDs are marked as precise. Use several scenarios to test that precision marks are propagated through: - registers of scalar type with the same ID within one state; - registers of scalar type with the same ID cross several states; - registers of scalar type with the same ID cross several stack frames; - stack slot of scalar type with the same ID; - multiple scalar IDs are tracked independently. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230613153824.3324830-3-eddyz87@gmail.com
2023-06-13bpf: Use scalar ids in mark_chain_precision()Eduard Zingerman
Change mark_chain_precision() to track precision in situations like below: r2 = unknown value ... --- state #0 --- ... r1 = r2 // r1 and r2 now share the same ID ... --- state #1 {r1.id = A, r2.id = A} --- ... if (r2 > 10) goto exit; // find_equal_scalars() assigns range to r1 ... --- state #2 {r1.id = A, r2.id = A} --- r3 = r10 r3 += r1 // need to mark both r1 and r2 At the beginning of the processing of each state, ensure that if a register with a scalar ID is marked as precise, all registers sharing this ID are also marked as precise. This property would be used by a follow-up change in regsafe(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230613153824.3324830-2-eddyz87@gmail.com
2023-06-13selftests/bpf: add a test for subprogram extablesKrister Johansen
In certain situations a program with subprograms may have a NULL extable entry. This should not happen, and when it does, it turns a single trap into multiple. Add a test case for further debugging and to prevent regressions. The test-case contains three essentially identical versions of the same test because just one program may not be sufficient to trigger the oops. This is due to the fact that the items are stored in a binary tree and have identical values so it's possible to sometimes find the ksym with the extable. With 3 copies, this has been reliable on this author's test systems. When triggered out of this test case, the oops looks like this: BUG: kernel NULL pointer dereference, address: 000000000000000c #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 1132 Comm: test_progs Tainted: G OE 6.4.0-rc3+ #2 RIP: 0010:cmp_ex_search+0xb/0x30 Code: cc cc cc cc e8 36 cb 03 00 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 55 48 89 e5 48 8b 07 <48> 63 0e 48 01 f1 31 d2 48 39 c8 19 d2 48 39 c8 b8 01 00 00 00 0f RSP: 0018:ffffb30c4291f998 EFLAGS: 00010006 RAX: ffffffffc00b49da RBX: 0000000000000002 RCX: 000000000000000c RDX: 0000000000000002 RSI: 000000000000000c RDI: ffffb30c4291f9e8 RBP: ffffb30c4291f998 R08: ffffffffab1a42d0 R09: 0000000000000001 R10: 0000000000000000 R11: ffffffffab1a42d0 R12: ffffb30c4291f9e8 R13: 000000000000000c R14: 000000000000000c R15: 0000000000000000 FS: 00007fb5d9e044c0(0000) GS:ffff92e95ee00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000000c CR3: 000000010c3a2005 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> bsearch+0x41/0x90 ? __pfx_cmp_ex_search+0x10/0x10 ? bpf_prog_45a7907e7114d0ff_handle_fexit_ret_subprogs3+0x2a/0x6c search_extable+0x3b/0x60 ? bpf_prog_45a7907e7114d0ff_handle_fexit_ret_subprogs3+0x2a/0x6c search_bpf_extables+0x10d/0x190 ? bpf_prog_45a7907e7114d0ff_handle_fexit_ret_subprogs3+0x2a/0x6c search_exception_tables+0x5d/0x70 fixup_exception+0x3f/0x5b0 ? look_up_lock_class+0x61/0x110 ? __lock_acquire+0x6b8/0x3560 ? __lock_acquire+0x6b8/0x3560 ? __lock_acquire+0x6b8/0x3560 kernelmode_fixup_or_oops+0x46/0x110 __bad_area_nosemaphore+0x68/0x2b0 ? __lock_acquire+0x6b8/0x3560 bad_area_nosemaphore+0x16/0x20 do_kern_addr_fault+0x81/0xa0 exc_page_fault+0xd6/0x210 asm_exc_page_fault+0x2b/0x30 RIP: 0010:bpf_prog_45a7907e7114d0ff_handle_fexit_ret_subprogs3+0x2a/0x6c Code: f3 0f 1e fa 0f 1f 44 00 00 66 90 55 48 89 e5 f3 0f 1e fa 48 8b 7f 08 49 bb 00 00 00 00 00 80 00 00 4c 39 df 73 04 31 f6 eb 04 <48> 8b 77 00 49 bb 00 00 00 00 00 80 00 00 48 81 c7 7c 00 00 00 4c RSP: 0018:ffffb30c4291fcb8 EFLAGS: 00010282 RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000 RDX: 00000000cddf1af1 RSI: 000000005315a00d RDI: ffffffffffffffea RBP: ffffb30c4291fcb8 R08: ffff92e644bf38a8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000800000000000 R12: ffff92e663652690 R13: 00000000000001c8 R14: 00000000000001c8 R15: 0000000000000003 bpf_trampoline_251255721842_2+0x63/0x1000 bpf_testmod_return_ptr+0x9/0xb0 [bpf_testmod] ? bpf_testmod_test_read+0x43/0x2d0 [bpf_testmod] sysfs_kf_bin_read+0x60/0x90 kernfs_fop_read_iter+0x143/0x250 vfs_read+0x240/0x2a0 ksys_read+0x70/0xe0 __x64_sys_read+0x1f/0x30 do_syscall_64+0x68/0xa0 ? syscall_exit_to_user_mode+0x77/0x1f0 ? do_syscall_64+0x77/0xa0 ? irqentry_exit+0x35/0xa0 ? sysvec_apic_timer_interrupt+0x4d/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7fb5da00a392 Code: ac 00 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb be 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 RSP: 002b:00007ffc5b3cab68 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 000055bee7b8b100 RCX: 00007fb5da00a392 RDX: 00000000000001c8 RSI: 0000000000000000 RDI: 0000000000000009 RBP: 00007ffc5b3caba0 R08: 0000000000000000 R09: 0000000000000037 R10: 000055bee7b8c2a7 R11: 0000000000000246 R12: 000055bee78f1f60 R13: 00007ffc5b3cae90 R14: 0000000000000000 R15: 0000000000000000 </TASK> Modules linked in: bpf_testmod(OE) nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common intel_uncore_frequency_common ppdev nfit crct10dif_pclmul crc32_pclmul psmouse ghash_clmulni_intel sha512_ssse3 aesni_intel parport_pc crypto_simd cryptd input_leds parport rapl ena i2c_piix4 mac_hid serio_raw ramoops reed_solomon pstore_blk drm pstore_zone efi_pstore autofs4 [last unloaded: bpf_testmod(OE)] CR2: 000000000000000c Though there may be some variation, depending on which suprogram triggers the bug. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/4ebf95ec857cd785b81db69f3e408c039ad8408b.1686616663.git.kjlx@templeofstupid.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-12selftests/bpf: Update bpf_cpumask_any* tests to use bpf_cpumask_any_distribute*David Vernet
In a prior patch, we removed the bpf_cpumask_any() and bpf_cpumask_any_and() kfuncs, and replaced them with bpf_cpumask_any_distribute() and bpf_cpumask_any_distribute_and(). The advertised semantics between the two kfuncs were identical, with the former always returning the first CPU, and the latter actually returning any CPU. This patch updates the selftests for these kfuncs to use the new names. Signed-off-by: David Vernet <void@manifault.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230610035053.117605-4-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-12selftests/bpf: Add test for new bpf_cpumask_first_and() kfuncDavid Vernet
A prior patch added a new kfunc called bpf_cpumask_first_and() which wraps cpumask_first_and(). This patch adds a selftest to validate its behavior. Signed-off-by: David Vernet <void@manifault.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230610035053.117605-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-12selftests/bpf: Fix invalid pointer check in get_xlated_program()Eduard Zingerman
Dan Carpenter reported invalid check for calloc() result in test_verifier.c:get_xlated_program(): ./tools/testing/selftests/bpf/test_verifier.c:1365 get_xlated_program() warn: variable dereferenced before check 'buf' (see line 1364) ./tools/testing/selftests/bpf/test_verifier.c 1363 *cnt = xlated_prog_len / buf_element_size; 1364 *buf = calloc(*cnt, buf_element_size); 1365 if (!buf) { This should be if (!*buf) { 1366 perror("can't allocate xlated program buffer"); 1367 return -ENOMEM; This commit refactors the get_xlated_program() to avoid using double pointer type. Fixes: 933ff53191eb ("selftests/bpf: specify expected instructions in test_verifier tests") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://lore.kernel.org/bpf/ZH7u0hEGVB4MjGZq@moroto/ Link: https://lore.kernel.org/bpf/20230609221637.2631800-1-eddyz87@gmail.com
2023-06-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: net/sched/sch_taprio.c d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()") net/ipv4/sysctl_net_ipv4.c e209fee4118f ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294") ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear") https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08selftests/bpf: Add missing prototypes for several test kfuncsJiri Olsa
Adding missing prototypes for several kfuncs that are used by test_verifier tests. We don't really need kfunc prototypes for these tests, but adding them to silence 'make W=1' build and to have all test kfuncs declarations in bpf_testmod_kfunc.h. Also moving __diag_pop for -Wmissing-prototypes to cover also bpf_testmod_test_write and bpf_testmod_test_read and adding bpf_fentry_shadow_test in there as well. All of them need to be exported, but there's no need for declarations. Fixes: 65eb006d85a2 ("bpf: Move kernel test kfuncs to bpf_testmod") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://lore.kernel.org/oe-kbuild-all/202306051319.EihCQZPs-lkp@intel.com Link: https://lore.kernel.org/bpf/20230607224046.236510-1-jolsa@kernel.org
2023-06-08selftests/bpf: Add test cases to assert proper ID tracking on spillMaxim Mikityanskiy
The previous commit fixed a verifier bypass by ensuring that ID is not preserved on narrowing spills. Add the test cases to check the problematic patterns. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20230607123951.558971-3-maxtram95@gmail.com
2023-06-06selftests/bpf: Fix sockopt_sk selftestYonghong Song
Commit f4e4534850a9 ("net/netlink: fix NETLINK_LIST_MEMBERSHIPS length report") fixed NETLINK_LIST_MEMBERSHIPS length report which caused selftest sockopt_sk failure. The failure log looks like test_sockopt_sk:PASS:join_cgroup /sockopt_sk 0 nsec run_test:PASS:skel_load 0 nsec run_test:PASS:setsockopt_link 0 nsec run_test:PASS:getsockopt_link 0 nsec getsetsockopt:FAIL:Unexpected NETLINK_LIST_MEMBERSHIPS value unexpected Unexpected NETLINK_LIST_MEMBERSHIPS value: actual 8 != expected 4 run_test:PASS:getsetsockopt 0 nsec #201 sockopt_sk:FAIL In net/netlink/af_netlink.c, function netlink_getsockopt(), for NETLINK_LIST_MEMBERSHIPS, nlk->ngroups equals to 36. Before Commit f4e4534850a9, the optlen is calculated as ALIGN(nlk->ngroups / 8, sizeof(u32)) = 4 After that commit, the optlen is ALIGN(BITS_TO_BYTES(nlk->ngroups), sizeof(u32)) = 8 Fix the test by setting the expected optlen to be 8. Fixes: f4e4534850a9 ("net/netlink: fix NETLINK_LIST_MEMBERSHIPS length report") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230606172202.1606249-1-yhs@fb.com
2023-06-06selftests/bpf: Fix check_mtu using wrong variable typeJesper Dangaard Brouer
Dan Carpenter found via Smatch static checker, that unsigned 'mtu_lo' is never less than zero. Variable mtu_lo should have been an 'int', because read_mtu_device_lo() uses minus as error indications. Fixes: b62eba563229 ("selftests/bpf: Tests using bpf_check_mtu BPF-helper") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/bpf/168605104733.3636467.17945947801753092590.stgit@firesoul
2023-06-05selftests/bpf: Add missing selftests kconfig optionsDavid Vernet
Our selftests of course rely on the kernel being built with CONFIG_DEBUG_INFO_BTF=y, though this (nor its dependencies of CONFIG_DEBUG_INFO=y and CONFIG_DEBUG_INFO_DWARF4=y) are not specified. This causes the wrong kernel to be built, and selftests to similarly fail to build. Additionally, in the BPF selftests kconfig file, CONFIG_NF_CONNTRACK_MARK=y is specified, so that the 'u_int32_t mark' field will be present in the definition of struct nf_conn. While a dependency of CONFIG_NF_CONNTRACK_MARK=y, CONFIG_NETFILTER_ADVANCED=y, should be enabled by default, I've run into instances of CONFIG_NF_CONNTRACK_MARK not being set because CONFIG_NETFILTER_ADVANCED isn't set, and have to manually enable them with make menuconfig. Let's add these missing kconfig options to the file so that the necessary dependencies are in place to build vmlinux. Otherwise, we'll get errors like this when we try to compile selftests and generate vmlinux.h: $ cd /path/to/bpf-next $ make mrproper; make defconfig $ cat tools/testing/selftests/config >> .config $ make -j ... $ cd tools/testing/selftests/bpf $ make clean $ make -j ... LD [M] tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.ko tools/testing/selftests/bpf/tools/build/bpftool/bootstrap/bpftool btf dump file vmlinux format c > tools/testing/selftests/bpf/tools/build/bpftool/vmlinux.h libbpf: failed to find '.BTF' ELF section in vmlinux Error: failed to load BTF from bpf-next/vmlinux: No data available make[1]: *** [Makefile:208: tools/testing/selftests/bpf/tools/build/bpftool/vmlinux.h] Error 195 make[1]: *** Deleting file 'tools/testing/selftests/bpf/tools/build/bpftool/vmlinux.h' make: *** [Makefile:261: tools/testing/selftests/bpf/tools/sbin/bpftool] Error 2 Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230602140108.1177900-1-void@manifault.com
2023-06-05selftests/bpf: Add test for non-NULLable PTR_TO_BTF_IDsDavid Vernet
In a recent patch, we taught the verifier that trusted PTR_TO_BTF_ID can never be NULL. This prevents the verifier from incorrectly failing to load certain programs where it gets confused and thinks a reference isn't dropped because it incorrectly assumes that a branch exists in which a NULL PTR_TO_BTF_ID pointer is never released. This patch adds a testcase that verifies this cannot happen. Signed-off-by: David Vernet <void@manifault.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230602150112.1494194-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-05bpf: Make bpf_refcount_acquire fallible for non-owning refsDave Marchevsky
This patch fixes an incorrect assumption made in the original bpf_refcount series [0], specifically that the BPF program calling bpf_refcount_acquire on some node can always guarantee that the node is alive. In that series, the patch adding failure behavior to rbtree_add and list_push_{front, back} breaks this assumption for non-owning references. Consider the following program: n = bpf_kptr_xchg(&mapval, NULL); /* skip error checking */ bpf_spin_lock(&l); if(bpf_rbtree_add(&t, &n->rb, less)) { bpf_refcount_acquire(n); /* Failed to add, do something else with the node */ } bpf_spin_unlock(&l); It's incorrect to assume that bpf_refcount_acquire will always succeed in this scenario. bpf_refcount_acquire is being called in a critical section here, but the lock being held is associated with rbtree t, which isn't necessarily the lock associated with the tree that the node is already in. So after bpf_rbtree_add fails to add the node and calls bpf_obj_drop in it, the program has no ownership of the node's lifetime. Therefore the node's refcount can be decr'd to 0 at any time after the failing rbtree_add. If this happens before the refcount_acquire above, the node might be free'd, and regardless refcount_acquire will be incrementing a 0 refcount. Later patches in the series exercise this scenario, resulting in the expected complaint from the kernel (without this patch's changes): refcount_t: addition on 0; use-after-free. WARNING: CPU: 1 PID: 207 at lib/refcount.c:25 refcount_warn_saturate+0xbc/0x110 Modules linked in: bpf_testmod(O) CPU: 1 PID: 207 Comm: test_progs Tainted: G O 6.3.0-rc7-02231-g723de1a718a2-dirty #371 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010:refcount_warn_saturate+0xbc/0x110 Code: 6f 64 f6 02 01 e8 84 a3 5c ff 0f 0b eb 9d 80 3d 5e 64 f6 02 00 75 94 48 c7 c7 e0 13 d2 82 c6 05 4e 64 f6 02 01 e8 64 a3 5c ff <0f> 0b e9 7a ff ff ff 80 3d 38 64 f6 02 00 0f 85 6d ff ff ff 48 c7 RSP: 0018:ffff88810b9179b0 EFLAGS: 00010082 RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 RDX: 0000000000000202 RSI: 0000000000000008 RDI: ffffffff857c3680 RBP: ffff88810027d3c0 R08: ffffffff8125f2a4 R09: ffff88810b9176e7 R10: ffffed1021722edc R11: 746e756f63666572 R12: ffff88810027d388 R13: ffff88810027d3c0 R14: ffffc900005fe030 R15: ffffc900005fe048 FS: 00007fee0584a700(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005634a96f6c58 CR3: 0000000108ce9002 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> bpf_refcount_acquire_impl+0xb5/0xc0 (rest of output snipped) The patch addresses this by changing bpf_refcount_acquire_impl to use refcount_inc_not_zero instead of refcount_inc and marking bpf_refcount_acquire KF_RET_NULL. For owning references, though, we know the above scenario is not possible and thus that bpf_refcount_acquire will always succeed. Some verifier bookkeeping is added to track "is input owning ref?" for bpf_refcount_acquire calls and return false from is_kfunc_ret_null for bpf_refcount_acquire on owning refs despite it being marked KF_RET_NULL. Existing selftests using bpf_refcount_acquire are modified where necessary to NULL-check its return value. [0]: https://lore.kernel.org/bpf/20230415201811.343116-1-davemarchevsky@fb.com/ Fixes: d2dcc67df910 ("bpf: Migrate bpf_rbtree_add and bpf_list_push_{front,back} to possibly fail") Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230602022647.1571784-5-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-02selftests/bpf: Add access_inner_map selftestRhys Rustad-Elliott
Add a selftest that accesses a BPF_MAP_TYPE_ARRAY (at a nonzero index) nested within a BPF_MAP_TYPE_HASH_OF_MAPS to flex a previously buggy case. Signed-off-by: Rhys Rustad-Elliott <me@rhysre.net> Link: https://lore.kernel.org/r/20230602190110.47068-3-me@rhysre.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-06-01selftests/bpf: Test table ID fib lookup BPF helperLouis DeLosSantos
Add additional test cases to `fib_lookup.c` prog_test. These test cases add a new /24 network to the previously unused veth2 device, removes the directly connected route from the main routing table and moves it to table 100. The first test case then confirms a fib lookup for a remote address in this directly connected network, using the main routing table fails. The second test case ensures the same fib lookup using table 100 succeeds. An additional pair of tests which function in the same manner are added for IPv6. Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230505-bpf-add-tbid-fib-lookup-v2-2-0a31c22c748c@gmail.com
2023-05-30selftests/bpf: Add a test where map key_type_id with decl_tag typeYonghong Song
Add two selftests where map creation key/value type_id's are decl_tags. Without previous patch, kernel warnings will appear similar to the one in the previous patch. With the previous patch, both kernel warnings are silenced. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230530205034.266643-1-yhs@fb.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-05-26Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-05-26 We've added 54 non-merge commits during the last 10 day(s) which contain a total of 76 files changed, 2729 insertions(+), 1003 deletions(-). The main changes are: 1) Add the capability to destroy sockets in BPF through a new kfunc, from Aditi Ghag. 2) Support O_PATH fds in BPF_OBJ_PIN and BPF_OBJ_GET commands, from Andrii Nakryiko. 3) Add capability for libbpf to resize datasec maps when backed via mmap, from JP Kobryn. 4) Move all the test kfuncs for CI out of the kernel and into bpf_testmod, from Jiri Olsa. 5) Big batch of xsk selftest improvements to prep for multi-buffer testing, from Magnus Karlsson. 6) Show the target_{obj,btf}_id in tracing link's fdinfo and dump it via bpftool, from Yafang Shao. 7) Various misc BPF selftest improvements to work with upcoming LLVM 17, from Yonghong Song. 8) Extend bpftool to specify netdevice for resolving XDP hints, from Larysa Zaremba. 9) Document masking in shift operations for the insn set document, from Dave Thaler. 10) Extend BPF selftests to check xdp_feature support for bond driver, from Lorenzo Bianconi. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits) bpf: Fix bad unlock balance on freeze_mutex libbpf: Ensure FD >= 3 during bpf_map__reuse_fd() libbpf: Ensure libbpf always opens files with O_CLOEXEC selftests/bpf: Check whether to run selftest libbpf: Change var type in datasec resize func bpf: drop unnecessary bpf_capable() check in BPF_MAP_FREEZE command libbpf: Selftests for resizing datasec maps libbpf: Add capability for resizing datasec maps selftests/bpf: Add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests libbpf: Add opts-based bpf_obj_pin() API and add support for path_fd bpf: Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands libbpf: Start v1.3 development cycle bpf: Validate BPF object in BPF_OBJ_PIN before calling LSM bpftool: Specify XDP Hints ifname when loading program selftests/bpf: Add xdp_feature selftest for bond device selftests/bpf: Test bpf_sock_destroy selftests/bpf: Add helper to get port using getsockname bpf: Add bpf_sock_destroy kfunc bpf: Add kfunc filter function to 'struct btf_kfunc_id_set' bpf: udp: Implement batching for sockets iterator ... ==================== Link: https://lore.kernel.org/r/20230526222747.17775-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-26kallsyms: remove unused arch_get_kallsym() helperArnd Bergmann
The arch_get_kallsym() function was introduced so that x86 could override it, but that override was removed in bf904d2762ee ("x86/pti/64: Remove the SYSCALL64 entry trampoline"), so now this does nothing except causing a warning about a missing prototype: kernel/kallsyms.c:662:12: error: no previous prototype for 'arch_get_kallsym' [-Werror=missing-prototypes] 662 | int __weak arch_get_kallsym(unsigned int symnum, unsigned long *value, Restore the old behavior before d83212d5dd67 ("kallsyms, x86: Export addresses of PTI entry trampolines") to simplify the code and avoid the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Alan Maguire <alan.maguire@oracle.com> [mcgrof: fold in bpf selftest fix] Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-05-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: net/ipv4/raw.c 3632679d9e4f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol") c85be08fc4fa ("raw: Stop using RTO_ONLINK.") https://lore.kernel.org/all/20230525110037.2b532b83@canb.auug.org.au/ Adjacent changes: drivers/net/ethernet/freescale/fec_main.c 9025944fddfe ("net: fec: add dma_wmb to ensure correct descriptor values") 144470c88c5d ("net: fec: using the standard return codes when xdp xmit errors") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-25selftests/bpf: Check whether to run selftestDaniel Müller
The sockopt test invokes test__start_subtest and then unconditionally asserts the success. That means that even if deny-listed, any test will still run and potentially fail. Evaluate the return value of test__start_subtest() to achieve the desired behavior, as other tests do. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230525232248.640465-1-deso@posteo.net
2023-05-24libbpf: Selftests for resizing datasec mapsJP Kobryn
This patch adds test coverage for resizing datasec maps. The first two subtests resize the bss and custom data sections. In both cases, an initial array (of length one) has its element set to one. After resizing the rest of the array is filled with ones as well. A BPF program is then run to sum the respective arrays and back on the userspace side the sum is checked to be equal to the number of elements. The third subtest attempts to perform resizing under conditions that will result in either the resize failing or the BTF info being cleared. Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230524004537.18614-3-inwardvessel@gmail.com
2023-05-23selftests/bpf: Add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET testsAndrii Nakryiko
Add a selftest demonstrating using detach-mounted BPF FS using new mount APIs, and pinning and getting BPF map using such mount. This demonstrates how something like container manager could setup BPF FS, pin and adjust all the necessary objects in it, all before exposing BPF FS to a particular mount namespace. Also add a few subtests validating all meaningful combinations of path_fd and pathname. We use mounted /sys/fs/bpf location for these. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230523170013.728457-5-andrii@kernel.org
2023-05-23selftests/bpf: Add xdp_feature selftest for bond deviceLorenzo Bianconi
Introduce selftests to check xdp_feature support for bond driver. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jussi Maki <joamaki@gmail.com> Link: https://lore.kernel.org/bpf/64cb8f20e6491f5b971f8d3129335093c359aad7.1684329998.git.lorenzo@kernel.org
2023-05-23bpf, sockmap: Test progs verifier error with latest clangJohn Fastabend
With a relatively recent clang (7090c10273119) and with this commit to fix warnings in selftests (c8ed668593972) that uses __sink(err) to resolve unused variables. We get the following verifier error. root@6e731a24b33a:/host/tools/testing/selftests/bpf# ./test_sockmap libbpf: prog 'bpf_sockmap': BPF program load failed: Permission denied libbpf: prog 'bpf_sockmap': -- BEGIN PROG LOAD LOG -- 0: R1=ctx(off=0,imm=0) R10=fp0 ; op = (int) skops->op; 0: (61) r2 = *(u32 *)(r1 +0) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) ; switch (op) { 1: (16) if w2 == 0x4 goto pc+5 ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) 2: (56) if w2 != 0x5 goto pc+15 ; R2_w=5 ; lport = skops->local_port; 3: (61) r2 = *(u32 *)(r1 +68) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) ; if (lport == 10000) { 4: (56) if w2 != 0x2710 goto pc+13 18: R1=ctx(off=0,imm=0) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 ; __sink(err); 18: (bc) w1 = w0 R0 !read_ok processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 -- END PROG LOAD LOG -- libbpf: prog 'bpf_sockmap': failed to load: -13 libbpf: failed to load object 'test_sockmap_kern.bpf.o' load_bpf_file: (-1) No such file or directory ERROR: (-1) load bpf failed libbpf: prog 'bpf_sockmap': BPF program load failed: Permission denied libbpf: prog 'bpf_sockmap': -- BEGIN PROG LOAD LOG -- 0: R1=ctx(off=0,imm=0) R10=fp0 ; op = (int) skops->op; 0: (61) r2 = *(u32 *)(r1 +0) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) ; switch (op) { 1: (16) if w2 == 0x4 goto pc+5 ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) 2: (56) if w2 != 0x5 goto pc+15 ; R2_w=5 ; lport = skops->local_port; 3: (61) r2 = *(u32 *)(r1 +68) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) ; if (lport == 10000) { 4: (56) if w2 != 0x2710 goto pc+13 18: R1=ctx(off=0,imm=0) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 ; __sink(err); 18: (bc) w1 = w0 R0 !read_ok processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 -- END PROG LOAD LOG -- libbpf: prog 'bpf_sockmap': failed to load: -13 libbpf: failed to load object 'test_sockhash_kern.bpf.o' load_bpf_file: (-1) No such file or directory ERROR: (-1) load bpf failed libbpf: prog 'bpf_sockmap': BPF program load failed: Permission denied libbpf: prog 'bpf_sockmap': -- BEGIN PROG LOAD LOG -- 0: R1=ctx(off=0,imm=0) R10=fp0 ; op = (int) skops->op; 0: (61) r2 = *(u32 *)(r1 +0) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) ; switch (op) { 1: (16) if w2 == 0x4 goto pc+5 ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) 2: (56) if w2 != 0x5 goto pc+15 ; R2_w=5 ; lport = skops->local_port; 3: (61) r2 = *(u32 *)(r1 +68) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) ; if (lport == 10000) { 4: (56) if w2 != 0x2710 goto pc+13 18: R1=ctx(off=0,imm=0) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 ; __sink(err); 18: (bc) w1 = w0 R0 !read_ok processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 -- END PROG LOAD LOG -- To fix simply remove the err value because its not actually used anywhere in the testing. We can investigate the root cause later. Future patch should probably actually test the err value as well. Although if the map updates fail they will get caught eventually by userspace. Fixes: c8ed668593972 ("selftests/bpf: fix lots of silly mistakes pointed out by compiler") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20230523025618.113937-15-john.fastabend@gmail.com
2023-05-23bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer with dropsJohn Fastabend
When BPF program drops pkts the sockmap logic 'eats' the packet and updates copied_seq. In the PASS case where the sk_buff is accepted we update copied_seq from recvmsg path so we need a new test to handle the drop case. Original patch series broke this resulting in test_sockmap_skb_verdict_fionread:PASS:ioctl(FIONREAD) error 0 nsec test_sockmap_skb_verdict_fionread:FAIL:ioctl(FIONREAD) unexpected ioctl(FIONREAD): actual 1503041772 != expected 256 After updated patch with fix. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20230523025618.113937-14-john.fastabend@gmail.com
2023-05-23bpf, sockmap: Test FIONREAD returns correct bytes in rx bufferJohn Fastabend
A bug was reported where ioctl(FIONREAD) returned zero even though the socket with a SK_SKB verdict program attached had bytes in the msg queue. The result is programs may hang or more likely try to recover, but use suboptimal buffer sizes. Add a test to check that ioctl(FIONREAD) returns the correct number of bytes. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20230523025618.113937-13-john.fastabend@gmail.com
2023-05-23bpf, sockmap: Test shutdown() correctly exits epoll and recv()=0John Fastabend
When session gracefully shutdowns epoll needs to wake up and any recv() readers should return 0 not the -EAGAIN they previously returned. Note we use epoll instead of select to test the epoll wake on shutdown event as well. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20230523025618.113937-12-john.fastabend@gmail.com
2023-05-23bpf, sockmap: Build helper to create connected socket pairJohn Fastabend
A common operation for testing is to spin up a pair of sockets that are connected. Then we can use these to run specific tests that need to send data, check BPF programs and so on. The sockmap_listen programs already have this logic lets move it into the new sockmap_helpers header file for general use. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20230523025618.113937-11-john.fastabend@gmail.com
2023-05-23bpf, sockmap: Pull socket helpers out of listen test for general useJohn Fastabend
No functional change here we merely pull the helpers in sockmap_listen.c into a header file so we can use these in other programs. The tests we are about to add aren't really _listen tests so doesn't make sense to add them here. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20230523025618.113937-10-john.fastabend@gmail.com
2023-05-19selftests/bpf: Test bpf_sock_destroyAditi Ghag
The test cases for destroying sockets mirror the intended usages of the bpf_sock_destroy kfunc using iterators. The destroy helpers set `ECONNABORTED` error code that we can validate in the test code with client sockets. But UDP sockets have an overriding error code from `disconnect()` called during abort, so the error code validation is only done for TCP sockets. The failure test cases validate that the `bpf_sock_destroy` kfunc is not allowed from program attach types other than BPF trace iterator, and such programs fail to load. Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com> Link: https://lore.kernel.org/r/20230519225157.760788-10-aditi.ghag@isovalent.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-05-19selftests/bpf: Add helper to get port using getsocknameAditi Ghag
The helper will be used to programmatically retrieve and pass ports in userspace and kernel selftest programs. Suggested-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com> Link: https://lore.kernel.org/r/20230519225157.760788-9-aditi.ghag@isovalent.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-05-17selftests/bpf: Make bpf_dynptr_is_rdonly() prototyype consistent with kernelYonghong Song
Currently kernel kfunc bpf_dynptr_is_rdonly() has prototype ... __bpf_kfunc bool bpf_dynptr_is_rdonly(struct bpf_dynptr_kern *ptr) ... while selftests bpf_kfuncs.h has: extern int bpf_dynptr_is_rdonly(const struct bpf_dynptr *ptr) __ksym; Such a mismatch might cause problems although currently it is okay in selftests. Fix it to prevent future potential surprise. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230517040409.4024618-1-yhs@fb.com
2023-05-17selftests/bpf: Fix dynptr/test_dynptr_is_nullYonghong Song
With latest llvm17, dynptr/test_dynptr_is_null subtest failed in my testing VM. The failure log looks like below: All error logs: tester_init:PASS:tester_log_buf 0 nsec process_subtest:PASS:obj_open_mem 0 nsec process_subtest:PASS:Can't alloc specs array 0 nsec verify_success:PASS:dynptr_success__open 0 nsec verify_success:PASS:bpf_object__find_program_by_name 0 nsec verify_success:PASS:dynptr_success__load 0 nsec verify_success:PASS:bpf_program__attach 0 nsec verify_success:FAIL:err unexpected err: actual 4 != expected 0 #65/9 dynptr/test_dynptr_is_null:FAIL The error happens for bpf prog test_dynptr_is_null in dynptr_success.c: if (bpf_dynptr_is_null(&ptr2)) { err = 4; goto exit; } The bpf_dynptr_is_null(&ptr) unexpectedly returned a non-zero value and the control went to the error path. Digging further, I found the root cause is due to function signature difference between kernel and user space. In kernel, we have ... __bpf_kfunc bool bpf_dynptr_is_null(struct bpf_dynptr_kern *ptr) ... while in bpf_kfuncs.h we have: extern int bpf_dynptr_is_null(const struct bpf_dynptr *ptr) __ksym; The kernel bpf_dynptr_is_null disasm code: ffffffff812f1a90 <bpf_dynptr_is_null>: ffffffff812f1a90: f3 0f 1e fa endbr64 ffffffff812f1a94: 0f 1f 44 00 00 nopl (%rax,%rax) ffffffff812f1a99: 53 pushq %rbx ffffffff812f1a9a: 48 89 fb movq %rdi, %rbx ffffffff812f1a9d: e8 ae 29 17 00 callq 0xffffffff81464450 <__asan_load8_noabort> ffffffff812f1aa2: 48 83 3b 00 cmpq $0x0, (%rbx) ffffffff812f1aa6: 0f 94 c0 sete %al ffffffff812f1aa9: 5b popq %rbx ffffffff812f1aaa: c3 retq Note that only 1-byte register %al is set and the other 7-bytes are not touched. In bpf program, the asm code for the above bpf_dynptr_is_null(&ptr2): 266: 85 10 00 00 ff ff ff ff call -0x1 267: b4 01 00 00 04 00 00 00 w1 = 0x4 268: 16 00 03 00 00 00 00 00 if w0 == 0x0 goto +0x3 <LBB9_8> Basically, 4-byte subregister is tested. This might cause error as the value other than the lowest byte might not be 0. This patch fixed the issue by using the identical func prototype across kernel and selftest user space. The fixed bpf asm code: 267: 85 10 00 00 ff ff ff ff call -0x1 268: 54 00 00 00 01 00 00 00 w0 &= 0x1 269: b4 01 00 00 04 00 00 00 w1 = 0x4 270: 16 00 03 00 00 00 00 00 if w0 == 0x0 goto +0x3 <LBB9_8> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230517040404.4023912-1-yhs@fb.com
2023-05-17selftests/bpf: Do not use sign-file as testcaseAlexey Gladkov
The sign-file utility (from scripts/) is used in prog_tests/verify_pkcs7_sig.c, but the utility should not be called as a test. Executing this utility produces the following error: selftests: /linux/tools/testing/selftests/bpf: urandom_read ok 16 selftests: /linux/tools/testing/selftests/bpf: urandom_read selftests: /linux/tools/testing/selftests/bpf: sign-file not ok 17 selftests: /linux/tools/testing/selftests/bpf: sign-file # exit=2 Also, urandom_read is mistakenly used as a test. It does not lead to an error, but should be moved over to TEST_GEN_FILES as well. The empty TEST_CUSTOM_PROGS can then be removed. Fixes: fc97590668ae ("selftests/bpf: Add test for bpf_verify_pkcs7_signature() kfunc") Signed-off-by: Alexey Gladkov <legion@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/ZEuWFk3QyML9y5QQ@example.org Link: https://lore.kernel.org/bpf/88e3ab23029d726a2703adcf6af8356f7a2d3483.1684316821.git.legion@kernel.org
2023-05-16selftests/xsk: adjust packet pacing for multi-buffer supportMagnus Karlsson
Modify the packet pacing algorithm so that it works with multi-buffer packets. This algorithm makes sure we do not send too many buffers to the receiving thread so that packets have to be dropped. The previous algorithm made the assumption that each packet only consumes one buffer, but that is not true anymore when multi-buffer support gets added. Instead, we find out what the largest packet size is in the packet stream and assume that each packet will consume this many buffers. This is conservative and overly cautious as there might be smaller packets in the stream that need fewer buffers per packet. But it keeps the algorithm simple. Also simplify it by removing the pthread conditional and just test if there is enough space in the Rx thread before trying to send one more batch. Also makes the tests run faster. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230516103109.3066-11-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>