summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-08-31nfp: bpf: add simple map op cacheJakub Kicinski
Each get_next and lookup call requires a round trip to the device. However, the device is capable of giving us a few entries back, instead of just one. In this patch we ask for a small yet reasonable number of entries (4) on every get_next call, and on subsequent get_next/lookup calls check this little cache for a hit. The cache is only kept for 250us, and is invalidated on every operation which may modify the map (e.g. delete or update call). Note that operations may be performed simultaneously, so we have to keep track of operations in flight. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31nfp: bpf: rework MTU checkingJakub Kicinski
If control channel MTU is too low to support map operations a warning will be printed. This is not enough, we want to make sure probe fails in such scenario, as this would clearly be a faulty configuration. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31Merge branch 'bpf-bpftool-build-improvements'Daniel Borkmann
Quentin Monnet says: ==================== This set attempts to make it easier to build bpftool, in particular when passing a specific output directory. This is a follow-up to the conversation held last month by Lorenz, Ilya and Jakub [0]. The first patch is a minor fix to bpftool's Makefile, regarding the retrieval of kernel version (which currently prints a non-relevant make warning on some invocations). Second patch improves the Makefile commands to support more "make" invocations, or to fix building with custom output directory. On Jakub's suggestion, a script is also added to BPF selftests in order to keep track of the supported build variants. Building bpftool with "make tools/bpf" from the top of the repository generates files in "libbpf/" and "feature/" directories under tools/bpf/ and tools/bpf/bpftool/. The third patch ensures such directories are taken care of on "make clean", and add them to the relevant .gitignore files. At last, fourth patch is a sligthly modified version of Ilya's fix regarding libbpf.a appearing twice on the linking command for bpftool. [0] https://lore.kernel.org/bpf/CACAyw9-CWRHVH3TJ=Tke2x8YiLsH47sLCijdp=V+5M836R9aAA@mail.gmail.com/ v2: - Return error from check script if one of the make invocations returns non-zero (even if binary is successfully produced). - Run "make clean" from bpf/ and not only bpf/bpftool/ in that same script, when relevant. - Add a patch to clean up generated "feature/" and "libbpf/" directories. ==================== Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Lorenz Bauer <lmb@cloudflare.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31tools: bpftool: do not link twice against libbpf.a in MakefileQuentin Monnet
In bpftool's Makefile, $(LIBS) includes $(LIBBPF), therefore the library is used twice in the linking command. No need to have $(LIBBPF) (from $^) on that command, let's do with "$(OBJS) $(LIBS)" (but move $(LIBBPF) _before_ the -l flags in $(LIBS)). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31tools: bpf: account for generated feature/ and libbpf/ directoriesQuentin Monnet
When building "tools/bpf" from the top of the Linux repository, the build system passes a value for the $(OUTPUT) Makefile variable to tools/bpf/Makefile and tools/bpf/bpftool/Makefile, which results in generating "libbpf/" (for bpftool) and "feature/" (bpf and bpftool) directories inside the tree. This commit adds such directories to the relevant .gitignore files, and edits the Makefiles to ensure they are removed on "make clean". The use of "rm" is also made consistent throughout those Makefiles (relies on the $(RM) variable, use "--" to prevent interpreting $(OUTPUT)/$(DESTDIR) as options. v2: - New patch. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31tools: bpftool: improve and check builds for different make invocationsQuentin Monnet
There are a number of alternative "make" invocations that can be used to compile bpftool. The following invocations are expected to work: - through the kbuild system, from the top of the repository (make tools/bpf) - by telling make to change to the bpftool directory (make -C tools/bpf/bpftool) - by building the BPF tools from tools/ (cd tools && make bpf) - by running make from bpftool directory (cd tools/bpf/bpftool && make) Additionally, setting the O or OUTPUT variables should tell the build system to use a custom output path, for each of these alternatives. The following patch fixes the following invocations: $ make tools/bpf $ make tools/bpf O=<dir> $ make -C tools/bpf/bpftool OUTPUT=<dir> $ make -C tools/bpf/bpftool O=<dir> $ cd tools/ && make bpf O=<dir> $ cd tools/bpf/bpftool && make OUTPUT=<dir> $ cd tools/bpf/bpftool && make O=<dir> After this commit, the build still fails for two variants when passing the OUTPUT variable: $ make tools/bpf OUTPUT=<dir> $ cd tools/ && make bpf OUTPUT=<dir> In order to remember and check what make invocations are supposed to work, and to document the ones which do not, a new script is added to the BPF selftests. Note that some invocations require the kernel to be configured, so the script skips them if no .config file is found. v2: - In make_and_clean(), set $ERROR to 1 when "make" returns non-zero, even if the binary was produced. - Run "make clean" from the correct directory (bpf/ instead of bpftool/, when relevant). Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31tools: bpftool: ignore make built-in rules for getting kernel versionQuentin Monnet
Bpftool calls the toplevel Makefile to get the kernel version for the sources it is built from. But when the utility is built from the top of the kernel repository, it may dump the following error message for certain architectures (including x86): $ make tools/bpf [...] make[3]: *** [checkbin] Error 1 [...] This does not prevent bpftool compilation, but may feel disconcerting. The "checkbin" arch-dependent target is not supposed to be called for target "kernelversion", which is a simple "echo" of the version number. It turns out this is caused by the make invocation in tools/bpf/bpftool, which attempts to find implicit rules to apply. Extract from debug output: Reading makefiles... Reading makefile 'Makefile'... Reading makefile 'scripts/Kbuild.include' (search path) (no ~ expansion)... Reading makefile 'scripts/subarch.include' (search path) (no ~ expansion)... Reading makefile 'arch/x86/Makefile' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.kcov' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.gcc-plugins' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.kasan' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.extrawarn' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.ubsan' (search path) (no ~ expansion)... Updating makefiles.... Considering target file 'scripts/Makefile.ubsan'. Looking for an implicit rule for 'scripts/Makefile.ubsan'. Trying pattern rule with stem 'Makefile.ubsan'. [...] Trying pattern rule with stem 'Makefile.ubsan'. Trying implicit prerequisite 'scripts/Makefile.ubsan.o'. Looking for a rule with intermediate file 'scripts/Makefile.ubsan.o'. Avoiding implicit rule recursion. Trying pattern rule with stem 'Makefile.ubsan'. Trying rule prerequisite 'prepare'. Trying rule prerequisite 'FORCE'. Found an implicit rule for 'scripts/Makefile.ubsan'. Considering target file 'prepare'. File 'prepare' does not exist. Considering target file 'prepare0'. File 'prepare0' does not exist. Considering target file 'archprepare'. File 'archprepare' does not exist. Considering target file 'archheaders'. File 'archheaders' does not exist. Finished prerequisites of target file 'archheaders'. Must remake target 'archheaders'. Putting child 0x55976f4f6980 (archheaders) PID 31743 on the chain. To avoid that, pass the -r and -R flags to eliminate the use of make built-in rules (and while at it, built-in variables) when running command "make kernelversion" from bpftool's Makefile. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31bpf: s390: add JIT support for multi-function programsYauheni Kaliuta
This adds support for bpf-to-bpf function calls in the s390 JIT compiler. The JIT compiler converts the bpf call instructions to native branch instructions. After a round of the usual passes, the start addresses of the JITed images for the callee functions are known. Finally, to fixup the branch target addresses, we need to perform an extra pass. Because of the address range in which JITed images are allocated on s390, the offsets of the start addresses of these images from __bpf_call_base are as large as 64 bits. So, for a function call, the imm field of the instruction cannot be used to determine the callee's address. Use bpf_jit_get_func_addr() helper instead. The patch borrows a lot from: commit 8c11ea5ce13d ("bpf, arm64: fix getting subprog addr from aux for calls") commit e2c95a61656d ("bpf, ppc64: generalize fetching subprog into bpf_jit_get_func_addr") commit 8484ce8306f9 ("bpf: powerpc64: add JIT support for multi-function programs") (including the commit message). test_verifier (5.3-rc6 with CONFIG_BPF_JIT_ALWAYS_ON=y): without patch: Summary: 1501 PASSED, 0 SKIPPED, 47 FAILED with patch: Summary: 1540 PASSED, 0 SKIPPED, 8 FAILED Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28selftests/bpf: remove wrong nhoff in flow dissector testStanislav Fomichev
.nhoff = 0 is (correctly) reset to ETH_HLEN on the next line so let's drop it. Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28Merge branch 'bpf-misc-test-fixes'Daniel Borkmann
Stanislav Fomichev says: ==================== * add test__skip to indicate skipped tests * remove global success/error counts (use environment) * remove asserts from the tests * remove unused ret from send_signal test v3: * QCHECK -> CHECK_FAIL (Daniel Borkmann) v2: * drop patch that changes output to keep consistent with test_verifier (Alexei Starovoitov) * QCHECK instead of test__fail (Andrii Nakryiko) * test__skip count number of subtests (Andrii Nakryiko) ==================== Cc: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28selftests/bpf: test_progs: remove unused retStanislav Fomichev
send_signal test returns static codes from the subtests which nobody looks at, let's rely on the CHECK macros instead. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28selftests/bpf: test_progs: remove asserts from subtestsStanislav Fomichev
Otherwise they can bring the whole process down. Cc: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28selftests/bpf: test_progs: remove global fail/success countsStanislav Fomichev
Now that we have a global per-test/per-environment state, there is no longer need to have global fail/success counters (and there is no need to save/get the diff before/after the test). Introduce CHECK_FAIL macro (suggested by Andrii) and covert existing tests to it. CHECK_FAIL uses new test__fail() to record the failure. Cc: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28selftests/bpf: test_progs: test__skipStanislav Fomichev
Export test__skip() to indicate skipped tests and use it in test_send_signal_nmi(). Cc: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28Merge branch 'bpf-precision-tracking-tests'Daniel Borkmann
Alexei Starovoitov says: ==================== Add few additional tests for precision tracking in the verifier. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28selftests/bpf: add precision tracking testAlexei Starovoitov
Copy-paste of existing test "calls: cross frame pruning - liveness propagation" but ran with different parentage chain heuristic which stresses different path in precision tracking logic. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28selftests/bpf: verifier precise testsAlexei Starovoitov
Use BPF_F_TEST_STATE_FREQ flag to check that precision tracking works as expected by comparing every step it takes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28tools/bpf: sync bpf.hAlexei Starovoitov
sync bpf.h from kernel/ to tools/ Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28bpf: introduce verifier internal test flagAlexei Starovoitov
Introduce BPF_F_TEST_STATE_FREQ flag to stress test parentage chain and state pruning. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21tools: bpftool: add "bpftool map freeze" subcommandQuentin Monnet
Add a new subcommand to freeze maps from user space. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21tools: bpftool: show frozen status for mapsQuentin Monnet
When listing maps, read their "frozen" status from procfs, and tell if maps are frozen. As commit log for map freezing command mentions that the feature might be extended with flags (e.g. for write-only instead of read-only) in the future, use an integer and not a boolean for JSON output. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21bpf: sync bpf.h to tools/Peter Wu
Fix a 'struct pt_reg' typo and clarify when bpf_trace_printk discards lines. Affects documentation only. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-21bpf: clarify when bpf_trace_printk discards linesPeter Wu
I opened /sys/kernel/tracing/trace once and kept reading from it. bpf_trace_printk somehow did not seem to work, no entries were appended to that trace file. It turns out that tracing is disabled when that file is open. Save the next person some time and document this. The trace file is described in Documentation/trace/ftrace.rst, however the implication "tracing is disabled" did not immediate translate to "bpf_trace_printk silently discards entries". Signed-off-by: Peter Wu <peter@lekensteyn.nl> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-21bpf: fix 'struct pt_reg' typo in documentationPeter Wu
There is no 'struct pt_reg'. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-21bpf: clarify description for CONFIG_BPF_EVENTSPeter Wu
PERF_EVENT_IOC_SET_BPF supports uprobes since v4.3, and tracepoints since v4.7 via commit 04a22fae4cbc ("tracing, perf: Implement BPF programs attached to uprobes"), and commit 98b5c2c65c29 ("perf, bpf: allow bpf programs attach to tracepoints") respectively. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-21btf: do not use CONFIG_OUTPUT_FORMATIlya Leoshkevich
Building s390 kernel with CONFIG_DEBUG_INFO_BTF fails, because CONFIG_OUTPUT_FORMAT is not defined. As a matter of fact, this variable appears to be x86-only, so other arches might be affected as well. Fix by obtaining this value from objdump output, just like it's already done for bin_arch. The exact objdump invocation is "inspired" by arch/powerpc/boot/wrapper. Also, use LANG=C for the existing bin_arch objdump invocation to avoid potential build issues on systems with non-English locale. Fixes: 341dfcf8d78e ("btf: expose BTF info through sysfs") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21samples: bpf: syscall_nrs: use mmap2 if definedIvan Khoronzhuk
For arm32 xdp sockets mmap2 is preferred, so use it if it's defined. Declaration of __NR_mmap can be skipped and it breaks build. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21xdp: xdp_umem: replace kmap on vmap for umem mapIvan Khoronzhuk
For 64-bit there is no reason to use vmap/vunmap, so use page_address as it was initially. For 32 bits, in some apps, like in samples xdpsock_user.c when number of pgs in use is quite big, the kmap memory can be not enough, despite on this, kmap looks like is deprecated in such cases as it can block and should be used rather for dynamic mm. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21libbpf: use LFS (_FILE_OFFSET_BITS) instead of direct mmap2 syscallIvan Khoronzhuk
Drop __NR_mmap2 fork in flavor of LFS, that is _FILE_OFFSET_BITS=64 (glibc & bionic) / LARGEFILE64_SOURCE (for musl) decision. It allows mmap() to use 64bit offset that is passed to mmap2 syscall. As result pgoff is not truncated and no need to use direct access to mmap2 for 32 bits systems. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-20Merge branch 'btf_get_next_id'Alexei Starovoitov
Quentin Monnet says: ==================== This set adds a new command BPF_BTF_GET_NEXT_ID to the bpf() system call, adds the relevant API function in libbpf, and uses it in bpftool to list all BTF objects loaded on the system (and to dump the ids of maps and programs associated with them, if any). The main motivation of listing BTF objects is introspection and debugging purposes. By getting BPF program and map information, it should already be possible to list all BTF objects associated to at least one map or one program. But there may be unattached BTF objects, held by a file descriptor from a user space process only, and we may want to list them too. As a side note, it also turned useful for examining the BTF objects attached to offloaded programs, which would not show in program information because the BTF id is not copied when retrieving such info. A fix is in progress on that side. v2: - Rebase patch with new libbpf function on top of Andrii's changes regarding libbpf versioning. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-20tools: bpftool: implement "bpftool btf show|list"Quentin Monnet
Add a "btf list" (alias: "btf show") subcommand to bpftool in order to dump all BTF objects loaded on a system. When running the command, hash tables are built in bpftool to retrieve all the associations between BTF objects and BPF maps and programs. This allows for printing all such associations when listing the BTF objects. The command is added at the top of the subcommands for "bpftool btf", so that typing only "bpftool btf" also comes down to listing the programs. We could not have this with the previous command ("dump"), which required a BTF object id, so it should not break any previous behaviour. This also makes the "btf" command behaviour consistent with "prog" or "map". Bash completion is updated to use "bpftool btf" instead of "bpftool prog" to list the BTF ids, as it looks more consistent. Example output (plain): # bpftool btf show 9: size 2989B prog_ids 21 map_ids 15 17: size 2847B prog_ids 36 map_ids 30,29,28 26: size 2847B Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-20libbpf: add bpf_btf_get_next_id() to cycle through BTF objectsQuentin Monnet
Add an API function taking a BTF object id and providing the id of the next BTF object in the kernel. This can be used to list all BTF objects loaded on the system. v2: - Rebase on top of Andrii's changes regarding libbpf versioning. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-20libbpf: refactor bpf_*_get_next_id() functionsQuentin Monnet
In preparation for the introduction of a similar function for retrieving the id of the next BTF object, consolidate the code from bpf_prog_get_next_id() and bpf_map_get_next_id() in libbpf. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-20tools: bpf: synchronise BPF UAPI header with toolsQuentin Monnet
Synchronise the bpf.h header under tools, to report the addition of the new BPF_BTF_GET_NEXT_ID syscall command for bpf(). Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-20bpf: add new BPF_BTF_GET_NEXT_ID syscall commandQuentin Monnet
Add a new command for the bpf() system call: BPF_BTF_GET_NEXT_ID is used to cycle through all BTF objects loaded on the system. The motivation is to be able to inspect (list) all BTF objects presents on the system. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-20test_bpf: Fix a new clang warning about xor-ing two numbersNathan Chancellor
r369217 in clang added a new warning about potential misuse of the xor operator as an exponentiation operator: ../lib/test_bpf.c:870:13: warning: result of '10 ^ 300' is 294; did you mean '1e300'? [-Wxor-used-as-pow] { { 4, 10 ^ 300 }, { 20, 10 ^ 300 } }, ~~~^~~~~ 1e300 ../lib/test_bpf.c:870:13: note: replace expression with '0xA ^ 300' to silence this warning ../lib/test_bpf.c:870:31: warning: result of '10 ^ 300' is 294; did you mean '1e300'? [-Wxor-used-as-pow] { { 4, 10 ^ 300 }, { 20, 10 ^ 300 } }, ~~~^~~~~ 1e300 ../lib/test_bpf.c:870:31: note: replace expression with '0xA ^ 300' to silence this warning The commit link for this new warning has some good logic behind wanting to add it but this instance appears to be a false positive. Adopt its suggestion to silence the warning but not change the code. According to the differential review link in the clang commit, GCC may eventually adopt this warning as well. Link: https://github.com/ClangBuiltLinux/linux/issues/643 Link: https://github.com/llvm/llvm-project/commit/920890e26812f808a74c60ebc14cc636dac661c1 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-20bpf: add include guard to tnum.hMasahiro Yamada
Add a header include guard just in case. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-20bpf: add BTF ids in procfs for file descriptors to BTF objectsQuentin Monnet
Implement the show_fdinfo hook for BTF FDs file operations, and make it print the id of the BTF object. This allows for a quick retrieval of the BTF id from its FD; or it can help understanding what type of object (BTF) the file descriptor points to. v2: - Do not expose data_size, only btf_id, in FD info. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-20bpf: Use PTR_ERR_OR_ZERO in xsk_map_inc()YueHaibing
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17Merge branch 'bpf-af-xdp-xskmap-improvements'Daniel Borkmann
Björn Töpel says: ==================== This series (v5 and counting) add two improvements for the XSKMAP, used by AF_XDP sockets. 1. Automatic cleanup when an AF_XDP socket goes out of scope/is released. Instead of require that the user manually clears the "released" state socket from the map, this is done automatically. Each socket tracks which maps it resides in, and remove itself from those maps at relase. A notable implementation change, is that the sockets references the map, instead of the map referencing the sockets. Which implies that when the XSKMAP is freed, it is by definition cleared of sockets. 2. The XSKMAP did not honor the BPF_EXIST/BPF_NOEXIST flag on insert, which this patch addresses. v1->v2: Fixed deadlock and broken cleanup. (Daniel) v2->v3: Rebased onto bpf-next v3->v4: {READ, WRITE}_ONCE consistency. (Daniel) Socket release/map update race. (Daniel) v4->v5: Avoid use-after-free on XSKMAP self-assignment [1]. (Daniel) Removed redundant assignment in xsk_map_update_elem(). Variable name consistency; Use map_entry everywhere. [1] https://lore.kernel.org/bpf/20190802081154.30962-1-bjorn.topel@gmail.com/T/#mc68439e97bc07fa301dad9fc4850ed5aa392f385 ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17xsk: support BPF_EXIST and BPF_NOEXIST flags in XSKMAPBjörn Töpel
The XSKMAP did not honor the BPF_EXIST/BPF_NOEXIST flags when updating an entry. This patch addresses that. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17xsk: remove AF_XDP socket from map when the socket is releasedBjörn Töpel
When an AF_XDP socket is released/closed the XSKMAP still holds a reference to the socket in a "released" state. The socket will still use the netdev queue resource, and block newly created sockets from attaching to that queue, but no user application can access the fill/complete/rx/tx queues. This results in that all applications need to explicitly clear the map entry from the old "zombie state" socket. This should be done automatically. In this patch, the sockets tracks, and have a reference to, which maps it resides in. When the socket is released, it will remove itself from all maps. Suggested-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17Merge branch 'bpf-sk-storage-clone'Daniel Borkmann
Stanislav Fomichev says: ==================== Currently there is no way to propagate sk storage from the listener socket to a newly accepted one. Consider the following use case: fd = socket(); setsockopt(fd, SOL_IP, IP_TOS,...); /* ^^^ setsockopt BPF program triggers here and saves something * into sk storage of the listener. */ listen(fd, ...); while (client = accept(fd)) { /* At this point all association between listener * socket and newly accepted one is gone. New * socket will not have any sk storage attached. */ } Let's add new BPF_F_CLONE flag that can be specified when creating a socket storage map. This new flag indicates that map contents should be cloned when the socket is cloned. v4: * drop 'goto err' in bpf_sk_storage_clone (Yonghong Song) * add comment about race with bpf_sk_storage_map_free to the bpf_sk_storage_clone side as well (Daniel Borkmann) v3: * make sure BPF_F_NO_PREALLOC is always present when creating a map (Martin KaFai Lau) * don't call bpf_sk_storage_free explicitly, rely on sk_free_unlock_clone to do the cleanup (Martin KaFai Lau) v2: * remove spinlocks around selem_link_map/sk (Martin KaFai Lau) * BPF_F_CLONE on a map, not selem (Martin KaFai Lau) * hold a map while cloning (Martin KaFai Lau) * use BTF maps in selftests (Yonghong Song) * do proper cleanup selftests; don't call close(-1) (Yonghong Song) * export bpf_map_inc_not_zero ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17selftests/bpf: add sockopt clone/inheritance testStanislav Fomichev
Add a test that calls setsockopt on the listener socket which triggers BPF program. This BPF program writes to the sk storage and sets clone flag. Make sure that sk storage is cloned for a newly accepted connection. We have two cloned maps in the tests to make sure we hit both cases in bpf_sk_storage_clone: first element (sk_storage_alloc) and non-first element(s) (selem_link_map). Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17bpf: sync bpf.h to tools/Stanislav Fomichev
Sync new sk storage clone flag. Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17bpf: support cloning sk storage on accept()Stanislav Fomichev
Add new helper bpf_sk_storage_clone which optionally clones sk storage and call it from sk_clone_lock. Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17bpf: export bpf_map_inc_not_zeroStanislav Fomichev
Rename existing bpf_map_inc_not_zero to __bpf_map_inc_not_zero to indicate that it's caller's responsibility to do proper locking. Create and export bpf_map_inc_not_zero wrapper that properly locks map_idr_lock. Will be used in the next commit to hold a map while cloning a socket. Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17selftests/bpf: fix race in test_tcp_rtt testPetar Penkov
There is a race in this test between receiving the ACK for the single-byte packet sent in the test, and reading the values from the map. This patch fixes this by having the client wait until there are no more unacknowledged packets. Before: for i in {1..1000}; do ../net/in_netns.sh ./test_tcp_rtt; \ done | grep -c PASSED < trimmed error messages > 993 After: for i in {1..10000}; do ../net/in_netns.sh ./test_tcp_rtt; \ done | grep -c PASSED 10000 Fixes: b55873984dab ("selftests/bpf: test BPF_SOCK_OPS_RTT_CB") Signed-off-by: Petar Penkov <ppenkov@google.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17libbpf: relicense bpf_helpers.h and bpf_endian.hAndrii Nakryiko
bpf_helpers.h and bpf_endian.h contain useful macros and BPF helper definitions essential to almost every BPF program. Which makes them useful not just for selftests. To be able to expose them as part of libbpf, though, we need them to be dual-licensed as LGPL-2.1 OR BSD-2-Clause. This patch updates licensing of those two files. Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Hechao Li <hechaol@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrey Ignatov <rdna@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Adam Barth <arb@fb.com> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Josef Bacik <jbacik@fb.com> Acked-by: Joe Stringer <joe@wand.net.nz> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Acked-by: Adrian Ratiu <adrian.ratiu@collabora.com> Acked-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Petar Penkov <ppenkov@google.com> Acked-by: Teng Qin <palmtenor@gmail.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Michal Rostecki <mrostecki@opensuse.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Sargun Dhillon <sargun@sargun.me> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-17net: Don't call XDP_SETUP_PROG when nothing is changedMaxim Mikityanskiy
Don't uninstall an XDP program when none is installed, and don't install an XDP program that has the same ID as the one already installed. dev_change_xdp_fd doesn't perform any checks in case it uninstalls an XDP program. It means that the driver's ndo_bpf can be called with XDP_SETUP_PROG asking to set it to NULL even if it's already NULL. This case happens if the user runs `ip link set eth0 xdp off` when there is no XDP program attached. The symmetrical case is possible when the user tries to set the program that is already set. The drivers typically perform some heavy operations on XDP_SETUP_PROG, so they all have to handle these cases internally to return early if they happen. This patch puts this check into the kernel code, so that all drivers will benefit from it. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>