summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2022-09-02bpf: Support getting tunnel flagsShmulik Ladkani
Existing 'bpf_skb_get_tunnel_key' extracts various tunnel parameters (id, ttl, tos, local and remote) but does not expose ip_tunnel_info's tun_flags to the BPF program. It makes sense to expose tun_flags to the BPF program. Assume for example multiple GRE tunnels maintained on a single GRE interface in collect_md mode. The program expects origins to initiate over GRE, however different origins use different GRE characteristics (e.g. some prefer to use GRE checksum, some do not; some pass a GRE key, some do not, etc..). A BPF program getting tun_flags can therefore remember the relevant flags (e.g. TUNNEL_CSUM, TUNNEL_SEQ...) for each initiating remote. In the reply path, the program can use 'bpf_skb_set_tunnel_key' in order to correctly reply to the remote, using similar characteristics, based on the stored tunnel flags. Introduce BPF_F_TUNINFO_FLAGS flag for bpf_skb_get_tunnel_key. If specified, 'bpf_tunnel_key->tunnel_flags' is set with the tun_flags. Decided to use the existing unused 'tunnel_ext' as the storage for the 'tunnel_flags' in order to avoid changing bpf_tunnel_key's layout. Also, the following has been considered during the design: 1. Convert the "interesting" internal TUNNEL_xxx flags back to BPF_F_yyy and place into the new 'tunnel_flags' field. This has 2 drawbacks: - The BPF_F_yyy flags are from *set_tunnel_key* enumeration space, e.g. BPF_F_ZERO_CSUM_TX. It is awkward that it is "returned" into tunnel_flags from a *get_tunnel_key* call. - Not all "interesting" TUNNEL_xxx flags can be mapped to existing BPF_F_yyy flags, and it doesn't make sense to create new BPF_F_yyy flags just for purposes of the returned tunnel_flags. 2. Place key.tun_flags into 'tunnel_flags' but mask them, keeping only "interesting" flags. That's ok, but the drawback is that what's "interesting" for my usecase might be limiting for other usecases. Therefore I decided to expose what's in key.tun_flags *as is*, which seems most flexible. The BPF user can just choose to ignore bits he's not interested in. The TUNNEL_xxx are also UAPI, so no harm exposing them back in the get_tunnel_key call. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220831144010.174110-1-shmulik.ladkani@gmail.com
2022-09-01selftests: net: dsa: symlink the tc_actions.sh testVladimir Oltean
This has been validated on the Ocelot/Felix switch family (NXP LS1028A) and should be relevant to any switch driver that offloads the tc-flower and/or tc-matchall actions trap, drop, accept, mirred, for which DSA has operations. TEST: gact drop and ok (skip_hw) [ OK ] TEST: mirred egress flower redirect (skip_hw) [ OK ] TEST: mirred egress flower mirror (skip_hw) [ OK ] TEST: mirred egress matchall mirror (skip_hw) [ OK ] TEST: mirred_egress_to_ingress (skip_hw) [ OK ] TEST: gact drop and ok (skip_sw) [ OK ] TEST: mirred egress flower redirect (skip_sw) [ OK ] TEST: mirred egress flower mirror (skip_sw) [ OK ] TEST: mirred egress matchall mirror (skip_sw) [ OK ] TEST: trap (skip_sw) [ OK ] TEST: mirred_egress_to_ingress (skip_sw) [ OK ] Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220831170839.931184-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-01Merge tag 'kvm-s390-master-6.0-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD PCI interpretation compile fixes
2022-09-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
tools/testing/selftests/net/.gitignore sort the net-next version and use it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-01selftests/bpf: Test concurrent updates on bpf_task_storage_busyHou Tao
Under full preemptible kernel, task local storage lookup operations on the same CPU may update per-cpu bpf_task_storage_busy concurrently. If the update of bpf_task_storage_busy is not preemption safe, the final value of bpf_task_storage_busy may become not-zero forever and bpf_task_storage_trylock() will always fail. So add a test case to ensure the update of bpf_task_storage_busy is preemption safe. Will skip the test case when CONFIG_PREEMPT is disabled, and it can only reproduce the problem probabilistically. By increasing TASK_STORAGE_MAP_NR_LOOP and running it under ARM64 VM with 4-cpus, it takes about four rounds to reproduce: > test_maps is modified to only run test_task_storage_map_stress_lookup() $ export TASK_STORAGE_MAP_NR_THREAD=256 $ export TASK_STORAGE_MAP_NR_LOOP=81920 $ export TASK_STORAGE_MAP_PIN_CPU=1 $ time ./test_maps test_task_storage_map_stress_lookup(135):FAIL:bad bpf_task_storage_busy got -2 real 0m24.743s user 0m6.772s sys 0m17.966s Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20220901061938.3789460-5-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-01selftests/bpf: Move sys_pidfd_open() into task_local_storage_helpers.hHou Tao
sys_pidfd_open() is defined twice in both test_bprm_opts.c and test_local_storage.c, so move it to a common header file. And it will be used in map_tests as well. Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20220901061938.3789460-4-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-01Merge tag 'net-6.0-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth, bpf and wireless. Current release - regressions: - bpf: - fix wrong last sg check in sk_msg_recvmsg() - fix kernel BUG in purge_effective_progs() - mac80211: - fix possible leak in ieee80211_tx_control_port() - potential NULL dereference in ieee80211_tx_control_port() Current release - new code bugs: - nfp: fix the access to management firmware hanging Previous releases - regressions: - ip: fix triggering of 'icmp redirect' - sched: tbf: don't call qdisc_put() while holding tree lock - bpf: fix corrupted packets for XDP_SHARED_UMEM - bluetooth: hci_sync: fix suspend performance regression - micrel: fix probe failure Previous releases - always broken: - tcp: make global challenge ack rate limitation per net-ns and default disabled - tg3: fix potential hang-up on system reboot - mac802154: fix reception for no-daddr packets Misc: - r8152: add PID for the lenovo onelink+ dock" * tag 'net-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (56 commits) net/smc: Remove redundant refcount increase Revert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb" tcp: make global challenge ack rate limitation per net-ns and default disabled tcp: annotate data-race around challenge_timestamp net: dsa: hellcreek: Print warning only once ip: fix triggering of 'icmp redirect' sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb selftests: net: sort .gitignore file Documentation: networking: correct possessive "its" kcm: fix strp_init() order and cleanup mlxbf_gige: compute MDIO period based on i1clk ethernet: rocker: fix sleep in atomic context bug in neigh_timer_handler net: lan966x: improve error handle in lan966x_fdma_rx_get_frame() nfp: fix the access to management firmware hanging net: phy: micrel: Make the GPIO to be non-exclusive net: virtio_net: fix notification coalescing comments net/sched: fix netdevice reference leaks in attach_default_qdiscs() net: sched: tbf: don't call qdisc_put() while holding tree lock net: Use u64_stats_fetch_begin_irq() for stats fetch. net: dsa: xrs700x: Use irqsave variant for u64 stats update ...
2022-09-01selftests/net: return back io_uring zc send testsPavel Begunkov
Enable io_uring zerocopy send tests back and fix them up to follow the new inteface. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c8e5018c516093bdad0b6e19f2f9847dea17e4d2.1662027856.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-01selftests/net: temporarily disable io_uring zc testPavel Begunkov
We're going to change API, to avoid build problems with a couple of following commits, disable io_uring testing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/12b7507223df04fbd12aa05fc0cb544b51d7ed79.1662027856.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-31net-next: Fix IP_UNICAST_IF option behavior for connected socketsRichard Gobert
The IP_UNICAST_IF socket option is used to set the outgoing interface for outbound packets. The IP_UNICAST_IF socket option was added as it was needed by the Wine project, since no other existing option (SO_BINDTODEVICE socket option, IP_PKTINFO socket option or the bind function) provided the needed characteristics needed by the IP_UNICAST_IF socket option. [1] The IP_UNICAST_IF socket option works well for unconnected sockets, that is, the interface specified by the IP_UNICAST_IF socket option is taken into consideration in the route lookup process when a packet is being sent. However, for connected sockets, the outbound interface is chosen when connecting the socket, and in the route lookup process which is done when a packet is being sent, the interface specified by the IP_UNICAST_IF socket option is being ignored. This inconsistent behavior was reported and discussed in an issue opened on systemd's GitHub project [2]. Also, a bug report was submitted in the kernel's bugzilla [3]. To understand the problem in more detail, we can look at what happens for UDP packets over IPv4 (The same analysis was done separately in the referenced systemd issue). When a UDP packet is sent the udp_sendmsg function gets called and the following happens: 1. The oif member of the struct ipcm_cookie ipc (which stores the output interface of the packet) is initialized by the ipcm_init_sk function to inet->sk.sk_bound_dev_if (the device set by the SO_BINDTODEVICE socket option). 2. If the IP_PKTINFO socket option was set, the oif member gets overridden by the call to the ip_cmsg_send function. 3. If no output interface was selected yet, the interface specified by the IP_UNICAST_IF socket option is used. 4. If the socket is connected and no destination address is specified in the send function, the struct ipcm_cookie ipc is not taken into consideration and the cached route, that was calculated in the connect function is being used. Thus, for a connected socket, the IP_UNICAST_IF sockopt isn't taken into consideration. This patch corrects the behavior of the IP_UNICAST_IF socket option for connect()ed sockets by taking into consideration the IP_UNICAST_IF sockopt when connecting the socket. In order to avoid reconnecting the socket, this option is still ignored when applied on an already connected socket until connect() is called again by the Richard Gobert. Change the __ip4_datagram_connect function, which is called during socket connection, to take into consideration the interface set by the IP_UNICAST_IF socket option, in a similar way to what is done in the udp_sendmsg function. [1] https://lore.kernel.org/netdev/1328685717.4736.4.camel@edumazet-laptop/T/ [2] https://github.com/systemd/systemd/issues/11935#issuecomment-618691018 [3] https://bugzilla.kernel.org/show_bug.cgi?id=210255 Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220829111554.GA1771@debian Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31selftests/bpf: Add test cases for htab updateHou Tao
One test demonstrates the reentrancy of hash map update on the same bucket should fail, and another one shows concureently updates of the same hash map bucket should succeed and not fail due to the reentrancy checking for bucket lock. There is no trampoline support on s390x, so move htab_update to denylist. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220831042629.130006-4-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-08-31selftest/bpf: Ensure no module loading in bpf_setsockopt(TCP_CONGESTION)Martin KaFai Lau
This patch adds a test to ensure bpf_setsockopt(TCP_CONGESTION, "not_exist") will not trigger the kernel module autoload. Before the fix: [ 40.535829] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274 [...] [ 40.552134] tcp_ca_find_autoload.constprop.0+0xcb/0x200 [ 40.552689] tcp_set_congestion_control+0x99/0x7b0 [ 40.553203] do_tcp_setsockopt+0x3ed/0x2240 [...] [ 40.556041] __bpf_setsockopt+0x124/0x640 Signed-off-by: Martin KaFai Lau <martin.lau@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220830231953.792412-1-martin.lau@linux.dev
2022-08-31selftests: net: sort .gitignore fileAxel Rasmussen
This is the result of `sort tools/testing/selftests/net/.gitignore`, but preserving the comment at the top. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Link: https://lore.kernel.org/r/20220829184748.1535580-1-axelrasmussen@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31libbpf: Add GCC support for bpf_tail_call_staticJames Hilliard
The bpf_tail_call_static function is currently not defined unless using clang >= 8. To support bpf_tail_call_static on GCC we can check if __clang__ is not defined to enable bpf_tail_call_static. We need to use GCC assembly syntax when the compiler does not define __clang__ as LLVM inline assembly is not fully compatible with GCC. Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220829210546.755377-1-james.hilliard1@gmail.com
2022-08-31selftests/xsk: Add missing close() on netns fdMaciej Fijalkowski
Commit 1034b03e54ac ("selftests: xsk: Simplify cleanup of ifobjects") removed close on netns fd, which is not correct, so let us restore it. Fixes: 1034b03e54ac ("selftests: xsk: Simplify cleanup of ifobjects") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220830133905.9945-1-maciej.fijalkowski@intel.com
2022-08-31perf script: Skip dummy event attr checkJiri Olsa
Hongtao Yu reported problem when displaying uregs in perf script for system wide perf.data: # perf script -F uregs | head -10 Samples for 'dummy:HG' event do not have UREGS attribute set. Cannot print 'uregs' field. The problem is the extra dummy event added for system wide, which does not have proper sample_type setup. Skipping attr check completely for dummy event as suggested by Namhyung, because it does not have any samples anyway. Reported-by: Hongtao Yu <hoy@fb.com> Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220831124041.219925-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-31perf metric: Return early if no CPU PMU table existsIan Rogers
Previous behavior is to segfault if there is no CPU PMU table and a metric is sought. To reproduce compile with NO_JEVENTS=1 then request a metric, for example, "perf stat -M IPC true". Committer testing: Before: $ make -k NO_JEVENTS=1 BUILD_BPF_SKEL=1 O=/tmp/build/perf-urgent -C tools/perf install-bin $ perf stat -M IPC true Segmentation fault (core dumped) $ After: $ perf stat -M IPC true Usage: perf stat [<options>] [<command>] -M, --metrics <metric/metric group list> monitor specified metrics or metric groups (separated by ,) $ Fixes: 00facc760903be66 ("perf jevents: Switch build to use jevents.py") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <rogers.email@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220830164846.401143-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-31selftests/nolibc: Avoid generated files being committedFernanda Ma'rouf
After running the nolibc tests, the "git status" is not clean because the generated files are not ignored. Create a `.gitignore` inside the selftests/nolibc directory to ignore them. Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org> Cc: Fernanda Ma'rouf <fernandafmr2@gmail.com> Signed-off-by: Fernanda Ma'rouf <fernandafmr12@gnuweeb.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a "help" targetWilly Tarreau
It presents the supported targets, and becomes the default target to save the user from having to read the makefile. The "all" target was placed after it and now points to "run" to do everything since it's no longer the default one. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: "sysroot" target installs a local copy of the sysrootWilly Tarreau
It's not convenient to rely on a sysroot built in another directory, especially when running cross-compilation tests, where one has to switch back and forth between directories. Let's make it possible to install the sysroot directly in the test directory. It's not big and even benefits from being copied by arch so that it's easier to switch between archs if needed. The new "sysroot" target does this, it just calls "headers_standalone" from nolibc to install the sysroot right here. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a "run" target to start the kernel in QEMUWilly Tarreau
The "run" target will build the kernel and start it in QEMU. The "rerun" target will not have the kernel dependency and will just try to start QEMU. The QEMU architecture used to start the kernel is derived from the configured ARCH. This might need to be improved for archs which include different variants under the same name (mips vs mipsel, +/-64, riscv32 vs riscv64). This could be tested for i386, x86, arm, arm64, mips and riscv (the later two reporting issues on some tests). It is possible to pass a test specification for nolibc-test in the TEST variable, which will be passed as-is as NOLIBC_TEST. On success, the number of successful tests is printed. On failure, failed lines are individually printed. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a "defconfig" targetWilly Tarreau
While most archs will work fine with "make defconfig", not all will do, and it's not always easy to remember the most suitable choice to use for a specific architecture. This adds a "defconfig" target to the Makefile so that one may easily run "make -C ... defconfig" and make sure to clean and rebuild a fresh config. This is *not* used by default because we want to preserve the user's config by default. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a "kernel" target to build the kernel with the initramfsWilly Tarreau
The "kernel" target rebuilds the kernel with the current config for the selected arch, with an initramfs containing the nolibc-test utility. Since image names depend on the architecture, the currently supported ones are referenced and resolved based on the architecture. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: support glibc as wellWilly Tarreau
Adding support for glibc can be useful to distinguish between bugs in nolibc and bugs in the kernel when a syscall reports an unusual value. It's not that much work and should not affect the long term maintainability of the tests. The necessary changes can essentially be summed up like this: - set _GNU_SOURCE a the top to access some definitions - many includes added when we know we don't come from nolibc (missing the stdio include guard) - disable gettid() which is not exposed by glibc - disable gettimeofday's support of bad pointers since these crash in glibc - add a simple itoa() for errorname(); strerror() is too verbose (no way to get short messages). strerrorname_np() was added in modern glibc (2.32) to do exactly this but that 's too recent to be usable as the default fallback. - use the standard ioperm() definition. May be we need to implement ioperm() in nolibc if that's useful. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: condition some tests on /proc existenceWilly Tarreau
If /proc is not available (program run inside a chroot or without sufficient permissions), it's better to disable the associated tests. Some will be preserved like the ones which check for a failure to create some entries there since they're still supposed to fail. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: recreate and populate /dev and /proc if missingWilly Tarreau
Most of the time the program will be run alone in an initramfs. There is no value in requiring the user to populate /dev and /proc for such tests, we can do it ourselves, and it participates to the tests at the same time. What's done here is that when called as init (getpid()==1) we check if /dev exists or create it, if /dev/console and /dev/null exists, otherwise we try to mount a devtmpfs there, and if it fails we fall back to mknod. The console is reopened if stdout was closed. Finally /proc is created and mounted if /proc/self cannot be found. This is sufficient for most tests. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: on x86, support exiting with isa-debug-exitWilly Tarreau
QEMU, when started with "-device isa-debug-exit -no-reboot" will exit with status code 2N+1 when N is written to 0x501. This is particularly convenient for automated tests but this is not portable. As such we only enable this on x86_64 when pid==1. In addition, this requires an ioperm() call but in order not to have to define arch-specific syscalls we just perform the syscall by hand there. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: exit with poweroff on success when getpid() == 1Willy Tarreau
The idea is to ease automated testing under qemu. If the test succeeds while running as PID 1, indicating the system was booted with init=/test, let's just power off so that qemu can exit with a successful code. In other situations it will exit and provoke a panic, which may be caught for example with CONFIG_PVPANIC. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a few tests for some libc functionsWilly Tarreau
The test series called "stdlib" covers some libc functions (string, stdlib etc). By default they are automatically run after "syscall" but may be requested in argument or in variable NOLIBC_TEST. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: implement a few tests for various syscallsWilly Tarreau
This adds 63 tests covering about 34 syscalls. Both successes and failures are tested. Two tests fail when run as unprivileged user (link_dir which returns EACCESS instead of EPERM, and chroot which returns EPERM). One test (execve("/")) expects to fail on EACCESS, but needs to have valid arguments otherwise the kernel will log a message. And a few tests require /proc to be mounted. The code is not pretty since all tests are one-liners, sometimes resulting in long lines, especially when using compount statements to preset a line, but it's convenient and doesn't obfuscate the code, which is important to understand what failed. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: support a test definition formatWilly Tarreau
It now becomes possible to pass a string either in argv[1] or in the NOLIBC_TEST environment variable (the former having precedence), to specify which tests to run. The format is: testname[:range]*[,testname...] Where a range is either a single value or the min and max numbers of the test IDs in a sequence, delimited by a dash. Multiple ranges are possible. This should provide enough flexibility to focus on certain failing parts just by playing with the boot command line in a boot loader or in qemu depending on what is accessible. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add basic infrastructure to ease creation of nolibc testsWilly Tarreau
This creates a "nolibc" selftest that intends to test various parts of the nolibc component, both in terms of build and execution for a given architecture. The aim is for it to be as simple to run as a kernel build, by just passing the compiler (for the build) and the ARCH (for kernel and execution). It brings a basic squeleton made of a single C file that will ease testing and error reporting. The code will be arranged so that it remains easy to add basic tests for syscalls or library calls that may rely on a condition to be executed, and whose result is compared to a value or to an error with a specific errno value. Tests will just use a relative line number in switch/case statements as an index, saving the user from having to maintain arrays and complicated functions which can often just be one-liners. MAINTAINERS was updated. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31tools/nolibc: make sys_mmap() automatically use the right __NR_mmap definitionWilly Tarreau
__NR_mmap2 was used for i386 but it's also needed for other archs such as RISCV32 or ARM. Let's decide to use it based on the __NR_mmap2 definition as it's not defined on other archs. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31tools/nolibc: fix build warning in sys_mmap() when my_syscall6 is not definedWilly Tarreau
We return -ENOSYS when there's no syscall6() operation, but we must cast it to void* to avoid a warning. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31tools/nolibc: make argc 32-bit in riscv startup codeWilly Tarreau
The "ld a0, 0(sp)" instruction doesn't build on RISCV32 because that would load a 64-bit value into a 32-bit register. But argc 32-bit, not 64, so we ought to use "lw" here. Tested on both RISCV32 and RISCV64. Cc: Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31tools/memory-model: Clarify LKMM's limitations in litmus-tests.txtPaul Heidekrüger
As discussed, clarify LKMM not recognizing certain kinds of orderings. In particular, highlight the fact that LKMM might deliberately make weaker guarantees than compilers and architectures. [ paulmck: Fix whitespace issue noted by checkpatch.pl. ] Link: https://lore.kernel.org/all/YpoW1deb%2FQeeszO1@ethstick13.dse.in.tum.de/T/#u Co-developed-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Charalampos Mainas <charalampos.mainas@gmail.com> Cc: Pramod Bhatotia <pramod.bhatotia@in.tum.de> Cc: Soham Chakraborty <s.s.chakraborty@tudelft.nl> Cc: Martin Fink <martin.fink@in.tum.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31netfilter: remove nf_conntrack_helper sysctl and modparam togglesPablo Neira Ayuso
__nf_ct_try_assign_helper() remains in place but it now requires a template to configure the helper. A toggle to disable automatic helper assignment was added by: a9006892643a ("netfilter: nf_ct_helper: allow to disable automatic helper assignment") in 2012 to address the issues described in "Secure use of iptables and connection tracking helpers". Automatic conntrack helper assignment was disabled by: 3bb398d925ec ("netfilter: nf_ct_helper: disable automatic helper assignment") back in 2016. This patch removes the sysctl and modparam toggles, users now have to rely on explicit conntrack helper configuration via ruleset. Update tools/testing/selftests/netfilter/nft_conntrack_helper.sh to check that auto-assignment does not happen anymore. Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-30bpftool: Add support for querying cgroup_iter linkHao Luo
Support dumping info of a cgroup_iter link. This includes showing the cgroup's id and the order for walking the cgroup hierarchy. Example output is as follows: > bpftool link show 1: iter prog 2 target_name bpf_map 2: iter prog 3 target_name bpf_prog 3: iter prog 12 target_name cgroup cgroup_id 72 order self_only > bpftool -p link show [{ "id": 1, "type": "iter", "prog_id": 2, "target_name": "bpf_map" },{ "id": 2, "type": "iter", "prog_id": 3, "target_name": "bpf_prog" },{ "id": 3, "type": "iter", "prog_id": 12, "target_name": "cgroup", "cgroup_id": 72, "order": "self_only" } ] Signed-off-by: Hao Luo <haoluo@google.com> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220829231828.1016835-1-haoluo@google.com Signed-off-by: Martin KaFai Lau <martin.lau@linux.dev>
2022-08-30memblock tests: add tests for memblock_trim_memoryRebecca Mckeever
Add tests for memblock_trim_memory() for the following scenarios: - all regions aligned - one unaligned region that is smaller than the alignment - one unaligned region that is unaligned at the base - one unaligned region that is unaligned at the end Reviewed-by: Shaoqin Huang <shaoqin.huang@intel.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/0e5f55154a3b66581e04ba3717978795cbc08a5b.1661578349.git.remckee0@gmail.com
2022-08-30memblock tests: add tests for memblock_*bottom_up functionsRebecca Mckeever
Add simple tests for memblock_set_bottom_up() and memblock_bottom_up(). Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Shaoqin Huang <shaoqin.huang@intel.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/b03701d2faeaf00f7184e4b72903de4e5e939437.1661578349.git.remckee0@gmail.com
2022-08-30memblock tests: update alloc_nid_api to test memblock_alloc_try_nid_rawRebecca Mckeever
Update memblock_alloc_try_nid() tests so that they test either memblock_alloc_try_nid() or memblock_alloc_try_nid_raw() depending on the value of alloc_nid_test_flags. Run through all the existing tests in alloc_nid_api twice: once for memblock_alloc_try_nid() and once for memblock_alloc_try_nid_raw(). When the tests run memblock_alloc_try_nid(), they test that the entire memory region is zero. When the tests run memblock_alloc_try_nid_raw(), they test that the entire memory region is nonzero. The content of the memory region is initialized to nonzero, and we expect it to remain unchanged if running memblock_alloc_try_nid_raw(). Reviewed-by: Shaoqin Huang <shaoqin.huang@intel.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/6fa8938f67872841c10a00afb042947d1d280a04.1661578349.git.remckee0@gmail.com
2022-08-30memblock tests: update alloc_api to test memblock_alloc_rawRebecca Mckeever
Update memblock_alloc() tests so that they test either memblock_alloc() or memblock_alloc_raw() depending on the value of alloc_test_flags. Run through all the existing tests in memblock_alloc_api twice: once for memblock_alloc() and once for memblock_alloc_raw(). When the tests run memblock_alloc(), they test that the entire memory region is zero. When the tests run memblock_alloc_raw(), they test that the entire memory region is nonzero. The content of the memory region is initialized to nonzero, and we expect it to remain unchanged if running memblock_alloc_raw(). Reviewed-by: Shaoqin Huang <shaoqin.huang@intel.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/5a7cfb2f807ee2cb53ee77f9f5c910107b253d6e.1661578349.git.remckee0@gmail.com
2022-08-30memblock tests: add additional tests for basic api and memblock_allocRebecca Mckeever
Add tests for memblock_add(), memblock_reserve(), memblock_remove(), memblock_free(), and memblock_alloc() for the following test scenarios. memblock_add() and memblock_reserve(): - add/reserve a memory block in the gap between two existing memory blocks, and check that the blocks are merged into one region - try to add/reserve memblock regions that extend past PHYS_ADDR_MAX memblock_remove() and memblock_free(): - remove/free a region when it is the only available region + These tests ensure that the first region is overwritten with a "dummy" region when the last remaining region of that type is removed or freed. - remove/free() a region that overlaps with two existing regions of the relevant type - try to remove/free memblock regions that extend past PHYS_ADDR_MAX memblock_alloc(): - try to allocate a region that is larger than the total size of available memory (memblock.memory) Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Shaoqin Huang <shaoqin.huang@intel.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/c23c0393c5b9a53fe7f676996913c629495e9727.1661578349.git.remckee0@gmail.com
2022-08-30memblock tests: add labels to verbose output for generic alloc testsRebecca Mckeever
Generic tests for memblock_alloc*() functions do not use separate functions for testing top-down and bottom-up allocation directions. Therefore, the function name that is displayed in the verbose testing output does not include the allocation direction. Add an additional prefix when running generic tests for memblock_alloc*() functions that indicates which allocation direction is set. The prefix will be displayed when the tests are run in verbose mode. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Shaoqin Huang <shaoqin.huang@intel.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/fb76a42253d2a196a7daea29dd8121a69904f58e.1661578349.git.remckee0@gmail.com
2022-08-30memblock tests: update zeroed memory check for memblock_alloc_* testsRebecca Mckeever
Update the assert in memblock_alloc_try_nid() and memblock_alloc_from() tests that checks whether the memory is cleared so that it checks the entire chunk of allocated memory instead of just the first byte. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Shaoqin Huang <shaoqin.huang@intel.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/24b3271751756100142e65b75284d43b4d30c9b7.1661578349.git.remckee0@gmail.com
2022-08-30memblock tests: update tests to check if memblock_alloc zeroed memoryRebecca Mckeever
Add an assert in memblock_alloc() tests where allocation is expected to occur. The assert checks whether the entire chunk of allocated memory is cleared. The current memblock_alloc() tests do not check whether the allocated memory was zeroed. memblock_alloc() should zero the allocated memory since it is a wrapper for memblock_alloc_try_nid(). Reviewed-by: Shaoqin Huang <shaoqin.huang@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/83ffb941b65074f40eb14552f8bfe5b71fe50abd.1661578349.git.remckee0@gmail.com
2022-08-30memblock tests: update reference to obsolete build option in commentsRebecca Mckeever
The VERBOSE build option was replaced with the --verbose runtime option, but the comments describing the ASSERT_*() macros still refer to the VERBOSE build option. Update these comments so that they refer to the --verbose runtime option. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/5f8a4c2bde34cc029282c68d47eda982d950f421.1660451025.git.remckee0@gmail.com
2022-08-30memblock tests: add command line help optionRebecca Mckeever
Add a help command line option to the help message. Add the help option to the short and long options so it will be recognized as a valid option. Usage: $ ./main -h Or: $ ./main --help Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/0f3b93a79de78c0da1ca90f74fe35e9a85c7cf93.1660451025.git.remckee0@gmail.com
2022-08-29selftests/bpf: Fix connect4_prog tcp/socket header type conflictJames Hilliard
There is a potential for us to hit a type conflict when including netinet/tcp.h and sys/socket.h, we can replace both of these includes with linux/tcp.h and bpf_tcp_helpers.h to avoid this conflict. Fixes errors like the below when compiling with gcc BPF backend: In file included from /usr/include/netinet/tcp.h:91, from progs/connect4_prog.c:11: /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:34:23: error: conflicting types for 'int8_t'; have 'char' 34 | typedef __INT8_TYPE__ int8_t; | ^~~~~~ In file included from /usr/include/x86_64-linux-gnu/sys/types.h:155, from /usr/include/x86_64-linux-gnu/bits/socket.h:29, from /usr/include/x86_64-linux-gnu/sys/socket.h:33, from progs/connect4_prog.c:10: /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:24:18: note: previous declaration of 'int8_t' with type 'int8_t' {aka 'signed char'} 24 | typedef __int8_t int8_t; | ^~~~~~ /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:43:24: error: conflicting types for 'int64_t'; have 'long int' 43 | typedef __INT64_TYPE__ int64_t; | ^~~~~~~ /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:27:19: note: previous declaration of 'int64_t' with type 'int64_t' {aka 'long long int'} 27 | typedef __int64_t int64_t; | ^~~~~~~ Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220829154710.3870139-1-james.hilliard1@gmail.com
2022-08-29selftests/bpf: Fix bind{4,6} tcp/socket header type conflictJames Hilliard
There is a potential for us to hit a type conflict when including netinet/tcp.h with sys/socket.h, we can remove these as they are not actually needed. Fixes errors like the below when compiling with gcc BPF backend: In file included from /usr/include/netinet/tcp.h:91, from progs/bind4_prog.c:10: /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:34:23: error: conflicting types for 'int8_t'; have 'char' 34 | typedef __INT8_TYPE__ int8_t; | ^~~~~~ In file included from /usr/include/x86_64-linux-gnu/sys/types.h:155, from /usr/include/x86_64-linux-gnu/bits/socket.h:29, from /usr/include/x86_64-linux-gnu/sys/socket.h:33, from progs/bind4_prog.c:9: /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:24:18: note: previous declaration of 'int8_t' with type 'int8_t' {aka 'signed char'} 24 | typedef __int8_t int8_t; | ^~~~~~ /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:43:24: error: conflicting types for 'int64_t'; have 'long int' 43 | typedef __INT64_TYPE__ int64_t; | ^~~~~~~ /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:27:19: note: previous declaration of 'int64_t' with type 'int64_t' {aka 'long long int'} 27 | typedef __int64_t int64_t; | ^~~~~~~ make: *** [Makefile:537: /home/buildroot/bpf-next/tools/testing/selftests/bpf/bpf_gcc/bind4_prog.o] Error 1 Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220826052925.980431-1-james.hilliard1@gmail.com