summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2024-06-13tools/perf: Fix timing issue with parallel threads in perf bench ↵Athira Rajeev
wake-up-parallel perf bench futex fails as below and hangs intermittently when attempted to run on on a powerpc system: ./perf bench futex wake-parallel Running 'futex/wake-parallel' benchmark: Run summary [PID 88588]: blocking on 640 threads (at [private] futex 0x10464b8c), 640 threads waking up 1 at a time. [Run 1]: Avg per-thread latency (waking 1/640 threads) in 0.1309 ms (+-53.27%) [Run 2]: Avg per-thread latency (waking 1/640 threads) in 0.0120 ms (+-31.16%) [Run 3]: Avg per-thread latency (waking 1/640 threads) in 0.1474 ms (+-92.47%) [Run 4]: Avg per-thread latency (waking 1/640 threads) in 0.2883 ms (+-67.75%) [Run 5]: Avg per-thread latency (waking 1/640 threads) in 0.4108 ms (+-39.60%) [Run 6]: Avg per-thread latency (waking 1/640 threads) in 0.7843 ms (+-78.98%) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) In the system, where perf bench wake-up-parallel is has system configuration of 640 cpus. After debugging, this turned out to be a timing issue. The benchmark creates threads equal to number of cpus and issues a futex_wait. Then it does a usleep for .1 second before initiating futex_wake. In system configuration with more threads, the usleep time is not enough. Patch changes the usleep from 100000 to 200000 With the patch, ran multiple iterations and there were no issues further seen Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Cc: akanksha@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240607044354.82225-3-atrajeev@linux.vnet.ibm.com
2024-06-13tools/perf: Fix perf bench epoll to enable the run when some CPU's are offlineAthira Rajeev
Perf bench epoll fails as below when attempted to run on on a powerpc system: ./perf bench epoll wait Running 'epoll/wait' benchmark: Run summary [PID 627653]: 79 threads monitoring on 64 file-descriptors for 8 secs. perf: pthread_create: No such file or directory In the setup where this perf bench was ran, difference was that partition had 640 CPU's, but not all CPUs were online. 80 CPUs were online. While creating threads and using epoll_wait , code sets the affinity using cpumask. The cpumask size used is 80 which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the benchmark reports fail while setting affinity for cpu number which is greater than 80 or higher, because it attempts to set a bit position which is not allocated on the cpumask. Fix this by changing the size of cpumask to number of possible cpus and not the number of online cpus. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Cc: akanksha@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240607044354.82225-2-atrajeev@linux.vnet.ibm.com
2024-06-13tools/perf: Fix perf bench futex to enable the run when some CPU's are offlineAthira Rajeev
Perf bench futex fails as below when attempted to run on on a powerpc system: ./perf bench futex all Running futex/hash benchmark... Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs. perf: pthread_create: No such file or directory In the setup where this perf bench was ran, difference was that partition had 640 CPU's, but not all CPUs were online. 80 CPUs were online. While blocking the threads with futex_wait, code sets the affinity using cpumask. The cpumask size used is 80 which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the benchmark reports fail while setting affinity for cpu number which is greater than 80 or higher, because it attempts to set a bit position which is not allocated on the cpumask. Fix this by changing the size of cpumask to number of possible cpus and not the number of online cpus. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Cc: akanksha@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240607044354.82225-1-atrajeev@linux.vnet.ibm.com
2024-06-13perf record: Ensure space for lost samplesIan Rogers
Previous allocation didn't account for sample ID written after the lost samples event. Switch from malloc/free to a stack allocation. Reported-by: Milian Wolff <milian.wolff@kdab.com> Closes: https://lore.kernel.org/linux-perf-users/23879991.0LEYPuXRzz@milian-workstation/ Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240611050626.1223155-1-irogers@google.com
2024-06-13selftests: bpf: add testmod kfunc for nullable paramsVadim Fedorenko
Add special test to be sure that only __nullable BTF params can be replaced by NULL. This patch adds fake kfuncs in bpf_testmod to properly test different params. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Link: https://lore.kernel.org/r/20240613211817.1551967-6-vadfed@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-13selftests: bpf: crypto: adjust bench to use nullable IVVadim Fedorenko
The bench shows some improvements, around 4% faster on decrypt. Before: Benchmark 'crypto-decrypt' started. Iter 0 (325.719us): hits 5.105M/s ( 5.105M/prod), drops 0.000M/s, total operations 5.105M/s Iter 1 (-17.295us): hits 5.224M/s ( 5.224M/prod), drops 0.000M/s, total operations 5.224M/s Iter 2 ( 5.504us): hits 4.630M/s ( 4.630M/prod), drops 0.000M/s, total operations 4.630M/s Iter 3 ( 9.239us): hits 5.148M/s ( 5.148M/prod), drops 0.000M/s, total operations 5.148M/s Iter 4 ( 37.885us): hits 5.198M/s ( 5.198M/prod), drops 0.000M/s, total operations 5.198M/s Iter 5 (-53.282us): hits 5.167M/s ( 5.167M/prod), drops 0.000M/s, total operations 5.167M/s Iter 6 (-17.809us): hits 5.186M/s ( 5.186M/prod), drops 0.000M/s, total operations 5.186M/s Summary: hits 5.092 ± 0.228M/s ( 5.092M/prod), drops 0.000 ±0.000M/s, total operations 5.092 ± 0.228M/s After: Benchmark 'crypto-decrypt' started. Iter 0 (268.912us): hits 5.312M/s ( 5.312M/prod), drops 0.000M/s, total operations 5.312M/s Iter 1 (124.869us): hits 5.354M/s ( 5.354M/prod), drops 0.000M/s, total operations 5.354M/s Iter 2 (-36.801us): hits 5.334M/s ( 5.334M/prod), drops 0.000M/s, total operations 5.334M/s Iter 3 (254.628us): hits 5.334M/s ( 5.334M/prod), drops 0.000M/s, total operations 5.334M/s Iter 4 (-77.691us): hits 5.275M/s ( 5.275M/prod), drops 0.000M/s, total operations 5.275M/s Iter 5 (-164.510us): hits 5.313M/s ( 5.313M/prod), drops 0.000M/s, total operations 5.313M/s Iter 6 (-81.376us): hits 5.346M/s ( 5.346M/prod), drops 0.000M/s, total operations 5.346M/s Summary: hits 5.326 ± 0.029M/s ( 5.326M/prod), drops 0.000 ±0.000M/s, total operations 5.326 ± 0.029M/s Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Link: https://lore.kernel.org/r/20240613211817.1551967-5-vadfed@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-13selftests: bpf: crypto: use NULL instead of 0-sized dynptrVadim Fedorenko
Adjust selftests to use nullable option for state and IV arg. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Link: https://lore.kernel.org/r/20240613211817.1551967-4-vadfed@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts, no adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-13bpf: selftests: Do not use generated kfunc prototypes for arena progsDaniel Xu
When selftests are built with a new enough clang, the arena selftests opt-in to use LLVM address_space attribute annotations for arena pointers. These annotations are not emitted by kfunc prototype generation. This causes compilation errors when clang sees conflicting prototypes. Fix by opting arena selftests out of using generated kfunc prototypes. Fixes: 770abbb5a25a ("bpftool: Support dumping kfunc prototypes from BTF") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202406131810.c1B8hTm8-lkp@intel.com/ Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/fc59a617439ceea9ad8dfbb4786843c2169496ae.1718295425.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-13selftests/bpf: Add test coverage for reg_set_min_max handlingDaniel Borkmann
Add a test case for the jmp32/k fix to ensure selftests have coverage. Before fix: # ./vmtest.sh -- ./test_progs -t verifier_or_jmp32_k [...] ./test_progs -t verifier_or_jmp32_k tester_init:PASS:tester_log_buf 0 nsec process_subtest:PASS:obj_open_mem 0 nsec process_subtest:PASS:specs_alloc 0 nsec run_subtest:PASS:obj_open_mem 0 nsec run_subtest:FAIL:unexpected_load_success unexpected success: 0 #492/1 verifier_or_jmp32_k/or_jmp32_k: bit ops + branch on unknown value:FAIL #492 verifier_or_jmp32_k:FAIL Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED After fix: # ./vmtest.sh -- ./test_progs -t verifier_or_jmp32_k [...] ./test_progs -t verifier_or_jmp32_k #492/1 verifier_or_jmp32_k/or_jmp32_k: bit ops + branch on unknown value:OK #492 verifier_or_jmp32_k:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20240613115310.25383-3-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-13Merge tag 'net-6.10-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter. Slim pickings this time, probably a combination of summer, DevConf.cz, and the end of first half of the year at corporations. Current release - regressions: - Revert "igc: fix a log entry using uninitialized netdev", it traded lack of netdev name in a printk() for a crash Previous releases - regressions: - Bluetooth: L2CAP: fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ - geneve: fix incorrectly setting lengths of inner headers in the skb, confusing the drivers and causing mangled packets - sched: initialize noop_qdisc owner to avoid false-positive recursion detection (recursing on CPU 0), which bubbles up to user space as a sendmsg() error, while noop_qdisc should silently drop - netdevsim: fix backwards compatibility in nsim_get_iflink() Previous releases - always broken: - netfilter: ipset: fix race between namespace cleanup and gc in the list:set type" * tag 'net-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits) bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send() af_unix: Read with MSG_PEEK loops if the first unread byte is OOB bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded response gve: Clear napi->skb before dev_kfree_skb_any() ionic: fix use after netif_napi_del() Revert "igc: fix a log entry using uninitialized netdev" net: bridge: mst: fix suspicious rcu usage in br_mst_set_state net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state net/ipv6: Fix the RT cache flush via sysctl using a previous delay net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters gve: ignore nonrelevant GSO type bits when processing TSO headers net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP netfilter: Use flowlabel flow key when re-routing mangled packets netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type netfilter: nft_inner: validate mandatory meta and payload tcp: use signed arithmetic in tcp_rtx_probe0_timed_out() mailmap: map Geliang's new email address mptcp: pm: update add_addr counters after connect mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID mptcp: ensure snd_una is properly initialized on connect ...
2024-06-13selftests/bpf: Validate CHECKSUM_COMPLETE optionVadim Fedorenko
Adjust skb program test to run with checksum validation. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240606145851.229116-2-vadfed@meta.com
2024-06-13bpf: Add CHECKSUM_COMPLETE to bpf test progsVadim Fedorenko
Add special flag to validate that TC BPF program properly updates checksum information in skb. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240606145851.229116-1-vadfed@meta.com
2024-06-13kselftest/arm64: Fix a couple of spelling mistakesColin Ian King
There are two spelling mistakes in some error messages. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240613073429.1797451-1-colin.i.king@gmail.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12selftests: forwarding: router_mpath_hash: Add a new selftestPetr Machata
Add a selftest that exercises the sysctl added in the previous patches. Test that set/get works as expected; that across seeds we eventually hit all NHs (test_mpath_seed_*); and that a given seed keeps hitting the same NHs even across seed changes (test_mpath_seed_stability_*). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240607151357.421181-6-petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-12selftests: forwarding: lib: Split sysctl_save() out of sysctl_set()Petr Machata
In order to be able to save the current value of a sysctl without changing it, split the relevant bit out of sysctl_set() into a new helper. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240607151357.421181-5-petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-12bpftool: Support dumping kfunc prototypes from BTFDaniel Xu
This patch enables dumping kfunc prototypes from bpftool. This is useful b/c with this patch, end users will no longer have to manually define kfunc prototypes. For the kernel tree, this also means we can optionally drop kfunc prototypes from: tools/testing/selftests/bpf/bpf_kfuncs.h tools/testing/selftests/bpf/bpf_experimental.h Example usage: $ make PAHOLE=/home/dxu/dev/pahole/build/pahole -j30 vmlinux $ ./tools/bpf/bpftool/bpftool btf dump file ./vmlinux format c | rg "__ksym;" | head -3 extern void cgroup_rstat_updated(struct cgroup *cgrp, int cpu) __weak __ksym; extern void cgroup_rstat_flush(struct cgroup *cgrp) __weak __ksym; extern struct bpf_key *bpf_lookup_user_key(u32 serial, u64 flags) __weak __ksym; Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/bf6c08f9263c4bd9d10a717de95199d766a13f61.1718207789.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-12bpf: selftests: xfrm: Opt out of using generated kfunc prototypesDaniel Xu
The xfrm_info selftest locally defines an aliased type such that folks with CONFIG_XFRM_INTERFACE=m/n configs can still build the selftests. See commit aa67961f3243 ("selftests/bpf: Allow building bpf tests with CONFIG_XFRM_INTERFACE=[m|n]"). Thus, it is simpler if this selftest opts out of using enerated kfunc prototypes. The preprocessor macro this commit uses will be introduced in the final commit. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/afe0bb1c50487f52542cdd5230c4aef9e36ce250.1718207789.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-12bpf: selftests: nf: Opt out of using generated kfunc prototypesDaniel Xu
The bpf-nf selftests play various games with aliased types such that folks with CONFIG_NF_CONNTRACK=m/n configs can still build the selftests. See commits: 1058b6a78db2 ("selftests/bpf: Do not fail build if CONFIG_NF_CONNTRACK=m/n") 92afc5329a5b ("selftests/bpf: Fix build errors if CONFIG_NF_CONNTRACK=m") Thus, it is simpler if these selftests opt out of using generated kfunc prototypes. The preprocessor macro this commit uses will be introduced in the final commit. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/044a5b10cb3abd0d71cb1c818ee0bfc4a2239332.1718207789.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-12bpf: treewide: Align kfunc signatures to prog point-of-viewDaniel Xu
Previously, kfunc declarations in bpf_kfuncs.h (and others) used "user facing" types for kfuncs prototypes while the actual kfunc definitions used "kernel facing" types. More specifically: bpf_dynptr vs bpf_dynptr_kern, __sk_buff vs sk_buff, and xdp_md vs xdp_buff. It wasn't an issue before, as the verifier allows aliased types. However, since we are now generating kfunc prototypes in vmlinux.h (in addition to keeping bpf_kfuncs.h around), this conflict creates compilation errors. Fix this conflict by using "user facing" types in kfunc definitions. This results in more casts, but otherwise has no additional runtime cost. Note, similar to 5b268d1ebcdc ("bpf: Have bpf_rdonly_cast() take a const pointer"), we also make kfuncs take const arguments where appropriate in order to make the kfunc more permissive. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/b58346a63a0e66bc9b7504da751b526b0b189a67.1718207789.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-12bpf: selftests: Namespace struct_opt callbacks in bpf_dctcpDaniel Xu
With generated kfunc prototypes, the existing callback names will conflict. Fix by namespacing with a bpf_ prefix. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/efe7aadad8a054e5aeeba94b1d2e4502eee09d7a.1718207789.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-12bpf: selftests: Fix bpf_map_sum_elem_count() kfunc prototypeDaniel Xu
The prototype in progs/map_percpu_stats.c is not in line with how the actual kfuncs are defined in kernel/bpf/map_iter.c. This causes compilation errors when kfunc prototypes are generated from BTF. Fix by aligning with actual kfunc definitions. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/0497e11a71472dcb71ada7c90ad691523ae87c3b.1718207789.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-12bpf: selftests: Fix bpf_cpumask_first_zero() kfunc prototypeDaniel Xu
The prototype in progs/nested_trust_common.h is not in line with how the actual kfuncs are defined in kernel/bpf/cpumask.c. This causes compilation errors when kfunc prototypes are generated from BTF. Fix by aligning with actual kfunc definitions. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/437936a4e554b02e04566dd6e3f0a5d08370cc8c.1718207789.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-12bpf: selftests: Fix fentry test kfunc prototypesDaniel Xu
Some prototypes in progs/get_func_ip_test.c were not in line with how the actual kfuncs are defined in net/bpf/test_run.c. This causes compilation errors when kfunc prototypes are generated from BTF. Fix by aligning with actual kfunc definitions. Also remove two unused prototypes. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/1e68870e7626b7b9c6420e65076b307fc404a2f0.1718207789.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-12bpf: selftests: Fix bpf_iter_task_vma_new() prototypeDaniel Xu
bpf_iter_task_vma_new() is defined as taking a u64 as its 3rd argument. u64 is a unsigned long long. bpf_experimental.h was defining the prototype as unsigned long. Fix by using __u64. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/fab4509bfee914f539166a91c3ff41e949f3df30.1718207789.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-12kselftest/arm64: Fix redundancy of a testcaseDev Jain
Currently, we are writing the same value as we read into the TLS register, hence we cannot confirm update of the register, making the testcase "verify_tpidr_one" redundant. Fix this. Signed-off-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240605115448.640717-1-dev.jain@arm.com [catalin.marinas@arm.com: remove the increment style change] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12kselftest/arm64: Include kernel mode NEON in fp-stressMark Brown
Currently fp-stress only covers userspace use of floating point, it does not cover any kernel mode uses. Since currently kernel mode floating point usage can't be preempted and there are explicit preemption points in the existing implementations this isn't so important for fp-stress but when we readd preemption it will be good to try to exercise it. When the arm64 accelerated crypto operations are implemented we can relatively straightforwardly trigger kernel mode floating point usage by using the crypto userspace API to hash data, using the splice() support in an effort to minimise copying. We use /proc/crypto to check which accelerated implementations are available, picking the first symmetric hash we find. We run the kernel mode test unconditionally, replacing the second copy of the FPSIMD testcase for systems with FPSIMD only. If we don't think there are any suitable kernel mode implementations we fall back to running another copy of fpsimd-stress. There are a number issues with this approach, we don't actually verify that we are using an accelerated (or even CPU) implementation of the algorithm being tested and even with attempting to use splice() to minimise copying there are sizing limits on how much data gets spliced at once. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240521-arm64-fp-stress-kernel-v1-1-e38f107baad4@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12tools/x86/kcpuid: Add missing dir via MakefileChristian Heusel
So far the Makefile just installed the csv into $(HWDATADIR)/cpuid.csv, which made it unaware about $DESTDIR. Add $DESTDIR to the install command and while at it also create the directory, should it not exist already. This eases the packaging of kcpuid and allows i.e. for the install on Arch to look like this: $ make BINDIR=/usr/bin DESTDIR="$pkgdir" -C tools/arch/x86/kcpuid install Some background on DESTDIR: DESTDIR is commonly used in packaging for staged installs (regardless of the used package manager): https://www.gnu.org/prep/standards/html_node/DESTDIR.html So the package is built and installed into a directory which the package manager later picks up and creates some archive from it. What is specific to Arch Linux here is only the usage of $pkgdir in the example, DESTDIR itself is widely used. [ bp: Extend the commit message with Christian's info on DESTDIR as a GNU coding standards thing. ] Signed-off-by: Christian Heusel <christian@heusel.eu> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240531111757.719528-2-christian@heusel.eu
2024-06-11selftests: mptcp: lib: use wait_local_port_listen helperGeliang Tang
This patch includes net_helper.sh into mptcp_lib.sh, uses the helper wait_local_port_listen() defined in it to implement the similar mptcp helper. This can drop some duplicate code. It looks like this helper from net_helper.sh was originally coming from MPTCP, but MPTCP selftests have not been updated to use it from this shared place. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-6-e36986faac94@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-11selftests: mptcp: lib: use setup/cleanup_ns helpersGeliang Tang
This patch includes lib.sh into mptcp_lib.sh, uses setup_ns helper defined in lib.sh to set up namespaces in mptcp_lib_ns_init(), and uses cleanup_ns to delete namespaces in mptcp_lib_ns_exit(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-5-e36986faac94@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-11selftests: net: lib: remove 'ns' var in setup_nsGeliang Tang
The helper setup_ns() doesn't work when a net namespace named "ns" is passed to it. For example, in net/mptcp/diag.sh, the name of the namespace is "ns". If "setup_ns ns" is used in it, diag.sh fails with errors: Invalid netns name "./mptcp_connect" Cannot open network namespace "10000": No such file or directory Cannot open network namespace "10000": No such file or directory That is because "ns" is also a local variable in setup_ns, and it will not set the value for the global variable that has been giving in argument. To solve this, we could rename the variable, but it sounds better to drop it, as we can resolve the name using the variable passed in argument instead. The other local variables -- "ns_list" and "ns_name" -- are more unlikely to conflict with existing global variables. They don't seem to be currently used in any other net selftests. Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-4-e36986faac94@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-11selftests: net: lib: do not set ns var as readonlyMatthieu Baerts (NGI0)
It sounds good to mark the global netns variable as 'readonly', but Bash doesn't allow the creation of local variables with the same name. Because it looks like 'readonly' is mainly used here to check if a netns with that name has already been set, it sounds fine to check if a variable with this name has already been set instead. By doing that, we avoid having to modify helpers from MPTCP selftests using the same variable name as the one used to store the created netns name. While at it, also avoid an unnecessary call to 'eval' to set a local variable. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-3-e36986faac94@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-11selftests: net: lib: remove ns from list after clean-upMatthieu Baerts (NGI0)
Instead of only appending items to the list, removing them when the netns has been deleted. By doing that, we can make sure 'cleanup_all_ns()' is not trying to remove already deleted netns. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-2-e36986faac94@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-11selftests: net: lib: ignore possible errorsMatthieu Baerts (NGI0)
No need to disable errexit temporary, simply ignore the only possible and not handled error. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-1-e36986faac94@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-12selftests/bpf: Add uretprobe shadow stack testJiri Olsa
Adding uretprobe shadow stack test that runs all existing uretprobe tests with shadow stack enabled if it's available. Link: https://lore.kernel.org/all/20240611112158.40795-9-jolsa@kernel.org/ Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-06-12selftests/bpf: Add uretprobe syscall call from user space testJiri Olsa
Adding test to verify that when called from outside of the trampoline provided by kernel, the uretprobe syscall will cause calling process to receive SIGILL signal and the attached bpf program is not executed. Link: https://lore.kernel.org/all/20240611112158.40795-8-jolsa@kernel.org/ Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-06-12selftests/bpf: Add uretprobe syscall test for regs changesJiri Olsa
Adding test that creates uprobe consumer on uretprobe which changes some of the registers. Making sure the changed registers are propagated to the user space when the ureptobe syscall trampoline is used on x86_64. To be able to do this, adding support to bpf_testmod to create uprobe via new attribute file: /sys/kernel/bpf_testmod_uprobe This file is expecting file offset and creates related uprobe on current process exe file and removes existing uprobe if offset is 0. The can be only single uprobe at any time. The uprobe has specific consumer that changes registers used in ureprobe syscall trampoline and which are later checked in the test. Link: https://lore.kernel.org/all/20240611112158.40795-7-jolsa@kernel.org/ Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-06-12selftests/bpf: Add uretprobe syscall test for regs integrityJiri Olsa
Add uretprobe syscall test that compares register values before and after the uretprobe is hit. It also compares the register values seen from attached bpf program. Link: https://lore.kernel.org/all/20240611112158.40795-6-jolsa@kernel.org/ Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-06-12selftests/x86: Add return uprobe shadow stack testJiri Olsa
Adding return uprobe test for shadow stack and making sure it's working properly. Borrowed some of the code from bpf selftests. Link: https://lore.kernel.org/all/20240611112158.40795-5-jolsa@kernel.org/ Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-06-11selftests/fchmodat2: fix clang build failure due to -static-libasanJohn Hubbard
gcc requires -static-libasan in order to ensure that Address Sanitizer's library is the first one loaded. However, this leads to build failures on clang, when building via: make LLVM=1 -C tools/testing/selftests However, clang already does the right thing by default: it statically links the Address Sanitizer if -fsanitize is specified. Therefore, simply omit -static-libasan for clang builds. And leave behind a comment, because the whole reason for static linking might not be obvious. Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-11selftests/openat2: fix clang build failures: -static-libasan, LOCAL_HDRSJohn Hubbard
When building with clang via: make LLVM=1 -C tools/testing/selftests two distinct failures occur: 1) gcc requires -static-libasan in order to ensure that Address Sanitizer's library is the first one loaded. However, this leads to build failures on clang, when building via: make LLVM=1 -C tools/testing/selftests However, clang already does the right thing by default: it statically links the Address Sanitizer if -fsanitize is specified. Therefore, fix this by simply omitting -static-libasan for clang builds. And leave behind a comment, because the whole reason for static linking might not be obvious. 2) clang won't accept invocations of this form, but gcc will: $(CC) file1.c header2.h Fix this by using selftests/lib.mk facilities for tracking local header file dependencies: add them to LOCAL_HDRS, leaving only the .c files to be passed to the compiler. Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-11bpftool: Query only cgroup-related attach typesKenta Tada
When CONFIG_NETKIT=y, bpftool-cgroup shows error even if the cgroup's path is correct: $ bpftool cgroup tree /sys/fs/cgroup CgroupPath ID AttachType AttachFlags Name Error: can't query bpf programs attached to /sys/fs/cgroup: No such device or address >From strace and kernel tracing, I found netkit returned ENXIO and this command failed. I think this AttachType(BPF_NETKIT_PRIMARY) is not relevant to cgroup. bpftool-cgroup should query just only cgroup-related attach types. v2->v3: - removed an unnecessary check v1->v2: - used an array of cgroup attach types Signed-off-by: Kenta Tada <tadakentaso@gmail.com> Reviewed-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/r/20240607111704.6716-1-tadakentaso@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-11selftests: seccomp: fix format-zero-length warningsAmer Al Shanawany
fix the following errors by using string format specifier and an empty parameter: seccomp_benchmark.c:197:24: warning: zero-length gnu_printf format string [-Wformat-zero-length] 197 | ksft_print_msg(""); | ^~ seccomp_benchmark.c:202:24: warning: zero-length gnu_printf format string [-Wformat-zero-length] 202 | ksft_print_msg(""); | ^~ seccomp_benchmark.c:204:24: warning: zero-length gnu_printf format string [-Wformat-zero-length] 204 | ksft_print_msg(""); | ^~ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312260235.Uj5ug8K9-lkp@intel.com/ Suggested-by: Kees Cook <kees@kernel.org> Signed-off-by: Amer Al Shanawany <amer.shanawany@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-11selftests: filesystems: fix warn_unused_result build warningsAmer Al Shanawany
Fix the following warnings by adding return check and error messages. statmount_test.c: In function ‘cleanup_namespace’: statmount_test.c:128:9: warning: ignoring return value of ‘fchdir’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 128 | fchdir(orig_root); | ^~~~~~~~~~~~~~~~~ statmount_test.c:129:9: warning: ignoring return value of ‘chroot’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 129 | chroot("."); | ^~~~~~~~~~~ Signed-off-by: Amer Al Shanawany <amer.shanawany@gmail.com> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-11x86/alternatives: Add nested alternatives macrosPeter Zijlstra
Instead of making increasingly complicated ALTERNATIVE_n() implementations, use a nested alternative expression. The only difference between: ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2) and ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1), newinst2, flag2) is that the outer alternative can add additional padding when the inner alternative is the shorter one, which then results in alt_instr::instrlen being inconsistent. However, this is easily remedied since the alt_instr entries will be consecutive and it is trivial to compute the max(alt_instr::instrlen) at runtime while patching. Specifically, after this the ALTERNATIVE_2 macro, after CPP expansion (and manual layout), looks like this: .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2 740: 740: \oldinstr ; 741: .skip -(((744f-743f)-(741b-740b)) > 0) * ((744f-743f)-(741b-740b)),0x90 ; 742: .pushsection .altinstructions,"a" ; altinstr_entry 740b,743f,\ft_flags1,742b-740b,744f-743f ; .popsection ; .pushsection .altinstr_replacement,"ax" ; 743: \newinstr1 ; 744: .popsection ; ; 741: .skip -(((744f-743f)-(741b-740b)) > 0) * ((744f-743f)-(741b-740b)),0x90 ; 742: .pushsection .altinstructions,"a" ; altinstr_entry 740b,743f,\ft_flags2,742b-740b,744f-743f ; .popsection ; .pushsection .altinstr_replacement,"ax" ; 743: \newinstr2 ; 744: .popsection ; .endm The only label that is ambiguous is 740, however they all reference the same spot, so that doesn't matter. NOTE: obviously only @oldinstr may be an alternative; making @newinstr an alternative would mean patching .altinstr_replacement which very likely isn't what is intended, also the labels will be confused in that case. [ bp: Debug an issue where it would match the wrong two insns and and consider them nested due to the same signed offsets in the .alternative section and use instr_va() to compare the full virtual addresses instead. - Use new labels to denote that the new, nested alternatives are being used when staring at preprocessed output. - Use the %c constraint everywhere instead of %P and document the difference for future reference. ] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230628104952.GA2439977@hirez.programming.kicks-ass.net
2024-06-10mptcp: pm: update add_addr counters after connectYonglongLi
The creation of new subflows can fail for different reasons. If no subflow have been created using the received ADD_ADDR, the related counters should not be updated, otherwise they will never be decremented for events related to this ID later on. For the moment, the number of accepted ADD_ADDR is only decremented upon the reception of a related RM_ADDR, and only if the remote address ID is currently being used by at least one subflow. In other words, if no subflow can be created with the received address, the counter will not be decremented. In this case, it is then important not to increment pm.add_addr_accepted counter, and not to modify pm.accept_addr bit. Note that this patch does not modify the behaviour in case of failures later on, e.g. if the MP Join is dropped or rejected. The "remove invalid addresses" MP Join subtest has been modified to validate this case. The broadcast IP address is added before the "valid" address that will be used to successfully create a subflow, and the limit is decreased by one: without this patch, it was not possible to create the last subflow, because: - the broadcast address would have been accepted even if it was not usable: the creation of a subflow to this address results in an error, - the limit of 2 accepted ADD_ADDR would have then been reached. Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-3-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10mptcp: pm: inc RmAddr MIB counter once per RM_ADDR IDYonglongLi
The RmAddr MIB counter is supposed to be incremented once when a valid RM_ADDR has been received. Before this patch, it could have been incremented as many times as the number of subflows connected to the linked address ID, so it could have been 0, 1 or more than 1. The "RmSubflow" is incremented after a local operation. In this case, it is normal to tied it with the number of subflows that have been actually removed. The "remove invalid addresses" MP Join subtest has been modified to validate this case. A broadcast IP address is now used instead: the client will not be able to create a subflow to this address. The consequence is that when receiving the RM_ADDR with the ID attached to this broadcast IP address, no subflow linked to this ID will be found. Fixes: 7a7e52e38a40 ("mptcp: add RM_ADDR related mibs") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-2-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-06-06 We've added 54 non-merge commits during the last 10 day(s) which contain a total of 50 files changed, 1887 insertions(+), 527 deletions(-). The main changes are: 1) Add a user space notification mechanism via epoll when a struct_ops object is getting detached/unregistered, from Kui-Feng Lee. 2) Big batch of BPF selftest refactoring for sockmap and BPF congctl tests, from Geliang Tang. 3) Add BTF field (type and string fields, right now) iterator support to libbpf instead of using existing callback-based approaches, from Andrii Nakryiko. 4) Extend BPF selftests for the latter with a new btf_field_iter selftest, from Alan Maguire. 5) Add new kfuncs for a generic, open-coded bits iterator, from Yafang Shao. 6) Fix BPF selftests' kallsyms_find() helper under kernels configured with CONFIG_LTO_CLANG_THIN, from Yonghong Song. 7) Remove a bunch of unused structs in BPF selftests, from David Alan Gilbert. 8) Convert test_sockmap section names into names understood by libbpf so it can deduce program type and attach type, from Jakub Sitnicki. 9) Extend libbpf with the ability to configure log verbosity via LIBBPF_LOG_LEVEL environment variable, from Mykyta Yatsenko. 10) Fix BPF selftests with regards to bpf_cookie and find_vma flakiness in nested VMs, from Song Liu. 11) Extend riscv32/64 JITs to introduce shift/add helpers to generate Zba optimization, from Xiao Wang. 12) Enable BPF programs to declare arrays and struct fields with kptr, bpf_rb_root, and bpf_list_head, from Kui-Feng Lee. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits) selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca selftests/bpf: Add start_test helper in bpf_tcp_ca selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca libbpf: Auto-attach struct_ops BPF maps in BPF skeleton selftests/bpf: Add btf_field_iter selftests selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT libbpf: Remove callback-based type/string BTF field visitor helpers bpftool: Use BTF field iterator in btfgen libbpf: Make use of BTF field iterator in BTF handling code libbpf: Make use of BTF field iterator in BPF linker code libbpf: Add BTF field iterator selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find() selftests/bpf: Fix bpf_cookie and find_vma in nested VM selftests/bpf: Test global bpf_list_head arrays. selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types. selftests/bpf: Test kptr arrays and kptrs in nested struct fields. bpf: limit the number of levels of a nested struct type. bpf: look into the types of the fields of a struct type recursively. ... ==================== Link: https://lore.kernel.org/r/20240606223146.23020-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10perf evsel: Refactor tool eventsIan Rogers
Tool events unnecessarily open a dummy perf event which is useless even with `perf record` which will still open a dummy event. Change the behavior of tool events so: - duration_time - call `rdclock` on open and then report the count as a delta since the start in evsel__read_counter. This moves code out of builtin-stat making it more general purpose. - user_time/system_time - open the fd as either `/proc/pid/stat` or `/proc/stat` for cases like system wide. evsel__read_counter will read the appropriate field out of the procfs file. These values were previously supplied by wait4, if the procfs read fails then the wait4 values are used, assuming the process/thread terminated. By reading user_time and system_time this way, interval mode, per PID and per CPU can be supported although there are restrictions given what the files provide (e.g. per PID can't be combined with per CPU). Opening any of the tool events for `perf record` is changed to return invalid. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240503232849.17752-1-irogers@google.com
2024-06-10mlxsw: spectrum_acl: Fix ACL scale regression and firmware errorsIdo Schimmel
ACLs that reside in the algorithmic TCAM (A-TCAM) in Spectrum-2 and newer ASICs can share the same mask if their masks only differ in up to 8 consecutive bits. For example, consider the following filters: # tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 192.0.2.0/24 action drop # tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 198.51.100.128/25 action drop The second filter can use the same mask as the first (dst_ip/24) with a delta of 1 bit. However, the above only works because the two filters have different values in the common unmasked part (dst_ip/24). When entries have the same value in the common unmasked part they create undesired collisions in the device since many entries now have the same key. This leads to firmware errors such as [1] and to a reduced scale. Fix by adjusting the hash table key to only include the value in the common unmasked part. That is, without including the delta bits. That way the driver will detect the collision during filter insertion and spill the filter into the circuit TCAM (C-TCAM). Add a test case that fails without the fix and adjust existing cases that check C-TCAM spillage according to the above limitation. [1] mlxsw_spectrum2 0000:06:00.0: EMAD reg access failed (tid=3379b18a00003394,reg_id=3027(ptce3),type=write,status=8(resource not available)) Fixes: c22291f7cf45 ("mlxsw: spectrum: acl: Implement delta for ERP") Reported-by: Alexander Zubkov <green@qrator.net> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>