Age | Commit message (Collapse) | Author |
|
perf_sched__replay()
The start_work_mutex and work_done_wait_mutex are used only for the
'perf sched replay'. Put their initialization in perf_sched__replay () to
reduce unnecessary actions in other commands.
Simple functional testing:
# perf sched record perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.197 [sec]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 14.952 MB perf.data (134165 samples) ]
# perf sched replay
run measurement overhead: 108 nsecs
sleep measurement overhead: 65658 nsecs
the run test took 999991 nsecs
the sleep test took 1079324 nsecs
nr_run_events: 42378
nr_sleep_events: 43102
nr_wakeup_events: 31852
target-less wakeups: 17
multi-target wakeups: 712
task 0 ( swapper: 0), nr_events: 10451
task 1 ( swapper: 1), nr_events: 3
task 2 ( swapper: 2), nr_events: 1
<SNIP>
task 717 ( sched-messaging: 74483), nr_events: 152
task 718 ( sched-messaging: 74484), nr_events: 1944
task 719 ( sched-messaging: 74485), nr_events: 73
task 720 ( sched-messaging: 74486), nr_events: 163
task 721 ( sched-messaging: 74487), nr_events: 942
task 722 ( sched-messaging: 74488), nr_events: 78
task 723 ( sched-messaging: 74489), nr_events: 1090
------------------------------------------------------------
#1 : 1366.507, ravg: 1366.51, cpu: 7682.70 / 7682.70
#2 : 1410.072, ravg: 1370.86, cpu: 7723.88 / 7686.82
#3 : 1396.296, ravg: 1373.41, cpu: 7568.20 / 7674.96
#4 : 1381.019, ravg: 1374.17, cpu: 7531.81 / 7660.64
#5 : 1393.826, ravg: 1376.13, cpu: 7725.25 / 7667.11
#6 : 1401.581, ravg: 1378.68, cpu: 7594.82 / 7659.88
#7 : 1381.337, ravg: 1378.94, cpu: 7371.22 / 7631.01
#8 : 1373.842, ravg: 1378.43, cpu: 7894.92 / 7657.40
#9 : 1364.697, ravg: 1377.06, cpu: 7324.91 / 7624.15
#10 : 1363.613, ravg: 1375.72, cpu: 7209.55 / 7582.69
# echo $?
0
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240206083228.172607-2-yangjihong1@huawei.com
|
|
udpgso regression test configures routing and device MTU directly through
uAPI (Netlink, ioctl) to do its job. While there is nothing wrong with it,
it takes more effort than doing it from shell.
Looking forward, we would like to extend the udpgso regression tests to
cover the EIO corner case [1], once it gets addressed. That will require a
dummy device and device feature manipulation to set it up. Which means more
Netlink code.
So, in preparation, pull out network configuration into the shell script
part of the test, so it is easily extendable in the future.
Also, because it now easy to setup routing, add a second local IPv6
address. Because the second address is not managed by the kernel, we can
"replace" the corresponding local route with a reduced-MTU one. This
unblocks the disabled "ipv6 connected" test case. Add a similar setup for
IPv4 for symmetry.
[1] https://lore.kernel.org/netdev/87jzqsld6q.fsf@cloudflare.com/
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20240207-jakub-krn-635-v3-1-3dfa3da8a7d3@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a test case into the netlink checks that will show the number of
nested action recursions won't exceed 16. Going to 17 on a small
clone call isn't enough to exhaust the stack on (most) systems, so
it should be safe to run even on systems that don't have the fix
applied.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240207132416.1488485-3-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The altnames test uses the forwarding/lib.sh and that dependency
currently causes failures when running the test after install:
make -C tools/testing/selftests/ TARGETS=net install
./tools/testing/selftests/kselftest_install/run_kselftest.sh \
-t net:altnames.sh
# ...
# ./altnames.sh: line 8: ./forwarding/lib.sh: No such file or directory
# RTNETLINK answers: Operation not permitted
# ./altnames.sh: line 73: tests_run: command not found
# ./altnames.sh: line 65: pre_cleanup: command not found
Address the issue leveraging the TEST_INCLUDES infrastructure
provided by commit 2a0683be5b4c ("selftests: Introduce Makefile variable
to list shared bash scripts")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/f7b1e9d468224cbc136d304362315499fe39848f.1707298927.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add 8 new mirred tdc tests that target mirred to block:
- Add mirred mirror to egress block action
- Add mirred mirror to ingress block action
- Add mirred redirect to egress block action
- Add mirred redirect to ingress block action
- Try to add mirred action with both dev and block
- Try to add mirred action without specifying neither dev nor block
- Replace mirred redirect to dev action with redirect to block
- Replace mirred redirect to block action with mirror to dev
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20240202020726.529170-1-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The redirection test case fails in the netdev CI on debug kernels
because an FDB entry is learned despite the presence of a tc filter that
redirects incoming traffic [1].
I am unable to reproduce the failure locally, but I can see how it can
happen given that learning is first enabled and only then the ingress tc
filter is configured. On debug kernels the time window between these two
operations is longer compared to regular kernels, allowing random
packets to be transmitted and trigger learning.
Fix by reversing the order and configure the ingress tc filter before
enabling learning.
[1]
[...]
# TEST: Locked port MAB redirect [FAIL]
# Locked entry created for redirected traffic
Fixes: 38c43a1ce758 ("selftests: forwarding: Add test case for traffic redirection from a locked port")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-5-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Suppress the following grep warnings:
[...]
INFO: # Port group entries configuration tests - (*, G)
TEST: Common port group entries configuration tests (IPv4 (*, G)) [ OK ]
TEST: Common port group entries configuration tests (IPv6 (*, G)) [ OK ]
grep: warning: stray \ before /
grep: warning: stray \ before /
grep: warning: stray \ before /
TEST: IPv4 (*, G) port group entries configuration tests [ OK ]
grep: warning: stray \ before /
grep: warning: stray \ before /
grep: warning: stray \ before /
TEST: IPv6 (*, G) port group entries configuration tests [ OK ]
[...]
They do not fail the test, but do clutter the output.
Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After enabling a multicast querier on the bridge (like the test is
doing), the bridge will wait for the Max Response Delay before starting
to forward according to its MDB in order to let Membership Reports
enough time to be received and processed.
Currently, the test is waiting for exactly the default Max Response
Delay (10 seconds) which is racy and leads to failures [1].
Fix by reducing the Max Response Delay to 1 second.
[1]
[...]
# TEST: IPv4 host entries forwarding tests [FAIL]
# Packet locally received after flood
Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After enabling a multicast querier on the bridge (like the test is
doing), the bridge will wait for the Max Response Delay before starting
to forward according to its MDB in order to let Membership Reports
enough time to be received and processed.
Currently, the test is waiting for exactly the default Max Response
Delay (10 seconds) which is racy and leads to failures [1].
Fix by reducing the Max Response Delay to 1 second.
[1]
[...]
# TEST: L2 miss - Multicast (IPv4) [FAIL]
# Unregistered multicast filter was hit after adding MDB entry
Fixes: 8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The test toggles the carrier of a bridge port in order to test the
bridge backup port feature.
Due to the linkwatch delayed work the carrier change is not always
reflected fast enough to the bridge driver and packets are not forwarded
as the test expects, resulting in failures [1].
Fix by busy waiting on the bridge port state until it changes to the
desired state following the carrier change.
[1]
# Backup port
# -----------
[...]
# TEST: swp1 carrier off [ OK ]
# TEST: No forwarding out of swp1 [FAIL]
[ 641.995910] br0: port 1(swp1) entered disabled state
# TEST: No forwarding out of vx0 [ OK ]
Fixes: b408453053fb ("selftests: net: Add bridge backup port and backup nexthop ID test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208123110.1063930-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The two tests that make use of multicast routig (router.sh and
router_multicast.sh) are currently failing in the netdev CI because the
kernel is missing multicast routing support.
Fix by adding the required config entries.
Fixes: 6d4efada3b82 ("selftests: forwarding: Add multicast routing test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240208165538.1303021-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The reuseport_addr_any.sh is currently skipping DCCP tests and
pmtu.sh is skipping all the FOU/GUE related cases: add the missing
options.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/38d3ca7f909736c1aef56e6244d67c82a9bba6ff.1707326987.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
stat+std_output.sh test fails on my arm64 machine:
[root@localhost shell]# ./stat+std_output.sh
Checking STD output: no args Unknown event name in TopDownL1 # 0.18 retiring
[root@localhost shell]# ./stat+std_output.sh
Checking STD output: no args [Success]
Checking STD output: system wide [Success]
Checking STD output: interval [Success]
Checking STD output: per thread Unknown event name in tmux: server-1114960 # 0.41 frontend_bound
When no args specified `perf stat` will add TopdownL1 metric group
and the output will be like:
[root@localhost shell]# perf stat -- stress-ng --vm 1 --timeout 1
stress-ng: info: [3351733] setting to a 1 second run per stressor
stress-ng: info: [3351733] dispatching hogs: 1 vm
stress-ng: info: [3351733] successful run completed in 1.02s
Performance counter stats for 'stress-ng --vm 1 --timeout 1':
1,037.71 msec task-clock # 1.000 CPUs utilized
13 context-switches # 12.528 /sec
1 cpu-migrations # 0.964 /sec
67,544 page-faults # 65.090 K/sec
2,691,932,561 cycles # 2.594 GHz (74.56%)
6,571,333,653 instructions # 2.44 insn per cycle (74.92%)
521,863,142 branches # 502.901 M/sec (75.21%)
425,879 branch-misses # 0.08% of all branches (87.57%)
TopDownL1 # 0.61 retiring (87.67%)
# 0.03 frontend_bound (87.67%)
# 0.02 bad_speculation (87.67%)
# 0.34 backend_bound (74.61%)
1.038138390 seconds time elapsed
0.844849000 seconds user
0.189053000 seconds sys
Metrics in group TopDownL1 don't have event name on arm64 but are not
listed in the $skip_metric list which they should be listed. Add them
to the skip list as what does for x86 platforms in [1].
[1] commit 4d60e83dfcee ("perf test: Skip metrics w/o event name in stat STD output linter")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: linuxarm@huawei.com
Cc: kan.liang@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240207091222.54096-1-yangyicong@huawei.com
|
|
Currently perf does not record module section addresses except for
the .text section. In general that means perf cannot get module section
mappings correct (except for .text) when loading symbols from a kernel
module file. (Note using --kcore does not have this issue)
Improve that situation slightly by identifying executable sections that
use the same mapping as the .text section. That happens when an
executable section comes directly after the .text section, both in memory
and on file, something that can be determined by following the same layout
rules used by the kernel, refer kernel layout_sections(). Note whether
that happens is somewhat arbitrary, so this is not a final solution.
Example from tracing a virtual machine process:
Before:
$ perf script | grep unknown
CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 [unknown] (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
$ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
Map: 0-7e0 41430 [kvm_intel].noinstr.text
Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
After:
$ perf script | grep 203.511270
CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 vmx_vmexit+0x0 (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
$ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
Reported-by: Like Xu <like.xu.linux@gmail.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240208085326.13432-3-adrian.hunter@intel.com
|
|
Dump kmaps if using 'perf --debug kmaps' or verbose > 2 (e.g. -vvv) for
tools 'perf script' and 'perf report' if there is no browser.
Example:
$ perf --debug kmaps script 2>&1 >/dev/null | grep kvm.intel
build id event received for /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko: 0691d75e10e72ebbbd45a44c59f6d00a5604badf [20]
Map: 0-3a3 4f5d8 [kvm_intel].modinfo
Map: 0-5240 5f280 [kvm_intel]__versions
Map: 0-30 64 [kvm_intel].note.Linux
Map: 0-14 644c0 [kvm_intel].orc_header
Map: 0-5297 43680 [kvm_intel].rodata
Map: 0-5bee 3b837 [kvm_intel].text.unlikely
Map: 0-7e0 41430 [kvm_intel].noinstr.text
Map: 0-2080 713c0 [kvm_intel].bss
Map: 0-26 705c8 [kvm_intel].data..read_mostly
Map: 0-5888 6a4c0 [kvm_intel].data
Map: 0-22 70220 [kvm_intel].data.once
Map: 0-40 705f0 [kvm_intel].data..percpu
Map: 0-1685 41d20 [kvm_intel].init.text
Map: 0-4b8 6fd60 [kvm_intel].init.data
Map: 0-380 70248 [kvm_intel]__dyndbg
Map: 0-8 70218 [kvm_intel].exit.data
Map: 0-438 4f980 [kvm_intel]__param
Map: 0-5f5 4ca0f [kvm_intel].rodata.str1.1
Map: 0-3657 493b8 [kvm_intel].rodata.str1.8
Map: 0-e0 70640 [kvm_intel].data..ro_after_init
Map: 0-500 70ec0 [kvm_intel].gnu.linkonce.this_module
Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
The example above shows how the module section mappings are all wrong
except for the main .text mapping at 0xffffffffc13a7000.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Like Xu <like.xu.linux@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240208085326.13432-2-adrian.hunter@intel.com
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
drivers/net/ethernet/stmicro/stmmac/common.h
38cc3c6dcc09 ("net: stmmac: protect updates of 64-bit statistics counters")
fd5a6a71313e ("net: stmmac: est: Per Tx-queue error count for HLBF")
c5c3e1bfc9e0 ("net: stmmac: Offload queueMaxSDU from tc-taprio")
drivers/net/wireless/microchip/wilc1000/netdev.c
c9013880284d ("wifi: fill in MODULE_DESCRIPTION()s for wilc1000")
328efda22af8 ("wifi: wilc1000: do not realloc workqueue everytime an interface is added")
net/unix/garbage.c
11498715f266 ("af_unix: Remove io_uring code for GC.")
1279f9d9dec2 ("af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from WiFi and netfilter.
Current release - regressions:
- nic: intel: fix old compiler regressions
- netfilter: ipset: missing gc cancellations fixed
Current release - new code bugs:
- netfilter: ctnetlink: fix filtering for zone 0
Previous releases - regressions:
- core: fix from address in memcpy_to_iter_csum()
- netfilter: nfnetlink_queue: un-break NF_REPEAT
- af_unix: fix memory leak for dead unix_(sk)->oob_skb in GC.
- devlink: avoid potential loop in devlink_rel_nested_in_notify_work()
- iwlwifi:
- mvm: fix a battery life regression
- fix double-free bug
- mac80211: fix waiting for beacons logic
- nic: nfp: flower: prevent re-adding mac index for bonded port
Previous releases - always broken:
- rxrpc: fix generation of serial numbers to skip zero
- tipc: check the bearer type before calling tipc_udp_nl_bearer_add()
- tunnels: fix out of bounds access when building IPv6 PMTU error
- nic: hv_netvsc: register VF in netvsc_probe if NET_DEVICE_REGISTER
missed
- nic: atlantic: fix DMA mapping for PTP hwts ring
Misc:
- selftests: more fixes to deal with very slow hosts"
* tag 'net-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits)
netfilter: nft_set_pipapo: remove scratch_aligned pointer
netfilter: nft_set_pipapo: add helper to release pcpu scratch area
netfilter: nft_set_pipapo: store index in scratch maps
netfilter: nft_set_rbtree: skip end interval element from gc
netfilter: nfnetlink_queue: un-break NF_REPEAT
netfilter: nf_tables: use timestamp to check for set element timeout
netfilter: nft_ct: reject direction for ct id
netfilter: ctnetlink: fix filtering for zone 0
s390/qeth: Fix potential loss of L3-IP@ in case of network issues
netfilter: ipset: Missing gc cancellations fixed
octeontx2-af: Initialize maps.
net: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio
net: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio
netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
netfilter: nft_compat: restrict match/target protocol to u16
netfilter: nft_compat: reject unused compat flag
netfilter: nft_compat: narrow down revision to unsigned 8-bits
net: intel: fix old compiler regressions
MAINTAINERS: Maintainer change for rds
selftests: cmsg_ipv6: repeat the exact packet
...
|
|
It is more accurate to check if KVM is enabled, instead of having the
architecture say so. Architectures always "have" KVM, so for example
checking CONFIG_HAVE_KVM in x86 code is pointless, but if KVM is disabled
in a specific build, there is no need for support code.
Alternatively, many of the #ifdefs could simply be deleted. However,
this would add completely dead code. For example, when KVM is disabled,
there should not be any posted interrupts, i.e. NOT wiring up the "dummy"
handlers and treating IRQs on those vectors as spurious is the right
thing to do.
Cc: x86@kernel.org
Cc: kbingham@kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Narrow down target/match revision to u8 in nft_compat.
2) Bail out with unused flags in nft_compat.
3) Restrict layer 4 protocol to u16 in nft_compat.
4) Remove static in pipapo get command that slipped through when
reducing set memory footprint.
5) Follow up incremental fix for the ipset performance regression,
this includes the missing gc cancellation, from Jozsef Kadlecsik.
6) Allow to filter by zone 0 in ctnetlink, do not interpret zone 0
as no filtering, from Felix Huettner.
7) Reject direction for NFT_CT_ID.
8) Use timestamp to check for set element expiration while transaction
is handled to prevent garbage collection from removing set elements
that were just added by this transaction. Packet path and netlink
dump/get path still use current time to check for expiration.
9) Restore NF_REPEAT in nfnetlink_queue, from Florian Westphal.
10) map_index needs to be percpu and per-set, not just percpu.
At this time its possible for a pipapo set to fill the all-zero part
with ones and take the 'might have bits set' as 'start-from-zero' area.
From Florian Westphal. This includes three patches:
- Change scratchpad area to a structure that provides space for a
per-set-and-cpu toggle and uses it of the percpu one.
- Add a new free helper to prepare for the next patch.
- Remove the scratch_aligned pointer and makes AVX2 implementation
use the exact same memory addresses for read/store of the matching
state.
netfilter pull request 24-02-08
* tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_set_pipapo: remove scratch_aligned pointer
netfilter: nft_set_pipapo: add helper to release pcpu scratch area
netfilter: nft_set_pipapo: store index in scratch maps
netfilter: nft_set_rbtree: skip end interval element from gc
netfilter: nfnetlink_queue: un-break NF_REPEAT
netfilter: nf_tables: use timestamp to check for set element timeout
netfilter: nft_ct: reject direction for ct id
netfilter: ctnetlink: fix filtering for zone 0
netfilter: ipset: Missing gc cancellations fixed
netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
netfilter: nft_compat: restrict match/target protocol to u16
netfilter: nft_compat: reject unused compat flag
netfilter: nft_compat: narrow down revision to unsigned 8-bits
====================
Link: https://lore.kernel.org/r/20240208112834.1433-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
previously filtering for the default zone would actually skip the zone
filter and flush all zones.
Fixes: eff3c558bb7e ("netfilter: ctnetlink: support filtering by zone")
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Closes: https://lore.kernel.org/netdev/2032238f-31ac-4106-8f22-522e76df5a12@ovn.org/
Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Correct header file is needed for getting CLOSE_RANGE_* macros.
Previously it was tested with newer glibc which didn't show the need to
include the header which was a mistake.
Link: https://lkml.kernel.org/r/20231024155137.219700-1-usama.anjum@collabora.com
Fixes: ec54424923cf ("selftests: core: remove duplicate defines")
Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Link: https://lore.kernel.org/all/7161219e-0223-d699-d6f3-81abd9abf13b@arm.com
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Use slowwait instead of hard code sleep for bonding tests.
In function setup_prepare(), the client_create() will be called after
server_create(). So I think there is no need to sleep in server_create()
and remove it.
For lab_lib.sh, remove bonding module may affect other running bonding tests.
And some test env may buildin bond which can't be removed. The bonding
link should be removed by lag_reset_network() or netns delete.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-5-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The purpose of grat_arp is testing commit 9949e2efb54e ("bonding: fix
send_peer_notif overflow"). As the send_peer_notif was defined to u8,
to overflow it, we need to
send_peer_notif = num_peer_notif * peer_notif_delay = num_grat_arp * peer_notify_delay / miimon > 255
(kernel) (kernel parameter) (user parameter)
e.g. 30 (num_grat_arp) * 1000 (peer_notify_delay) / 100 (miimon) > 255.
Which need 30s to complete sending garp messages. To save the testing time,
the only way is reduce the miimon number. Something like
30 (num_grat_arp) * 100 (peer_notify_delay) / 10 (miimon) > 255.
To save more time, the 50 num_grat_arp testing could be removed.
The arp_validate_test also need to check the mii_status, which sleep
too long. Use slowwait to save some time.
For other connection checkings, make sure active slave changed first.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-4-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use tc filter to check if LACP was sent, which is accurate and save
more time.
No need to remove bonding module as some test env may buildin bonding.
And the bond link has been deleted.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-3-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add slowwait functions to wait for some operations that may need a long time
to finish. The busywait executes the cmd too fast, which is kind of wasting
cpu in this scenario. At the same time, if shell debugging is enabled with
`set -x`. the busywait will output too much logs.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-2-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After the series "Annotate kfuncs in .BTF_ids section"[0], kfuncs can be
generated from bpftool. Let's mark the existing cpumask kfunc declarations
__weak so they don't conflict with definitions that will eventually come
from vmlinux.h.
[0]. https://lore.kernel.org/all/cover.1706491398.git.dxu@dxuuu.xyz
Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/bpf/20240206081416.26242-5-laoar.shao@gmail.com
|
|
We should verify the return value of cpumask_success__load().
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240206081416.26242-4-laoar.shao@gmail.com
|
|
The .BTF_ids section is pre-filled with zeroed BTF ID entries during the
build and afterwards patched by resolve_btfids with correct values.
Since resolve_btfids always writes in host-native endianness, it relies
on libelf to do the translation when the target ELF is cross-compiled to
a different endianness (this was introduced in commit 61e8aeda9398
("bpf: Fix libelf endian handling in resolv_btfids")).
Unfortunately, the translation will corrupt the flags fields of SET8
entries because these were written during vmlinux compilation and are in
the correct endianness already. This will lead to numerous selftests
failures such as:
$ sudo ./test_verifier 502 502
#502/p sleepable fentry accept FAIL
Failed to load prog 'Invalid argument'!
bpf_fentry_test1 is not sleepable
verification time 34 usec
stack depth 0
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
Summary: 0 PASSED, 0 SKIPPED, 1 FAILED
Since it's not possible to instruct libelf to translate just certain
values, let's manually bswap the flags (both global and entry flags) in
resolve_btfids when needed, so that libelf then translates everything
correctly.
Fixes: ef2c6f370a63 ("tools/resolve_btfids: Add support for 8-byte BTF sets")
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/7b6bff690919555574ce0f13d2a5996cacf7bf69.1707223196.git.vmalik@redhat.com
|
|
Instead of using magic offsets to access BTF ID set data, leverage types
from btf_ids.h (btf_id_set and btf_id_set8) which define the actual
layout of the data. Thanks to this change, set sorting should also
continue working if the layout changes.
This requires to sync the definition of 'struct btf_id_set8' from
include/linux/btf_ids.h to tools/include/linux/btf_ids.h. We don't sync
the rest of the file at the moment, b/c that would require to also sync
multiple dependent headers and we don't need any other defs from
btf_ids.h.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/bpf/ff7f062ddf6a00815fda3087957c4ce667f50532.1707223196.git.vmalik@redhat.com
|
|
Pull kvm fixes from Paolo Bonzini:
"x86 guest:
- Avoid false positive for check that only matters on AMD processors
x86:
- Give a hint when Win2016 might fail to boot due to XSAVES &&
!XSAVEC configuration
- Do not allow creating an in-kernel PIT unless an IOAPIC already
exists
RISC-V:
- Allow ISA extensions that were enabled for bare metal in 6.8 (Zbc,
scalar and vector crypto, Zfh[min], Zihintntl, Zvfh[min], Zfa)
S390:
- fix CC for successful PQAP instruction
- fix a race when creating a shadow page"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86/coco: Define cc_vendor without CONFIG_ARCH_HAS_CC_PLATFORM
x86/kvm: Fix SEV check in sev_map_percpu_data()
KVM: x86: Give a hint when Win2016 might fail to boot due to XSAVES erratum
KVM: x86: Check irqchip mode before create PIT
KVM: riscv: selftests: Add Zfa extension to get-reg-list test
RISC-V: KVM: Allow Zfa extension for Guest/VM
KVM: riscv: selftests: Add Zvfh[min] extensions to get-reg-list test
RISC-V: KVM: Allow Zvfh[min] extensions for Guest/VM
KVM: riscv: selftests: Add Zihintntl extension to get-reg-list test
RISC-V: KVM: Allow Zihintntl extension for Guest/VM
KVM: riscv: selftests: Add Zfh[min] extensions to get-reg-list test
RISC-V: KVM: Allow Zfh[min] extensions for Guest/VM
KVM: riscv: selftests: Add vector crypto extensions to get-reg-list test
RISC-V: KVM: Allow vector crypto extensions for Guest/VM
KVM: riscv: selftests: Add scaler crypto extensions to get-reg-list test
RISC-V: KVM: Allow scalar crypto extensions for Guest/VM
KVM: riscv: selftests: Add Zbc extension to get-reg-list test
RISC-V: KVM: Allow Zbc extension for Guest/VM
KVM: s390: fix cc for successful PQAP
KVM: s390: vsie: fix race during shadow creation
|
|
Currently pipe mode doesn't set the file size and it results in a
misleading message of 0 data size at the end. Although it might miss
some accounting for pipe header or more, just displaying the data size
would reduce the possible confusion.
Before:
$ perf record -o- perf test -w noploop | perf report -i- -q --percent-limit=1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ] <====== (here)
99.58% perf perf [.] noploop
After:
$ perf record -o- perf test -w noploop | perf report -i- -q --percent-limit=1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.229 MB - ]
99.46% perf perf [.] noploop
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240112231340.779469-1-namhyung@kernel.org
|
|
With the srcline option, the perf script only prints a source line at
the beginning of a sample with call/ret from functions, but not for
each jump in brstackinsn. It's useful to print a source line for each
jump in brstackinsn when the end user analyze the full assembler
sequences of branch sequences for the sample.
The srccode option can also be used to locate the source code line.
However, it's printed almost for every line and makes the output less
readable.
$perf script -F +brstackinsn,+srcline --xed
Before the patch,
tchain_edit_deb 1463275 15228549.107820: 282495 instructions:u: 401133 f3+0xd (/home/kan/os.li>
tchain_edit.c:22
f3+40: tchain_edit.c:20
000000000040114e jle 0x401133 # PRED 6 cycles [6]
0000000000401133 movl -0x4(%rbp), %eax
0000000000401136 and $0x1, %eax
0000000000401139 test %eax, %eax
000000000040113b jz 0x401143
000000000040113d addl $0x1, -0x4(%rbp)
0000000000401141 jmp 0x401147 # PRED 3 cycles [9] 2.00 IPC
0000000000401147 cmpl $0x3e7, -0x4(%rbp)
000000000040114e jle 0x401133 # PRED 6 cycles [15] 0.33 IPC
After the patch,
tchain_edit_deb 1463275 15228549.107820: 282495 instructions:u: 401133 f3+0xd (/home/kan/os.li>
tchain_edit.c:22
f3+40: tchain_edit.c:20
000000000040114e jle 0x401133 srcline: tchain_edit.c:20 # PRED 6 cycles [6]
0000000000401133 movl -0x4(%rbp), %eax
0000000000401136 and $0x1, %eax
0000000000401139 test %eax, %eax
000000000040113b jz 0x401143
000000000040113d addl $0x1, -0x4(%rbp)
0000000000401141 jmp 0x401147 srcline: tchain_edit.c:23 # PRED 3 cycles [9] 2.00 IPC
0000000000401147 cmpl $0x3e7, -0x4(%rbp)
000000000040114e jle 0x401133 srcline: tchain_edit.c:20 # PRED 6 cycles [15] 0.33 IPC
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: ahmad.yasin@intel.com
Cc: amiri.khalil@intel.com
Cc: ak@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240205145819.1943114-1-kan.liang@linux.intel.com
|
|
Updates to struct parse_events_error needed to be carried through to
PowerPC specific event parsing.
Fixes: fd7b8e8fb20f ("perf parse-events: Print all errors")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240206235902.2917395-1-irogers@google.com
|
|
Ensure that pidfd_getfd() reports -ESRCH if the task is already exiting.
Signed-off-by: Tycho Andersen <tandersen@netflix.com>
Link: https://lore.kernel.org/r/20240206192357.81942-1-tycho@tycho.pizza
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
cmsg_ipv6 test requests tcpdump to capture 4 packets,
and sends until tcpdump quits. Only the first packet
is "real", however, and the rest are basic UDP packets.
So if tcpdump doesn't start in time it will miss
the real packet and only capture the UDP ones.
This makes the test fail on slow machine (no KVM or with
debug enabled) 100% of the time, while it passes in fast
environments.
Repeat the "real" / expected packet.
Fixes: 9657ad09e1fa ("selftests: net: test IPV6_TCLASS")
Fixes: 05ae83d5a4a2 ("selftests: net: test IPV6_HOPLIMIT")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The seccomp benchmark test (for validating the benefit of bitmaps) can
be sensitive to scheduling speed, so pin the process to a single CPU,
which appears to significantly improve reliability, and loosen the
"close enough" checking to allow up to 10% variance instead of 1%.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202402061002.3a8722fd-oliver.sang@intel.com
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Multi-attr elements could not be encoded because of missing logic in the
ynl code. Enable encoding of these attributes by checking if the
attribute is a multi-attr and if the value to be processed is a list.
This has been tested both with the taprio and ets qdisc which contain
this kind of attributes.
Signed-off-by: Alessandro Marcolini <alessandromarcolini99@gmail.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/c5bc9f5797168dbf7a4379c42f38d5de8ac7f38a.1706962013.git.alessandromarcolini99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Correct typo in SpecAttr docstring. Changed SpecSubMessageFormat
docstring.
Signed-off-by: Alessandro Marcolini <alessandromarcolini99@gmail.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/6ab1dea7fb1f635c0d8b237f03a49eaa448c2bf4.1706962013.git.alessandromarcolini99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drop dirty_log_page_splitting_test's assertion that the number of 4KiB
pages remains the same across dirty logging being enabled and disabled, as
the test doesn't guarantee that mappings outside of the memslots being
dirty logged are stable, e.g. KVM's mappings for code and pages in
memslot0 can be zapped by things like NUMA balancing.
To preserve the spirit of the check, assert that (a) the number of 4KiB
pages after splitting is _at least_ the number of 4KiB pages across all
memslots under test, and (b) the number of hugepages before splitting adds
up to the number of pages across all memslots under test. (b) is a little
tenuous as it relies on memslot0 being incompatible with transparent
hugepages, but that holds true for now as selftests explicitly madvise()
MADV_NOHUGEPAGE for memslot0 (__vm_create() unconditionally specifies the
backing type as VM_MEM_SRC_ANONYMOUS).
Reported-by: Yi Lai <yi1.lai@intel.com>
Reported-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20240131222728.4100079-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When finishing the final iteration of dirty_log_test testcase, set
host_quit _before_ the final "continue" so that the vCPU worker doesn't
run an extra iteration, and delete the hack-a-fix of an extra "continue"
from the dirty ring testcase. This fixes a bug where the extra post to
sem_vcpu_cont may not be consumed, which results in failures in subsequent
runs of the testcases. The bug likely was missed during development as
x86 supports only a single "guest mode", i.e. there aren't any subsequent
testcases after the dirty ring test, because for_each_guest_mode() only
runs a single iteration.
For the regular dirty log testcases, letting the vCPU run one extra
iteration is a non-issue as the vCPU worker waits on sem_vcpu_cont if and
only if the worker is explicitly told to stop (vcpu_sync_stop_requested).
But for the dirty ring test, which needs to periodically stop the vCPU to
reap the dirty ring, letting the vCPU resume the guest _after_ the last
iteration means the vCPU will get stuck without an extra "continue".
However, blindly firing off an post to sem_vcpu_cont isn't guaranteed to
be consumed, e.g. if the vCPU worker sees host_quit==true before resuming
the guest. This results in a dangling sem_vcpu_cont, which leads to
subsequent iterations getting out of sync, as the vCPU worker will
continue on before the main task is ready for it to resume the guest,
leading to a variety of asserts, e.g.
==== Test Assertion Failure ====
dirty_log_test.c:384: dirty_ring_vcpu_ring_full
pid=14854 tid=14854 errno=22 - Invalid argument
1 0x00000000004033eb: dirty_ring_collect_dirty_pages at dirty_log_test.c:384
2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505
3 (inlined by) run_test at dirty_log_test.c:802
4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100
5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3)
6 0x0000ffff9be173c7: ?? ??:0
7 0x0000ffff9be1749f: ?? ??:0
8 0x000000000040206f: _start at ??:?
Didn't continue vcpu even without ring full
Alternatively, the test could simply reset the semaphores before each
testcase, but papering over hacks with more hacks usually ends in tears.
Reported-by: Shaoqin Huang <shahuang@redhat.com>
Fixes: 84292e565951 ("KVM: selftests: Add dirty ring buffer test")
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Link: https://lore.kernel.org/r/20240202231831.354848-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When the feature_flags and xdp_zc_max_segs fields were added to the libbpf
bpf_xdp_query_opts, the code writing them did not use the OPTS_SET() macro.
This causes libbpf to write to those fields unconditionally, which means
that programs compiled against an older version of libbpf (with a smaller
size of the bpf_xdp_query_opts struct) will have its stack corrupted by
libbpf writing out of bounds.
The patch adding the feature_flags field has an early bail out if the
feature_flags field is not part of the opts struct (via the OPTS_HAS)
macro, but the patch adding xdp_zc_max_segs does not. For consistency, this
fix just changes the assignments to both fields to use the OPTS_SET()
macro.
Fixes: 13ce2daa259a ("xsk: add new netlink attribute dedicated for ZC max frags")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240206125922.1992815-1-toke@redhat.com
|
|
[Differences from V2:
- Remove conditionals in the source files pragmas, as the
pragma is supported by both GCC and clang.]
Both GCC and clang implement the -Wno-address-of-packed-member
warning, which is enabled by -Wall, that warns about taking the
address of a packed struct field when it can lead to an "unaligned"
address.
This triggers the following errors (-Werror) when building three
particular BPF selftests with GCC:
progs/test_cls_redirect.c
986 | if (ipv4_is_fragment((void *)&encap->ip)) {
progs/test_cls_redirect_dynptr.c
410 | pkt_ipv4_checksum((void *)&encap_gre->ip);
progs/test_cls_redirect.c
521 | pkt_ipv4_checksum((void *)&encap_gre->ip);
progs/test_tc_tunnel.c
232 | set_ipv4_csum((void *)&h_outer.ip);
These warnings do not signal any real problem in the tests as far as I
can see.
This patch adds pragmas to these test files that inhibit the
-Waddress-of-packed-member warning.
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240206102330.7113-1-jose.marchesi@oracle.com
|
|
Leverage previously added MOCK_FLAGS_DEVICE_HUGE_IOVA flag to create an
IOMMU domain with more than MOCK_IO_PAGE_SIZE supported.
Plumb the hugetlb backing memory for buffer allocation and change the
expected page size to MOCK_HUGE_PAGE_SIZE (1M) when hugepage variant test
cases are used. These so far are limited to 128M and 256M IOVA range tests
cases which is when 1M hugepages can be used.
Link: https://lore.kernel.org/r/20240202133415.23819-9-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Rework the functions that test and set the bitmaps to receive a new
parameter (the pte_page_size) that reflects the expected PTE size in the
page tables. The same scheme is still used i.e. even bits are dirty and
odd page indexes aren't dirty. Here it just refactors to consider the size
of the PTE rather than hardcoded to IOMMU mock base page assumptions.
While at it, refactor dirty bitmap tests to use the idev_id created by the
fixture instead of creating a new one.
This is in preparation for doing tests with IOMMU hugepages where multiple
bits set as part of recording a whole hugepage as dirty and thus the
pte_page_size will vary depending on io hugepages or io base pages.
Link: https://lore.kernel.org/r/20240202133415.23819-6-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Exercise the dirty tracking bitmaps with byte unaligned addresses in
addition to the PAGE_SIZE unaligned bitmaps, using a address towards the
end of the page boundary.
In doing so, increase the tailroom we allocate for the bitmap from
MOCK_PAGE_SIZE(2K) into PAGE_SIZE(4K), such that we can test end of bitmap
boundary.
Link: https://lore.kernel.org/r/20240202133415.23819-4-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add support for the independent control state machine per IEEE
802.1AX-2008 5.4.15 in addition to the existing implementation of the
coupled control state machine.
Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
the LACP MUX state machine for separated handling of an initial
Collecting state before the Collecting and Distributing state. This
enables a port to be in a state where it can receive incoming packets
while not still distributing. This is useful for reducing packet loss when
a port begins distributing before its partner is able to collect.
Added new functions such as bond_set_slave_tx_disabled_flags and
bond_set_slave_rx_enabled_flags to precisely manage the port's collecting
and distributing states. Previously, there was no dedicated method to
disable TX while keeping RX enabled, which this patch addresses.
Note that the regular flow process in the kernel's bonding driver remains
unaffected by this patch. The extension requires explicit opt-in by the
user (in order to ensure no disruptions for existing setups) via netlink
support using the new bonding parameter coupled_control. The default value
for coupled_control is set to 1 so as to preserve existing behaviour.
Signed-off-by: Aahil Awatramani <aahila@google.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240202175858.1573852-1-aahila@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Selftests here check not only that connect()/accept() for
TCP-AO/TCP-MD5/non-signed-TCP combinations do/don't establish
connections, but also counters: those are per-AO-key, per-socket and
per-netns.
The counters are checked on the server's side, as the server listener
has TCP-AO/TCP-MD5/no keys for different peers. All tests run in
the same namespaces with the same veth pair, created in test_init().
After close() in both client and server, the sides go through
the regular FIN/ACK + FIN/ACK sequence, which goes in the background.
If the selftest has already started a new testing scenario, read
per-netns counters - it may fail in the end iff it doesn't expect
the TCPAOGood per-netns counters go up during the test.
Let's just kill both TCP-AO sides - that will avoid any asynchronous
background TCP-AO segments going to either sides.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/all/20240201132153.4d68f45e@kernel.org/T/#u
Fixes: 6f0c472a6815 ("selftests/net: Add TCP-AO + TCP-MD5 + no sign listen socket tests")
Signed-off-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20240202-unsigned-md5-netns-counters-v1-1-8b90c37c0566@arista.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This test is time sensitive. It may fail on virtual machines and for
debug builds.
Continue to run in these environments to get code coverage. But
optionally suppress failure for timing errors (only). This is
controlled with environment variable KSFT_MACHINE_SLOW.
The test continues to return 0 (KSFT_PASS), rather than KSFT_XFAIL
as previously discussed. Because making so_txtime.c return that and
then making so_txtime.sh capture runs that pass that vs KSFT_FAIL
and pass it on added a bunch of (fragile bash) boilerplate, while the
result is interpreted the same as KSFT_PASS anyway.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240201162130.2278240-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Update the Power11 PVR to json mapfile to enable
json events. Power11 is PowerISA v3.1 compliant
and support Power10 events.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Cc: atrajeev@linux.vnet.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240129120855.551529-1-maddy@linux.ibm.com
|
|
Mark dynptr kfuncs as __weak to allow
verifier_global_subprogs/arg_ctx_{perf,kprobe,raw_tp} subtests to be
loadable on old kernels. Because bpf_dynptr_from_xdp() kfunc is used
from arg_tag_dynptr BPF program in progs/verifier_global_subprogs.c
*and* is not marked as __weak, loading any subtest from
verifier_global_subprogs fails on old kernels that don't have
bpf_dynptr_from_xdp() kfunc defined. Even if arg_tag_dynptr program
itself is not loaded, libbpf bails out on non-weak reference to
bpf_dynptr_from_xdp (that can't be resolved), which shared across all
programs in progs/verifier_global_subprogs.c.
So mark all dynptr-related kfuncs as __weak to unblock libbpf CI ([0]).
In the upcoming "kfunc in vmlinux.h" work we should make sure that
kfuncs are always declared __weak as well.
[0] https://github.com/libbpf/libbpf/actions/runs/7792673215/job/21251250831?pr=776#step:4:7961
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240206004008.1541513-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|