Age | Commit message (Collapse) | Author |
|
struct perf_event_attr initialization is spread into
perf_event_initialize() and perf_event_attr_initialize() and setting
->config is hardcoded by the deepest level.
perf_event_attr init belongs to perf_event_attr_initialize() so move it
entirely there. Rename the other function
perf_event_initialized_read_format().
Call each init function directly from the test as they will take
different parameters (especially true after the perf related global
variables are moved to local variables).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Naming for perf event related functions, types, and variables is
inconsistent.
Make struct read_format and all functions related to perf events start
with "perf_". Adjust variable names towards the same direction but use
shorter names for variables where appropriate (pe prefix).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Perf event handling has functions that are the sole caller of another
perf event handling related function:
- reset_enable_llc_perf() calls perf_event_open_llc_miss()
- perf_event_measure() calls get_llc_perf()
Remove the extra layer of calls to make the code easier to follow by
moving the code into the calling function.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Perf counters are __u64 but the code converts them to unsigned long
before printing them out.
Remove unnecessary type conversion and retain the perf originating
value as __u64.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
show_cache_info() calculates results and provides generic cache
information. This makes it hard to alter pass/fail conditions.
Separate the test specific checks into CAT and CMT test files and
leave only the generic information part into show_cache_info().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
measure_cache_vals() does a different thing depending on the test case
that called it:
- For CAT, it measures LLC misses through perf.
- For CMT, it measures LLC occupancy through resctrl.
Split these two functionalities into own functions the CAT and CMT
tests can call directly. Replace passing the struct resctrl_val_param
parameter with the filename because it's more generic and all those
functions need out of resctrl_val.
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
CAT test doesn't take shareable bits into account, i.e., the test might
be sharing cache with some devices (e.g., graphics).
Introduce get_mask_no_shareable() and use it to provision an
environment for CAT test where the allocated LLC is isolated better.
Excluding shareable_bits may create hole(s) into the cbm_mask, thus add
a new helper count_contiguous_bits() to find the longest contiguous set
of CBM bits.
create_bit_mask() is needed by an upcoming CAT test rewrite so make it
available in resctrl.h right away.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
CAT and CMT tests calculate size of the cache portion for the n-bits
cache allocation on their own.
Add cache_portion_size() helper that calculates size of the cache
portion for the given number of bits and use it to replace the existing
span calculations. This also prepares for the new CAT test that will
need to determine the size of the cache portion also during results
processing.
Rename also 'cache_size' local variables to 'cache_total_size' to
prevent misinterpretations.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
get_cache_size() does not modify cache_type so it could be const.
Mark cache_type const so that const char * can be passed to it. This
prevents warnings once many of the test parameters are marked const.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Callers of get_cbm_mask() are required to pass a string into which the
capacity bitmask (CBM) is read. Neither CAT nor CMT tests need the
bitmask as string but just convert it into an unsigned long value.
Another limitation is that the bit mask reader can only read
.../cbm_mask files.
Generalize the bit mask reading function into get_bit_mask() such that
it can be used to handle other files besides the .../cbm_mask and
handles the unsigned long conversion within get_bit_mask() using
fscanf(). Change get_cbm_mask() to use get_bit_mask() and rename it to
get_full_cbm() to better indicate what the function does.
Return error from get_full_cbm() if the bitmask is zero for some reason
because it makes the code more robust as the selftests naturally assume
the bitmask has some bits.
Also mark cache_type const while at it and remove useless comments that
are related to processing of CBM bits.
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
There are unnecessary nested calls in fill_buf.c:
- run_fill_buf() calls fill_cache()
- alloc_buffer() calls malloc_and_init_memory()
Simplify the code flow and remove those unnecessary call levels by
moving the called code inside the calling function and remove the
duplicated error print.
Resolve the difference in run_fill_buf() and fill_cache() parameter
name into 'buf_size' which is more descriptive than 'span'. Also, while
moving the allocation related code, rename 'p' into 'buf' to be
consistent in naming the variables.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
MBM, MBA and CMT test cases call run_fill_buf() that in turn calls
fill_cache() to alloc and loop indefinitely around the buffer. This
binds buffer allocation and running the benchmark into a single bundle
so that a selftest cannot allocate a buffer once and reuse it. CAT test
doesn't want to loop around the buffer continuously and after rewrite
it needs the ability to allocate the buffer separately.
Split buffer allocation out of fill_cache() into alloc_buffer(). This
change is part of preparation for the new CAT test that allocates a
buffer and does multiple passes over the same buffer (but not in an
infinite loop).
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
A number function comments state the function return non-zero on
failure but in reality they can only return 0 on success and < 0 on
error.
Update the comments to say < 0 on error to match the behavior.
While at it, improve cat_val() comment to state that 0 means the test
was run (either pass or fail).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
perf_event_open_llc_miss() calls ctrlc_handler() to cleanup if
perf_event_open() returns an error. Those cleanups, however, are not
the responsibility of perf_event_open_llc_miss() and it thus interferes
unnecessarily with the usual cleanup pattern. Worse yet,
ctrlc_handler() calls exit() in the end preventing the ordinary cleanup
done in the calling function from executing.
ctrlc_handler() should only be used as a signal handler, not during
normal error handling.
Remove call to ctrlc_handler() from perf_event_open_llc_miss(). As
unmounting resctrlfs and test cleanup are already handled properly
by error rollbacks in the calling functions, no other changes are
necessary.
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
A number of functions in the resctrl selftests return errno. It is
problematic because errno is positive which is often counterintuitive.
Also, every site returning errno prints the error message already with
ksft_perror() so there is not much added value in returning the precise
error code.
Simply convert all places returning errno to return -1 that is typical
userspace error code in case of failures.
While at it, improve resctrl_val() comment to state that 0 means the
test was run (either pass or fail).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
The resctrl selftest code contains a number of perror() calls. Some of
them come with hash character and some don't. The kselftest framework
provides ksft_perror() that is compatible with test output formatting
so it should be used instead of adding custom hash signs.
Some perror() calls are too far away from anything that sets error.
For those call sites, ksft_print_msg() must be used instead.
Convert perror() to ksft_perror() or ksft_print_msg().
Other related changes:
- Remove hash signs
- Remove trailing stops & newlines from ksft_perror()
- Add terminating newlines for converted ksft_print_msg()
- Use consistent capitalization
- Small fixes/tweaks to typos & grammar of the messages
- Extract error printing out of PARENT_EXIT() to be able to
differentiate
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
The netdev CI is reporting failures for the pmtu test:
[ 115.929264] br0: port 2(vxlan_a) entered forwarding state
# 2024/02/08 17:33:22 socat[7871] E bind(7, {AF=10 [0000:0000:0000:0000:0000:0000:0000:0000]:50000}, 28): Address already in use
# 2024/02/08 17:33:22 socat[7877] E write(7, 0x5598fb6ff000, 8192): Connection refused
# TEST: IPv6, bridged vxlan4: PMTU exceptions [FAIL]
# File size 0 mismatches exepcted value in locally bridged vxlan test
The root cause is apparently a socket created by a previous iteration
of the relevant loop still lasting in LAST_ACK state.
Note that even the file size check is racy, the receiver process dumping
the file could still be running in background
Allow the listener to bound on the same local port via SO_REUSEADDR and
collect file output file size only after the listener completion.
Fixes: 136a1b434bbb ("selftests: net: test vxlan pmtu exceptions with tcp")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/4f51c11a1ce7ca7a4dabd926cffff63dadac9ba1.1707731086.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The helper waiting for a listener port can match any socket whose
hexadecimal representation of source or destination addresses
matches that of the given port.
Additionally, any socket state is accepted.
All the above can let the helper return successfully before the
relevant listener is actually ready, with unexpected results.
So far I could not find any related failure in the netdev CI, but
the next patch is going to make the critical event more easily
reproducible.
Address the issue matching the port hex only vs the relevant socket
field and additionally checking the socket state for TCP sockets.
Fixes: 3bdd9fd29cb0 ("selftests/net: synchronize udpgro tests' tx and rx connection")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/192b3dbc443d953be32991d1b0ca432bd4c65008.1707731086.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The mentioned test is failing in slow environments:
# SO_TXTIME ipv4 clock monotonic
# ./so_txtime: recv: timeout: Resource temporarily unavailable
not ok 1 selftests: net: so_txtime.sh # exit=1
Tuning the tolerance in the test binary is error-prone and doomed
to failures is slow-enough environment.
Just resort to suppress any error in such cases. Note to suppress
them we need first to refactor a bit the code moving it to explicit
error handling.
Fixes: af5136f95045 ("selftests/net: SO_TXTIME with ETF and FQ")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/2142d9ed4b5c5aa07dd1b455779625d91b175373.1707730902.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The gro self-tests sends the packets to be aggregated with
multiple write operations.
When running is slow environment, it's hard to guarantee that
the GRO engine will wait for the last packet in an intended
train.
The above causes almost deterministic failures in our CI for
the 'large' test-case.
Address the issue explicitly ignoring failures for such case
in slow environments (KSFT_MACHINE_SLOW==true).
Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/97d3ba83f5a2bfeb36f6bc0fb76724eb3dafb608.1707729403.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Older glibc's netinet/in.h may leave IPPROTO_MPTCP undefined when
building ip_local_port_range.c, that leads to "error: use of undeclared
identifier 'IPPROTO_MPTCP'".
Define IPPROTO_MPTCP in such cases, just like in other MPTCP selftests.
Fixes: 122db5e3634b ("selftests/net: add MPTCP coverage for IP_LOCAL_PORT_RANGE")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/netdev/CA+G9fYvGO5q4o_Td_kyQgYieXWKw6ktMa-Q0sBu6S-0y3w2aEQ@mail.gmail.com/
Signed-off-by: Maxim Galaganov <max@internet.ru>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Link: https://lore.kernel.org/r/20240209132512.254520-1-max@internet.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"21 hotfixes. 12 are cc:stable and the remainder pertain to post-6.7
issues or aren't considered to be needed in earlier kernel versions"
* tag 'mm-hotfixes-stable-2024-02-10-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
nilfs2: fix potential bug in end_buffer_async_write
mm/damon/sysfs-schemes: fix wrong DAMOS tried regions update timeout setup
nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
MAINTAINERS: Leo Yan has moved
mm/zswap: don't return LRU_SKIP if we have dropped lru lock
fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super
mailmap: switch email address for John Moon
mm: zswap: fix objcg use-after-free in entry destruction
mm/madvise: don't forget to leave lazy MMU mode in madvise_cold_or_pageout_pte_range()
arch/arm/mm: fix major fault accounting when retrying under per-VMA lock
selftests: core: include linux/close_range.h for CLOSE_RANGE_* macros
mm/memory-failure: fix crash in split_huge_page_to_list from soft_offline_page
mm: memcg: optimize parent iteration in memcg_rstat_updated()
nilfs2: fix data corruption in dsync block recovery for small block sizes
mm/userfaultfd: UFFDIO_MOVE implementation should use ptep_get()
exit: wait_task_zombie: kill the no longer necessary spin_lock_irq(siglock)
fs/proc: do_task_stat: use sig->stats_lock to gather the threads/children stats
fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sighand()
getrusage: use sig->stats_lock rather than lock_task_sighand()
getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand()
...
|
|
This exact case was fail for async crypto and we weren't
catching it.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a test case into the netlink checks that will show the number of
nested action recursions won't exceed 16. Going to 17 on a small
clone call isn't enough to exhaust the stack on (most) systems, so
it should be safe to run even on systems that don't have the fix
applied.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240207132416.1488485-3-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The redirection test case fails in the netdev CI on debug kernels
because an FDB entry is learned despite the presence of a tc filter that
redirects incoming traffic [1].
I am unable to reproduce the failure locally, but I can see how it can
happen given that learning is first enabled and only then the ingress tc
filter is configured. On debug kernels the time window between these two
operations is longer compared to regular kernels, allowing random
packets to be transmitted and trigger learning.
Fix by reversing the order and configure the ingress tc filter before
enabling learning.
[1]
[...]
# TEST: Locked port MAB redirect [FAIL]
# Locked entry created for redirected traffic
Fixes: 38c43a1ce758 ("selftests: forwarding: Add test case for traffic redirection from a locked port")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-5-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Suppress the following grep warnings:
[...]
INFO: # Port group entries configuration tests - (*, G)
TEST: Common port group entries configuration tests (IPv4 (*, G)) [ OK ]
TEST: Common port group entries configuration tests (IPv6 (*, G)) [ OK ]
grep: warning: stray \ before /
grep: warning: stray \ before /
grep: warning: stray \ before /
TEST: IPv4 (*, G) port group entries configuration tests [ OK ]
grep: warning: stray \ before /
grep: warning: stray \ before /
grep: warning: stray \ before /
TEST: IPv6 (*, G) port group entries configuration tests [ OK ]
[...]
They do not fail the test, but do clutter the output.
Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After enabling a multicast querier on the bridge (like the test is
doing), the bridge will wait for the Max Response Delay before starting
to forward according to its MDB in order to let Membership Reports
enough time to be received and processed.
Currently, the test is waiting for exactly the default Max Response
Delay (10 seconds) which is racy and leads to failures [1].
Fix by reducing the Max Response Delay to 1 second.
[1]
[...]
# TEST: IPv4 host entries forwarding tests [FAIL]
# Packet locally received after flood
Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After enabling a multicast querier on the bridge (like the test is
doing), the bridge will wait for the Max Response Delay before starting
to forward according to its MDB in order to let Membership Reports
enough time to be received and processed.
Currently, the test is waiting for exactly the default Max Response
Delay (10 seconds) which is racy and leads to failures [1].
Fix by reducing the Max Response Delay to 1 second.
[1]
[...]
# TEST: L2 miss - Multicast (IPv4) [FAIL]
# Unregistered multicast filter was hit after adding MDB entry
Fixes: 8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The test toggles the carrier of a bridge port in order to test the
bridge backup port feature.
Due to the linkwatch delayed work the carrier change is not always
reflected fast enough to the bridge driver and packets are not forwarded
as the test expects, resulting in failures [1].
Fix by busy waiting on the bridge port state until it changes to the
desired state following the carrier change.
[1]
# Backup port
# -----------
[...]
# TEST: swp1 carrier off [ OK ]
# TEST: No forwarding out of swp1 [FAIL]
[ 641.995910] br0: port 1(swp1) entered disabled state
# TEST: No forwarding out of vx0 [ OK ]
Fixes: b408453053fb ("selftests: net: Add bridge backup port and backup nexthop ID test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208123110.1063930-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The reuseport_addr_any.sh is currently skipping DCCP tests and
pmtu.sh is skipping all the FOU/GUE related cases: add the missing
options.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/38d3ca7f909736c1aef56e6244d67c82a9bba6ff.1707326987.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from WiFi and netfilter.
Current release - regressions:
- nic: intel: fix old compiler regressions
- netfilter: ipset: missing gc cancellations fixed
Current release - new code bugs:
- netfilter: ctnetlink: fix filtering for zone 0
Previous releases - regressions:
- core: fix from address in memcpy_to_iter_csum()
- netfilter: nfnetlink_queue: un-break NF_REPEAT
- af_unix: fix memory leak for dead unix_(sk)->oob_skb in GC.
- devlink: avoid potential loop in devlink_rel_nested_in_notify_work()
- iwlwifi:
- mvm: fix a battery life regression
- fix double-free bug
- mac80211: fix waiting for beacons logic
- nic: nfp: flower: prevent re-adding mac index for bonded port
Previous releases - always broken:
- rxrpc: fix generation of serial numbers to skip zero
- tipc: check the bearer type before calling tipc_udp_nl_bearer_add()
- tunnels: fix out of bounds access when building IPv6 PMTU error
- nic: hv_netvsc: register VF in netvsc_probe if NET_DEVICE_REGISTER
missed
- nic: atlantic: fix DMA mapping for PTP hwts ring
Misc:
- selftests: more fixes to deal with very slow hosts"
* tag 'net-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits)
netfilter: nft_set_pipapo: remove scratch_aligned pointer
netfilter: nft_set_pipapo: add helper to release pcpu scratch area
netfilter: nft_set_pipapo: store index in scratch maps
netfilter: nft_set_rbtree: skip end interval element from gc
netfilter: nfnetlink_queue: un-break NF_REPEAT
netfilter: nf_tables: use timestamp to check for set element timeout
netfilter: nft_ct: reject direction for ct id
netfilter: ctnetlink: fix filtering for zone 0
s390/qeth: Fix potential loss of L3-IP@ in case of network issues
netfilter: ipset: Missing gc cancellations fixed
octeontx2-af: Initialize maps.
net: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio
net: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio
netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
netfilter: nft_compat: restrict match/target protocol to u16
netfilter: nft_compat: reject unused compat flag
netfilter: nft_compat: narrow down revision to unsigned 8-bits
net: intel: fix old compiler regressions
MAINTAINERS: Maintainer change for rds
selftests: cmsg_ipv6: repeat the exact packet
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Narrow down target/match revision to u8 in nft_compat.
2) Bail out with unused flags in nft_compat.
3) Restrict layer 4 protocol to u16 in nft_compat.
4) Remove static in pipapo get command that slipped through when
reducing set memory footprint.
5) Follow up incremental fix for the ipset performance regression,
this includes the missing gc cancellation, from Jozsef Kadlecsik.
6) Allow to filter by zone 0 in ctnetlink, do not interpret zone 0
as no filtering, from Felix Huettner.
7) Reject direction for NFT_CT_ID.
8) Use timestamp to check for set element expiration while transaction
is handled to prevent garbage collection from removing set elements
that were just added by this transaction. Packet path and netlink
dump/get path still use current time to check for expiration.
9) Restore NF_REPEAT in nfnetlink_queue, from Florian Westphal.
10) map_index needs to be percpu and per-set, not just percpu.
At this time its possible for a pipapo set to fill the all-zero part
with ones and take the 'might have bits set' as 'start-from-zero' area.
From Florian Westphal. This includes three patches:
- Change scratchpad area to a structure that provides space for a
per-set-and-cpu toggle and uses it of the percpu one.
- Add a new free helper to prepare for the next patch.
- Remove the scratch_aligned pointer and makes AVX2 implementation
use the exact same memory addresses for read/store of the matching
state.
netfilter pull request 24-02-08
* tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_set_pipapo: remove scratch_aligned pointer
netfilter: nft_set_pipapo: add helper to release pcpu scratch area
netfilter: nft_set_pipapo: store index in scratch maps
netfilter: nft_set_rbtree: skip end interval element from gc
netfilter: nfnetlink_queue: un-break NF_REPEAT
netfilter: nf_tables: use timestamp to check for set element timeout
netfilter: nft_ct: reject direction for ct id
netfilter: ctnetlink: fix filtering for zone 0
netfilter: ipset: Missing gc cancellations fixed
netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
netfilter: nft_compat: restrict match/target protocol to u16
netfilter: nft_compat: reject unused compat flag
netfilter: nft_compat: narrow down revision to unsigned 8-bits
====================
Link: https://lore.kernel.org/r/20240208112834.1433-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
previously filtering for the default zone would actually skip the zone
filter and flush all zones.
Fixes: eff3c558bb7e ("netfilter: ctnetlink: support filtering by zone")
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Closes: https://lore.kernel.org/netdev/2032238f-31ac-4106-8f22-522e76df5a12@ovn.org/
Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Correct header file is needed for getting CLOSE_RANGE_* macros.
Previously it was tested with newer glibc which didn't show the need to
include the header which was a mistake.
Link: https://lkml.kernel.org/r/20231024155137.219700-1-usama.anjum@collabora.com
Fixes: ec54424923cf ("selftests: core: remove duplicate defines")
Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Link: https://lore.kernel.org/all/7161219e-0223-d699-d6f3-81abd9abf13b@arm.com
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Pull kvm fixes from Paolo Bonzini:
"x86 guest:
- Avoid false positive for check that only matters on AMD processors
x86:
- Give a hint when Win2016 might fail to boot due to XSAVES &&
!XSAVEC configuration
- Do not allow creating an in-kernel PIT unless an IOAPIC already
exists
RISC-V:
- Allow ISA extensions that were enabled for bare metal in 6.8 (Zbc,
scalar and vector crypto, Zfh[min], Zihintntl, Zvfh[min], Zfa)
S390:
- fix CC for successful PQAP instruction
- fix a race when creating a shadow page"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86/coco: Define cc_vendor without CONFIG_ARCH_HAS_CC_PLATFORM
x86/kvm: Fix SEV check in sev_map_percpu_data()
KVM: x86: Give a hint when Win2016 might fail to boot due to XSAVES erratum
KVM: x86: Check irqchip mode before create PIT
KVM: riscv: selftests: Add Zfa extension to get-reg-list test
RISC-V: KVM: Allow Zfa extension for Guest/VM
KVM: riscv: selftests: Add Zvfh[min] extensions to get-reg-list test
RISC-V: KVM: Allow Zvfh[min] extensions for Guest/VM
KVM: riscv: selftests: Add Zihintntl extension to get-reg-list test
RISC-V: KVM: Allow Zihintntl extension for Guest/VM
KVM: riscv: selftests: Add Zfh[min] extensions to get-reg-list test
RISC-V: KVM: Allow Zfh[min] extensions for Guest/VM
KVM: riscv: selftests: Add vector crypto extensions to get-reg-list test
RISC-V: KVM: Allow vector crypto extensions for Guest/VM
KVM: riscv: selftests: Add scaler crypto extensions to get-reg-list test
RISC-V: KVM: Allow scalar crypto extensions for Guest/VM
KVM: riscv: selftests: Add Zbc extension to get-reg-list test
RISC-V: KVM: Allow Zbc extension for Guest/VM
KVM: s390: fix cc for successful PQAP
KVM: s390: vsie: fix race during shadow creation
|
|
Ensure that pidfd_getfd() reports -ESRCH if the task is already exiting.
Signed-off-by: Tycho Andersen <tandersen@netflix.com>
Link: https://lore.kernel.org/r/20240206192357.81942-1-tycho@tycho.pizza
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
cmsg_ipv6 test requests tcpdump to capture 4 packets,
and sends until tcpdump quits. Only the first packet
is "real", however, and the rest are basic UDP packets.
So if tcpdump doesn't start in time it will miss
the real packet and only capture the UDP ones.
This makes the test fail on slow machine (no KVM or with
debug enabled) 100% of the time, while it passes in fast
environments.
Repeat the "real" / expected packet.
Fixes: 9657ad09e1fa ("selftests: net: test IPV6_TCLASS")
Fixes: 05ae83d5a4a2 ("selftests: net: test IPV6_HOPLIMIT")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The seccomp benchmark test (for validating the benefit of bitmaps) can
be sensitive to scheduling speed, so pin the process to a single CPU,
which appears to significantly improve reliability, and loosen the
"close enough" checking to allow up to 10% variance instead of 1%.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202402061002.3a8722fd-oliver.sang@intel.com
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Drop dirty_log_page_splitting_test's assertion that the number of 4KiB
pages remains the same across dirty logging being enabled and disabled, as
the test doesn't guarantee that mappings outside of the memslots being
dirty logged are stable, e.g. KVM's mappings for code and pages in
memslot0 can be zapped by things like NUMA balancing.
To preserve the spirit of the check, assert that (a) the number of 4KiB
pages after splitting is _at least_ the number of 4KiB pages across all
memslots under test, and (b) the number of hugepages before splitting adds
up to the number of pages across all memslots under test. (b) is a little
tenuous as it relies on memslot0 being incompatible with transparent
hugepages, but that holds true for now as selftests explicitly madvise()
MADV_NOHUGEPAGE for memslot0 (__vm_create() unconditionally specifies the
backing type as VM_MEM_SRC_ANONYMOUS).
Reported-by: Yi Lai <yi1.lai@intel.com>
Reported-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20240131222728.4100079-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When finishing the final iteration of dirty_log_test testcase, set
host_quit _before_ the final "continue" so that the vCPU worker doesn't
run an extra iteration, and delete the hack-a-fix of an extra "continue"
from the dirty ring testcase. This fixes a bug where the extra post to
sem_vcpu_cont may not be consumed, which results in failures in subsequent
runs of the testcases. The bug likely was missed during development as
x86 supports only a single "guest mode", i.e. there aren't any subsequent
testcases after the dirty ring test, because for_each_guest_mode() only
runs a single iteration.
For the regular dirty log testcases, letting the vCPU run one extra
iteration is a non-issue as the vCPU worker waits on sem_vcpu_cont if and
only if the worker is explicitly told to stop (vcpu_sync_stop_requested).
But for the dirty ring test, which needs to periodically stop the vCPU to
reap the dirty ring, letting the vCPU resume the guest _after_ the last
iteration means the vCPU will get stuck without an extra "continue".
However, blindly firing off an post to sem_vcpu_cont isn't guaranteed to
be consumed, e.g. if the vCPU worker sees host_quit==true before resuming
the guest. This results in a dangling sem_vcpu_cont, which leads to
subsequent iterations getting out of sync, as the vCPU worker will
continue on before the main task is ready for it to resume the guest,
leading to a variety of asserts, e.g.
==== Test Assertion Failure ====
dirty_log_test.c:384: dirty_ring_vcpu_ring_full
pid=14854 tid=14854 errno=22 - Invalid argument
1 0x00000000004033eb: dirty_ring_collect_dirty_pages at dirty_log_test.c:384
2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505
3 (inlined by) run_test at dirty_log_test.c:802
4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100
5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3)
6 0x0000ffff9be173c7: ?? ??:0
7 0x0000ffff9be1749f: ?? ??:0
8 0x000000000040206f: _start at ??:?
Didn't continue vcpu even without ring full
Alternatively, the test could simply reset the semaphores before each
testcase, but papering over hacks with more hacks usually ends in tears.
Reported-by: Shaoqin Huang <shahuang@redhat.com>
Fixes: 84292e565951 ("KVM: selftests: Add dirty ring buffer test")
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Link: https://lore.kernel.org/r/20240202231831.354848-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Leverage previously added MOCK_FLAGS_DEVICE_HUGE_IOVA flag to create an
IOMMU domain with more than MOCK_IO_PAGE_SIZE supported.
Plumb the hugetlb backing memory for buffer allocation and change the
expected page size to MOCK_HUGE_PAGE_SIZE (1M) when hugepage variant test
cases are used. These so far are limited to 128M and 256M IOVA range tests
cases which is when 1M hugepages can be used.
Link: https://lore.kernel.org/r/20240202133415.23819-9-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Rework the functions that test and set the bitmaps to receive a new
parameter (the pte_page_size) that reflects the expected PTE size in the
page tables. The same scheme is still used i.e. even bits are dirty and
odd page indexes aren't dirty. Here it just refactors to consider the size
of the PTE rather than hardcoded to IOMMU mock base page assumptions.
While at it, refactor dirty bitmap tests to use the idev_id created by the
fixture instead of creating a new one.
This is in preparation for doing tests with IOMMU hugepages where multiple
bits set as part of recording a whole hugepage as dirty and thus the
pte_page_size will vary depending on io hugepages or io base pages.
Link: https://lore.kernel.org/r/20240202133415.23819-6-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Exercise the dirty tracking bitmaps with byte unaligned addresses in
addition to the PAGE_SIZE unaligned bitmaps, using a address towards the
end of the page boundary.
In doing so, increase the tailroom we allocate for the bitmap from
MOCK_PAGE_SIZE(2K) into PAGE_SIZE(4K), such that we can test end of bitmap
boundary.
Link: https://lore.kernel.org/r/20240202133415.23819-4-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Selftests here check not only that connect()/accept() for
TCP-AO/TCP-MD5/non-signed-TCP combinations do/don't establish
connections, but also counters: those are per-AO-key, per-socket and
per-netns.
The counters are checked on the server's side, as the server listener
has TCP-AO/TCP-MD5/no keys for different peers. All tests run in
the same namespaces with the same veth pair, created in test_init().
After close() in both client and server, the sides go through
the regular FIN/ACK + FIN/ACK sequence, which goes in the background.
If the selftest has already started a new testing scenario, read
per-netns counters - it may fail in the end iff it doesn't expect
the TCPAOGood per-netns counters go up during the test.
Let's just kill both TCP-AO sides - that will avoid any asynchronous
background TCP-AO segments going to either sides.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/all/20240201132153.4d68f45e@kernel.org/T/#u
Fixes: 6f0c472a6815 ("selftests/net: Add TCP-AO + TCP-MD5 + no sign listen socket tests")
Signed-off-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20240202-unsigned-md5-netns-counters-v1-1-8b90c37c0566@arista.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In very slow environments, most big TCP cases including
segmentation and reassembly of big TCP packets have a good
chance to fail: by default the TCP client uses write size
well below 64K. If the host is low enough autocorking is
unable to build real big TCP packets.
Address the issue using much larger write operations.
Note that is hard to observe the issue without an extremely
slow and/or overloaded environment; reduce the TCP transfer
time to allow for much easier/faster reproducibility.
Fixes: 6bb382bcf742 ("selftests: add a selftest for big tcp")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Calling get_system_loc_code before checking devfd and errno fails the test
when the device is not available, the expected behaviour is a SKIP.
Change the order of 'SKIP_IF_MSG' to correctly SKIP when the /dev/
papr-vpd device is not available.
Test output before:
Test FAILED on line 271
Test output after:
[SKIP] Test skipped on line 266: /dev/papr-vpd not present
Signed-off-by: R Nageswara Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240131130859.14968-1-rnsastry@linux.ibm.com
|
|
Using hard-coded constant timeout to wait for some expected
event is deemed to fail sooner or later, especially in slow
env.
Our CI has spotted another of such race:
# TEST: ipv6: cleanup of cached exceptions - nexthop objects [FAIL]
# can't delete veth device in a timely manner, PMTU dst likely leaked
Replace the crude sleep with a loop looking for the expected condition
at low interval for a much longer range.
Fixes: b3cc4f8a8a41 ("selftests: pmtu: add explicit tests for PMTU exceptions cleanup")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/fd5c745e9bb665b724473af6a9373a8c2a62b247.1706812005.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The pmtu.sh test uses a few TCP listener in a problematic way:
It hard-codes a constant timeout to wait for the listener starting-up
in background. That introduces unneeded latency and on very slow and
busy host it can fail.
Additionally the test starts again the same listener in the same
namespace on the same port, just after the previous connection
completed. Fast host can attempt starting the new server before the
old one really closed the socket.
Address the issues using the wait_local_port_listen helper and
explicitly waiting for the background listener process exit.
Fixes: 136a1b434bbb ("selftests: net: test vxlan pmtu exceptions with tcp")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/f8e8f6d44427d8c45e9f6a71ee1a321047452087.1706812005.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The setup_ns helper marks the testns global variable as
readonly. Later attempts to set such variable are unsuccessful,
causing a couple test failures.
Avoid completely the variable re-initialization and let the
function access the global value.
Fixes: e9ce7ededf14 ("selftests: rtnetlink: use setup_ns in bonding test")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/6e7c937c8ff73ca52a21a4a536a13a76ec0173a8.1706812005.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The udpgro_fwd.sh self-tests are somewhat unstable. There are
a few timing constraints the we struggle to meet on very slow
environments.
Instead of skipping the whole tests in such envs, increase the
test resilience WRT very slow hosts: increase the inter-packets
timeouts, avoid resetting the counters every second and finally
disable reduce the background traffic noise.
Tested with:
for I in $(seq 1 100); do
./tools/testing/selftests/kselftest_install/run_kselftest.sh \
-t net:udpgro_fwd.sh || exit -1
done
in a slow environment.
Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/f4b6b11064a0d39182a9ae6a853abae3e9b4426a.1706812005.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|