Age | Commit message (Collapse) | Author |
|
Replace literal 0 with macro PKEY_UNRESTRICTED where pkey_*() functions
are used in mm selftests for memory protection keys.
Signed-off-by: Yury Khrustalev <yury.khrustalev@arm.com>
Suggested-by: Joey Gouly <joey.gouly@arm.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250113170619.484698-3-yury.khrustalev@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Add a selftest for io_uring zero copy Rx. This test cannot run locally
and requires a remote host to be configured in net.config. The remote
host must have hardware support for zero copy Rx as listed in the
documentation page. The test will restore the NIC config back to before
the test and is idempotent.
liburing is required to compile the test and be installed on the remote
host running the test.
Signed-off-by: David Wei <dw@davidwei.uk>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20250215000947.789731-12-dw@davidwei.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next into for-6.15/io_uring-rx-zc
Merge networking zerocopy receive tree, to get the prep patches for
the io_uring rx zc support.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (63 commits)
net: add helpers for setting a memory provider on an rx queue
net: page_pool: add memory provider helpers
net: prepare for non devmem TCP memory providers
net: page_pool: add a mp hook to unregister_netdevice*
net: page_pool: add callback for mp info printing
netdev: add io_uring memory provider info
net: page_pool: create hooks for custom memory providers
net: generalise net_iov chunk owners
net: prefix devmem specific helpers
net: page_pool: don't cast mp param to devmem
tools: ynl: add all headers to makefile deps
eth: fbnic: set IFF_UNICAST_FLT to avoid enabling promiscuous mode when adding unicast addrs
eth: fbnic: add MAC address TCAM to debugfs
tools: ynl-gen: support limits using definitions
tools: ynl-gen: don't output external constants
net/mlx5e: Avoid WARN_ON when configuring MQPRIO with HTB offload enabled
net/mlx5e: Remove unused mlx5e_tc_flow_action struct
net/mlx5: Remove stray semicolon in LAG port selection table creation
net/mlx5e: Support FEC settings for 200G per lane link modes
net/mlx5: Add support for 200Gbps per lane link modes
...
|
|
We need the faux_device changes in here for future work.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Large set of fixes for vector handling, especially in the
interactions between host and guest state.
This fixes a number of bugs affecting actual deployments, and
greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland
for dealing with this thankless task.
- Fix an ugly race between vcpu and vgic creation/init, resulting in
unexpected behaviours
- Fix use of kernel VAs at EL2 when emulating timers with nVHE
- Small set of pKVM improvements and cleanups
x86:
- Fix broken SNP support with KVM module built-in, ensuring the PSP
module is initialized before KVM even when the module
infrastructure cannot be used to order initcalls
- Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being
emulated by KVM to fix a NULL pointer dereference
- Enter guest mode (L2) from KVM's perspective before initializing
the vCPU's nested NPT MMU so that the MMU is properly tagged for
L2, not L1
- Load the guest's DR6 outside of the innermost .vcpu_run() loop, as
the guest's value may be stale if a VM-Exit is handled in the
fastpath"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
x86/sev: Fix broken SNP support with KVM module built-in
KVM: SVM: Ensure PSP module is initialized if KVM module is built-in
crypto: ccp: Add external API interface for PSP module initialization
KVM: arm64: vgic: Hoist SGI/PPI alloc from vgic_init() to kvm_create_vgic()
KVM: arm64: timer: Drop warning on failed interrupt signalling
KVM: arm64: Fix alignment of kvm_hyp_memcache allocations
KVM: arm64: Convert timer offset VA when accessed in HYP code
KVM: arm64: Simplify warning in kvm_arch_vcpu_load_fp()
KVM: arm64: Eagerly switch ZCR_EL{1,2}
KVM: arm64: Mark some header functions as inline
KVM: arm64: Refactor exit handlers
KVM: arm64: Refactor CPTR trap deactivation
KVM: arm64: Remove VHE host restore of CPACR_EL1.SMEN
KVM: arm64: Remove VHE host restore of CPACR_EL1.ZEN
KVM: arm64: Remove host FPSIMD saving for non-protected KVM
KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME state
KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop
KVM: nSVM: Enter guest mode before initializing nested NPT MMU
KVM: selftests: Add CPUID tests for Hyper-V features that need in-kernel APIC
KVM: selftests: Manage CPUID array in Hyper-V CPUID test's core helper
...
|
|
The driver for the 8250 console is not used, as no port is found.
Instead the prom0 bootconsole is used the whole time.
The prom driver translates '\n' to '\r\n' before handing of the message
off to the firmware. The firmware performs the same translation again.
In the final output produced by QEMU each line ends with '\r\r\n'.
This breaks the kunit parser, which can only handle '\r\n' and '\n'.
Use the Zilog console instead. It works correctly, is the one documented
by the QEMU manual and also saves a bit of codesize:
Before=4051011, After=4023326, chg -0.68%
Observed on QEMU 9.2.0.
Link: https://lore.kernel.org/r/20250214-kunit-qemu-sparc-console-v1-1-ba1dfdf8f0b1@linutronix.de
Fixes: 87c9c1631788 ("kunit: tool: add support for QEMU")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
As noted in [0], SeaBIOS (QEMU default) makes a mess of the terminal,
qboot does not.
It turns out this is actually useful with kunit.py, since the user is
exposed to this issue if they set --raw_output=all.
qboot is also faster than SeaBIOS, but it's is marginal for this
usecase.
[0] https://lore.kernel.org/all/CA+i-1C0wYb-gZ8Mwh3WSVpbk-LF-Uo+njVbASJPe1WXDURoV7A@mail.gmail.com/
Both SeaBIOS and qboot are x86-specific.
Link: https://lore.kernel.org/r/20250124-kunit-qboot-v1-1-815e4d4c6f7c@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
kernfs_rename_lock is used to obtain stable kernfs_node::{name|parent}
pointer. This is a preparation to access kernfs_node::parent under RCU
and ensure that the pointer remains stable under the RCU lifetime
guarantees.
For a complete path, as it is done in kernfs_path_from_node(), the
kernfs_rename_lock is still required in order to obtain a stable parent
relationship while computing the relevant node depth. This must not
change while the nodes are inspected in order to build the path.
If the kernfs user never moves the nodes (changes the parent) then the
kernfs_rename_lock is not required and the RCU guarantees are
sufficient. This "restriction" can be set with
KERNFS_ROOT_INVARIANT_PARENT. Otherwise the lock is required.
Rename kernfs_node::parent to kernfs_node::__parent to denote the RCU
access and use RCU accessor while accessing the node.
Make cgroup use KERNFS_ROOT_INVARIANT_PARENT since the parent here can
not change.
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-6-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add a simple repro for the issue of miscalculating LDX/STX/ST CO-RE
relocation size adjustment when the CO-RE relocation target type is an
ARRAY.
We need to make sure that compiler generates LDX/STX/ST instruction with
CO-RE relocation against entire ARRAY type, not ARRAY's element. With
the code pattern in selftest, we get this:
59: 61 71 00 00 00 00 00 00 w1 = *(u32 *)(r7 + 0x0)
00000000000001d8: CO-RE <byte_off> [5] struct core_reloc_arrays::a (0:0)
Where offset of `int a[5]` is embedded (through CO-RE relocation) into memory
load instruction itself.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250207014809.1573841-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Added test cases to ensure that programs with stack sizes exceeding 512
bytes are restricted in non-JITed mode, and can be executed normally in
JITed mode, even with stack sizes exceeding 512 bytes due to the presence
of may_goto instructions.
Test result:
echo "0" > /proc/sys/net/core/bpf_jit_enable
./test_progs -t verifier_stack_ptr
...
stack size 512 with may_goto with jit:SKIP
stack size 512 with may_goto without jit:OK
...
Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED
echo "1" > /proc/sys/net/core/bpf_jit_enable
./test_progs -t verifier_stack_ptr
...
stack size 512 with may_goto with jit:OK
stack size 512 with may_goto without jit:SKIP
...
Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED
Signed-off-by: Jiayuan Chen <mrpre@163.com>
Link: https://lore.kernel.org/r/20250214091823.46042-4-mrpre@163.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In some cases, the verification logic under the interpreter and JIT
differs, such as may_goto, and the test program behaves differently under
different runtime modes, requiring separate verification logic for each
result.
Introduce __load_if_JITed and __load_if_no_JITed annotation for tests.
Signed-off-by: Jiayuan Chen <mrpre@163.com>
Link: https://lore.kernel.org/r/20250214091823.46042-3-mrpre@163.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When working on OpenRISC support for restartable sequences I noticed
and fixed these two issues with the riscv support bits.
1 The 'inc' argument to RSEQ_ASM_OP_R_DEREF_ADDV was being implicitly
passed to the macro. Fix this by adding 'inc' to the list of macro
arguments.
2 The inline asm input constraints for 'inc' and 'off' use "er", The
riscv gcc port does not have an "e" constraint, this looks to be
copied from the x86 port. Fix this by just using an "r" constraint.
I have compile tested this only for riscv. However, the same fixes I
use in the OpenRISC rseq selftests and everything passes with no issues.
Fixes: 171586a6ab66 ("selftests/rseq: riscv: Template memory ordering and percpu access mode")
Signed-off-by: Stafford Horne <shorne@gmail.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250114170721.3613280-1-shorne@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
- Fix lock imbalance in a corner case of dispatch_to_local_dsq()
- Migration disabled tasks were confusing some BPF schedulers and its
handling had a bug. Fix it and simplify the default behavior by
dispatching them automatically
- ops.tick(), ops.disable() and ops.exit_task() were incorrectly
disallowing kfuncs that require the task argument to be the rq
operation is currently operating on and thus is rq-locked.
Allow them.
- Fix autogroup migration handling bug which was occasionally
triggering a warning in the cgroup migration path
- tools/sched_ext, selftest and other misc updates
* tag 'sched_ext-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: Use SCX_CALL_OP_TASK in task_tick_scx
sched_ext: Fix the incorrect bpf_list kfunc API in common.bpf.h.
sched_ext: selftests: Fix grammar in tests description
sched_ext: Fix incorrect assumption about migration disabled tasks in task_can_run_on_remote_rq()
sched_ext: Fix migration disabled handling in targeted dispatches
sched_ext: Implement auto local dispatching of migration disabled tasks
sched_ext: Fix incorrect time delta calculation in time_delta()
sched_ext: Fix lock imbalance in dispatch_to_local_dsq()
sched_ext: selftests/dsp_local_on: Fix selftest on UP systems
tools/sched_ext: Add helper to check task migration state
sched_ext: Fix incorrect autogroup migration detection
sched_ext: selftests/dsp_local_on: Fix sporadic failures
selftests/sched_ext: Fix enum resolution
sched_ext: Include task weight in the error state dump
sched_ext: Fixes typos in comments
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Fix a race window where a newly forked task could escape cgroup.kill
- Remove incorrectly included steal time from cpu.stat::usage_usec
- Minor update in selftest
* tag 'cgroup-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Remove steal time from usage_usec
selftests/cgroup: use bash in test_cpuset_v1_hp.sh
cgroup: fix race between fork and cgroup.kill
|
|
Now that the binary stats cache infrastructure is largely scope agnostic,
add support for vCPU-scoped stats. Like VM stats, open and cache the
stats FD when the vCPU is created so that it's guaranteed to be valid when
vcpu_get_stats() is invoked.
Account for the extra per-vCPU file descriptor in kvm_set_files_rlimit(),
so that tests that create large VMs don't run afoul of resource limits.
To sanity check that the infrastructure actually works, and to get a bit
of bonus coverage, add an assert in x86's xapic_ipi_test to verify that
the number of HLTs executed by the test matches the number of HLT exits
observed by KVM.
Tested-by: Manali Shukla <Manali.Shukla@amd.com>
Link: https://lore.kernel.org/r/20250111005049.1247555-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move the max vCPUs test's RLIMIT_NOFILE adjustments to common code, and
use the new helper to adjust the resource limit for non-barebones VMs by
default. x86's recalc_apic_map_test creates 512 vCPUs, and a future
change will open the binary stats fd for all vCPUs, which will put the
recalc APIC test above some distros' default limit of 1024.
Link: https://lore.kernel.org/r/20250111005049.1247555-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Get and cache a VM's binary stats FD when the VM is opened, as opposed to
waiting until the stats are first used. Opening the stats FD outside of
__vm_get_stat() will allow converting it to a scope-agnostic helper.
Note, this doesn't interfere with kvm_binary_stats_test's testcase that
verifies a stats FD can be used after its own VM's FD is closed, as the
cached FD is also closed during kvm_vm_free().
Link: https://lore.kernel.org/r/20250111005049.1247555-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a struct and helpers to manage the binary stats cache, which is
currently used only for VM-scoped stats. This will allow expanding the
selftests infrastructure to provide support for vCPU-scoped binary stats,
which, except for the ioctl to get the stats FD are identical to VM-scoped
stats.
Defer converting __vm_get_stat() to a scope-agnostic helper to a future
patch, as getting the stats FD from KVM needs to be moved elsewhere
before it can be made completely scope-agnostic.
Link: https://lore.kernel.org/r/20250111005049.1247555-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Turn vm_get_stat() into a macro that generates a string for the stat name,
as opposed to taking a string. This will allow hardening stat usage in
the future to generate errors on unknown stats at compile time.
No functional change intended.
Link: https://lore.kernel.org/r/20250111005049.1247555-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Fail the test if it attempts to read a stat that doesn't exist, e.g. due
to a typo (hooray, strings), or because the test tried to get a stat for
the wrong scope. As is, there's no indiciation of failure and @data is
left untouched, e.g. holds '0' or random stack data in most cases.
Fixes: 8448ec5993be ("KVM: selftests: Add NX huge pages test")
Link: https://lore.kernel.org/r/20250111005049.1247555-4-seanjc@google.com
[sean: fixup spelling mistake, courtesy of Colin Ian King]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Building the test creates binaries 'wait-pipe' and
'sandbox-and-launch' which need to be gitignore'd.
Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com>
Link: https://lore.kernel.org/r/20250210161101.6024-1-bharadwaj.raju777@gmail.com
[mic: Sort entries]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Extend protocol fixture with test suits for MPTCP protocol.
Add CONFIG_MPTCP and CONFIG_MPTCP_IPV6 options in config.
Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Link: https://lore.kernel.org/r/20250205093651.1424339-4-ivanov.mikhail1@huawei-partners.com
Cc: <stable@vger.kernel.org> # 6.7.x
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Extend protocol_variant structure with protocol field (Cf. socket(2)).
Extend protocol fixture with TCP test suits with protocol=IPPROTO_TCP
which can be used as an alias for IPPROTO_IP (=0) in socket(2).
Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Link: https://lore.kernel.org/r/20250205093651.1424339-3-ivanov.mikhail1@huawei-partners.com
Cc: <stable@vger.kernel.org> # 6.7.x
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Since commit 5155cbcdbf03 ("af_unix: Add a prompt to
CONFIG_AF_UNIX_OOB"), the Landlock selftests's configuration is not
enough to build a minimal kernel. Because scoped_signal_test checks
with the MSG_OOB flag, we need to enable CONFIG_AF_UNIX_OOB for tests:
# RUN fown.no_sandbox.sigurg_socket ...
# scoped_signal_test.c:420:sigurg_socket:Expected 1 (1) == send(client_socket, ".", 1, MSG_OOB) (-1)
# sigurg_socket: Test terminated by assertion
# FAIL fown.no_sandbox.sigurg_socket
...
Cc: Günther Noack <gnoack@google.com>
Acked-by: Florent Revest <revest@chromium.org>
Link: https://lore.kernel.org/r/20250211132531.1625566-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Two sets of tests are added to exercise the not _locked and _locked
version of the kfuncs. For both tests, user space accesses xattr
security.bpf.foo on a testfile. The BPF program is triggered by user
space access (on LSM hook inode_[set|get]_xattr) and sets or removes
xattr security.bpf.bar. Then user space then validates that xattr
security.bpf.bar is set or removed as expected.
Note that, in both tests, the BPF programs use the not _locked kfuncs.
The verifier picks the proper kfuncs based on the calling context.
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250130213549.3353349-6-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Extend test_progs fs_kfuncs to cover different xattr names. Specifically:
xattr name "user.kfuncs" and "security.bpf.xxx" can be read from BPF
program with kfuncs bpf_get_[file|dentry]_xattr(); while "security.bpf"
and "security.selinux" cannot be read.
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250130213549.3353349-3-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Fix a race condition between the main test_progs thread and the traffic
monitoring thread. The traffic monitor thread tries to print a line
using multiple printf and use flockfile() to prevent the line from being
torn apart. Meanwhile, the main thread doing io redirection can reassign
or close stdout when going through tests. A deadlock as shown below can
happen.
main traffic_monitor_thread
==== ======================
show_transport()
-> flockfile(stdout)
stdio_hijack_init()
-> stdout = open_memstream(log_buf, log_cnt);
...
env.subtest_state->stdout_saved = stdout;
...
funlockfile(stdout)
stdio_restore_cleanup()
-> fclose(env.subtest_state->stdout_saved);
After the traffic monitor thread lock stdout, A new memstream can be
assigned to stdout by the main thread. Therefore, the traffic monitor
thread later will not be able to unlock the original stdout. As the
main thread tries to access the old stdout, it will hang indefinitely
as it is still locked by the traffic monitor thread.
The deadlock can be reproduced by running test_progs repeatedly with
traffic monitor enabled:
for ((i=1;i<=100;i++)); do
./test_progs -a flow_dissector_skb* -m '*'
done
Fix this by only calling printf once and remove flockfile()/funlockfile().
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250213233217.553258-1-ameryhung@gmail.com
|
|
Cross-merge networking fixes after downstream PR (net-6.14-rc3).
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, wireless and bluetooth.
Kalle Valo steps down after serving as the WiFi driver maintainer for
over a decade.
Current release - fix to a fix:
- vsock: orphan socket after transport release, avoid null-deref
- Bluetooth: L2CAP: fix corrupted list in hci_chan_del
Current release - regressions:
- eth:
- stmmac: correct Rx buffer layout when SPH is enabled
- iavf: fix a locking bug in an error path
- rxrpc: fix alteration of headers whilst zerocopy pending
- s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH
- Revert "netfilter: flowtable: teardown flow if cached mtu is stale"
Current release - new code bugs:
- rxrpc: fix ipv6 path MTU discovery, only ipv4 worked
- pse-pd: fix deadlock in current limit functions
Previous releases - regressions:
- rtnetlink: fix netns refleak with rtnl_setlink()
- wifi: brcmfmac: use random seed flag for BCM4355 and BCM4364
firmware
Previous releases - always broken:
- add missing RCU protection of struct net throughout the stack
- can: rockchip: bail out if skb cannot be allocated
- eth: ti: am65-cpsw: base XDP support fixes
Misc:
- ethtool: tsconfig: update the format of hwtstamp flags, changes the
uAPI but this uAPI was not in any release yet"
* tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
net: pse-pd: Fix deadlock in current limit functions
rxrpc: Fix ipv6 path MTU discovery
Reapply "net: skb: introduce and use a single page frag cache"
s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH
mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw()
ipv6: mcast: add RCU protection to mld_newpack()
team: better TEAM_OPTION_TYPE_STRING validation
Bluetooth: L2CAP: Fix corrupted list in hci_chan_del
Bluetooth: btintel_pcie: Fix a potential race condition
Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
net: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP case
net: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX case
net: ethernet: ti: am65-cpsw: fix memleak in certain XDP cases
vsock/test: Add test for SO_LINGER null ptr deref
vsock: Orphan socket after transport release
MAINTAINERS: Add sctp headers to the general netdev entry
Revert "netfilter: flowtable: teardown flow if cached mtu is stale"
iavf: Fix a locking bug in an error path
rxrpc: Fix alteration of headers whilst zerocopy pending
net: phylink: make configuring clock-stop dependent on MAC support
...
|
|
Fixed grammar for a few tests of sched_ext.
Signed-off-by: Devaansh Kumar <devaanshk840@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Explicitly close() a TCP_ESTABLISHED (connectible) socket with SO_LINGER
enabled.
As for now, test does not verify if close() actually lingers.
On an unpatched machine, may trigger a null pointer dereference.
Tested-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250210-vsock-linger-nullderef-v3-2-ef6244d02b54@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Extract a private header and convert the prime_numbers self-test to a
KUnit test. I considered parameterizing the test using
`KUNIT_CASE_PARAM` but didn't see how it was possible since the test
logic is entangled with the test parameter generation logic.
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250208-prime_numbers-kunit-convert-v5-2-b0cb82ae7c7d@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The list is getting overly long and any modifications introduce a lot of
noise and are prone to conflicts. Split the string into a bash array
and break that into multiple lines.
Link: https://lore.kernel.org/r/20250211-nolibc-test-archs-v1-1-8e55aa3369cf@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
|
|
Close/free a VM's binary stats cache when the VM is released, not when the
VM is fully freed. When a VM is re-created, e.g. for state save/restore
tests, the stats FD and descriptor points at the old, defunct VM. The FD
is still valid, in that the underlying stats file won't be freed until the
FD is closed, but reading stats will always pull information from the old
VM.
Note, this is a benign bug in the current code base as none of the tests
that recreate VMs use binary stats.
Fixes: 83f6e109f562 ("KVM: selftests: Cache binary stats metadata for duration of test")
Link: https://lore.kernel.org/r/20250111005049.1247555-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When allocating and freeing a VM's cached binary stats info, check for a
NULL descriptor, not a '0' file descriptor, as '0' is a legal FD. E.g. in
the unlikely scenario the kernel installs the stats FD at entry '0',
selftests would reallocate on the next __vm_get_stat() and/or fail to free
the stats in kvm_vm_free().
Fixes: 83f6e109f562 ("KVM: selftests: Cache binary stats metadata for duration of test")
Link: https://lore.kernel.org/r/20250111005049.1247555-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Now that dirty_log_test doesn't require running multiple iterations to
verify dirty pages, and actually runs the requested number of iterations,
drop the requirement that the test run at least "3" (which was really "2"
at the time the test was written) iterations.
Link: https://lore.kernel.org/r/20250111003004.1235645-21-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Actually run all requested iterations, instead of iterations-1 (the count
starts at '1' due to the need to avoid '0' as an in-memory value for a
dirty page).
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-20-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Set the per-iteration variables at the start of each iteration instead of
setting them before the loop, and at the end of each iteration. To ensure
the vCPU doesn't race ahead before the first iteration, simply have the
vCPU worker want for sem_vcpu_cont, which conveniently avoids the need to
special case posting sem_vcpu_cont from the loop.
Link: https://lore.kernel.org/r/20250111003004.1235645-19-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Now that each iteration collects all dirty entries and ensures the guest
*completes* at least one write, tighten the exemptions for the last dirty
page of the previous iteration. Specifically, the only legal value (other
than the current iteration) is N-1.
Unlike the last page for the current iteration, the in-progress write from
the previous iteration is guaranteed to have completed, otherwise the test
would have hung.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Ensure the vCPU fully completes at least one write in each dirty_log_test
iteration, as failure to dirty any pages complicates verification and
forces the test to be overly conservative about possible values. E.g.
verification needs to allow the last dirty page from a previous iteration
to have *any* value, because the vCPU could get stuck for multiple
iterations, which is unlikely but can happen in heavily overloaded and/or
nested virtualization setups.
Somewhat arbitrarily set the minimum to 0x100/256; high enough to be
interesting, but not so high as to lead to pointlessly long runtimes.
Opportunistically report the number of writes per iteration for debug
purposes, and so that a human can sanity check the test. Due to each
write targeting a random page, the number of dirty pages will likely be
lower than the number of total writes, but it shouldn't be absurdly lower
(which would suggest the pRNG is broken)
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a sanity check that a completely garbage value wasn't written to
the last dirty page in the ring, e.g. that it doesn't contain the *next*
iteration's value.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-16-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Collect all dirty entries during each iteration of dirty_log_test by
doing a final collection after the vCPU has been stopped. To deal with
KVM's destructive approach to getting the dirty bitmaps, use a second
bitmap for the post-stop collection.
Collecting all entries that were dirtied during an iteration simplifies
the verification logic *and* improves test coverage.
- If a page is written during iteration X, but not seen as dirty until
X+1, the test can get a false pass if the page is also written during
X+1.
- If a dirty page used a stale value from a previous iteration, the test
would grant a false pass.
- If a missed dirty log occurs in the last iteration, the test would fail
to detect the issue.
E.g. modifying mark_page_dirty_in_slot() to dirty an unwritten gfn:
if (memslot && kvm_slot_dirty_track_enabled(memslot)) {
unsigned long rel_gfn = gfn - memslot->base_gfn;
u32 slot = (memslot->as_id << 16) | memslot->id;
if (!vcpu->extra_dirty &&
gfn_to_memslot(kvm, gfn + 1) == memslot) {
vcpu->extra_dirty = true;
mark_page_dirty_in_slot(kvm, memslot, gfn + 1);
}
if (kvm->dirty_ring_size && vcpu)
kvm_dirty_ring_push(vcpu, slot, rel_gfn);
else if (memslot->dirty_bitmap)
set_bit_le(rel_gfn, memslot->dirty_bitmap);
}
isn't detected with the current approach, even with an interval of 1ms
(when running nested in a VM; bare metal would be even *less* likely to
detect the bug due to the vCPU being able to dirty more memory). Whereas
collecting all dirty entries consistently detects failures with an
interval of 700ms or more (the longer interval means a higher probability
of an actual write to the prematurely-dirtied page).
Link: https://lore.kernel.org/r/20250111003004.1235645-15-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Print out the last dirty pages from the current and previous iteration on
verification failures. In many cases, bugs (especially test bugs) occur
on the edges, i.e. on or near the last pages, and being able to correlate
failures with the last pages can aid in debug.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When verifying pages in dirty_log_test, immediately continue on all "pass"
scenarios to make the logic consistent in how it handles pass vs. fail.
No functional change intended.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When running dirty_log_test using the dirty ring, post to sem_vcpu_stop
only when the main thread has explicitly requested that the vCPU stop.
Synchronizing the vCPU and main thread whenever the dirty ring happens to
be full is unnecessary, as KVM's ABI is to actively prevent the vCPU from
running until the ring is no longer full. I.e. attempting to run the vCPU
will simply result in KVM_EXIT_DIRTY_RING_FULL without ever entering the
guest. And if KVM doesn't exit, e.g. let's the vCPU dirty more pages,
then that's a KVM bug worth finding.
Posting to sem_vcpu_stop on ring full also makes it difficult to get the
test logic right, e.g. it's easy to let the vCPU keep running when it
shouldn't, as a ring full can essentially happen at any given time.
Opportunistically rework the handling of dirty_ring_vcpu_ring_full to
leave it set for the remainder of the iteration in order to simplify the
surrounding logic.
Link: https://lore.kernel.org/r/20250111003004.1235645-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
In the dirty_log_test guest code, exit to userspace only when the vCPU is
explicitly told to stop. Periodically exiting just to check if a flag has
been set is unnecessary, weirdly complex, and wastes time handling exits
that could be used to dirty memory.
Opportunistically convert 'i' to a uint64_t to guard against the unlikely
scenario that guest_num_pages exceeds the storage of an int.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Now that the vCPU doesn't dirty every page on the first iteration for
architectures that support the dirty ring, honor vcpu_stop in the dirty
ring's vCPU worker, i.e. stop when the main thread says "stop". This will
allow plumbing vcpu_stop into the guest so that the vCPU doesn't need to
periodically exit to userspace just to see if it should stop.
Add a comment explaining that marking all pages as dirty is problematic
for the dirty ring, as it results in the guest getting stuck on "ring
full". This could be addressed by adding a GUEST_SYNC() in that initial
loop, but it's not clear how that would interact with s390's behavior.
Link: https://lore.kernel.org/r/20250111003004.1235645-10-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
s390 specific workaround causes the dirty-log mode of the test to dirty
all guest memory on the first iteration, which is very slow when the test
is run in a nested VM.
Limit this workaround to s390x.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Continue collecting entries from the dirty ring for the entire time the
vCPU is running. Collecting exactly once all but guarantees the vCPU will
encounter a "ring full" event and stop. While testing ring full is
interesting, stopping and doing nothing is not, especially for larger
intervals as the test effectively does nothing for a much longer time.
To balance continuous collection with letting the guest make forward
progress, chunk the interval waiting into 1ms loops (which also makes
the math dead simple).
To maintain coverage for "ring full", collect entries on subsequent
iterations if and only if the ring has been filled at least once. I.e.
let the ring fill up (if the interval allows), but after that contiuously
empty it so that the vCPU can keep running.
Opportunistically drop unnecessary zero-initialization of "count".
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Cache the page's value during verification in a local variable, re-reading
from the pointer is ugly and error prone, e.g. allows for bugs like
checking the pointer itself instead of the value.
No functional change intended.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|