Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs pidfs updates from Christian Brauner:
- Allow retrieving exit information after a process has been reaped
through pidfds via the new PIDFD_INTO_EXIT extension for the
PIDFD_GET_INFO ioctl. Various tools need access to information about
a process/task even after it has already been reaped.
Pidfd polling allows waiting on either task exit or for a task to
have been reaped. The contract for PIDFD_INFO_EXIT is simply that
EPOLLHUP must be observed before exit information can be retrieved,
i.e., exit information is only provided once the task has been reaped
and then can be retrieved as long as the pidfd is open.
- Add PIDFD_SELF_{THREAD,THREAD_GROUP} sentinels allowing userspace to
forgo allocating a file descriptor for their own process. This is
useful in scenarios where users want to act on their own process
through pidfds and is akin to AT_FDCWD.
- Improve premature thread-group leader and subthread exec behavior
when polling on pidfds:
(1) During a multi-threaded exec by a subthread, i.e.,
non-thread-group leader thread, all other threads in the
thread-group including the thread-group leader are killed and the
struct pid of the thread-group leader will be taken over by the
subthread that called exec. IOW, two tasks change their TIDs.
(2) A premature thread-group leader exit means that the thread-group
leader exited before all of the other subthreads in the
thread-group have exited.
Both cases lead to inconsistencies for pidfd polling with
PIDFD_THREAD. Any caller that holds a PIDFD_THREAD pidfd to the
current thread-group leader may or may not see an exit notification
on the file descriptor depending on when poll is performed. If the
poll is performed before the exec of the subthread has concluded an
exit notification is generated for the old thread-group leader. If
the poll is performed after the exec of the subthread has concluded
no exit notification is generated for the old thread-group leader.
The correct behavior is to simply not generate an exit notification
on the struct pid of a subhthread exec because the struct pid is
taken over by the subthread and thus remains alive.
But this is difficult to handle because a thread-group may exit
premature as mentioned in (2). In that case an exit notification is
reliably generated but the subthreads may continue to run for an
indeterminate amount of time and thus also may exec at some point.
After this pull no exit notifications will be generated for a
PIDFD_THREAD pidfd for a thread-group leader until all subthreads
have been reaped. If a subthread should exec before no exit
notification will be generated until that task exits or it creates
subthreads and repeates the cycle.
This means an exit notification indicates the ability for the father
to reap the child.
* tag 'vfs-6.15-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
selftests/pidfd: third test for multi-threaded exec polling
selftests/pidfd: second test for multi-threaded exec polling
selftests/pidfd: first test for multi-threaded exec polling
pidfs: improve multi-threaded exec and premature thread-group leader exit polling
pidfs: ensure that PIDFS_INFO_EXIT is available
selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest
selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest
selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest
selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest
selftests/pidfd: add third PIDFD_INFO_EXIT selftest
selftests/pidfd: add second PIDFD_INFO_EXIT selftest
selftests/pidfd: add first PIDFD_INFO_EXIT selftest
selftests/pidfd: expand common pidfd header
pidfs/selftests: ensure correct headers for ioctl handling
selftests/pidfd: fix header inclusion
pidfs: allow to retrieve exit information
pidfs: record exit code and cgroupid at exit
pidfs: use private inode slab cache
pidfs: move setting flags into pidfs_alloc_file()
pidfd: rely on automatic cleanup in __pidfd_prepare()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount updates from Christian Brauner:
- Mount notifications
The day has come where we finally provide a new api to listen for
mount topology changes outside of /proc/<pid>/mountinfo. A mount
namespace file descriptor can be supplied and registered with
fanotify to listen for mount topology changes.
Currently notifications for mount, umount and moving mounts are
generated. The generated notification record contains the unique
mount id of the mount.
The listmount() and statmount() api can be used to query detailed
information about the mount using the received unique mount id.
This allows userspace to figure out exactly how the mount topology
changed without having to generating diffs of /proc/<pid>/mountinfo
in userspace.
- Support O_PATH file descriptors with FSCONFIG_SET_FD in the new mount
api
- Support detached mounts in overlayfs
Since last cycle we support specifying overlayfs layers via file
descriptors. However, we don't allow detached mounts which means
userspace cannot user file descriptors received via
open_tree(OPEN_TREE_CLONE) and fsmount() directly. They have to
attach them to a mount namespace via move_mount() first.
This is cumbersome and means they have to undo mounts via umount().
Allow them to directly use detached mounts.
- Allow to retrieve idmappings with statmount
Currently it isn't possible to figure out what idmapping has been
attached to an idmapped mount. Add an extension to statmount() which
allows to read the idmapping from the mount.
- Allow creating idmapped mounts from mounts that are already idmapped
So far it isn't possible to allow the creation of idmapped mounts
from already idmapped mounts as this has significant lifetime
implications. Make the creation of idmapped mounts atomic by allow to
pass struct mount_attr together with the open_tree_attr() system call
allowing to solve these issues without complicating VFS lookup in any
way.
The system call has in general the benefit that creating a detached
mount and applying mount attributes to it becomes an atomic operation
for userspace.
- Add a way to query statmount() for supported options
Allow userspace to query which mount information can be retrieved
through statmount().
- Allow superblock owners to force unmount
* tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits)
umount: Allow superblock owners to force umount
selftests: add tests for mount notification
selinux: add FILE__WATCH_MOUNTNS
samples/vfs: fix printf format string for size_t
fs: allow changing idmappings
fs: add kflags member to struct mount_kattr
fs: add open_tree_attr()
fs: add copy_mount_setattr() helper
fs: add vfs_open_tree() helper
statmount: add a new supported_mask field
samples/vfs: add STATMOUNT_MNT_{G,U}IDMAP
selftests: add tests for using detached mount with overlayfs
samples/vfs: check whether flag was raised
statmount: allow to retrieve idmappings
uidgid: add map_id_range_up()
fs: allow detached mounts in clone_private_mount()
selftests/overlayfs: test specifying layers as O_PATH file descriptors
fs: support O_PATH fds with FSCONFIG_SET_FD
vfs: add notifications for mount attach and detach
fanotify: notify on mount attach and detach
...
|
|
Add ublk stripe target which can take 1~4 underlying backing files
or block device, with stripe size 4k ~ 512K.
Add two basic tests(write verify & mkfs/mount/umount) over ublk/stripe.
This target is helpful to cover multiple IOs aiming at same
fixed/registered IO kernel buffer.
It is also capable of verifying vectored registered (kernel)buffers
in future for zero copy, so far it isn't supported yet.
Todo: support vectored registered kernel buffer for ublk/zc.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-9-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use the added target io handling helpers for simplifying loop io
completion.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-8-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Enable zero copy for null target so that we can evaluate performance
from zero copy or not.
Also this should be the simplest ublk zero copy implementation, which
can be served as zc example.
Add test for covering 'add -t null -z'.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-7-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
- pass 'truct dev_ctx *ctx' to target init function
- add 'private_data' to 'struct ublk_dev' for storing target specific data
- add 'private_data' to 'struct ublk_io' for storing per-IO data
- add 'tgt_ios' to 'struct ublk_io' for counting how many io_uring ios
for handling the current io command
- add helper ublk_get_io() for supporting stripe target
- add two helpers for simplifying target io handling
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-6-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Move two functions for initializing & de-initializing backing file
into common.c.
Also move one common helper into kublk.h.
Prepare for supporting ublk-stripe.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-5-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Increase max buffer size to 1MB, and 64KB is too small to evaluate
performance with builtin ublk server implementation.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Unify the sqe allocator helper, and we will use it for supporting
more cases, such as ublk stripe, in which variable sqe allocation
is required.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
block layer, ublk and io_uring might re-order IO in the past
- plug
- queue ublk io command via task work
Add one test for verifying if sequential WRITE IO is dispatched in order.
- null target is taken, so we can just observe io order from
`tracepoint:block:block_rq_complete` which represents the dispatch order
- WRITE IO is taken because READ may come from system-wide utility
Cc: Uday Shankar <ushankar@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
number is invalid
syzbot reported out-of-bounds read in check_atomic_load/store() when the
register number is invalid in this context:
https://syzkaller.appspot.com/bug?extid=a5964227adc0f904549c
To avoid the issue from now on, let's add tests where the register number
is invalid for load-acquire/store-release.
After discussion with Eduard, I decided to use R15 as invalid register
because the actual slab-out-of-bounds read issue occurs when the register
number is R12 or larger.
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250322045340.18010-6-enjuk@amazon.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
syzbot reported the following splat [0].
In check_atomic_load/store(), register validity is not checked before
atomic_ptr_type_ok(). This causes the out-of-bounds read in is_ctx_reg()
called from atomic_ptr_type_ok() when the register number is MAX_BPF_REG
or greater.
Call check_load_mem()/check_store_reg() before atomic_ptr_type_ok()
to avoid the OOB read.
However, some tests introduced by commit ff3afe5da998 ("selftests/bpf: Add
selftests for load-acquire and store-release instructions") assume
calling atomic_ptr_type_ok() before checking register validity.
Therefore the swapping of order unintentionally changes verifier messages
of these tests.
For example in the test load_acquire_from_pkt_pointer(), expected message
is 'BPF_ATOMIC loads from R2 pkt is not allowed' although actual messages
are different.
validate_msgs:FAIL:754 expect_msg
VERIFIER LOG:
=============
Global function load_acquire_from_pkt_pointer() doesn't return scalar. Only those are supported.
0: R1=ctx() R10=fp0
; asm volatile ( @ verifier_load_acquire.c:140
0: (61) r2 = *(u32 *)(r1 +0) ; R1=ctx() R2_w=pkt(r=0)
1: (d3) r0 = load_acquire((u8 *)(r2 +0))
invalid access to packet, off=0 size=1, R2(id=0,off=0,r=0)
R2 offset is outside of the packet
processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
=============
EXPECTED SUBSTR: 'BPF_ATOMIC loads from R2 pkt is not allowed'
#505/19 verifier_load_acquire/load-acquire from pkt pointer:FAIL
This is because instructions in the test don't pass check_load_mem() and
therefore don't enter the atomic_ptr_type_ok() path.
In this case, we have to modify instructions so that they pass the
check_load_mem() and trigger atomic_ptr_type_ok().
Similarly for store-release tests, we need to modify instructions so that
they pass check_store_reg().
Like load_acquire_from_pkt_pointer(), modify instructions in:
load_acquire_from_sock_pointer()
store_release_to_ctx_pointer()
store_release_to_pkt_pointer()
Also in store_release_to_sock_pointer(), check_store_reg() returns error
early and atomic_ptr_type_ok() is not triggered, since write to sock
pointer is not possible in general.
We might be able to remove the test, but for now let's leave it and just
change the expected message.
[0]
BUG: KASAN: slab-out-of-bounds in is_ctx_reg kernel/bpf/verifier.c:6185 [inline]
BUG: KASAN: slab-out-of-bounds in atomic_ptr_type_ok+0x3d7/0x550 kernel/bpf/verifier.c:6223
Read of size 4 at addr ffff888141b0d690 by task syz-executor143/5842
CPU: 1 UID: 0 PID: 5842 Comm: syz-executor143 Not tainted 6.14.0-rc3-syzkaller-gf28214603dc6 #0
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:408 [inline]
print_report+0x16e/0x5b0 mm/kasan/report.c:521
kasan_report+0x143/0x180 mm/kasan/report.c:634
is_ctx_reg kernel/bpf/verifier.c:6185 [inline]
atomic_ptr_type_ok+0x3d7/0x550 kernel/bpf/verifier.c:6223
check_atomic_store kernel/bpf/verifier.c:7804 [inline]
check_atomic kernel/bpf/verifier.c:7841 [inline]
do_check+0x89dd/0xedd0 kernel/bpf/verifier.c:19334
do_check_common+0x1678/0x2080 kernel/bpf/verifier.c:22600
do_check_main kernel/bpf/verifier.c:22691 [inline]
bpf_check+0x165c8/0x1cca0 kernel/bpf/verifier.c:23821
Reported-by: syzbot+a5964227adc0f904549c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a5964227adc0f904549c
Tested-by: syzbot+a5964227adc0f904549c@syzkaller.appspotmail.com
Fixes: e24bbad29a8d ("bpf: Introduce load-acquire and store-release instructions")
Fixes: ff3afe5da998 ("selftests/bpf: Add selftests for load-acquire and store-release instructions")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250322045340.18010-5-enjuk@amazon.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
create_pagecache_thp_and_fd() was previously writing a file sized at twice
the PMD size by making a per-byte write syscall. This was quite slow when
the PMD size is 4M, but completely intolerable for 32M (PMD size for
arm64's 16K page size), and 512M (PMD size for arm64's 64K page size).
The byte pattern has a 256 byte period, so let's create a 1K buffer and
fill it with exactly 4 periods. Then we can write the buffer as many
times as is required to fill the file. This makes things much more
tolerable.
The test now passes for 16K page size. It still fails for 64K page size
because MAX_PAGECACHE_ORDER is too small for 512M folio size (I think).
Link: https://lkml.kernel.org/r/20250318174343.243631-3-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Rafael Aquini <raquini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
uffd-unit-tests uses a memory area with a fixed 32M size. Then it
calculates the number of pages by dividing by page_size, which itself is
either the base page size or the PMD huge page size depending on the test
config. For the latter, we end up with nr_pages=1 for arm64 16K base
pages, and nr_pages=0 for 64K base pages. This doesn't end well.
So let's make the 32M size a floor and also ensure that we have at least 2
pages given the PMD size. With this change, the tests pass on arm64 64K
base page size configuration.
Link: https://lkml.kernel.org/r/20250318174343.243631-2-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Rafael Aquini <raquini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
As discussed here:
https://lore.kernel.org/lkml/Z9RRkL1hom48z3Tt@google.com/
This code could benefit from some more commentary.
To avoid needing to comment the same thing in multiple places (I guess
more of these SKIPs will need to be added over time, for now I am only
like 20% of the way through Project Run run_vmtests.sh Successfully), add
a dummy "skip tests for this specific reason" function that basically just
serves as a hook to hang comments on.
Link: https://lkml.kernel.org/r/20250317-9pfs-comments-v1-1-9ac96043e146@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Firstly ublk char device node may not be created by udev yet, so wait
a while until it can be opened or timeout.
Secondly delete created ublk device in case of start failure, otherwise
the device becomes zombie.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250321135324.259677-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Lei Chen reported a bug with CLOCK_MONOTONIC_COARSE having inconsistencies
when NTP is adjusting the clock frequency.
This has gone seemingly undetected for ~15 years, illustrating a clear gap
in our testing.
The skew_consistency test is intended to catch this sort of problem, but
was focused on only evaluating CLOCK_MONOTONIC, and thus missed the problem
on CLOCK_MONOTONIC_COARSE.
So adjust the test to run with all clockids for 60 seconds each instead of
10 minutes with just CLOCK_MONOTONIC.
Reported-by: Lei Chen <lei.chen@smartx.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250320200306.1712599-2-jstultz@google.com
Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/
|
|
Expands the self-tests to include the 'release' feature in
sysdata.
Verifies that enabling the 'release' feature appends the
correct data and ensures that disabling it functions as expected.
When enabled, the message should have an item similar to in the
userdata: `release=$(uname -r)`
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250314-netcons_release-v1-5-07979c4b86af@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Some fixes may require user space to check if they are applied on the
running kernel before using a specific feature. For instance, this
applies when a restriction was previously too restrictive and is now
getting relaxed (e.g. for compatibility reasons). However, non-visible
changes for legitimate use (e.g. security fixes) do not require an
erratum.
Because fixes are backported down to a specific Landlock ABI, we need a
way to avoid cherry-pick conflicts. The solution is to only update a
file related to the lower ABI impacted by this issue. All the ABI files
are then used to create a bitmask of fixes.
The new errata interface is similar to the one used to get the supported
Landlock ABI version, but it returns a bitmask instead because the order
of fixes may not match the order of versions, and not all fixes may
apply to all versions.
The actual errata will come with dedicated commits. The description is
not actually used in the code but serves as documentation.
Create the landlock_abi_version symbol and use its value to check errata
consistency.
Update test_base's create_ruleset_checks_ordering tests and add errata
tests.
This commit is backportable down to the first version of Landlock.
Fixes: 3532b0b4352c ("landlock: Enable user space to infer supported features")
Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250318161443.279194-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
All implementations of chacha_init_arch() just call
chacha_init_generic(), so it is pointless. Just delete it, and replace
chacha_init() with what was previously chacha_init_generic().
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
For loop target, write cache isn't enabled, and each write isn't be
marked as DSYNC too.
Fix it by enabling write cache, meantime fix FLUSH implementation
by not taking LBA range into account, and there isn't such info
for FLUSH command.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250321004758.152572-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Some user decides test result by exit code only, and wouldn't like to be
bothered by the test result.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250320013743.4167489-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
ublk_drv may be built-in, so don't show modprobe failure, and we
do check `/dev/ublk-control` for skipping test if ublk_drv isn't
enabled.
Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250320013743.4167489-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add one dependency helper which can include new uapi definition which
isn't synced from kernel.
This way also helps a lot for downstream test deployment.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250320013743.4167489-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Martin KaFai Lau says:
====================
pull-request: bpf-next 2025-03-13
The following pull-request contains BPF updates for your *net-next* tree.
We've added 4 non-merge commits during the last 3 day(s) which contain
a total of 2 files changed, 35 insertions(+), 12 deletions(-).
The main changes are:
1) bpf_getsockopt support for TCP_BPF_RTO_MIN and TCP_BPF_DELACK_MAX,
from Jason Xing
bpf-next-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftests/bpf: Add bpf_getsockopt() for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN
tcp: bpf: Support bpf_getsockopt for TCP_BPF_DELACK_MAX
tcp: bpf: Support bpf_getsockopt for TCP_BPF_RTO_MIN
tcp: bpf: Introduce bpf_sol_tcp_getsockopt to support TCP_BPF flags
====================
Link: https://patch.msgid.link/20250313221620.2512684-1-martin.lau@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Cross-merge networking fixes after downstream PR (net-6.14-rc8).
Conflict:
tools/testing/selftests/net/Makefile
03544faad761 ("selftest: net: add proc_net_pktgen")
3ed61b8938c6 ("selftests: net: test for lwtunnel dst ref loops")
tools/testing/selftests/net/config:
85cb3711acb8 ("selftests: net: Add test cases for link and peer netns")
3ed61b8938c6 ("selftests: net: test for lwtunnel dst ref loops")
Adjacent commits:
tools/testing/selftests/net/Makefile
c935af429ec2 ("selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY")
355d940f4d5a ("Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There are scenarios where env.{sub,}test_state->stdout_saved, can be
NULL, e.g. sometimes when the watchdog timeout kicks in, or if the
open_memstream syscall is not available.
Avoid crashing test_progs by adding an explicit NULL check prior the
fclose() call.
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250318081648.122523-1-bjorn@kernel.org
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.15
- Nested virtualization support for VGICv3, giving the nested
hypervisor control of the VGIC hardware when running an L2 VM
- Removal of 'late' nested virtualization feature register masking,
making the supported feature set directly visible to userspace
- Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers
- Paravirtual interface for discovering the set of CPU implementations
where a VM may run, addressing a longstanding issue of guest CPU
errata awareness in big-little systems and cross-implementation VM
migration
- Userspace control of the registers responsible for identifying a
particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
allowing VMs to be migrated cross-implementation
- pKVM updates, including support for tracking stage-2 page table
allocations in the protected hypervisor in the 'SecPageTable' stat
- Fixes to vPMU, ensuring that userspace updates to the vPMU after
KVM_RUN are reflected into the backing perf events
|
|
KVM/riscv changes for 6.15
- Disable the kernel perf counter during configure
- KVM selftests improvements for PMU
- Fix warning at the time of KVM module removal
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from can, bluetooth and ipsec.
This contains a last minute revert of a recent GRE patch, mostly to
allow me stating there are no known regressions outstanding.
Current release - regressions:
- revert "gre: Fix IPv6 link-local address generation."
- eth: ti: am65-cpsw: fix NAPI registration sequence
Previous releases - regressions:
- ipv6: fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().
- mptcp: fix data stream corruption in the address announcement
- bluetooth: fix connection regression between LE and non-LE adapters
- can:
- flexcan: only change CAN state when link up in system PM
- ucan: fix out of bound read in strscpy() source
Previous releases - always broken:
- lwtunnel: fix reentry loops
- ipv6: fix TCP GSO segmentation with NAT
- xfrm: force software GSO only in tunnel mode
- eth: ti: icssg-prueth: add lock to stats
Misc:
- add Andrea Mayer as a maintainer of SRv6"
* tag 'net-6.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6
Revert "gre: Fix IPv6 link-local address generation."
Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."
net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES
tools headers: Sync uapi/asm-generic/socket.h with the kernel sources
mptcp: Fix data stream corruption in the address announcement
selftests: net: test for lwtunnel dst ref loops
net: ipv6: ioam6: fix lwtunnel_output() loop
net: lwtunnel: fix recursion loops
net: ti: icssg-prueth: Add lock to stats
net: atm: fix use after free in lec_send()
xsk: fix an integer overflow in xp_create_and_assign_umem()
net: stmmac: dwc-qos-eth: use devm_kzalloc() for AXI data
selftests: drv-net: use defer in the ping test
phy: fix xa_alloc_cyclic() error handling
dpll: fix xa_alloc_cyclic() error handling
devlink: fix xa_alloc_cyclic() error handling
ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create().
ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().
net: ipv6: fix TCP GSO segmentation with NAT
...
|
|
devices."
This reverts commit 6f50175ccad4278ed3a9394c00b797b75441bd6e.
Commit 183185a18ff9 ("gre: Fix IPv6 link-local address generation.") is
going to be reverted. So let's revert the corresponding kselftest
first.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/259a9e98f7f1be7ce02b53d0b4afb7c18a8ff747.1742418408.git.gnault@redhat.com
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Ensure that during a multi-threaded exec and premature thread-group
leader exit no exit notification is generated.
Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-4-da678ce805bf@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Ensure that during a multi-threaded exec and premature thread-group
leader exit no exit notification is generated.
Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-3-da678ce805bf@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add first test for premature thread-group leader exit.
Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-2-da678ce805bf@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
As recently specified by commit 0ea09cbf8350 ("docs: netdev: add a note
on selftest posting") in net-next, the selftest is therefore shipped in
this series. However, this selftest does not really test this series. It
needs this series to avoid crashing the kernel. What it really tests,
thanks to kmemleak, is what was fixed by the following commits:
- commit c71a192976de ("net: ipv6: fix dst refleaks in rpl, seg6 and
ioam6 lwtunnels")
- commit 92191dd10730 ("net: ipv6: fix dst ref loops in rpl, seg6 and
ioam6 lwtunnels")
- commit c64a0727f9b1 ("net: ipv6: fix dst ref loop on input in seg6
lwt")
- commit 13e55fbaec17 ("net: ipv6: fix dst ref loop on input in rpl
lwt")
- commit 0e7633d7b95b ("net: ipv6: fix dst ref loop in ila lwtunnel")
- commit 5da15a9c11c1 ("net: ipv6: fix missing dst ref drop in ila
lwtunnel")
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20250314120048.12569-4-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This patch checks if the newly added net.mptcp.path_manager is mapped
successfully from or to the old net.mptcp.pm_type in userspace_pm.sh.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-12-f4e4a88efc50@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
test_xdp_vlan.sh isn't used by the BPF CI.
Migrate test_xdp_vlan.sh in prog_tests/xdp_vlan.c.
It uses the same BPF programs located in progs/test_xdp_vlan.c and the
same network topology.
Remove test_xdp_vlan*.sh and their Makefile entries.
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250221-xdp_vlan-v1-2-7d29847169af@bootlin.com/
|
|
The __load() helper expects BPF sections to be names 'xdp' or 'tc'
Rename BPF sections so they can be loaded with the __load() helper in
upcoming patch.
Rename the BPF functions with their previous section's name.
Update the 'ip link' commands in the script to use the program name
instead of the section name to load the BPF program.
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250221-xdp_vlan-v1-1-7d29847169af@bootlin.com
|
|
* kvm-arm64/writable-midr:
: Writable implementation ID registers, courtesy of Sebastian Ott
:
: Introduce a new capability that allows userspace to set the
: ID registers that identify a CPU implementation: MIDR_EL1, REVIDR_EL1,
: and AIDR_EL1. Also plug a hole in KVM's trap configuration where
: SMIDR_EL1 was readable at EL1, despite the fact that KVM does not
: support SME.
KVM: arm64: Fix documentation for KVM_CAP_ARM_WRITABLE_IMP_ID_REGS
KVM: arm64: Copy MIDR_EL1 into hyp VM when it is writable
KVM: arm64: Copy guest CTR_EL0 into hyp VM
KVM: selftests: arm64: Test writes to MIDR,REVIDR,AIDR
KVM: arm64: Allow userspace to change the implementation ID registers
KVM: arm64: Load VPIDR_EL2 with the VM's MIDR_EL1 value
KVM: arm64: Maintain per-VM copy of implementation ID regs
KVM: arm64: Set HCR_EL2.TID1 unconditionally
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/pv-cpuid:
: Paravirtualized implementation ID, courtesy of Shameer Kolothum
:
: Big-little has historically been a pain in the ass to virtualize. The
: implementation ID (MIDR, REVIDR, AIDR) of a vCPU can change at the whim
: of vCPU scheduling. This can be particularly annoying when the guest
: needs to know the underlying implementation to mitigate errata.
:
: "Hyperscalers" face a similar scheduling problem, where VMs may freely
: migrate between hosts in a pool of heterogenous hardware. And yes, our
: server-class friends are equally riddled with errata too.
:
: In absence of an architected solution to this wart on the ecosystem,
: introduce support for paravirtualizing the implementation exposed
: to a VM, allowing the VMM to describe the pool of implementations that a
: VM may be exposed to due to scheduling/migration.
:
: Userspace is expected to intercept and handle these hypercalls using the
: SMCCC filter UAPI, should it choose to do so.
smccc: kvm_guest: Fix kernel builds for 32 bit arm
KVM: selftests: Add test for KVM_REG_ARM_VENDOR_HYP_BMAP_2
smccc/kvm_guest: Enable errata based on implementation CPUs
arm64: Make _midr_in_range_list() an exported function
KVM: arm64: Introduce KVM_REG_ARM_VENDOR_HYP_BMAP_2
KVM: arm64: Specify hypercall ABI for retrieving target implementations
arm64: Modify _midr_range() functions to read MIDR/REVIDR internally
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/nv-idregs:
: Changes to exposure of NV features, courtesy of Marc Zyngier
:
: Apply NV-specific feature restrictions at reset rather than at the point
: of KVM_RUN. This makes the true feature set visible to userspace, a
: necessary step towards save/restore support or NV VMs.
:
: Add an additional vCPU feature flag for selecting the E2H0 flavor of NV,
: such that the VHE-ness of the VM can be applied to the feature set.
KVM: arm64: selftests: Test that TGRAN*_2 fields are writable
KVM: arm64: Allow userspace to write ID_AA64MMFR0_EL1.TGRAN*_2
KVM: arm64: Advertise FEAT_ECV when possible
KVM: arm64: Make ID_AA64MMFR4_EL1.NV_frac writable
KVM: arm64: Allow userspace to limit NV support to nVHE
KVM: arm64: Move NV-specific capping to idreg sanitisation
KVM: arm64: Enforce NV limits on a per-idregs basis
KVM: arm64: Make ID_REG_LIMIT_FIELD_ENUM() more widely available
KVM: arm64: Consolidate idreg callbacks
KVM: arm64: Advertise NV2 in the boot messages
KVM: arm64: Mark HCR.EL2.{NV*,AT} RES0 when ID_AA64MMFR4_EL1.NV_frac is 0
KVM: arm64: Mark HCR.EL2.E2H RES0 when ID_AA64MMFR1_EL1.VH is zero
KVM: arm64: Hide ID_AA64MMFR2_EL1.NV from guest and userspace
arm64: cpufeature: Handle NV_frac as a synonym of NV2
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
We leave most of the defconfigs alone (there's over 70 of them),
but let's remove CONFIG_SCHED_DEBUG from the scheduler self-test
Kconfig files.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/Z9szt3MpQmQ56TRd@gmail.com
|
|
When the rseq UAPI header is included, 'union rseq' clashes with 'struct
rseq'. It's not the case in the rseq selftests but it does break the KVM
selftests that also include this file.
Rename 'union rseq' to 'union rseq_tls' to fix this.
Fixes: e6644c967d3c ("rseq/selftests: Ensure the rseq ABI TLS is actually 1024 bytes")
Reported-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/r/20250319202144.1141542-1-mjeanson@efficios.com
|
|
Before tc's recent change to fix rounding errors, several tests which
specified a burst size of "1m" would translate back to being 1048574
bytes (2b less than 1Mb). sprint_size prints this as "1024Kb".
With the tc fix, the burst size is instead correctly reported as
1048576 bytes (precisely 1Mb), which sprint_size prints as "1Mb".
This updates the expected output in the tests' matchPattern values
to accept either the old or the new output.
Signed-off-by: Jonathan Lennox <jonathan.lennox@8x8.com>
Link: https://patch.msgid.link/20250312174804.313107-1-jonathan.lennox@8x8.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Make sure the test cleans up after itself. The XDP off statements
at the end of the test may not be reached.
Fixes: 75cc19c8ff89 ("selftests: drv-net: add xdp cases for ping.py")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250312131040.660386-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Introduce selftests that trigger AA, ABBA deadlocks, and test the edge
case where the held locks table runs out of entries, since we then
fallback to the timeout as the final line of defense. Also exercise
verifier's AA detection where applicable.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250316040541.108729-26-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
KVM selftests changes for 6.15, part 2
- Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and
improve its coverage by collecting all dirty entries on each iteration.
- Fix a few minor bugs related to handling of stats FDs.
- Add infrastructure to make vCPU and VM stats FDs available to tests by
default (open the FDs during VM/vCPU creation).
- Relax an assertion on the number of HLT exits in the xAPIC IPI test when
running on a CPU that supports AMD's Idle HLT (which elides interception of
HLT if a virtual IRQ is pending and unmasked).
- Misc cleanups and fixes.
|
|
into HEAD
KVM selftests changes for 6.15, part 1
- Misc cleanups and prep work.
- Annotate _no_printf() with "printf" so that pr_debug() statements are
checked by the compiler for default builds (and pr_info() when QUIET).
- Attempt to whack the last LLC references/misses mole in the Intel PMU
counters test by adding a data load and doing CLFLUSH{OPT} on the data
instead of the code being executed. The theory is that modern Intel CPUs
have learned new code prefetching tricks that bypass the PMU counters.
- Fix a flaw in the Intel PMU counters test where it asserts that an event is
counting correctly without actually knowing what the event counts on the
underlying hardware.
|
|
KVM x86 misc changes for 6.15:
- Fix a bug in PIC emulation that caused KVM to emit a spurious KVM_REQ_EVENT.
- Add a helper to consolidate handling of mp_state transitions, and use it to
clear pv_unhalted whenever a vCPU is made RUNNABLE.
- Defer runtime CPUID updates until KVM emulates a CPUID instruction, to
coalesce updates when multiple pieces of vCPU state are changing, e.g. as
part of a nested transition.
- Fix a variety of nested emulation bugs, and add VMX support for synthesizing
nested VM-Exit on interception (instead of injecting #UD into L2).
- Drop "support" for PV Async #PF with proctected guests without SEND_ALWAYS,
as KVM can't get the current CPL.
- Misc cleanups
|
|
The KVM RISC-V allows Zaamo/Zalrsc extensions for Guest/VM so add these
extensions to get-reg-list test.
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240619153913.867263-6-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
|