Age | Commit message (Collapse) | Author |
|
Add a selftest to validate the behavior of the NUMA-aware scheduler
functionalities, including idle CPU selection within nodes, per-node
DSQs and CPU to node mapping.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add cxl-test emulation of Get Supported Features mailbox command.
Currently only adding a test feature with feature identifier of
all f's for testing.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Link: https://patch.msgid.link/20250220194438.2281088-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
CXL spec r3.2 8.2.9.6.1 Get Supported Features (Opcode 0500h)
The command retrieve the list of supported device-specific features
(identified by UUID) and general information about each Feature.
The driver will retrieve the Feature entries in order to make checks and
provide information for the Get Feature and Set Feature command. One of
the main piece of information retrieved are the effects a Set Feature
command would have for a particular feature. The retrieved Feature
entries are stored in the cxl_mailbox context.
The setup of Features is initiated via devm_cxl_setup_features() during the
pci probe function before the cxl_memdev is enumerated.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250220194438.2281088-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add support for 'bla' instruction.
This is done by 'flagging' the address as an absolute address so that
arch_jump_destination() can calculate it as expected. Because code is
_always_ 4 bytes aligned, use bit 30 as flag.
Also add support for 'b' and 'ba' instructions. Objtool call them jumps.
And make sure the special 'bl .+4' used by clang in relocatable code is
not seen as an 'unannotated intra-function call'. clang should use the
special 'bcl 20,31,.+4' form like gcc but for the time being it does not
so lets work around that.
Link: https://github.com/llvm/llvm-project/issues/128644
Reviewed-by: Segher Boessenkool <segher@kewrnel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/bf0b4d554547bc34fa3d1af5b4e62a84c0bc182b.1740470510.git.christophe.leroy@csgroup.eu
|
|
Add xstate testing specifically for those vector register states,
validating kernel's context switching and ensuring ABI compliance.
Use the established xstate testing framework.
Alternatively, this invocation could be placed directly in
xstate.c::main(). However, the current test file naming convention, which
clearly specifies the tested area, seems reasonable. Adding avx.c
considerably aligns with that convention.
The test output should be like this for ZMM_Hi256 as an example:
$ avx_64
...
[RUN] AVX-512 ZMM_Hi256: check context switches, 10 iterations, 5 threads.
[OK] No incorrect case was found.
[RUN] AVX-512 ZMM_Hi256: inject xstate via ptrace().
[OK] 'xfeatures' in SW reserved area was correctly written
[OK] xstate was correctly updated.
[RUN] AVX-512 ZMM_Hi256: load xstate and raise SIGUSR1
[OK] 'magic1' is valid
[OK] 'xfeatures' in SW reserved area is valid
[OK] 'xfeatures' in XSAVE header is valid
[OK] xstate delivery was successful
[OK] 'magic2' is valid
[RUN] AVX-512 ZMM_Hi256: load new xstate from sighandler and check it after sigreturn
[OK] xstate was restored correctly
But systems without AVX-512 will look like:
...
The kernel does not support feature number: 5
The kernel does not support feature number: 6
The kernel does not support feature number: 7
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226010731.2456-10-chang.seok.bae@intel.com
|
|
The established xstate test code is designed to be generic, but certain
xstates require special handling and cannot be tested without additional
adjustments.
Clarify which xstates are currently supported, and enforce testing only
for them.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226010731.2456-9-chang.seok.bae@intel.com
|
|
Currently, each of the three xstate tests runs as a separate invocation,
requiring the xstate number to be passed and state information to be
reconstructed repeatedly. This approach arose from their individual and
isolated development, but now it makes sense to unify them.
Introduce a wrapper function that first verifies feature availability
from the kernel and constructs the necessary state information once. The
wrapper then sequentially invokes all tests to ensure consistent
execution.
Update the AMX test to use this unified invocation.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226010731.2456-8-chang.seok.bae@intel.com
|
|
With the refactored test cases, another xstate exposure to userspace is
through signal delivery. While amx.c includes signal-related scenarios,
its primary focus is on xstate permission management, which is largely
specific to dynamic states.
The remaining gap is testing xstate preservation and restoration across
signal delivery. The kernel defines an ABI for presenting xstate in the
signal frame, closely resembling the hardware XSAVE format, where xstate
modification is also possible.
Introduce a new test case to verify xstate preservation across signal
delivery and return, that is ensuring ABI compatibility by:
- Loading xstate before raising a signal.
- Verifying correct exposure in the signal frame
- Modifying xstate in the signal frame before returning.
- Checking the state restoration upon signal return.
Integrate this test into the AMX test suite as an initial usage site.
Expected output:
$ amx_64
...
[RUN] AMX Tile data: load xstate and raise SIGUSR1
[OK] 'magic1' is valid
[OK] 'xfeatures' in SW reserved area is valid
[OK] 'xfeatures' in XSAVE header is valid
[OK] xstate delivery was successful
[OK] 'magic2' is valid
[RUN] AMX Tile data: load new xstate from sighandler and check it after sigreturn
[OK] xstate was restored correctly
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226010731.2456-7-chang.seok.bae@intel.com
|
|
Following the refactoring of the context switching test, the ptrace test is
another component reusable for other xstate features. As part of this
restructuring, add a missing check to validate the
user_xstateregs->xstate_fx_sw field in the ABI.
Also, replace err() and fatal_error() with ksft_exit_fail_msg() for
consistency in error handling.
Expected output:
$ amx_64
...
[RUN] AMX Tile data: inject xstate via ptrace().
[OK] 'xfeatures' in SW reserved area was correctly written
[OK] xstate was correctly updated.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226010731.2456-6-chang.seok.bae@intel.com
|
|
The existing context switching and ptrace tests in amx.c are not specific
to dynamic states, making them reusable for general xstate testing.
As a first step, move the context switching test to xstate.c. Refactor
the test code to allow specifying which xstate component being tested.
To decouple the test from dynamic states, remove the permission request
code. In fact, The permission request inside the test wrapper was
redundant.
Additionally, replace fatal_error() with ksft_exit_fail_msg() for
consistency in error handling.
Expected output:
$ amx_64
...
[RUN] AMX Tile data: check context switches, 10 iterations, 5 threads.
[OK] No incorrect case was found.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226010731.2456-5-chang.seok.bae@intel.com
|
|
After moving essential helpers from amx.c, the code remains neutral
regarding which xstate components it handles. However, explicitly listing
known components helps users identify which features are ready for
testing.
Enumerate xstate components to facilitate identification. Extend struct
xstate_info to include a name field, providing a human-readable
identifier.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226010731.2456-4-chang.seok.bae@intel.com
|
|
The AMX test introduced several XSAVE-related helper functions, but so
far, it has been the only user of them. These helpers can be generalized
for broader test of multiple xstate features.
Move most XSAVE-related code into xsave.h, making it shareable. The
restructuring includes:
* Establishing low-level XSAVE helpers for saving and restoring register
states, as well as handling XSAVE buffers.
* Generalizing state data manipuldations: set_rand_data()
* Introducing a generic feature query helper: get_xstate_info()
While doing so, remove unused defines in amx.c.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226010731.2456-3-chang.seok.bae@intel.com
|
|
The x86 selftests frequently register and clean up signal handlers, but
the sethandler() and clearhandler() functions have been redundantly
copied across multiple .c files.
Move these functions to helpers.h to enable reuse across tests,
eliminating around 250 lines of duplicate code.
Converge the error handling by using ksft_exit_fail_msg(), which is
functionally equivalent with err() within the selftest framework.
This change is a prerequisite for the upcoming xstate selftest, which
requires signal handling for registering and cleaning up handlers.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226010731.2456-2-chang.seok.bae@intel.com
|
|
Assert that MIDR_EL1, REVIDR_EL1, AIDR_EL1 are writable from userspace,
that the changed values are visible to guests, and that they are
preserved across a vCPU reset.
Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250225005401.679536-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Add a selftest that verifies symmetric RSS hash is working as intended.
The test runs iterations of traffic, swapping the src/dst UDP ports, and
verifies that the same RX queue is receiving the traffic in both cases.
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250224174416.499070-5-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of guessing a port and checking whether it's available, get an
available port from the OS.
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250224174416.499070-4-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some distributions may not enable MPTCP by default. All other MPTCP tests
source mptcp_lib.sh to ensure MPTCP is enabled before testing. However,
the ip_local_port_range test is the only one that does not include this
step.
Let's also ensure MPTCP is enabled in netns for ip_local_port_range so
that it passes on all distributions.
Suggested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250224094013.13159-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix tools/ quiet build Makefile infrastructure that was broken when
working on tools/perf/ without testing on other tools/ living
utilities.
* tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
tools: Remove redundant quiet setup
tools: Unify top-level quiet infrastructure
|
|
Make all the SCX_OPS_* and SCX_PICK_IDLE_* flags available to the
user-space part of the schedulers via the compat interface.
This allows schedulers / selftests to set all the ops flags in
user-space, rather than having them split between BPF and user-space.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
There are a couple of inconsistent indents around code/literal blocks.
Adjust them to make this file easier to parse.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Fix a trivial typo.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Not all annotated accesses provide the semantics their syntactic tags
would imply. For example, an 'acquire tag on a write does not imply that
the write is finally in the Acquire set and provides acquire ordering.
To distinguish in those cases between the syntactic tags and actual
sets, we capitalize the former, so 'ACQUIRE tags may be present on both
reads and writes, but only reads will appear in the Acquire set.
For tags where the two concepts are the same we do not use specific
capitalization to make this distinction.
Reported-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Tested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Akira Yokosawa <akiyks@gmail.com> # herdtools7.7.58
|
|
A new version of herd7 provides a -lkmmv2 switch which overrides the old herd7
behavior of simply ignoring any softcoded tags in the .def and .bell files. We
port LKMM to this version of herd7 by providing the switch in linux-kernel.cfg
and reporting an error if the LKMM is used without this switch.
To preserve the semantics of LKMM, we also softcode the Noreturn tag on atomic
RMW which do not return a value and define atomic_add_unless with an Mb tag in
linux-kernel.def.
We update the herd-representation.txt accordingly and clarify some of the
resulting combinations.
Co-developed-by: Hernan Ponce de Leon <hernan.poncedeleon@huaweicloud.com>
Signed-off-by: Hernan Ponce de Leon <hernan.poncedeleon@huaweicloud.com>
Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Tested-by: Boqun Feng <boqun.feng@gmail.com>
Tested-by: Akira Yokosawa <akiyks@gmail.com> # herdtools7.7.58
|
|
Fix the following objtool warning during build time:
fs/bcachefs/btree_cache.o: warning: objtool: btree_node_lock.constprop.0() falls through to next function bch2_recalc_btree_reserve()
fs/bcachefs/btree_update.o: warning: objtool: bch2_trans_update_get_key_cache() falls through to next function need_whiteout_for_snapshot()
bch2_trans_unlocked_or_in_restart_error() is an Obviously Correct (tm)
panic() wrapper, add it to the list of known noreturns.
Fixes: b318882022a8 ("bcachefs: bch2_trans_verify_not_unlocked_or_in_restart()")
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lore.kernel.org/r/20250218064230.219997-1-youling.tang@linux.dev
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
A C jump table (such as the one used by the BPF interpreter) is a const
global array of absolute code addresses, and this means that the actual
values in the table may not be known until the kernel is booted (e.g.,
when using KASLR or when the kernel VA space is sized dynamically).
When using PIE codegen, the compiler will default to placing such const
global objects in .data.rel.ro (which is annotated as writable), rather
than .rodata (which is annotated as read-only). As C jump tables are
explicitly emitted into .rodata, this used to result in warnings for
LoongArch builds (which uses PIE codegen for the entire kernel) like
Warning: setting incorrect section attributes for .rodata..c_jump_table
due to the fact that the explicitly specified .rodata section inherited
the read-write annotation that the compiler uses for such objects when
using PIE codegen.
This warning was suppressed by explicitly adding the read-only
annotation to the __attribute__((section(""))) string, by commit
c5b1184decc8 ("compiler.h: specify correct attribute for .rodata..c_jump_table")
Unfortunately, this hack does not work on Clang's integrated assembler,
which happily interprets the appended section type and permission
specifiers as part of the section name, which therefore no longer
matches the hard-coded pattern '.rodata..c_jump_table' that objtool
expects, causing it to emit a warning
kernel/bpf/core.o: warning: objtool: ___bpf_prog_run+0x20: sibling call from callable instruction with modified stack frame
Work around this, by emitting C jump tables into .data.rel.ro instead,
which is treated as .rodata by the linker script for all builds, not
just PIE based ones.
Fixes: c5b1184decc8 ("compiler.h: specify correct attribute for .rodata..c_jump_table")
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn> # on LoongArch
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250221135704.431269-6-ardb+git@google.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
Exception branch returns without freeing 'fi'.
Signed-off-by: liuye <liuye@kylinos.cn>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250114082650.113105-1-liuye@kylinos.cn
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix for cacheinfo DT probing to avoid reading non-boolean
properties as booleans.
- A fix for cpufeature to use bitmap_equal() instead of memcmp(), so
unused bits are ignored.
- Fixes for cmpxchg and futex cmpxchg that properly encode the sign
extension requirements on inline asm, which results in spurious
successes. This manifests in at least inode_set_ctime_current, but is
likely just a disaster waiting to happen.
- A fix for the rseq selftests, which was using an invalid constraint.
- A pair of fixes for signal frame size handling:
- We were reserving space for an extra empty extension context
header on systems with extended signal context, thus resulting in
unnecessarily large allocations.
- We weren't properly checking for available extensions before
calculating the signal stack size, which resulted in undersized
stack allocations on some systems (at least those with T-Head
custom vectors).
Also, we've added Alex as a reviewer. He's been helping out a ton
lately, thanks!
* tag 'riscv-for-linus-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
MAINTAINERS: Add myself as a riscv reviewer
riscv: signal: fix signal_minsigstksz
riscv: signal: fix signal frame size
rseq/selftests: Fix riscv rseq_offset_deref_addv inline asm
riscv/futex: sign extend compare value in atomic cmpxchg
riscv/atomic: Do proper sign extension also for unsigned in arch_cmpxchg
riscv: cpufeature: use bitmap_equal() instead of memcmp()
riscv: cacheinfo: Use of_property_present() for non-boolean properties
|
|
Shawn Lin <shawn.lin@rock-chips.com> says:
This patchset adds initial UFS controller supprt for RK3576 SoC.
Patch 1 is the dt-bindings. Patch 2-4 deal with rpm and spm support
in advanced suggested by Ulf. Patch 5 exports two new APIs for host
driver. Patch 6 and 7 are the host driver and dtsi support.
Link: https://lore.kernel.org/r/1738736156-119203-1-git-send-email-shawn.lin@rock-chips.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In parse_abi function,the dyn_test fails because the
enable_file isn’t closed after successfully registering an event.
By adding wait_for_delete(), the dyn_test now passes as expected.
Link: https://lore.kernel.org/r/20250221033555.326716-1-realxxyq@163.com
Signed-off-by: Yiqian Xun <xunyiqian@kylinos.cn>
Acked-by: Beau Belgrave <beaub@linux.microsoft.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Test XDP and HDS interaction. While at it add a test for using the IOCTL,
as that turned out to be the real culprit.
Testing bnxt:
# NETIF=eth0 ./ksft-net-drv/drivers/net/hds.py
KTAP version 1
1..12
ok 1 hds.get_hds
ok 2 hds.get_hds_thresh
ok 3 hds.set_hds_disable # SKIP disabling of HDS not supported by the device
ok 4 hds.set_hds_enable
ok 5 hds.set_hds_thresh_zero
ok 6 hds.set_hds_thresh_max
ok 7 hds.set_hds_thresh_gt
ok 8 hds.set_xdp
ok 9 hds.enabled_set_xdp
ok 10 hds.ioctl
ok 11 hds.ioctl_set_xdp
ok 12 hds.ioctl_enabled_set_xdp
# Totals: pass:11 fail:0 xfail:0 xpass:0 skip:1 error:0
and netdevsim:
# ./ksft-net-drv/drivers/net/hds.py
KTAP version 1
1..12
ok 1 hds.get_hds
ok 2 hds.get_hds_thresh
ok 3 hds.set_hds_disable
ok 4 hds.set_hds_enable
ok 5 hds.set_hds_thresh_zero
ok 6 hds.set_hds_thresh_max
ok 7 hds.set_hds_thresh_gt
ok 8 hds.set_xdp
ok 9 hds.enabled_set_xdp
ok 10 hds.ioctl
ok 11 hds.ioctl_set_xdp
ok 12 hds.ioctl_enabled_set_xdp
# Totals: pass:12 fail:0 xfail:0 xpass:0 skip:0 error:0
Netdevsim needs a sane default for tx/rx ring size.
ethtool 6.11 is needed for the --disable-netlink option.
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Tested-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250221025141.1132944-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a selftest case to iou-zcrx where the sender sends 4x4K = 16K and
the receiver does 4x4K recvzc requests. Validate that the requests
complete successfully and that the data is not corrupted.
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/r/20250224041319.2389785-3-dw@davidwei.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Some tools like KGraphViewer interpret the "ON" nodes not having an
explicitly set fill colour as them being entirely black, which obscures
the text on them and looks funny. In fact, I thought they were off for
the longest time. Comparing to the output of the `dot` tool, I assume
they are supposed to be white.
Instead of speclawyering over who's in the wrong and must immediately
atone for their wickedness at the altar of RFC2119, just be explicit
about it, set the fillcolor to white, and nobody gets confused.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Link: https://patch.msgid.link/20250221-dapm-graph-node-colour-v1-1-514ed0aa7069@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Similarly to scx_bpf_nr_cpu_ids(), introduce a new kfunc
scx_bpf_nr_node_ids() to expose the maximum number of NUMA nodes in the
system.
BPF schedulers can use this information together with the new node-aware
kfuncs, for example to create per-node DSQs, validate node IDs, etc.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add a rudimentary test for validating KVM's handling of L1 hypervisor
intercepts during instruction emulation on behalf of L2. To minimize
complexity and avoid overlap with other tests, only validate KVM's
handling of instructions that L1 wants to intercept, i.e. that generate a
nested VM-Exit. Full testing of emulation on behalf of L2 is better
achieved by running existing (forced) emulation tests in a VM, (although
on VMX, getting L0 to emulate on #UD requires modifying either L1 KVM to
not intercept #UD, or modifying L0 KVM to prioritize L0's exception
intercepts over L1's intercepts, as is done by KVM for SVM).
Since emulation should never be successful, i.e. L2 always exits to L1,
dynamically generate the L2 code stream instead of adding a helper for
each instruction. Doing so requires hand coding instruction opcodes, but
makes it significantly easier for the test to compute the expected "next
RIP" and instruction length.
Link: https://lore.kernel.org/r/20250201015518.689704-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
"Function graph accounting fixes:
- Fix the manage ops hashes
The function graph registers a "manager ops" and "sub-ops" to
ftrace. The manager ops does not have any callback but calls the
sub-ops callbacks. The manage ops hashes (what is used to tell
ftrace what functions to attach to) is built on the sub-ops it
manages.
There was an error in the way it built the hash. An empty hash
means to attach to all functions. When the manager ops had one
sub-ops it properly copied its hash. But when the manager ops had
more than one sub-ops, it went into a loop to make a set of all
functions it needed to add to the hash. If any of the subops hashes
was empty, that would mean to attach to all functions. The error
was that the first iteration of the loop passed in an empty hash to
start with in order to add the other hashes. That starting hash was
mistaken as to attach to all functions. This made the manage ops
attach to all functions whenever it had two or more sub-ops, even
if each sub-op was attached to only a single function.
- Do not add duplicate entries to the manager ops hash
If two or more subops hashes trace the same function, an entry for
that function will be added to the manager ops for each subops.
This causes waste and extra overhead.
Fprobe accounting fixes:
- Remove last function from fprobe hash
Fprobes has a ftrace hash to manage which functions an fprobe is
attached to. It also has a counter of how many fprobes are
attached. When the last fprobe is removed, it unregisters the
fprobe from ftrace but does not remove the functions the last
fprobe was attached to from the hash. This leaves the old functions
attached. When a new fprobe is added, the fprobe infrastructure
attaches to not only the functions of the new fprobe, but also to
the functions of the last fprobe.
- Fix accounting of the fprobe counter
When a fprobe is added, it updates a counter. If the counter goes
from zero to one, it attaches its ops to ftrace. When an fprobe is
removed, the counter is decremented. If the counter goes from 1 to
zero, it removes the fprobes ops from ftrace.
There was an issue where if two fprobes trace the same function,
the addition of each fprobe would increment the counter. But when
removing the first of the fprobes, it would notice that another
fprobe is still attached to one of its functions no it does not
remove the functions from the ftrace ops.
But it also did not decrement the counter, so when the last fprobe
is removed, the counter is still one. This leaves the fprobes
callback still registered with ftrace and it being called by the
functions defined by the fprobes ops hash. Worse yet, because all
the functions from the fprobe ops hash have been removed, that
tells ftrace that it wants to trace all functions.
Thus, this puts the state of the system where every function is
calling the fprobe callback handler (which does nothing as there
are no registered fprobes), but this causes a good 13% slow down of
the entire system.
Other updates:
- Add a selftest to test the above issues to prevent regressions.
- Fix preempt count accounting in function tracing
Better recursion protection was added to function tracing which
added another layer of preempt disable. As the preempt_count gets
traced in the event, it needs to subtract the amount of preempt
disabling the tracer does to record what the preempt_count was when
the trace was triggered.
- Fix memory leak in output of set_event
A variable is passed by the seq_file functions in the location that
is set by the return of the next() function. The start() function
allocates it and the stop() function frees it. But when the last
item is found, the next() returns NULL which leaks the data that
was allocated in start(). The m->private is used for something
else, so have next() free the data when it returns NULL, as stop()
will then just receive NULL in that case"
* tag 'ftrace-v6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix memory leak when reading set_event file
ftrace: Correct preemption accounting for function tracing.
selftests/ftrace: Update fprobe test to check enabled_functions file
fprobe: Fix accounting of when to unregister from function graph
fprobe: Always unregister fgraph function from ops
ftrace: Do not add duplicate entries in subops manager ops
ftrace: Fix accounting of adding subops to a manager ops
|
|
Remove a leftover shell script reference from commit 313b38a6ecb4
("lib/prime_numbers: convert self-test to KUnit").
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202502171110.708d965a-lkp@intel.com
Fixes: 313b38a6ecb4 ("lib/prime_numbers: convert self-test to KUnit")
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250217-fix-prime-numbers-v1-1-eb0ca7235e60@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
This test adds coverage of expected errors during rseq registration and
unregistration, it disables glibc integration and will thus always
exercise the rseq syscall explictly.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250121213402.1754762-1-mjeanson@efficios.com
|
|
Recent change in how get_user() handles pointers:
https://lore.kernel.org/all/20241024013214.129639-1-torvalds@linux-foundation.org/
has a specific case for LAM. It assigns a different bitmask that's
later used to check whether a pointer comes from userland in get_user().
Add test case to LAM that utilizes a ioctl (FIOASYNC) syscall which uses
get_user() in its implementation. Execute the syscall with differently
tagged pointers to verify that valid user pointers are passing through
and invalid kernel/non-canonical pointers are not.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/1624d9d1b9502517053a056652d50dc5d26884ac.1737990375.git.maciej.wieczor-retman@intel.com
|
|
Until LASS is merged into the kernel:
https://lore.kernel.org/all/20241028160917.1380714-1-alexander.shishkin@linux.intel.com/
LAM is left disabled in the config file. Running the LAM selftest with
disabled LAM only results in unhelpful output.
Use one of LAM syscalls() to determine whether the kernel was compiled
with LAM support (CONFIG_ADDRESS_MASKING) or not. Skip running the tests
in the latter case.
Merge CPUID checking function with the one mentioned above to achieve a
single function that shows LAM's availability from both CPU and the
kernel.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/251d0f45f6a768030115e8d04bc85458910cb0dc.1737990375.git.maciej.wieczor-retman@intel.com
|
|
In current form cpu_has_la57() reports platform's support for LA57
through reading the output of cpuid. A much more useful information is
whether 5-level paging is actually enabled on the running system.
Check whether 5-level paging is enabled by trying to map a page in the
high linear address space.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/8b1ca51b13e6d94b5a42b6930d81b692cbb0bcbb.1737990375.git.maciej.wieczor-retman@intel.com
|
|
The current test marks all unexpected return values as failed and sets ret
to 1. If a test is skipped, the entire test also returns 1, incorrectly
indicating failure.
To fix this, add a skipped variable and set ret to 4 if it was previously
0. Otherwise, keep ret set to 1.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250220085326.1512814-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add tests for FIB rules that match on DSCP with a mask. Test both good
and bad flows and both the input and output paths.
# ./fib_rule_tests.sh
IPv6 FIB rule tests
[...]
TEST: rule6 check: dscp redirect to table [ OK ]
TEST: rule6 check: dscp no redirect to table [ OK ]
TEST: rule6 del by pref: dscp redirect to table [ OK ]
TEST: rule6 check: iif dscp redirect to table [ OK ]
TEST: rule6 check: iif dscp no redirect to table [ OK ]
TEST: rule6 del by pref: iif dscp redirect to table [ OK ]
TEST: rule6 check: dscp masked redirect to table [ OK ]
TEST: rule6 check: dscp masked no redirect to table [ OK ]
TEST: rule6 del by pref: dscp masked redirect to table [ OK ]
TEST: rule6 check: iif dscp masked redirect to table [ OK ]
TEST: rule6 check: iif dscp masked no redirect to table [ OK ]
TEST: rule6 del by pref: iif dscp masked redirect to table [ OK ]
[...]
Tests passed: 316
Tests failed: 0
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20250220080525.831924-7-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Martin KaFai Lau says:
====================
pull-request: bpf-next 2025-02-20
We've added 19 non-merge commits during the last 8 day(s) which contain
a total of 35 files changed, 1126 insertions(+), 53 deletions(-).
The main changes are:
1) Add TCP_RTO_MAX_MS support to bpf_set/getsockopt, from Jason Xing
2) Add network TX timestamping support to BPF sock_ops, from Jason Xing
3) Add TX metadata Launch Time support, from Song Yoong Siang
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
igc: Add launch time support to XDP ZC
igc: Refactor empty frame insertion for launch time support
net: stmmac: Add launch time support to XDP ZC
selftests/bpf: Add launch time request to xdp_hw_metadata
xsk: Add launch time hardware offload support to XDP Tx metadata
selftests/bpf: Add simple bpf tests in the tx path for timestamping feature
bpf: Support selective sampling for bpf timestamping
bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback
bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback
bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback
bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback
bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback
net-timestamp: Prepare for isolating two modes of SO_TIMESTAMPING
bpf: Disable unsafe helpers in TX timestamping callbacks
bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback
bpf: Prepare the sock_ops ctx and call bpf prog for TX timestamping
bpf: Add networking timestamping support to bpf_get/setsockopt()
selftests/bpf: Add rto max for bpf_setsockopt test
bpf: Support TCP_RTO_MAX_MS for bpf_setsockopt
====================
Link: https://patch.msgid.link/20250221022104.386462-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
- Add test for creating link in another netns when a link of the same
name and ifindex exists in current netns.
- Add test to verify that link is created in target netns directly -
no link new/del events should be generated in link netns or current
netns.
- Add test cases to verify that link-netns is set as expected for
various drivers and combination of namespace-related parameters.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Link: https://patch.msgid.link/20250219125039.18024-14-shaw.leon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Change netns of current thread and switch back on context exit.
For example:
with NetNSEnter("ns1"):
ip("link add dummy0 type dummy")
The command be executed in netns "ns1".
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Link: https://patch.msgid.link/20250219125039.18024-13-shaw.leon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A few bugs were found in the fprobe accounting logic along with it using
the function graph infrastructure. Update the fprobe selftest to catch
those bugs in case they or something similar shows up in the future.
The test now checks the enabled_functions file which shows all the
functions attached to ftrace or fgraph. When enabling a fprobe, make sure
that its corresponding function is also added to that file. Also add two
more fprobes to enable to make sure that the fprobe logic works properly
with multiple probes.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/20250220202055.733001756@goodmis.org
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
new patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Fix the grammatical/spelling errors in sysctl/sysctl.sh.
This fixes all errors pointed out by codespell in the file.
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
|
|
The test is for AF_XDP, we refer to AF_XDP as XSK.
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Tested-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250219234956.520599-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Avoid exceptions when xsk attr is not present, and add a proper ksft
helper for "not in" condition.
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Tested-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250219234956.520599-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|