Age | Commit message (Collapse) | Author |
|
Global variables are accessed using PC-relative address so it needs to
be handled separately. The PC-rel addressing is detected by using
DWARF_REG_PC. On x86, %rip register would be used.
The address can be calculated using the ip and offset in the
instruction. But it should start from the next instruction so add
calculate_pcrel_addr() to do it properly.
But global variables defined in a different file would only have a
declaration which doesn't include a location list. So it first tries
to get the type info using the address, and then looks up the variable
declarations using name. The name of global variables should be get
from the symbol table. The declaration would have the type info.
So extend find_var_type() to take both address and name for global
variables.
The stat is now looks like:
Annotate data type stats:
total 294, ok 153 (52.0%), bad 141 (48.0%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
61 : no_var
10 : no_typeinfo
8 : bad_offset
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Extend find_data_type_die() to find data type from PC-relative address
using die_find_variable_by_addr(). Users need to pass the address for
the (global) variable.
The offset for the variable should be updated after finding the type
because the offset in the instruction is just to calcuate the address
for the variable. So it changed to pass a pointer to offset and renamed
it to 'poffset'.
First it searches variables in the CU DIE as it's likely that the global
variables are defined in the file level. And then it iterates the scope
DIEs to find a local (static) variable.
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
A typical function prologue and epilogue include multiple stack
operations to save and restore the current value of registers.
On x86, it looks like below:
push r15
push r14
push r13
push r12
...
pop r12
pop r13
pop r14
pop r15
ret
As these all touches the stack memory region, chances are high that they
appear in a memory profile data. But these are not used for any real
purpose yet so it'd return no types.
One of my profile type shows that non neglible portion of data came from
the stack operations. It also seems GCC generates more stack operations
than clang.
Annotate Instruction stats
total 264, ok 169 (64.0%), bad 95 (36.0%)
Name : Good Bad
-----------------------------------------------------------
movq : 49 27
movl : 24 9
popq : 0 19 <-- here
cmpl : 17 2
addq : 14 1
cmpq : 12 2
cmpxchgl : 3 7
Instead of dealing them as unknown, let's create a seperate pseudo type
to represent those stack operations separately.
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
On x86, instructions for array access often looks like below.
mov 0x1234(%rax,%rbx,8), %rcx
Usually the first register holds the type information and the second one
has the index. And the current code only looks up a variable for the
first register. But it's possible to be in the other way around so it
needs to check the second register if the first one failed.
The stat changed like this.
Annotate data type stats:
total 294, ok 148 (50.3%), bad 146 (49.7%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
66 : no_var
10 : no_typeinfo
8 : bad_offset
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When a sample was come from a conditional branch without a memory
operand, it could be due to a macro fusion with a previous instruction.
So it needs to check the memory operand in the previous one.
This improves the stat like below:
Annotate data type stats:
total 294, ok 147 (50.0%), bad 147 (50.0%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
71 : no_var
6 : no_typeinfo
8 : bad_offset
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
For the performance reason, I prefer llvm-objdump over GNU's. But I
found that llvm-objdump puts x86 lock prefix in a separate line like
below.
ffffffff81000695: f0 lock
ffffffff81000696: ff 83 54 0b 00 00 incl 2900(%rbx)
This should be parsed properly, but I just changed to find the insn
with next offset for now.
This improves the statistics as it can process more instructions.
Annotate data type stats:
total 294, ok 144 (49.0%), bad 150 (51.0%)
-----------------------------------------------------------
30 : no_sym
35 : no_mem_ops
71 : no_var
6 : no_typeinfo
8 : bad_offset
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
If pkg-config is not installed when libtraceevent is linked, the build fails.
The error information is as follows:
$ make
<SNIP>
In file included from /home/yjh/projects_linux/perf-tool-next/linux/tools/perf/util/evsel.c:43:
/home/yjh/projects_linux/perf-tool-next/linux/tools/perf/util/trace-event.h:149:62: error: operator '&&' has no right operand
149 | #if defined(LIBTRACEEVENT_VERSION) && LIBTRACEEVENT_VERSION >= MAKE_LIBTRACEEVENT_VERSION(1, 5, 0)
| ^~
error: command '/usr/bin/gcc' failed with exit code 1
cp: cannot stat 'python_ext_build/lib/perf*.so': No such file or directory
make[2]: *** [Makefile.perf:668: python/perf.cpython-310-x86_64-linux-gnu.so] Error 1
make[2]: *** Waiting for unfinished jobs....
Because pkg-config is not installed, fail to get libtraceevent version in
Makefile.config file. As a result, LIBTRACEEVENT_VERSION is empty.
However, the preceding error information is not user-friendly.
Identify errors in advance by checking that pkg-config is installed at
compile time.
The build results of various scenarios are as follows:
1. build successful when libtraceevent is not linked and pkg-config is not installed
$ pkg-config --version
-bash: /usr/bin/pkg-config: No such file or directory
$ make clean >/dev/null
$ make NO_LIBTRACEEVENT=1 >/dev/null
Makefile.config:1133: No alternatives command found, you need to set JDIR= to point to the root of your Java directory
PERF_VERSION = 6.7.rc6.gd988c9f511af
$ echo $?
0
2. dummy pkg-config is missing when libtraceevent is linked
$ pkg-config --version
-bash: /usr/bin/pkg-config: No such file or directory
$ make clean >/dev/null
$ make >/dev/null
Makefile.config:221: *** Error: pkg-config needed by libtraceevent is missing on this system, please install it. Stop.
make[1]: *** [Makefile.perf:251: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
$ echo $?
2
3. build successful when libtraceevent is linked and pkg-config is installed
$ pkg-config --version
0.29.2
$ make clean >/dev/null
$ make >/dev/null
Makefile.config:1133: No alternatives command found, you need to set JDIR= to point to the root of your Java directory
PERF_VERSION = 6.7.rc6.gd988c9f511af
$ echo $?
0
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240112034019.3558584-1-yangjihong1@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
This test case often fails on s390 (about 2 out of 10) because the
10% percent limit on the difference between --bpf-counters event counting
and s390 hardware counting is more than 10% in all failure cases.
Raise the limit to 20% on s390 and the test case succeeds.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Link: https://lore.kernel.org/r/20240108084009.3959211-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Prevent warnings of the form:
tools/testing/nvdimm/config_check.c:4:6: error: no previous prototype
for ‘check’ [-Werror=missing-prototypes]
...by locally disabling some warnings.
It turns out that:
Commit 0fcb70851fbf ("Makefile.extrawarn: turn on missing-prototypes globally")
...in addition to expanding in-tree coverage, also impacts out-of-tree
module builds like those in tools/testing/nvdimm/.
Filter out the warning options on unit test code that does not effect
mainline builds.
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/170543984331.460832.1780246477583036191.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Prevent warnings of the form:
tools/testing/cxl/test/mock.c:44:6: error: no previous prototype for
‘__wrap_is_acpi_device_node’ [-Werror=missing-prototypes]
tools/testing/cxl/test/mock.c:63:5: error: no previous prototype for
‘__wrap_acpi_table_parse_cedt’ [-Werror=missing-prototypes]
tools/testing/cxl/test/mock.c:81:13: error: no previous prototype for
‘__wrap_acpi_evaluate_integer’ [-Werror=missing-prototypes]
...by locally disabling some warnings.
It turns out that:
Commit 0fcb70851fbf ("Makefile.extrawarn: turn on missing-prototypes globally")
...in addition to expanding in-tree coverage, also impacts out-of-tree
module builds like those in tools/testing/cxl/.
Filter out the warning options on unit test code that does not effect
mainline builds.
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/170543983780.460832.10920261849128601697.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Indexing with mm_cid is incompatible with skipping disallowed cpumask,
because concurrency IDs are based on a virtual ID allocation which is
unrelated to the physical CPU mask.
These issues can be reproduced by running the rseq selftests under a
taskset which excludes CPU 0, e.g.
taskset -c 10-20 ./run_param_test.sh
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
The test proves that a syscall can be livepatched. It is interesting
because syscalls are called a tricky way. Also the process gets
livepatched either when sleeping in the userspace or when entering
or leaving the kernel space.
The livepatch is a bit tricky:
1. The syscall function name is architecture specific. Also
ARCH_HAS_SYSCALL_WRAPPER must be taken in account.
2. The syscall must stay working the same way for other processes
on the system. It is solved by decrementing a counter only
for PIDs of the test processes. It means that the test processes
has to call the livepatched syscall at least once.
The test creates one userspace process per online cpu. The processes
are calling getpid in a busy loop. The intention is to create random
locations when the livepatch gets enabled. Nothing is guarantted.
The magic is in the randomness.
Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
The modules are being moved from lib/livepatch to
tools/testing/selftests/livepatch/test_modules.
This code moving will allow writing more complex tests, like for example an
userspace C code that will call a livepatched kernel function.
The modules are now built as out-of-tree
modules, but being part of the kernel source means they will be maintained.
Another advantage of the code moving is to be able to easily change,
debug and rebuild the tests by running make on the selftests/livepatch
directory, which is not currently possible since the modules on
lib/livepatch are build and installed using the "modules" target.
The current approach also keeps the ability to execute the tests manually
by executing the scripts inside selftests/livepatch directory, as it's
currently supported. If the modules are modified, they needed to be
rebuilt before running the scripts though.
The modules are built before running the selftests when using the
kselftest invocations:
make kselftest TARGETS=livepatch
or
make -C tools/testing/selftests/livepatch run_tests
Having the modules being built as out-of-modules requires changing the
currently used 'modprobe' by 'insmod' and adapt the test scripts that
check for the kernel message buffer.
Now it is possible to only compile the modules by running:
make -C tools/testing/selftests/livepatch/
This way the test modules and other test program can be built in order
to be packaged if so desired.
As there aren't any modules being built on lib/livepatch, remove the
TEST_LIVEPATCH Kconfig and it's references.
Note: "make gen_tar" packages the pre-built binaries into the tarball.
It means that it will store the test modules pre-built for
the kernel running on the build host.
Note that these modules need not binary compatible with
the kernel built from the same sources. But the same
is true for other packaged selftest binaries.
The entire kernel sources are needed for rebuilding
the selftests on another system.
Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Add TEST_GEN_MODS_DIR variable for kselftests. It can point to
a directory containing kernel modules that will be used by
selftest scripts.
The modules are built as external modules for the running kernel.
As a result they are always binary compatible and the same tests
can be used for older or newer kernels.
The build requires "kernel-devel" package to be installed.
For example, in the upstream sources, the rpm devel package
is produced by "make rpm-pkg"
The modules can be built independently by
make -C tools/testing/selftests/livepatch/
or they will be automatically built before running the tests via
make -C tools/testing/selftests/livepatch/ run_tests
Note that they are _not_ built when running the standalone
tests by calling, for example, ./test-state.sh.
Along with TEST_GEN_MODS_DIR, it was necessary to create a new install
rule. INSTALL_MODS_RULE is needed because INSTALL_SINGLE_RULE would
copy the entire TEST_GEN_MODS_DIR directory to the destination, even
the files created by Kbuild to compile the modules. The new install
rule copies only the .ko files, as we would expect the gen_tar to work.
Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
When walking directory trees, instead of looking for specific files and
running dirname to get the parent folder, traverse all folders and
ignore the ones not containing the desired files. This avoids the need
to call dirname inside the loop, which drastically decreases run time:
Running locally on a mt8192-asurada-spherion, which reports 160 test
cases, has gone from 5.5s to 2.9s, while running remotely with an
nfsroot has gone from 13.5s to 5.5s.
This change has a side-effect, which is that the root DT node now
also shows in the output, even though it isn't expected to bind to a
driver. However there shouldn't be a matching driver for the board
compatible, so the end result will be just an extra skipped test:
ok 1 / # SKIP
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/all/310391e8-fdf2-4c2f-a680-7744eb685177@sirena.org.uk
Fixes: 14571ab1ad21 ("kselftest: Add new test for detecting unprobed Devicetree devices")
Tested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20240122-dt-kselftest-dirname-perf-fix-v2-1-f1630532fd38@collabora.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Replace SYS_<syscall> with __NR_<syscall>. Using the __NR_<syscall>
notation, provided by UAPI, is useful to build tests on systems without
the SYS_<syscall> definitions.
Replace SYS_move_mount with __NR_move_mount
Similar changes: commit 87129ef13603 ("selftests/landlock: Make tests build with old libc")
Acked-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Hu.Yadi <hu.yadi@h3c.com>
Link: https://lore.kernel.org/r/20240111113229.10820-1-hu.yadi@h3c.com
Reviewed-by: Berlin <berlin@h3c.com>
Suggested-by: Jiao <jiaoxupo@h3c.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
One build issue comes up due to both mount.h included dev_in_maps.c
In file included from dev_in_maps.c:10:
/usr/include/sys/mount.h:35:3: error: expected identifier before numeric constant
35 | MS_RDONLY = 1, /* Mount read-only. */
| ^~~~~~~~~
In file included from dev_in_maps.c:13:
Remove one of them to solve conflict, another error comes up:
dev_in_maps.c:170:6: error: implicit declaration of function ‘mount’ [-Werror=implicit-function-declaration]
170 | if (mount(NULL, "/", NULL, MS_SLAVE | MS_REC, NULL) == -1) {
| ^~~~~
cc1: all warnings being treated as errors
and then , add sys_mount definition to solve it
After both above, dev_in_maps.c can be built correctly on my mache(gcc 10.2,glibc-2.32,kernel-5.10)
Signed-off-by: Hu Yadi <hu.yadi@h3c.com>
Link: https://lore.kernel.org/r/20240112074059.29673-1-hu.yadi@h3c.com
Acked-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Allow user to specify outside CFLAGS values as make argument
Corrects an issue where CFLAGS is passed as a make argument for
cpupower, but bench's makefile does not inherit and append to them.
Signed-off-by: Stanley Chan <schan@cloudflare.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- Support for tuning for systems with fast misaligned accesses.
- Support for SBI-based suspend.
- Support for the new SBI debug console extension.
- The T-Head CMOs now use PA-based flushes.
- Support for enabling the V extension in kernel code.
- Optimized IP checksum routines.
- Various ftrace improvements.
- Support for archrandom, which depends on the Zkr extension.
- The build is no longer broken under NET=n, KUNIT=y for ports that
don't define their own ipv6 checksum.
* tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (56 commits)
lib: checksum: Fix build with CONFIG_NET=n
riscv: lib: Check if output in asm goto supported
riscv: Fix build error on rv32 + XIP
riscv: optimize ELF relocation function in riscv
RISC-V: Implement archrandom when Zkr is available
riscv: Optimize hweight API with Zbb extension
riscv: add dependency among Image(.gz), loader(.bin), and vmlinuz.efi
samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI]
riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
riscv: ftrace: Make function graph use ftrace directly
riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
riscv: Restrict DWARF5 when building with LLVM to known working versions
riscv: Hoist linker relaxation disabling logic into Kconfig
kunit: Add tests for csum_ipv6_magic and ip_fast_csum
riscv: Add checksum library
riscv: Add checksum header
riscv: Add static key for misaligned accesses
asm-generic: Improve csum_fold
RISC-V: selftests: cbo: Ensure asm operands match constraints
...
|
|
When tests are run by runner.sh, bond_options.sh gets killed before
it can complete:
make -C tools/testing/selftests run_tests TARGETS="drivers/net/bonding"
[...]
# timeout set to 120
# selftests: drivers/net/bonding: bond_options.sh
# TEST: prio (active-backup miimon primary_reselect 0) [ OK ]
# TEST: prio (active-backup miimon primary_reselect 1) [ OK ]
# TEST: prio (active-backup miimon primary_reselect 2) [ OK ]
# TEST: prio (active-backup arp_ip_target primary_reselect 0) [ OK ]
# TEST: prio (active-backup arp_ip_target primary_reselect 1) [ OK ]
# TEST: prio (active-backup arp_ip_target primary_reselect 2) [ OK ]
#
not ok 7 selftests: drivers/net/bonding: bond_options.sh # TIMEOUT 120 seconds
This test includes many sleep statements, at least some of which are
related to timers in the operation of the bonding driver itself. Increase
the test timeout to allow the test to complete.
I ran the test in slightly different VMs (including one without HW
virtualization support) and got runtimes of 13m39.760s, 13m31.238s, and
13m2.956s. Use a ~1.5x "safety factor" and set the timeout to 1200s.
Fixes: 42a8d4aaea84 ("selftests: bonding: add bonding prio option test")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20240116104402.1203850a@kernel.org/#t
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240118001233.304759-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
"Add Namhyung Kim as tools/perf/ co-maintainer, we're taking turns
processing patches, switching roles from perf-tools to perf-tools-next
at each Linux release.
Data profiling:
- Associate samples that identify loads and stores with data
structures. This uses events available on Intel, AMD and others and
DWARF info:
# To get memory access samples in kernel for 1 second (on Intel)
$ perf mem record -a -K --ldlat=4 -- sleep 1
# Similar for the AMD (but it requires 6.3+ kernel for BPF filters)
$ perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' -- sleep 1
Then, amongst several modes of post processing, one can do things like:
$ perf report -s type,typeoff --hierarchy --group --stdio
...
#
# Samples: 10K of events 'cpu/mem-loads,ldlat=4/P, cpu/mem-stores/P, dummy:u'
# Event count (approx.): 602758064
#
# Overhead Data Type / Data Type Offset
# ........................... ............................
#
26.09% 3.28% 0.00% long unsigned int
26.09% 3.28% 0.00% long unsigned int +0 (no field)
18.48% 0.73% 0.00% struct page
10.83% 0.02% 0.00% struct page +8 (lru.next)
3.90% 0.28% 0.00% struct page +0 (flags)
3.45% 0.06% 0.00% struct page +24 (mapping)
0.25% 0.28% 0.00% struct page +48 (_mapcount.counter)
0.02% 0.06% 0.00% struct page +32 (index)
0.02% 0.00% 0.00% struct page +52 (_refcount.counter)
0.02% 0.01% 0.00% struct page +56 (memcg_data)
0.00% 0.01% 0.00% struct page +16 (lru.prev)
15.37% 17.54% 0.00% (stack operation)
15.37% 17.54% 0.00% (stack operation) +0 (no field)
11.71% 50.27% 0.00% (unknown)
11.71% 50.27% 0.00% (unknown) +0 (no field)
$ perf annotate --data-type
...
Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples):
============================================================================
samples offset size field
13 0 640 struct cfs_rq {
2 0 16 struct load_weight load {
2 0 8 unsigned long weight;
0 8 4 u32 inv_weight;
};
0 16 8 unsigned long runnable_weight;
0 24 4 unsigned int nr_running;
1 28 4 unsigned int h_nr_running;
...
$ perf annotate --data-type=page --group
Annotate type: 'struct page' in [kernel.kallsyms] (480 samples):
event[0] = cpu/mem-loads,ldlat=4/P
event[1] = cpu/mem-stores/P
event[2] = dummy:u
===================================================================================
samples offset size field
447 33 0 0 64 struct page {
108 8 0 0 8 long unsigned int flags;
319 13 0 8 40 union {
319 13 0 8 40 struct {
236 2 0 8 16 union {
236 2 0 8 16 struct list_head lru {
236 1 0 8 8 struct list_head* next;
0 1 0 16 8 struct list_head* prev;
};
236 2 0 8 16 struct {
236 1 0 8 8 void* __filler;
0 1 0 16 4 unsigned int mlock_count;
};
236 2 0 8 16 struct list_head buddy_list {
236 1 0 8 8 struct list_head* next;
0 1 0 16 8 struct list_head* prev;
};
236 2 0 8 16 struct list_head pcp_list {
236 1 0 8 8 struct list_head* next;
0 1 0 16 8 struct list_head* prev;
};
};
82 4 0 24 8 struct address_space* mapping;
1 7 0 32 8 union {
1 7 0 32 8 long unsigned int index;
1 7 0 32 8 long unsigned int share;
};
0 0 0 40 8 long unsigned int private;
};
This uses the existing annotate code, calling objdump to do the
disassembly, with improvements to avoid having this take too long,
but longer term a switch to a disassembler library, possibly
reusing code in the kernel will be pursued.
This is the initial implementation, please use it and report
impressions and bugs. Make sure the kernel-debuginfo packages match
the running kernel. The 'perf report' phase for non short perf.data
files may take a while.
There is a great article about it on LWN:
https://lwn.net/Articles/955709/ - "Data-type profiling for perf"
One last test I did while writing this text, on a AMD Ryzen 5950X,
using a distro kernel, while doing a simple 'find /' on an
otherwise idle system resulted in:
# uname -r
6.6.9-100.fc38.x86_64
# perf -vv | grep BPF_
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
bpf_skeletons: [ on ] # HAVE_BPF_SKEL
# rpm -qa | grep kernel-debuginfo
kernel-debuginfo-common-x86_64-6.6.9-100.fc38.x86_64
kernel-debuginfo-6.6.9-100.fc38.x86_64
#
# perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000'
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ]
#
# ls -la perf.data
-rw-------. 1 root root 2346486 Jan 9 18:36 perf.data
# perf evlist
ibs_op//
dummy:u
# perf evlist -v
ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1
dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
#
# perf report -s type,typeoff --hierarchy --group --stdio
# Total Lost Samples: 0
#
# Samples: 2K of events 'ibs_op//, dummy:u'
# Event count (approx.): 1904553038
#
# Overhead Data Type / Data Type Offset
# ................... ............................
#
73.70% 0.00% (unknown)
73.70% 0.00% (unknown) +0 (no field)
3.01% 0.00% long unsigned int
3.00% 0.00% long unsigned int +0 (no field)
0.01% 0.00% long unsigned int +2 (no field)
2.73% 0.00% struct task_struct
1.71% 0.00% struct task_struct +52 (on_cpu)
0.38% 0.00% struct task_struct +2104 (rcu_read_unlock_special.b.blocked)
0.23% 0.00% struct task_struct +2100 (rcu_read_lock_nesting)
0.14% 0.00% struct task_struct +2384 ()
0.06% 0.00% struct task_struct +3096 (signal)
0.05% 0.00% struct task_struct +3616 (cgroups)
0.05% 0.00% struct task_struct +2344 (active_mm)
0.02% 0.00% struct task_struct +46 (flags)
0.02% 0.00% struct task_struct +2096 (migration_disabled)
0.01% 0.00% struct task_struct +24 (__state)
0.01% 0.00% struct task_struct +3956 (mm_cid_active)
0.01% 0.00% struct task_struct +1048 (cpus_ptr)
0.01% 0.00% struct task_struct +184 (se.group_node.next)
0.01% 0.00% struct task_struct +20 (thread_info.cpu)
0.00% 0.00% struct task_struct +104 (on_rq)
0.00% 0.00% struct task_struct +2456 (pid)
1.36% 0.00% struct module
0.59% 0.00% struct module +952 (kallsyms)
0.42% 0.00% struct module +0 (state)
0.23% 0.00% struct module +8 (list.next)
0.12% 0.00% struct module +216 (syms)
0.95% 0.00% struct inode
0.41% 0.00% struct inode +40 (i_sb)
0.22% 0.00% struct inode +0 (i_mode)
0.06% 0.00% struct inode +76 (i_rdev)
0.06% 0.00% struct inode +56 (i_security)
<SNIP>
perf top/report:
- Don't ignore job control, allowing control+Z + bg to work.
- Add s390 raw data interpretation for PAI (Processor Activity
Instrumentation) counters.
perf archive:
- Add new option '--all' to pack perf.data with DSOs.
- Add new option '--unpack' to expand tarballs.
Initialization speedups:
- Lazily initialize zstd streams to save memory when not using it.
- Lazily allocate/size mmap event copy.
- Lazy load kernel symbols in 'perf record'.
- Be lazier in allocating lost samples buffer in 'perf record'.
- Don't synthesize BPF events when disabled via the command line
(perf record --no-bpf-event).
Assorted improvements:
- Show note on AMD systems that the :p, :pp, :ppp and :P are all the
same, as IBS (Instruction Based Sampling) is used and it is
inherentely precise, not having levels of precision like in Intel
systems.
- When 'cycles' isn't available, fall back to the "task-clock" event
when not system wide, not to 'cpu-clock'.
- Add --debug-file option to redirect debug output, e.g.:
$ perf --debug-file /tmp/perf.log record -v true
- Shrink 'struct map' to under one cacheline by avoiding function
pointers for selecting if addresses are identity or DSO relative,
and using just a byte for some boolean struct members.
- Resolve the arch specific strerrno just once to use in
perf_env__arch_strerrno().
- Reduce memory for recording PERF_RECORD_LOST_SAMPLES event.
Assorted fixes:
- Fix the default 'perf top' usage on Intel hybrid systems, now it
starts with a browser showing the number of samples for Efficiency
(cpu_atom/cycles/P) and Performance (cpu_core/cycles/P). This
behaviour is similar on ARM64, with its respective set of
big.LITTLE processors.
- Fix segfault on build_mem_topology() error path.
- Fix 'perf mem' error on hybrid related to availability of mem event
in a PMU.
- Fix missing reference count gets (map, maps) in the db-export code.
- Avoid recursively taking env->bpf_progs.lock in the 'perf_env'
code.
- Use the newly introduced maps__for_each_map() to add missing
locking around iteration of 'struct map' entries.
- Parse NOTE segments until the build id is found, don't stop on the
first one, ELF files may have several such NOTE segments.
- Remove 'egrep' usage, its deprecated, use 'grep -E' instead.
- Warn first about missing libelf, not libbpf, that depends on
libelf.
- Use alternative to 'find ... -printf' as this isn't supported in
busybox.
- Address python 3.6 DeprecationWarning for string scapes.
- Fix memory leak in uniq() in libsubcmd.
- Fix man page formatting for 'perf lock'
- Fix some spelling mistakes.
perf tests:
- Fail shell tests that needs some symbol in perf itself if it is
stripped. These tests check if a symbol is resolved, if some hot
function is indeed detected by profiling, etc.
- The 'perf test sigtrap' test is currently failing on PREEMPT_RT,
skip it if sleeping spinlocks are detected (using BTF) and point to
the mailing list discussion about it. This test is also being
skipped on several architectures (powerpc, s390x, arm and aarch64)
due to other pending issues with intruction breakpoints.
- Adjust test case perf record offcpu profiling tests for s390.
- Fix 'Setup struct perf_event_attr' fails on s390 on z/VM guest,
addressing issues caused by the fallback from cycles to task-clock
done in this release.
- Fix mask for VG register in the user-regs test.
- Use shellcheck on 'perf test' shell scripts automatically to make
sure changes don't introduce things it flags as problematic.
- Add option to change objdump binary and allow it to be set via
'perf config'.
- Add basic 'perf script', 'perf list --json" and 'perf diff' tests.
- Basic branch counter support.
- Make DSO tests a suite rather than individual.
- Remove atomics from test_loop to avoid test failures.
- Fix call chain match on powerpc for the record+probe_libc_inet_pton
test.
- Improve Intel hybrid tests.
Vendor event files (JSON):
powerpc:
- Update datasource event name to fix duplicate events on IBM's
Power10.
- Add PVN for HX-C2000 CPU with Power8 Architecture.
Intel:
- Alderlake/rocketlake metric fixes.
- Update emeraldrapids events to v1.02.
- Update icelakex events to v1.23.
- Update sapphirerapids events to v1.17.
- Add skx, clx, icx and spr upi bandwidth metric.
AMD:
- Add Zen 4 memory controller events.
RISC-V:
- Add StarFive Dubhe-80 and Dubhe-90 JSON files.
https://www.starfivetech.com/en/site/cpu-u
- Add T-HEAD C9xx JSON file.
https://github.com/riscv-software-src/opensbi/blob/master/docs/platform/thead-c9xx.md
ARM64:
- Remove UTF-8 characters from cmn.json, that were causing build
failure in some distros.
- Add core PMU events and metrics for Ampere One X.
- Rename Ampere One's BPU_FLUSH_MEM_FAULT to GPC_FLUSH_MEM_FAULT
libperf:
- Rename several perf_cpu_map constructor names to clarify what they
really do.
- Ditto for some other methods, coping with some issues in their
semantics, like perf_cpu_map__empty() ->
perf_cpu_map__has_any_cpu_or_is_empty().
- Document perf_cpu_map__nr()'s behavior
perf stat:
- Exit if parse groups fails.
- Combine the -A/--no-aggr and --no-merge options.
- Fix help message for --metric-no-threshold option.
Hardware tracing:
ARM64 CoreSight:
- Bump minimum OpenCSD version to ensure a bugfix is present.
- Add 'T' itrace option for timestamp trace
- Set start vm addr of exectable file to 0 and don't ignore first
sample on the arm-cs-trace-disasm.py 'perf script'"
* tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (179 commits)
MAINTAINERS: Add Namhyung as tools/perf/ co-maintainer
perf test: test case 'Setup struct perf_event_attr' fails on s390 on z/vm
perf db-export: Fix missing reference count get in call_path_from_sample()
perf tests: Add perf script test
libsubcmd: Fix memory leak in uniq()
perf TUI: Don't ignore job control
perf vendor events intel: Update sapphirerapids events to v1.17
perf vendor events intel: Update icelakex events to v1.23
perf vendor events intel: Update emeraldrapids events to v1.02
perf vendor events intel: Alderlake/rocketlake metric fixes
perf x86 test: Add hybrid test for conflicting legacy/sysfs event
perf x86 test: Update hybrid expectations
perf vendor events amd: Add Zen 4 memory controller events
perf stat: Fix hard coded LL miss units
perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event
perf env: Avoid recursively taking env->bpf_progs.lock
perf annotate: Add --insn-stat option for debugging
perf annotate: Add --type-stat option for debugging
perf annotate: Support event group display
perf annotate: Add --data-type option
...
|
|
The KVM RISC-V allows Zfa extension for Guest/VM so let us
add this extension to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The KVM RISC-V allows Zvfh[min] extensions for Guest/VM so let us
add these extensions to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The KVM RISC-V allows Zihintntl extension for Guest/VM so let us
add this extension to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The KVM RISC-V allows Zfh[min] extensions for Guest/VM so let us
add these extensions to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The KVM RISC-V allows vector crypto extensions for Guest/VM so let us
add these extensions to get-reg-list test. This includes extensions
Zvbb, Zvbc, Zvkb, Zvkg, Zvkned, Zvknha, Zvknhb, Zvksed, Zvksh, and Zvkt.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The KVM RISC-V allows scaler crypto extensions for Guest/VM so let us
add these extensions to get-reg-list test. This includes extensions
Zbkb, Zbkc, Zbkx, Zknd, Zkne, Zknh, Zkr, Zksed, Zksh, and Zkt.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The KVM RISC-V allows Zbc extension for Guest/VM so let us add
this extension to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.
Previous releases - regressions:
- Revert "net: rtnetlink: Enslave device before bringing it up",
breaks the case inverse to the one it was trying to fix
- net: dsa: fix oob access in DSA's netdevice event handler
dereference netdev_priv() before check its a DSA port
- sched: track device in tcf_block_get/put_ext() only for clsact
binder types
- net: tls, fix WARNING in __sk_msg_free when record becomes full
during splice and MORE hint set
- sfp-bus: fix SFP mode detect from bitrate
- drv: stmmac: prevent DSA tags from breaking COE
Previous releases - always broken:
- bpf: fix no forward progress in in bpf_iter_udp if output buffer is
too small
- bpf: reject variable offset alu on registers with a type of
PTR_TO_FLOW_KEYS to prevent oob access
- netfilter: tighten input validation
- net: add more sanity check in virtio_net_hdr_to_skb()
- rxrpc: fix use of Don't Fragment flag on RESPONSE packets, avoid
infinite loop
- amt: do not use the portion of skb->cb area which may get clobbered
- mptcp: improve validation of the MPTCPOPT_MP_JOIN MCTCP option
Misc:
- spring cleanup of inactive maintainers"
* tag 'net-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
i40e: Include types.h to some headers
ipv6: mcast: fix data-race in ipv6_mc_down / mld_ifc_work
selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes
selftests: mlxsw: qos_pfc: Remove wrong description
mlxsw: spectrum_router: Register netdevice notifier before nexthop
mlxsw: spectrum_acl_tcam: Fix stack corruption
mlxsw: spectrum_acl_tcam: Fix NULL pointer dereference in error path
mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure
ethtool: netlink: Add missing ethnl_ops_begin/complete
selftests: bonding: Add more missing config options
selftests: netdevsim: add a config file
libbpf: warn on unexpected __arg_ctx type when rewriting BTF
selftests/bpf: add tests confirming type logic in kernel for __arg_ctx
bpf: enforce types for __arg_ctx-tagged arguments in global subprogs
bpf: extract bpf_ctx_convert_map logic and make it more reusable
libbpf: feature-detect arg:ctx tag support in kernel
ipvs: avoid stat macros calls from preemptible context
netfilter: nf_tables: reject NFT_SET_CONCAT with not field length description
netfilter: nf_tables: skip dead set elements in netlink dump
netfilter: nf_tables: do not allow mismatch field size and set key length
...
|
|
Pull CXL (Compute Express Link) updates from Dan Williams:
"The bulk of this update is support for enumerating the performance
capabilities of CXL memory targets and connecting that to a platform
CXL memory QoS class. Some follow-on work remains to hook up this data
into core-mm policy, but that is saved for v6.9.
The next significant update is unifying how CXL event records (things
like background scrub errors) are processed between so called
"firmware first" and native error record retrieval. The CXL driver
handler that processes the record retrieved from the device mailbox is
now the handler for that same record format coming from an EFI/ACPI
notification source.
This also contains miscellaneous feature updates, like Get Timestamp,
and other fixups.
Summary:
- Add support for parsing the Coherent Device Attribute Table (CDAT)
- Add support for calculating a platform CXL QoS class from CDAT data
- Unify the tracing of EFI CXL Events with native CXL Events.
- Add Get Timestamp support
- Miscellaneous cleanups and fixups"
* tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (41 commits)
cxl/core: use sysfs_emit() for attr's _show()
cxl/pci: Register for and process CPER events
PCI: Introduce cleanup helpers for device reference counts and locks
acpi/ghes: Process CXL Component Events
cxl/events: Create a CXL event union
cxl/events: Separate UUID from event structures
cxl/events: Remove passing a UUID to known event traces
cxl/events: Create common event UUID defines
cxl/events: Promote CXL event structures to a core header
cxl: Refactor to use __free() for cxl_root allocation in cxl_endpoint_port_probe()
cxl: Refactor to use __free() for cxl_root allocation in cxl_find_nvdimm_bridge()
cxl: Fix device reference leak in cxl_port_perf_data_calculate()
cxl: Convert find_cxl_root() to return a 'struct cxl_root *'
cxl: Introduce put_cxl_root() helper
cxl/port: Fix missing target list lock
cxl/port: Fix decoder initialization when nr_targets > interleave_ways
cxl/region: fix x9 interleave typo
cxl/trace: Pass UUID explicitly to event traces
cxl/region: use %pap format to print resource_size_t
cxl/region: Add dev_dbg() detail on failure to allocate HPA space
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
"This brings the first of three planned user IO page table invalidation
operations:
- IOMMU_HWPT_INVALIDATE allows invalidating the IOTLB integrated into
the iommu itself. The Intel implementation will also generate an
ATC invalidation to flush the device IOTLB as it unambiguously
knows the device, but other HW will not.
It goes along with the prior PR to implement userspace IO page tables
(aka nested translation for VMs) to allow Intel to have full
functionality for simple cases. An Intel implementation of the
operation is provided.
Also fix a small bug in the selftest mock iommu driver probe"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd/selftest: Check the bus type during probe
iommu/vt-d: Add iotlb flush for nested domain
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommu: Add iommu_copy_struct_from_user_array helper
iommufd: Add IOMMU_HWPT_INVALIDATE
iommu: Add cache_invalidate_user op
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Allow kernel trace instance creation to specify what events are
created
Inside the kernel, a subsystem may create a tracing instance that it
can use to send events to user space. This sub-system may not care
about the thousands of events that exist in eventfs. Allow the
sub-system to specify what sub-systems of events it cares about, and
only those events are exposed to this instance.
- Allow the ring buffer to be broken up into bigger sub-buffers than
just the architecture page size.
A new tracefs file called "buffer_subbuf_size_kb" is created. The
user can now specify a minimum size the sub-buffer may be in
kilobytes. Note, that the implementation currently make the
sub-buffer size a power of 2 pages (1, 2, 4, 8, 16, ...) but the user
only writes in kilobyte size, and the sub-buffer will be updated to
the next size that it will can accommodate it. If the user writes in
10, it will change the size to be 4 pages on x86 (16K), as that is
the next available size that can hold 10K pages.
- Update the debug output when a corrupt time is detected in the ring
buffer. If the ring buffer detects inconsistent timestamps, there's a
debug config options that will dump the contents of the meta data of
the sub-buffer that is used for debugging. Add some more information
to this dump that helps with debugging.
- Add more timestamp debugging checks (only triggers when the config is
enabled)
- Increase the trace_seq iterator to 2 page sizes.
- Allow strings written into tracefs_marker to be larger. Up to just
under 2 page sizes (based on what trace_seq can hold).
- Increase the trace_maker_raw write to be as big as a sub-buffer can
hold.
- Remove 32 bit time stamp logic, now that the rb_time_cmpxchg() has
been removed.
- More selftests were added.
- Some code clean ups as well.
* tag 'trace-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (29 commits)
ring-buffer: Remove stale comment from ring_buffer_size()
tracing histograms: Simplify parse_actions() function
tracing/selftests: Remove exec permissions from trace_marker.tc test
ring-buffer: Use subbuf_order for buffer page masking
tracing: Update subbuffer with kilobytes not page order
ringbuffer/selftest: Add basic selftest to test changing subbuf order
ring-buffer: Add documentation on the buffer_subbuf_order file
ring-buffer: Just update the subbuffers when changing their allocation order
ring-buffer: Keep the same size when updating the order
tracing: Stop the tracing while changing the ring buffer subbuf size
tracing: Update snapshot order along with main buffer order
ring-buffer: Make sure the spare sub buffer used for reads has same size
ring-buffer: Do no swap cpu buffers if order is different
ring-buffer: Clear pages on error in ring_buffer_subbuf_order_set() failure
ring-buffer: Read and write to ring buffers with custom sub buffer size
ring-buffer: Set new size of the ring buffer sub page
ring-buffer: Add interface for configuring trace sub buffer size
ring-buffer: Page size per ring buffer
ring-buffer: Have ring_buffer_print_page_header() be able to access ring_buffer_iter
ring-buffer: Check if absolute timestamp goes backwards
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGX updates from Dave Hansen:
"This time, these are entirely confined to SGX selftests fixes.
The mini SGX enclave built by the selftests has garnered some
attention because it stands alone and does not need the sizable
infrastructure of the official SGX SDK. I think that's why folks are
suddently interested in cleaning it up.
- Clean up selftest compilation issues, mostly from non-gcc compilers
- Avoid building selftests when not on x86"
* tag 'x86_sgx_for_6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/sgx: Skip non X86_64 platform
selftests/sgx: Remove incomplete ABI sanitization code in test enclave
selftests/sgx: Discard unsupported ELF sections
selftests/sgx: Ensure expected location of test enclave buffer
selftests/sgx: Ensure test enclave buffer is entirely preserved
selftests/sgx: Fix linker script asserts
selftests/sgx: Handle relocations in test enclave
selftests/sgx: Produce static-pie executable for test enclave
selftests/sgx: Remove redundant enclave base address save/restore
selftests/sgx: Specify freestanding environment for enclave compilation
selftests/sgx: Separate linker options
selftests/sgx: Include memory clobber for inline asm in test enclave
selftests/sgx: Fix uninitialized pointer dereferences in encl_get_entry
selftests/sgx: Fix uninitialized pointer dereference in error path
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-01-18
We've added 10 non-merge commits during the last 5 day(s) which contain
a total of 12 files changed, 806 insertions(+), 51 deletions(-).
The main changes are:
1) Fix an issue in bpf_iter_udp under backward progress which prevents
user space process from finishing iteration, from Martin KaFai Lau.
2) Fix BPF verifier to reject variable offset alu on registers with a type
of PTR_TO_FLOW_KEYS to prevent oob access, from Hao Sun.
3) Follow up fixes for kernel- and libbpf-side logic around handling
arg:ctx tagged arguments of BPF global subprogs, from Andrii Nakryiko.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
libbpf: warn on unexpected __arg_ctx type when rewriting BTF
selftests/bpf: add tests confirming type logic in kernel for __arg_ctx
bpf: enforce types for __arg_ctx-tagged arguments in global subprogs
bpf: extract bpf_ctx_convert_map logic and make it more reusable
libbpf: feature-detect arg:ctx tag support in kernel
selftests/bpf: Add test for alu on PTR_TO_FLOW_KEYS
bpf: Reject variable offset alu on PTR_TO_FLOW_KEYS
selftests/bpf: Test udp and tcp iter batching
bpf: Avoid iter->offset making backward progress in bpf_iter_udp
bpf: iter_udp: Retry with a larger batch size without going back to the previous bucket
====================
Link: https://lore.kernel.org/r/20240118153936.11769-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'qos_pfc' test checks PFC behavior. The idea is to limit the traffic
using a shaper somewhere in the flow of the packets. In this area, the
buffer is smaller than the buffer at the beginning of the flow, so it fills
up until there is no more space left. The test configures there PFC
which is supposed to notice that the headroom is filling up and send PFC
Xoff to indicate the transmitter to stop sending traffic for the priorities
sharing this PG.
The Xon/Xoff threshold is auto-configured and always equal to
2*(MTU rounded up to cell size). Even after sending the PFC Xoff packet,
traffic will keep arriving until the transmitter receives and processes
the PFC packet. This amount of traffic is known as the PFC delay allowance.
Currently the buffer for the delay traffic is configured as 100KB. The
MTU in the test is 10KB, therefore the threshold for Xoff is about 20KB.
This allows 80KB extra to be stored in this buffer.
8-lane ports use two buffers among which the configured buffer is split,
the Xoff threshold then applies to each buffer in parallel.
The test does not take into account the behavior of 8-lane ports, when the
ports are configured to 400Gbps with 8 lanes or 800Gbps with 8 lanes,
packets are dropped and the test fails.
Check if the relevant ports use 8 lanes, in such case double the size of
the buffer, as the headroom is split half-half.
Cc: Shuah Khan <shuah@kernel.org>
Fixes: bfa804784e32 ("selftests: mlxsw: Add a PFC test")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/23ff11b7dff031eb04a41c0f5254a2b636cd8ebb.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the diagram of the topology, $swp3 and $swp4 are described as 1Gbps
ports. This is wrong information, the test does not configure such speed.
Cc: Shuah Khan <shuah@kernel.org>
Fixes: bfa804784e32 ("selftests: mlxsw: Add a PFC test")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/0087e2d416aff7e444d15f7c2958fc1d438dc27e.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When tc filters are first added to a net device, the corresponding local
port gets bound to an ACL group in the device. The group contains a list
of ACLs. In turn, each ACL points to a different TCAM region where the
filters are stored. During forwarding, the ACLs are sequentially
evaluated until a match is found.
One reason to place filters in different regions is when they are added
with decreasing priorities and in an alternating order so that two
consecutive filters can never fit in the same region because of their
key usage.
In Spectrum-2 and newer ASICs the firmware started to report that the
maximum number of ACLs in a group is more than 16, but the layout of the
register that configures ACL groups (PAGT) was not updated to account
for that. It is therefore possible to hit stack corruption [1] in the
rare case where more than 16 ACLs in a group are required.
Fix by limiting the maximum ACL group size to the minimum between what
the firmware reports and the maximum ACLs that fit in the PAGT register.
Add a test case to make sure the machine does not crash when this
condition is hit.
[1]
Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: mlxsw_sp_acl_tcam_group_update+0x116/0x120
[...]
dump_stack_lvl+0x36/0x50
panic+0x305/0x330
__stack_chk_fail+0x15/0x20
mlxsw_sp_acl_tcam_group_update+0x116/0x120
mlxsw_sp_acl_tcam_group_region_attach+0x69/0x110
mlxsw_sp_acl_tcam_vchunk_get+0x492/0xa20
mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0
mlxsw_sp_acl_rule_add+0x47/0x240
mlxsw_sp_flower_replace+0x1a9/0x1d0
tc_setup_cb_add+0xdc/0x1c0
fl_hw_replace_filter+0x146/0x1f0
fl_change+0xc17/0x1360
tc_new_tfilter+0x472/0xb90
rtnetlink_rcv_msg+0x313/0x3b0
netlink_rcv_skb+0x58/0x100
netlink_unicast+0x244/0x390
netlink_sendmsg+0x1e4/0x440
____sys_sendmsg+0x164/0x260
___sys_sendmsg+0x9a/0xe0
__sys_sendmsg+0x7a/0xc0
do_syscall_64+0x40/0xe0
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Fixes: c3ab435466d5 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Reported-by: Orel Hagag <orelh@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/2d91c89afba59c22587b444994ae419dbea8d876.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Lately, a bug was found when many TC filters are added - at some point,
several bugs are printed to dmesg [1] and the switch is crashed with
segmentation fault.
The issue starts when gen_pool_free() fails because of unexpected
behavior - a try to free memory which is already freed, this leads to BUG()
call which crashes the switch and makes many other bugs.
Trying to track down the unexpected behavior led to a bug in eRP code. The
function mlxsw_sp_acl_erp_table_alloc() gets a pointer to the allocated
index, sets the value and returns an error code. When gen_pool_alloc()
fails it returns address 0, we track it and return -ENOBUFS outside, BUT
the call for gen_pool_alloc() already override the index in erp_table
structure. This is a problem when such allocation is done as part of
table expansion. This is not a new table, which will not be used in case
of allocation failure. We try to expand eRP table and override the
current index (non-zero) with zero. Then, it leads to an unexpected
behavior when address 0 is freed twice. Note that address 0 is valid in
erp_table->base_index and indeed other tables use it.
gen_pool_alloc() fails in case that there is no space left in the
pre-allocated pool, in our case, the pool is limited to
ACL_MAX_ERPT_BANK_SIZE, which is read from hardware. When more than max
erp entries are required, we exceed the limit and return an error, this
error leads to "Failed to migrate vregion" print.
Fix this by changing erp_table->base_index only in case of a successful
allocation.
Add a test case for such a scenario. Without this fix it causes
segmentation fault:
$ TESTS="max_erp_entries_test" ./tc_flower.sh
./tc_flower.sh: line 988: 1560 Segmentation fault tc filter del dev $h2 ingress chain $i protocol ip pref $i handle $j flower &>/dev/null
[1]:
kernel BUG at lib/genalloc.c:508!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 6 PID: 3531 Comm: tc Not tainted 6.7.0-rc5-custom-ga6893f479f5e #1
Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 07/12/2021
RIP: 0010:gen_pool_free_owner+0xc9/0xe0
...
Call Trace:
<TASK>
__mlxsw_sp_acl_erp_table_other_dec+0x70/0xa0 [mlxsw_spectrum]
mlxsw_sp_acl_erp_mask_destroy+0xf5/0x110 [mlxsw_spectrum]
objagg_obj_root_destroy+0x18/0x80 [objagg]
objagg_obj_destroy+0x12c/0x130 [objagg]
mlxsw_sp_acl_erp_mask_put+0x37/0x50 [mlxsw_spectrum]
mlxsw_sp_acl_ctcam_region_entry_remove+0x74/0xa0 [mlxsw_spectrum]
mlxsw_sp_acl_ctcam_entry_del+0x1e/0x40 [mlxsw_spectrum]
mlxsw_sp_acl_tcam_ventry_del+0x78/0xd0 [mlxsw_spectrum]
mlxsw_sp_flower_destroy+0x4d/0x70 [mlxsw_spectrum]
mlxsw_sp_flow_block_cb+0x73/0xb0 [mlxsw_spectrum]
tc_setup_cb_destroy+0xc1/0x180
fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
__fl_delete+0x1ac/0x1c0 [cls_flower]
fl_destroy+0xc2/0x150 [cls_flower]
tcf_proto_destroy+0x1a/0xa0
...
mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion
mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion
Fixes: f465261aa105 ("mlxsw: spectrum_acl: Implement common eRP core")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/4cfca254dfc0e5d283974801a24371c7b6db5989.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As a followup to commit 03fb8565c880 ("selftests: bonding: add missing
build configs"), add more networking-specific config options which are
needed for bonding tests.
For testing, I used the minimal config generated by virtme-ng and I added
the options in the config file. All bonding tests passed.
Fixes: bbb774d921e2 ("net: Add tests for bonding and team address list management") # for ipv6
Fixes: 6cbe791c0f4e ("kselftest: bonding: add num_grat_arp test") # for tc options
Fixes: 222c94ec0ad4 ("selftests: bonding: add tests for ether type changes") # for nlmon
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240116154926.202164-1-bpoirier@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
netdevsim tests aren't very well integrated with kselftest,
which has its advantages and disadvantages. But regardless
of the intended integration - a config file to know what kernel
to build is very useful, add one.
Fixes: fc4c93f145d7 ("selftests: add basic netdevsim devlink flash testing")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20240116154311.1945801-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The device is exported with a fuzz of 4, meaning that the `+ t` here
is removed by the fuzz algorithm, making those tests failing.
Not sure why, but when I run this locally it was passing, but not in the
VM of the CI.
Fixes: b0fb904d074e ("HID: wacom: Add additional tests of confidence behavior")
Link: https://gitlab.freedesktop.org/bentiss/hid/-/jobs/53692957#L3315
Acked-by: Jason Gerecke <jason.gerecke@wacom.com>
Link: https://lore.kernel.org/r/20240117-b4-wip-wacom-tests-fixes-v1-1-f317784f3c36@kernel.org
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
|
|
On kernel that don't support arg:ctx tag, before adjusting global
subprog BTF information to match kernel's expected canonical type names,
make sure that types used by user are meaningful, and if not, warn and
don't do BTF adjustments.
This is similar to checks that kernel performs, but narrower in scope,
as only a small subset of BPF program types can be accommodated by
libbpf using canonical type names.
Libbpf unconditionally allows `struct pt_regs *` for perf_event program
types, unlike kernel, which supports that conditionally on architecture.
This is done to keep things simple and not cause unnecessary false
positives. This seems like a minor and harmless deviation, which in
real-world programs will be caught by kernels with arg:ctx tag support
anyways. So KISS principle.
This logic is hard to test (especially on latest kernels), so manual
testing was performed instead. Libbpf emitted the following warning for
perf_event program with wrong context argument type:
libbpf: prog 'arg_tag_ctx_perf': subprog 'subprog_ctx_tag' arg#0 is expected to be of `struct bpf_perf_event_data *` type
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240118033143.3384355-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a bunch of global subprogs across variety of program types to
validate expected kernel type enforcement logic for __arg_ctx arguments.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240118033143.3384355-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add feature detector of kernel-side arg:ctx (__arg_ctx) tag support. If
this is detected, libbpf will avoid doing any __arg_ctx-related BTF
rewriting and checks in favor of letting kernel handle this completely.
test_global_funcs/ctx_arg_rewrite subtest is adjusted to do the same
feature detection (albeit in much simpler, though round-about and
inefficient, way), and skip the tests. This is done to still be able to
execute this test on older kernels (like in libbpf CI).
Note, BPF token series ([0]) does a major refactor and code moving of
libbpf-internal feature detection "framework", so to avoid unnecessary
conflicts we keep newly added feature detection stand-alone with ad-hoc
result caching. Once things settle, there will be a small follow up to
re-integrate everything back and move code into its final place in
newly-added (by BPF token series) features.c file.
[0] https://patchwork.kernel.org/project/netdevbpf/list/?series=814209&state=*
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240118033143.3384355-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The 'i' constraint expects a constant operand, which fn and its
constant derivative MK_CBO(fn) are, but passing fn through a function
as a parameter and using a local variable for MK_CBO(fn) allow the
compiler to lose sight of that when no optimization is done. Use
a macro instead of a function and skip the local variable to ensure
the compiler uses constants, matching the asm constraints.
Reported-by: Yunhui Cui <cuiyunhui@bytedance.com>
Closes: https://lore.kernel.org/all/20240117082514.42967-1-cuiyunhui@bytedance.com
Fixes: a29e2a48afe3 ("RISC-V: selftests: Add CBO tests")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240117130933.57514-2-ajones@ventanamicro.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver updates from Greg KH:
"Here is the big set of char/misc and other driver subsystem changes
for 6.8-rc1.
Other than lots of binder driver changes (as you can see by the merge
conflicts) included in here are:
- lots of iio driver updates and additions
- spmi driver updates
- eeprom driver updates
- firmware driver updates
- ocxl driver updates
- mhi driver updates
- w1 driver updates
- nvmem driver updates
- coresight driver updates
- platform driver remove callback api changes
- tags.sh script updates
- bus_type constant marking cleanups
- lots of other small driver updates
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (341 commits)
android: removed duplicate linux/errno
uio: Fix use-after-free in uio_open
drivers: soc: xilinx: add check for platform
firmware: xilinx: Export function to use in other module
scripts/tags.sh: remove find_sources
scripts/tags.sh: use -n to test archinclude
scripts/tags.sh: add local annotation
scripts/tags.sh: use more portable -path instead of -wholename
scripts/tags.sh: Update comment (addition of gtags)
firmware: zynqmp: Convert to platform remove callback returning void
firmware: turris-mox-rwtm: Convert to platform remove callback returning void
firmware: stratix10-svc: Convert to platform remove callback returning void
firmware: stratix10-rsu: Convert to platform remove callback returning void
firmware: raspberrypi: Convert to platform remove callback returning void
firmware: qemu_fw_cfg: Convert to platform remove callback returning void
firmware: mtk-adsp-ipc: Convert to platform remove callback returning void
firmware: imx-dsp: Convert to platform remove callback returning void
firmware: coreboot_table: Convert to platform remove callback returning void
firmware: arm_scpi: Convert to platform remove callback returning void
firmware: arm_scmi: Convert to platform remove callback returning void
...
|
|
Pull kvm updates from Paolo Bonzini:
"Generic:
- Use memdup_array_user() to harden against overflow.
- Unconditionally advertise KVM_CAP_DEVICE_CTRL for all
architectures.
- Clean up Kconfigs that all KVM architectures were selecting
- New functionality around "guest_memfd", a new userspace API that
creates an anonymous file and returns a file descriptor that refers
to it. guest_memfd files are bound to their owning virtual machine,
cannot be mapped, read, or written by userspace, and cannot be
resized. guest_memfd files do however support PUNCH_HOLE, which can
be used to switch a memory area between guest_memfd and regular
anonymous memory.
- New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify
per-page attributes for a given page of guest memory; right now the
only attribute is whether the guest expects to access memory via
guest_memfd or not, which in Confidential SVMs backed by SEV-SNP,
TDX or ARM64 pKVM is checked by firmware or hypervisor that
guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in
the case of pKVM).
x86:
- Support for "software-protected VMs" that can use the new
guest_memfd and page attributes infrastructure. This is mostly
useful for testing, since there is no pKVM-like infrastructure to
provide a meaningfully reduced TCB.
- Fix a relatively benign off-by-one error when splitting huge pages
during CLEAR_DIRTY_LOG.
- Fix a bug where KVM could incorrectly test-and-clear dirty bits in
non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with
a non-huge SPTE.
- Use more generic lockdep assertions in paths that don't actually
care about whether the caller is a reader or a writer.
- let Xen guests opt out of having PV clock reported as "based on a
stable TSC", because some of them don't expect the "TSC stable" bit
(added to the pvclock ABI by KVM, but never set by Xen) to be set.
- Revert a bogus, made-up nested SVM consistency check for
TLB_CONTROL.
- Advertise flush-by-ASID support for nSVM unconditionally, as KVM
always flushes on nested transitions, i.e. always satisfies flush
requests. This allows running bleeding edge versions of VMware
Workstation on top of KVM.
- Sanity check that the CPU supports flush-by-ASID when enabling SEV
support.
- On AMD machines with vNMI, always rely on hardware instead of
intercepting IRET in some cases to detect unmasking of NMIs
- Support for virtualizing Linear Address Masking (LAM)
- Fix a variety of vPMU bugs where KVM fail to stop/reset counters
and other state prior to refreshing the vPMU model.
- Fix a double-overflow PMU bug by tracking emulated counter events
using a dedicated field instead of snapshotting the "previous"
counter. If the hardware PMC count triggers overflow that is
recognized in the same VM-Exit that KVM manually bumps an event
count, KVM would pend PMIs for both the hardware-triggered overflow
and for KVM-triggered overflow.
- Turn off KVM_WERROR by default for all configs so that it's not
inadvertantly enabled by non-KVM developers, which can be
problematic for subsystems that require no regressions for W=1
builds.
- Advertise all of the host-supported CPUID bits that enumerate
IA32_SPEC_CTRL "features".
- Don't force a masterclock update when a vCPU synchronizes to the
current TSC generation, as updating the masterclock can cause
kvmclock's time to "jump" unexpectedly, e.g. when userspace
hotplugs a pre-created vCPU.
- Use RIP-relative address to read kvm_rebooting in the VM-Enter
fault paths, partly as a super minor optimization, but mostly to
make KVM play nice with position independent executable builds.
- Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
CONFIG_HYPERV as a minor optimization, and to self-document the
code.
- Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV
"emulation" at build time.
ARM64:
- LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base
granule sizes. Branch shared with the arm64 tree.
- Large Fine-Grained Trap rework, bringing some sanity to the
feature, although there is more to come. This comes with a prefix
branch shared with the arm64 tree.
- Some additional Nested Virtualization groundwork, mostly
introducing the NV2 VNCR support and retargetting the NV support to
that version of the architecture.
- A small set of vgic fixes and associated cleanups.
Loongarch:
- Optimization for memslot hugepage checking
- Cleanup and fix some HW/SW timer issues
- Add LSX/LASX (128bit/256bit SIMD) support
RISC-V:
- KVM_GET_REG_LIST improvement for vector registers
- Generate ISA extension reg_list using macros in get-reg-list
selftest
- Support for reporting steal time along with selftest
s390:
- Bugfixes
Selftests:
- Fix an annoying goof where the NX hugepage test prints out garbage
instead of the magic token needed to run the test.
- Fix build errors when a header is delete/moved due to a missing
flag in the Makefile.
- Detect if KVM bugged/killed a selftest's VM and print out a helpful
message instead of complaining that a random ioctl() failed.
- Annotate the guest printf/assert helpers with __printf(), and fix
the various bugs that were lurking due to lack of said annotation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (185 commits)
x86/kvm: Do not try to disable kvmclock if it was not enabled
KVM: x86: add missing "depends on KVM"
KVM: fix direction of dependency on MMU notifiers
KVM: introduce CONFIG_KVM_COMMON
KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgd
KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache
RISC-V: KVM: selftests: Add get-reg-list test for STA registers
RISC-V: KVM: selftests: Add steal_time test support
RISC-V: KVM: selftests: Add guest_sbi_probe_extension
RISC-V: KVM: selftests: Move sbi_ecall to processor.c
RISC-V: KVM: Implement SBI STA extension
RISC-V: KVM: Add support for SBI STA registers
RISC-V: KVM: Add support for SBI extension registers
RISC-V: KVM: Add SBI STA info to vcpu_arch
RISC-V: KVM: Add steal-update vcpu request
RISC-V: KVM: Add SBI STA extension skeleton
RISC-V: paravirt: Implement steal-time support
RISC-V: Add SBI STA extension definitions
RISC-V: paravirt: Add skeleton for pv-time support
RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for many new extensions in hwprobe, along with a handful of
cleanups
- Various cleanups to our page table handling code, so we alwayse use
{READ,WRITE}_ONCE
- Support for the which-cpus flavor of hwprobe
- Support for XIP kernels has been resurrected
* tag 'riscv-for-linus-6.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
riscv: hwprobe: export Zicond extension
riscv: hwprobe: export Zacas ISA extension
riscv: add ISA extension parsing for Zacas
dt-bindings: riscv: add Zacas ISA extension description
riscv: hwprobe: export Ztso ISA extension
riscv: add ISA extension parsing for Ztso
use linux/export.h rather than asm-generic/export.h
riscv: Remove SHADOW_OVERFLOW_STACK_SIZE macro
riscv; fix __user annotation in save_v_state()
riscv: fix __user annotation in traps_misaligned.c
riscv: Select ARCH_WANTS_NO_INSTR
riscv: Remove obsolete rv32_defconfig file
riscv: Allow disabling of BUILTIN_DTB for XIP
riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro
riscv: Make XIP bootable again
riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC
riscv: Fix module_alloc() that did not reset the linear mapping permissions
riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping
riscv: Check if the code to patch lies in the exit section
riscv: Use the same CPU operations for all CPUs
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"For once not mostly MM-related.
17 hotfixes. 10 address post-6.7 issues and the other 7 are cc:stable"
* tag 'mm-hotfixes-stable-2024-01-12-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
userfaultfd: avoid huge_zero_page in UFFDIO_MOVE
MAINTAINERS: add entry for shrinker
selftests: mm: hugepage-vmemmap fails on 64K page size systems
mm/memory_hotplug: fix memmap_on_memory sysfs value retrieval
mailmap: switch email for Tanzir Hasan
mailmap: add old address mappings for Randy
kernel/crash_core.c: make __crash_hotplug_lock static
efi: disable mirror feature during crashkernel
kexec: do syscore_shutdown() in kernel_kexec
mailmap: update entry for Manivannan Sadhasivam
fs/proc/task_mmu: move mmu notification mechanism inside mm lock
mm: zswap: switch maintainers to recently active developers and reviewers
scripts/decode_stacktrace.sh: optionally use LLVM utilities
kasan: avoid resetting aux_lock
lib/Kconfig.debug: disable CONFIG_DEBUG_INFO_BTF for Hexagon
MAINTAINERS: update LTP maintainers
kdump: defer the insertion of crashkernel resources
|
|
This is a follow-up of commit a159cbe81d3b ("selftests: rtnetlink: check
enslaving iface in a bond") after the merge of net-next into net.
The goal is to follow the new convention,
see commit d3b6b1116127 ("selftests/net: convert rtnetlink.sh to run it in
unique namespace") for more details.
Let's use also the generic dummy name instead of defining a new one.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240115135922.3662648-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|