Age | Commit message (Collapse) | Author |
|
If sysfs isn't mounted then we may fail to read a PMU's type. In this
situation resort to lookup of wellknown types. Only applies to
software, tracepoint and breakpoint PMUs.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
scnprintf is declared in linux/kernel.h, directly depend upon it.
Add missing SPDX comments.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add missing breakpoint and raw types. Avoid a switch, just use a
lookup array. Switch the type to unsigned to avoid checking negative
values.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Long names like ucsi_source_psy_USBC000:001 when prefixed with hwmon_
exceed the buffer size and the last digit is lost. This causes
confusion with similar names like ucsi_source_psy_USBC000:002. Extend
the buffer size to avoid this.
Fixes: 53cc0b351ec9 ("perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
It uses evsel->priv to save per-cpu timing information. It should be
freed when the evsel is released.
Add the priv destructor for evsel same as thread to handle that.
Fixes: 49394a2a24c78ce0 ("perf sched timehist: Introduce timehist command")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703014942.1369397-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The pipe mode header dumping was accidentally removed when tracing of
header feature events in pipe mode was added.
Minor spelling tweak to header test failure message.
Fixes: 61051f9a8452 ("perf header: In pipe mode dump features without --header/-I")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250703042000.2740640-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
With the dso_data embedded in a dso there is a reference counted
pointer to the dso rather than using container_of with reference count
checking. This data can hold the dso live meaning that no dso__put
ever deletes it. Add a check for this case and close the dso_data when
it happens. There isn't an infinite loop as the dso_data clears the
file descriptor prior to putting on the dso.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624190326.2038704-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Hold the path to the hwmon_pmu rather than the file descriptor. The
file descriptor is somewhat problematic in that it reflects the
directory state when opened, something that may vary in testing. Using
a path simplifies testing and to some extent cleanup as the hwmon_pmu
is owned by the pmus list and intentionally global and leaked when
perf terminates, the file descriptor being left open looks like a
leak.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624190326.2038704-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The kcore loading creates a set of list nodes that have reference
counted references to maps of the kcore. The list node freeing in the
success path wasn't releasing the maps, add the missing puts. It is
unclear why this leak was being missed by leak sanitizer.
Fixes: 83720209961f ("perf map: Move map list node into symbol")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624190326.2038704-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
With NO_LIBDW=1 a new unused-parameter warning/error has appeared:
```
util/genelf.c: In function ‘jit_write_elf’:
util/genelf.c:163:32: error: unused parameter ‘load_addr’ [-Werror=unused-parameter]
163 | jit_write_elf(int fd, uint64_t load_addr, const char *sym,
```
Fixes: e3f612c1d8f3 ("perf genelf: Remove libcrypto dependency and use built-in sha1()")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250702175402.761818-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When someone has a global shellcheckrc file, for example at
~/.config/shellcheckrc, with the directive 'shell=sh', building perf
will fail with many shellcheck errors like:
In tests/shell/base_probe/test_adding_kernel.sh line 294:
(( TEST_RESULT += $? ))
^---------------------^ SC3006 (warning): In POSIX sh, standalone ((..)) is undefined.
For more information:
https://www.shellcheck.net/wiki/SC3006 -- In POSIX sh, standalone ((..)) is...
make[5]: *** [tests/Build:91: tests/shell/base_probe/test_adding_kernel.sh.shellcheck_log] Error 1
Passing the '-s bash' option ensures that it runs correctly regardless
of a developers global configuration.
This patch adds '-s bash' and other options to the SHELLCHECK variable
in Makefile.perf and makes use of the variable consistently.
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Link: https://lore.kernel.org/r/63491dbc8439edf2e949d80e264b9d22332fea61.1751082075.git.collin.funk1@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
There are spelling mistakes in some literal strings. Fix these.
Fixes: 28917cb17f9d ("perf drm_pmu: Add a tool like PMU to expose DRM information")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250630125128.562895-1-colin.i.king@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Follow up:
lore.kernel.org/CAP-5=fVDF4-qYL1Lm7efgiHk7X=_nw_nEFMBZFMcsnOOJgX4Kg@mail.gmail.com/
The patch adds unit aggregation during evsel merge the aggregated uncore
counters. Change the name of the column to `ctrs` and `counters` for
json mode.
Tested on a 2-socket machine with SNC3, uncore_imc_[0-11] and
cpumask="0,120"
Before:
perf stat -e clockticks -I 1000 --per-socket
# time socket cpus counts unit events
1.001085024 S0 1 9615386315 clockticks
1.001085024 S1 1 9614287448 clockticks
perf stat -e clockticks -I 1000 --per-node
# time node cpus counts unit events
1.001029867 N0 1 3205726984 clockticks
1.001029867 N1 1 3205444421 clockticks
1.001029867 N2 1 3205234018 clockticks
1.001029867 N3 1 3205224660 clockticks
1.001029867 N4 1 3205207213 clockticks
1.001029867 N5 1 3205528246 clockticks
After:
perf stat -e clockticks -I 1000 --per-socket
# time socket ctrs counts unit events
1.001026071 S0 12 9619677996 clockticks
1.001026071 S1 12 9618612614 clockticks
perf stat -e clockticks -I 1000 --per-node
# time node ctrs counts unit events
1.001027449 N0 4 3207251859 clockticks
1.001027449 N1 4 3207315930 clockticks
1.001027449 N2 4 3206981828 clockticks
1.001027449 N3 4 3206566126 clockticks
1.001027449 N4 4 3206032609 clockticks
1.001027449 N5 4 3205651355 clockticks
Tested with JSON output linter:
perf test "perf stat JSON output linter"
94: perf stat JSON output linter : Ok
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Link: https://lore.kernel.org/r/20250627201818.479421-1-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Recently it added -fno-strict-aliasing to sync with the kernel behavior.
But it caused an error due to potential uninitialized access like below:
In file included from util/symbol.c:27:
In function ‘dso__set_symbol_names_len’,
inlined from ‘dso__sort_by_name’ at util/symbol.c:638:4:
util/dso.h:654:46: error: ‘len’ may be used uninitialized [-Werror=maybe-uninitialized]
654 | RC_CHK_ACCESS(dso)->symbol_names_len = len;
| ^
util/symbol.c: In function ‘dso__sort_by_name’:
util/symbol.c:634:24: note: ‘len’ was declared here
634 | size_t len;
| ^~~
Let's just initialize it with 0.
Fixes: 55a18d2f3ff79c90 ("perf build: enable -fno-strict-aliasing")
Closes: https://lore.kernel.org/r/aF7JC8zkG5-_-nY_@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Recently it uses llvm and capstone to speed up annotation or disassembly
of instructions. But they don't support source code view yet. Until it
fixed, we can force to use objdump for source code annotation.
To prevent performance loss, it's disabled by default and turned it on
when user requests it in TUI by pressing 's' key.
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250625230339.702610-1-namhyung@kernel.org
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
genelf is the only file in perf that depends on libcrypto (or openssl)
which only calculates a Build ID (SHA1, MD5, or URANDOM). SHA1 was
expected to be the default option, but MD5 was used by default due to
previous issues when linking against Java. This commit switches genelf
to use the in-house sha1(), and also removes MD5 and URANDOM options
since we have a reliable SHA1 implementation to rely on. It passes the
tools/perf/tests/shell/test_java_symbol.sh test.
Signed-off-by: Yuzhuo Jing <yuzhuo@google.com>
Co-developed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250625202311.23244-4-ebiggers@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
SHA-1 can be written in fewer than 100 lines of code. Just add a basic
SHA-1 implementation so that there's no need to use an external library
or try to pull in the kernel's SHA-1 implementation. The kernel's SHA-1
implementation is not really intended to be pulled into userspace
programs in the way that it was proposed to do so for perf
(https://lore.kernel.org/r/20250521225307.743726-3-yuzhuo@google.com/),
and it's also likely to undergo some refactoring in the future. There's
no need to tie userspace tools to it.
Include a test for sha1() in the util test suite.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250625202311.23244-3-ebiggers@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Calling perf top with branch filters enabled on Intel CPU's
with branch counters logging (A.K.A LBR event logging [1]) support
results in a segfault.
$ perf top -e '{cpu_core/cpu-cycles/,cpu_core/event=0xc6,umask=0x3,frontend=0x11,name=frontend_retired_dsb_miss/}' -j any,counter
...
Thread 27 "perf" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffafff76c0 (LWP 949003)]
perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
653 *width = env->cpu_pmu_caps ? env->br_cntr_width :
(gdb) bt
#0 perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
#1 0x00000000005b1599 in symbol__account_br_cntr (branch=0x7fffcc3db580, evsel=0xfea2d0, offset=12, br_cntr=8) at util/annotate.c:345
#2 0x00000000005b17fb in symbol__account_cycles (addr=5658172, start=5658160, sym=0x7fffcc0ee420, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:389
#3 0x00000000005b1976 in addr_map_symbol__account_cycles (ams=0x7fffcd7b01d0, start=0x7fffcd7b02b0, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:422
#4 0x000000000068d57f in hist__account_cycles (bs=0x110d288, al=0x7fffafff6540, sample=0x7fffafff6760, nonany_branch_mode=false, total_cycles=0x0, evsel=0xfea2d0) at util/hist.c:2850
#5 0x0000000000446216 in hist_iter__top_callback (iter=0x7fffafff6590, al=0x7fffafff6540, single=true, arg=0x7fffffff9e00) at builtin-top.c:737
#6 0x0000000000689787 in hist_entry_iter__add (iter=0x7fffafff6590, al=0x7fffafff6540, max_stack_depth=127, arg=0x7fffffff9e00) at util/hist.c:1359
#7 0x0000000000446710 in perf_event__process_sample (tool=0x7fffffff9e00, event=0x110d250, evsel=0xfea2d0, sample=0x7fffafff6760, machine=0x108c968) at builtin-top.c:845
#8 0x0000000000447735 in deliver_event (qe=0x7fffffffa120, qevent=0x10fc200) at builtin-top.c:1211
#9 0x000000000064ccae in do_flush (oe=0x7fffffffa120, show_progress=false) at util/ordered-events.c:245
#10 0x000000000064d005 in __ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324
#11 0x000000000064d0ef in ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP) at util/ordered-events.c:342
#12 0x00000000004472a9 in process_thread (arg=0x7fffffff9e00) at builtin-top.c:1120
#13 0x00007ffff6e7dba8 in start_thread (arg=<optimized out>) at pthread_create.c:448
#14 0x00007ffff6f01b8c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
The cause is that perf_env__find_br_cntr_info tries to access a
null pointer pmu_caps in the perf_env struct. A similar issue exists
for homogeneous core systems which use the cpu_pmu_caps structure.
Fix this by populating cpu_pmu_caps and pmu_caps structures with
values from sysfs when calling perf top with branch stack sampling
enabled.
[1], LBR event logging introduced here:
https://lore.kernel.org/all/20231025201626.3000228-5-kan.liang@linux.intel.com/
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250612163659.1357950-2-thomas.falcon@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
perf_pmus__find_core_pmu() is implemented in util/pmus.c but its
prototpye is in util/pmu.h. Move it to util/pmus.h.
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250612163659.1357950-1-thomas.falcon@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
And make builtin-trace.c less conditional. Dummy functions will be
called when BUILD_BPF_SKEL=0 is used. This makes the builtin-trace.c
slightly smaller and simpler by removing the skeleton and its helpers.
The conditional guard of trace__init_syscalls_bpf_prog_array_maps() is
changed from the HAVE_BPF_SKEL to HAVE_LIBBPF_SUPPORT as it doesn't
have a skeleton in the code directly. And a dummy function is added so
that it can be called unconditionally. The function will succeed only
if the both conditions are true.
Do not include trace_augment.h from the BPF code and move the definition
of TRACE_AUG_MAX_BUF to the BPF directly.
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250623225721.21553-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
If there are no values in bpf_prog_info or bpf_btf feature don't write
the data into the header.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The perf.data file may contain a bpf_prog_info or bpf_btf feature. If
the contents of these are empty then nothing is displayed. Rather than
display nothing and not account for the file space, display an empty
message.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
In pipe mode attr events capture the perf_event_attr. Allow their
dumping as they normally start the file.
Before:
```
$ perf record -o - -a sleep 1 | perf script -D -i -
. ... raw event: size 272 bytes
. 0000: 40 00 00 00 00 00 10 01 00 00 00 00 88 00 00 00 @...............
. 0010: 00 00 00 00 00 00 00 00 a0 0f 00 00 00 00 00 00 ................
. 0020: 87 01 01 00 00 00 00 00 14 00 00 00 00 00 00 00 ................
. 0030: 01 84 05 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0090: 91 08 00 00 00 00 00 00 92 08 00 00 00 00 00 00 ................
. 00a0: 93 08 00 00 00 00 00 00 94 08 00 00 00 00 00 00 ................
. 00b0: 95 08 00 00 00 00 00 00 96 08 00 00 00 00 00 00 ................
. 00c0: 97 08 00 00 00 00 00 00 98 08 00 00 00 00 00 00 ................
. 00d0: 99 08 00 00 00 00 00 00 9a 08 00 00 00 00 00 00 ................
. 00e0: 9b 08 00 00 00 00 00 00 9c 08 00 00 00 00 00 00 ................
. 00f0: 9d 08 00 00 00 00 00 00 9e 08 00 00 00 00 00 00 ................
. 0100: 9f 08 00 00 00 00 00 00 a0 08 00 00 00 00 00 00 ................
-1 -1 0 [0x110]: PERF_RECORD_ATTR
0x110@pipe [0x110]: event: 64
...
```
After:
```
$ perf record -o - -a sleep 1 | perf script -D -i -
0@pipe [0x110]: event: 64
.
. ... raw event: size 272 bytes
. 0000: 40 00 00 00 00 00 10 01 00 00 00 00 88 00 00 00 @...............
. 0010: 00 00 00 00 00 00 00 00 a0 0f 00 00 00 00 00 00 ................
. 0020: 87 01 01 00 00 00 00 00 14 00 00 00 00 00 00 00 ................
. 0030: 01 84 05 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0090: 5c 08 00 00 00 00 00 00 5d 08 00 00 00 00 00 00 \.......].......
. 00a0: 5e 08 00 00 00 00 00 00 5f 08 00 00 00 00 00 00 ^......._.......
. 00b0: 60 08 00 00 00 00 00 00 61 08 00 00 00 00 00 00 `.......a.......
. 00c0: 62 08 00 00 00 00 00 00 63 08 00 00 00 00 00 00 b.......c.......
. 00d0: 64 08 00 00 00 00 00 00 65 08 00 00 00 00 00 00 d.......e.......
. 00e0: 66 08 00 00 00 00 00 00 67 08 00 00 00 00 00 00 f.......g.......
. 00f0: 68 08 00 00 00 00 00 00 69 08 00 00 00 00 00 00 h.......i.......
. 0100: 6a 08 00 00 00 00 00 00 6b 08 00 00 00 00 00 00 j.......k.......
-1 -1 0 [0x110]: PERF_RECORD_ATTR, type = 0 (PERF_TYPE_HARDWARE), size = 136, config = 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, precise_ip = 3, sample_id_all = 1
0x110@pipe [0x110]: event: 64
...
```
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
In pipe mode the header features are contained within events. While
other events dump details the header features only dump if --header or
-I are passed, which doesn't make sense as in pipe mode there is no
perf file header. Make the printing of the information conditional on
dump_trace as with other events.
Before:
```
$ perf record -o - -a sleep 1 | perf script -D -i -
...
0x2c8@pipe [0x54]: event: 80
.
. ... raw event: size 84 bytes
. 0000: 50 00 00 00 00 00 54 00 05 00 00 00 00 00 00 00 P.....T.........
. 0010: 40 00 00 00 36 2e 31 35 2e 72 63 37 2e 67 61 64 @...6.15.rc7.gad
. 0020: 32 61 36 39 31 63 39 39 66 62 00 00 00 00 00 00 2a691c99fb......
. 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0050: 00 00 00 00 ....
0 0 0x2c8 [0x54]: PERF_RECORD_FEATURE
```
After:
```
$ perf record -o - -a sleep 1 | perf script -D -i -
...
0x2c8@pipe [0x54]: event: 80
.
. ... raw event: size 84 bytes
. 0000: 50 00 00 00 00 00 54 00 05 00 00 00 00 00 00 00 P.....T.........
. 0010: 40 00 00 00 36 2e 31 35 2e 72 63 37 2e 67 61 64 @...6.15.rc7.gad
. 0020: 32 61 36 39 31 63 39 39 66 62 00 00 00 00 00 00 2a691c99fb......
. 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0050: 00 00 00 00 ....
0 0 0x2c8 [0x54]: PERF_RECORD_FEATURE, # perf version : 6.15.rc7.gad2a691c99fb
```
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
DRM clients expose information through usage stats as documented in
Documentation/gpu/drm-usage-stats.rst (available online at
https://docs.kernel.org/gpu/drm-usage-stats.html). Add a tool like
PMU, similar to the hwmon PMU, that exposes DRM information. For
example on a tigerlake laptop:
```
$ perf list drm
List of pre-defined events (to be used in -e or -M):
drm:
drm-active-stolen-system0
[Total memory active in one or more engines. Unit: drm_i915]
drm-active-system0
[Total memory active in one or more engines. Unit: drm_i915]
drm-engine-capacity-video
[Engine capacity. Unit: drm_i915]
drm-engine-copy
[Utilization in ns. Unit: drm_i915]
drm-engine-render
[Utilization in ns. Unit: drm_i915]
drm-engine-video
[Utilization in ns. Unit: drm_i915]
drm-engine-video-enhance
[Utilization in ns. Unit: drm_i915]
drm-purgeable-stolen-system0
[Size of resident and purgeable memory bufers. Unit: drm_i915]
drm-purgeable-system0
[Size of resident and purgeable memory bufers. Unit: drm_i915]
drm-resident-stolen-system0
[Size of resident memory bufers. Unit: drm_i915]
drm-resident-system0
[Size of resident memory bufers. Unit: drm_i915]
drm-shared-stolen-system0
[Size of shared memory bufers. Unit: drm_i915]
drm-shared-system0
[Size of shared memory bufers. Unit: drm_i915]
drm-total-stolen-system0
[Size of shared and private memory. Unit: drm_i915]
drm-total-system0
[Size of shared and private memory. Unit: drm_i915]
```
System wide data can be gathered:
```
$ perf stat -x, -I 1000 -e drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0
1.000904910,0,bytes,drm-active-stolen-system0,1,100.00,,
1.000904910,0,bytes,drm-active-system0,1,100.00,,
1.000904910,36,capacity,drm-engine-capacity-video,1,100.00,,
1.000904910,0,ns,drm-engine-copy,1,100.00,,
1.000904910,1472970566175,ns,drm-engine-render,1,100.00,,
1.000904910,0,ns,drm-engine-video,1,100.00,,
1.000904910,0,ns,drm-engine-video-enhance,1,100.00,,
1.000904910,0,bytes,drm-purgeable-stolen-system0,1,100.00,,
1.000904910,38199296,bytes,drm-purgeable-system0,1,100.00,,
1.000904910,0,bytes,drm-resident-stolen-system0,1,100.00,,
1.000904910,4643196928,bytes,drm-resident-system0,1,100.00,,
1.000904910,0,bytes,drm-shared-stolen-system0,1,100.00,,
1.000904910,1886871552,bytes,drm-shared-system0,1,100.00,,
1.000904910,0,bytes,drm-total-stolen-system0,1,100.00,,
1.000904910,4643196928,bytes,drm-total-system0,1,100.00,,
2.264426839,0,bytes,drm-active-stolen-system0,1,100.00,,
```
Or for a particular process:
```
$ perf stat -x, -I 1000 -e drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0 -p 200027
1.001040274,0,bytes,drm-active-stolen-system0,6,100.00,,
1.001040274,0,bytes,drm-active-system0,6,100.00,,
1.001040274,12,capacity,drm-engine-capacity-video,6,100.00,,
1.001040274,0,ns,drm-engine-copy,6,100.00,,
1.001040274,1542300,ns,drm-engine-render,6,100.00,,
1.001040274,0,ns,drm-engine-video,6,100.00,,
1.001040274,0,ns,drm-engine-video-enhance,6,100.00,,
1.001040274,0,bytes,drm-purgeable-stolen-system0,6,100.00,,
1.001040274,13516800,bytes,drm-purgeable-system0,6,100.00,,
1.001040274,0,bytes,drm-resident-stolen-system0,6,100.00,,
1.001040274,27746304,bytes,drm-resident-system0,6,100.00,,
1.001040274,0,bytes,drm-shared-stolen-system0,6,100.00,,
1.001040274,0,bytes,drm-shared-system0,6,100.00,,
1.001040274,0,bytes,drm-total-stolen-system0,6,100.00,,
1.001040274,27746304,bytes,drm-total-system0,6,100.00,,
2.016629075,0,bytes,drm-active-stolen-system0,6,100.00,,
```
As with the hwmon PMU, high numbered PMU types are used to encode
multiple possible "DRM" PMUs. The appropriate fdinfo is found by
scanning /proc and filtering which fdinfos to read with stat. To avoid
some unneeding scanning, events not starting with "drm-" are
ignored. The patch builds on commit 57e13264dcea ("perf pmus:
Restructure pmu_read_sysfs to scan fewer PMUs") and later so that only
if full wild carding is being done, the PMU starts with "drm_" or the
event starts with "drm-" will /proc be scanned. That is there should
be little to no cost in this PMU unless DRM events are requested.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624231837.179536-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add perf_pmus__scan_for_event that only reads sysfs for pmus that
could contain a given event.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624231837.179536-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Symbolize stack traces by creating a live machine. Add this
functionality to dump_stack and switch dump_stack users to use
it. Switch TUI to use it. Add stack traces to the child test function
which can be useful to diagnose blocked code.
Example output:
```
$ perf test -vv PERF_RECORD_
...
7: PERF_RECORD_* events & perf_sample fields:
7: PERF_RECORD_* events & perf_sample fields : Running (1 active)
^C
Signal (2) while running tests.
Terminating tests with the same signal
Internal test harness failure. Completing any started tests:
: 7: PERF_RECORD_* events & perf_sample fields:
---- unexpected signal (2) ----
#0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
#1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
#2 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
#3 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
#4 0x7fc12fef1393 in __nanosleep nanosleep.c:26
#5 0x7fc12ff02d68 in __sleep sleep.c:55
#6 0x55788c63196b in test__PERF_RECORD perf-record.c:0
#7 0x55788c620fb0 in run_test_child builtin-test.c:0
#8 0x55788c5bd18d in start_command run-command.c:127
#9 0x55788c621ef3 in __cmd_test builtin-test.c:0
#10 0x55788c6225bf in cmd_test ??:0
#11 0x55788c5afbd0 in run_builtin perf.c:0
#12 0x55788c5afeeb in handle_internal_command perf.c:0
#13 0x55788c52b383 in main ??:0
#14 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
#15 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#16 0x55788c52b9d1 in _start ??:0
---- unexpected signal (2) ----
#0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
#1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
#2 0x7fc12fea3a14 in pthread_sigmask@GLIBC_2.2.5 pthread_sigmask.c:45
#3 0x7fc12fe49fd9 in __GI___sigprocmask sigprocmask.c:26
#4 0x7fc12ff2601b in __longjmp_chk longjmp.c:36
#5 0x55788c6210c0 in print_test_result.isra.0 builtin-test.c:0
#6 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
#7 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
#8 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
#9 0x7fc12fef1393 in __nanosleep nanosleep.c:26
#10 0x7fc12ff02d68 in __sleep sleep.c:55
#11 0x55788c63196b in test__PERF_RECORD perf-record.c:0
#12 0x55788c620fb0 in run_test_child builtin-test.c:0
#13 0x55788c5bd18d in start_command run-command.c:127
#14 0x55788c621ef3 in __cmd_test builtin-test.c:0
#15 0x55788c6225bf in cmd_test ??:0
#16 0x55788c5afbd0 in run_builtin perf.c:0
#17 0x55788c5afeeb in handle_internal_command perf.c:0
#18 0x55788c52b383 in main ??:0
#19 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
#20 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#21 0x55788c52b9d1 in _start ??:0
7: PERF_RECORD_* events & perf_sample fields : Skip (permissions)
```
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624210500.2121303-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Without the change `perf `hangs up on charaster devices. On my system
it's enough to run system-wide sampler for a few seconds to get the
hangup:
$ perf record -a -g --call-graph=dwarf
$ perf report
# hung
`strace` shows that hangup happens on reading on a character device
`/dev/dri/renderD128`
$ strace -y -f -p 2780484
strace: Process 2780484 attached
pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached
It's call trace descends into `elfutils`:
$ gdb -p 2780484
(gdb) bt
#0 0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0)
at ../sysdeps/unix/sysv/linux/pread64.c:25
#1 0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1
#2 0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#3 0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#4 0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 ()
from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#5 0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0)
at util/dso.h:537
#6 0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114
#7 frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242
#8 0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#9 0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0,
thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127,
best_effort=best_effort@entry=false) at util/thread.h:152
#13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0,
sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939
#14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540,
max_stack=127, symbols=true) at util/machine.c:2920
#15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440,
sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true)
at util/machine.c:2970
#16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440,
sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198
#17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0,
evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127
#18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127,
arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255
#19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540,
evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334
#20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0,
file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367
#21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245
#22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324
#23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224,
file_path=0x105ffbf0 "perf.data") at util/session.c:1419
#24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0,
--Type <RET> for more, q to quit, c to continue without paging--
quit
prog=prog@entry=0x7fff9df81220) at util/session.c:2132
#25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220)
at util/session.c:2181
#26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226
#27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390
#28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076
#29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827
#30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0)
at perf.c:351
#31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404
#32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448
#33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556
The hangup happens because nothing in` perf` or `elfutils` checks if a
mapped file is easily readable.
The change conservatively skips all non-regular files.
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250505174419.2814857-1-slyich@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Lower non-error debug messages to verbose 3 or larger.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250623161930.1421216-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
To get the fixes in libbpf and perf tools.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
This is an end-to-end test for the PERF_RECORD_BPF_METADATA support.
It adds a new "bpf_metadata_perf_version" variable to perf's BPF programs,
so that when they are loaded, there will be at least one BPF program with
some metadata to parse. The test invokes "perf record" in a way that loads
one of those BPF programs, and then sifts through the output to find its
BPF metadata.
Signed-off-by: Blake Jones <blakejones@google.com>
Link: https://lore.kernel.org/r/20250612194939.162730-6-blakejones@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Here's some example "perf script -D" output for the new event type. The
": unhandled!" message is from tool.c, analogous to other behavior there.
I've elided some rows with all NUL characters for brevity, and I wrapped
one of the >75-column lines to fit in the commit guidelines.
0x50fc8@perf.data [0x260]: event: 84
.
. ... raw event: size 608 bytes
. 0000: 54 00 00 00 00 00 60 02 62 70 66 5f 70 72 6f 67 T.....`.bpf_prog
. 0010: 5f 31 65 30 61 32 65 33 36 36 65 35 36 66 31 61 _1e0a2e366e56f1a
. 0020: 32 5f 70 65 72 66 5f 73 61 6d 70 6c 65 5f 66 69 2_perf_sample_fi
. 0030: 6c 74 65 72 00 00 00 00 00 00 00 00 00 00 00 00 lter............
. 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[...]
. 0110: 74 65 73 74 5f 76 61 6c 75 65 00 00 00 00 00 00 test_value......
. 0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[...]
. 0150: 34 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00 42..............
. 0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[...]
0 0x50fc8 [0x260]: PERF_RECORD_BPF_METADATA \
prog bpf_prog_1e0a2e366e56f1a2_perf_sample_filter
entry 0: test_value = 42
: unhandled!
Signed-off-by: Blake Jones <blakejones@google.com>
Link: https://lore.kernel.org/r/20250612194939.162730-5-blakejones@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
This collects metadata for any BPF programs that were loaded during a
"perf record" run, and emits it at the end of the run.
Signed-off-by: Blake Jones <blakejones@google.com>
Link: https://lore.kernel.org/r/20250612194939.162730-4-blakejones@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Look for .rodata maps, find ones with 'bpf_metadata_' variables, extract
their values as strings, and create a new PERF_RECORD_BPF_METADATA
synthetic event using that data. The code gets invoked from the existing
routine perf_event__synthesize_one_bpf_prog().
For example, a BPF program with the following variables:
const char bpf_metadata_version[] SEC(".rodata") = "3.14159";
int bpf_metadata_value[] SEC(".rodata") = 42;
would generate a PERF_RECORD_BPF_METADATA record with:
.prog_name = <BPF program name, e.g. "bpf_prog_a1b2c3_foo">
.nr_entries = 2
.entries[0].key = "version"
.entries[0].value = "3.14159"
.entries[1].key = "value"
.entries[1].value = "42"
Each of the BPF programs and subprograms that share those variables would
get a distinct PERF_RECORD_BPF_METADATA record, with the ".prog_name"
showing the name of each program or subprogram. The prog_name is
deliberately the same as the ".name" field in the corresponding
PERF_RECORD_KSYMBOL record.
This code only gets invoked if support for displaying BTF char arrays
as strings is detected.
Signed-off-by: Blake Jones <blakejones@google.com>
Link: https://lore.kernel.org/r/20250612194939.162730-3-blakejones@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
It is possible for systems to have a greater socket id number than the
number of cpus present on a machine, so this test is obselete and should
be removed.
Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250618142921.4053400-2-ashelat@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Commit 7b100989b4f6bce7 ("perf evlist: Remove __evlist__add_default")
changed to use "cycles:P" as a default event. But the problem is it
cannot set other default modifiers correctly.
perf kvm needs to set attr.exclude_host by default but it didn't work
because of the logic in the parse_events__modifier_list(). Also the
exclude_GH_default was applied only if ":u" modifier was specified -
which is strange. Move it out after handling the ":GH" and check
perf_host and perf_guest properly.
Before:
$ ./perf kvm record -vv true |& grep exclude
(nothing)
But specifying an event (without a modifier) works:
$ ./perf kvm record -vv -e cycles true |& grep exclude
exclude_host 1
After:
It now works for the both cases:
$ ./perf kvm record -vv true |& grep exclude
exclude_host 1
$ ./perf kvm record -vv -e cycles true |& grep exclude
exclude_host 1
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250606225431.2109754-1-namhyung@kernel.org
Fixes: 35c8d21371e9b342 ("perf tools: Don't set attr.exclude_guest by default")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Test that disabling rdpmc support via /sys/bus/event_source/cpu*/rdpmc
disables reading in the mmap (libperf read support will fallback to
using a system call).
Test all hybrid PMUs support rdpmc.
Ensure hybrid PMUs use the correct CPU to rdpmc the correct
event. Previously the test would open cycles or instructions with no
extended type then rdpmc it on whatever CPU. This could fail/skip due
to which CPU the test was scheduled upon.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250614004528.1652860-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add missing close() to avoid leaking perf events.
In past perfs this mattered little as the function was just used by 'perf
list'.
As the function is now used to detect hybrid PMUs leaking the perf event
is somewhat more painful.
Fixes: b41f1cec91c37eee ("perf list: Skip unsupported events")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/20250614004108.1650988-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Also add SYM_PIC_ALIAS() to tools/perf/util/include/linux/linkage.h.
This is to get the changes from:
419cbaf6a56a6e4b ("x86/boot: Add a bunch of PIC aliases")
That addresses these perf tools build warning:
Warning: Kernel ABI header differences:
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/aEry7L3fibwIG5au@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add thread safety annotations for comm_list and add locking for two
instances where the list is accessed without the lock held (in
contradiction to ____thread__set_comm()).
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250529192206.971199-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
pert script tests fails with segmentation fault as below:
92: perf script tests:
--- start ---
test child forked, pid 103769
DB test
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB /tmp/perf-test-script.7rbftEpOzX/perf.data (9 samples) ]
/usr/libexec/perf-core/tests/shell/script.sh: line 35:
103780 Segmentation fault (core dumped)
perf script -i "${perfdatafile}" -s "${db_test}"
--- Cleaning up ---
---- end(-1) ----
92: perf script tests : FAILED!
Backtrace pointed to :
#0 0x0000000010247dd0 in maps.machine ()
#1 0x00000000101d178c in db_export.sample ()
#2 0x00000000103412c8 in python_process_event ()
#3 0x000000001004eb28 in process_sample_event ()
#4 0x000000001024fcd0 in machines.deliver_event ()
#5 0x000000001025005c in perf_session.deliver_event ()
#6 0x00000000102568b0 in __ordered_events__flush.part.0 ()
#7 0x0000000010251618 in perf_session.process_events ()
#8 0x0000000010053620 in cmd_script ()
#9 0x00000000100b5a28 in run_builtin ()
#10 0x00000000100b5f94 in handle_internal_command ()
#11 0x0000000010011114 in main ()
Further investigation reveals that this occurs in the `perf script tests`,
because it uses `db_test.py` script. This script sets `perf_db_export_mode = True`.
With `perf_db_export_mode` enabled, if a sample originates from a hypervisor,
perf doesn't set maps for "[H]" sample in the code. Consequently, `al->maps` remains NULL
when `maps__machine(al->maps)` is called from `db_export__sample`.
As al->maps can be NULL in case of Hypervisor samples , use thread->maps
because even for Hypervisor sample, machine should exist.
If we don't have machine for some reason, return -1 to avoid segmentation fault.
Reported-by: Disha Goel <disgoel@linux.ibm.com>
Signed-off-by: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Link: https://lore.kernel.org/r/20250429065132.36839-1-adityab1@linux.ibm.com
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Now the target doesn't have a uid, it is handled through BPF filters,
remove the uid options to thread_map creation. Tidy up the functions
used in tests to avoid passing unused arguments.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Gathering threads with a uid by scanning /proc is inherently racy
leading to perf_event_open failures that quit perf. All users of the
functionality now use BPF filters, so remove uid and uid_str from
target.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Finding user processes by scanning /proc is inherently racy and
results in perf_event_open failures. Use a BPF filter to drop samples
where the uid doesn't match.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add parse_uid_filter filter as a helper to parse_filter, that
constructs a uid filter string. As uid filters don't work with
tracepoint filters, add a is_possible_tp_filter function so the
tracepoint filter isn't attempted for tracepoint evsels.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Allow parse_uid to be called without a struct target. Rather than have
two errors, remove TARGET_ERRNO__USER_NOT_FOUND and use
TARGET_ERRNO__INVALID_UID as the handling is identical.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Rather than manually scanning PMUs, use evsel__find_pmu that can use
the PMU set during event parsing.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The BPF filter needs libbpf/BPF-skeleton support and root privilege.
Add error messages to help users understand the problem easily.
When it's not build with BPF support (make BUILD_BPF_SKEL=0).
$ sudo perf record -e cycles --filter "pid != 0" true
Error: BPF filter is requested but perf is not built with BPF.
Please make sure to build with libbpf and BPF skeleton.
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
--filter <filter>
event filter
When it supports BPF but runs without root or CAP_BPF. Note that it
also checks pinned BPF filters.
$ perf record -e cycles --filter "pid != 0" -o /dev/null true
Error: BPF filter only works for users with the CAP_BPF capability!
Please run 'perf record --setup-filter pin' as root first.
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
--filter <filter>
event filter
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174835.1852481-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for the FWFT SBI extension, which is part of SBI 3.0 and a
dependency for many new SBI and ISA extensions
- Support for getrandom() in the VDSO
- Support for mseal
- Optimized routines for raid6 syndrome and recovery calculations
- kexec_file() supports loading Image-formatted kernel binaries
- Improvements to the instruction patching framework to allow for
atomic instruction patching, along with rules as to how systems need
to behave in order to function correctly
- Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
some SiFive vendor extensions
- Various fixes and cleanups, including: misaligned access handling,
perf symbol mangling, module loading, PUD THPs, and improved uaccess
routines
* tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits)
riscv: uaccess: Only restore the CSR_STATUS SUM bit
RISC-V: vDSO: Wire up getrandom() vDSO implementation
riscv: enable mseal sysmap for RV64
raid6: Add RISC-V SIMD syndrome and recovery calculations
riscv: mm: Add support for Svinval extension
RISC-V: Documentation: Add enough title underlines to CMODX
riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE
MAINTAINERS: Update Atish's email address
riscv: uaccess: do not do misaligned accesses in get/put_user()
riscv: process: use unsigned int instead of unsigned long for put_user()
riscv: make unsafe user copy routines use existing assembly routines
riscv: hwprobe: export Zabha extension
riscv: Make regs_irqs_disabled() more clear
perf symbols: Ignore mapping symbols on riscv
RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND
riscv: module: Optimize PLT/GOT entry counting
riscv: Add support for PUD THP
riscv: xchg: Prefetch the destination word for sc.w
riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop
riscv: Add support for Zicbop
...
|
|
RISCV ELF use mapping symbols with special names $x, $d to
identify regions of RISCV code or code with different ISAs[1].
These symbols don't identify functions, so will confuse the
perf output.
The patch filters out these symbols at load time, similar to
"4886f2ca perf symbols: Ignore mapping symbols on aarch64".
[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/
master/riscv-elf.adoc#mapping-symbol
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250409025202.201046-1-haibo1.xu@intel.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
|