summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2025-07-01perf test: Name the noploop processIan Rogers
Name the noploop process "perf-noploop" so that tests can easily check for its existence. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250628012302.1242532-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf build: Specify shellcheck should use bashCollin Funk
When someone has a global shellcheckrc file, for example at ~/.config/shellcheckrc, with the directive 'shell=sh', building perf will fail with many shellcheck errors like: In tests/shell/base_probe/test_adding_kernel.sh line 294: (( TEST_RESULT += $? )) ^---------------------^ SC3006 (warning): In POSIX sh, standalone ((..)) is undefined. For more information: https://www.shellcheck.net/wiki/SC3006 -- In POSIX sh, standalone ((..)) is... make[5]: *** [tests/Build:91: tests/shell/base_probe/test_adding_kernel.sh.shellcheck_log] Error 1 Passing the '-s bash' option ensures that it runs correctly regardless of a developers global configuration. This patch adds '-s bash' and other options to the SHELLCHECK variable in Makefile.perf and makes use of the variable consistently. Signed-off-by: Collin Funk <collin.funk1@gmail.com> Link: https://lore.kernel.org/r/63491dbc8439edf2e949d80e264b9d22332fea61.1751082075.git.collin.funk1@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf test annotate: Use --percent-limit rather than head to reduce outputIan Rogers
The annotate test was sped up by Thomas Richter <tmricht@linux.ibm.com> in commit 658a8805cb60 ("perf test: Speed up test case 70 annotate basic tests") by reducing the annotate output using head. This causes flakes on hybrid machines where the first event dumped may not have the samples for the test within it. Rather than reduce the output using `head` switch to `--percent-limit 10` which will stop annotate dumping functions that have an overhead of less than 10%, the noploop program should be using more. Add the missing objdump option for the pipe mode version of the objdump with a command test. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20250628015832.1271229-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf test: Add basic callgraph test to record testingIan Rogers
Give some basic perf record callgraph coverage. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20250628015553.1270748-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf drm_pmu: Fix spelling mistake "bufers" -> "buffers"Colin Ian King
There are spelling mistakes in some literal strings. Fix these. Fixes: 28917cb17f9d ("perf drm_pmu: Add a tool like PMU to expose DRM information") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250630125128.562895-1-colin.i.king@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf test: perf header test fails on s390Thomas Richter
commit 2d584688643fa ("perf test: Add header shell test") introduced a new test case for perf header. It fails on s390 because call graph option -g is not supported on s390. Also the option --call-graph dwarf is only supported for the event cpu-clock. Remove this option and the test succeeds. Output after: # ./perf test 76 76: perf header tests : Ok Fixes: 2d584688643fa ("perf test: Add header shell test") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Link: https://lore.kernel.org/r/20250630091613.3061664-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-27perf stat: Fix uncore aggregation numberChun-Tse Shao
Follow up: lore.kernel.org/CAP-5=fVDF4-qYL1Lm7efgiHk7X=_nw_nEFMBZFMcsnOOJgX4Kg@mail.gmail.com/ The patch adds unit aggregation during evsel merge the aggregated uncore counters. Change the name of the column to `ctrs` and `counters` for json mode. Tested on a 2-socket machine with SNC3, uncore_imc_[0-11] and cpumask="0,120" Before: perf stat -e clockticks -I 1000 --per-socket # time socket cpus counts unit events 1.001085024 S0 1 9615386315 clockticks 1.001085024 S1 1 9614287448 clockticks perf stat -e clockticks -I 1000 --per-node # time node cpus counts unit events 1.001029867 N0 1 3205726984 clockticks 1.001029867 N1 1 3205444421 clockticks 1.001029867 N2 1 3205234018 clockticks 1.001029867 N3 1 3205224660 clockticks 1.001029867 N4 1 3205207213 clockticks 1.001029867 N5 1 3205528246 clockticks After: perf stat -e clockticks -I 1000 --per-socket # time socket ctrs counts unit events 1.001026071 S0 12 9619677996 clockticks 1.001026071 S1 12 9618612614 clockticks perf stat -e clockticks -I 1000 --per-node # time node ctrs counts unit events 1.001027449 N0 4 3207251859 clockticks 1.001027449 N1 4 3207315930 clockticks 1.001027449 N2 4 3206981828 clockticks 1.001027449 N3 4 3206566126 clockticks 1.001027449 N4 4 3206032609 clockticks 1.001027449 N5 4 3205651355 clockticks Tested with JSON output linter: perf test "perf stat JSON output linter" 94: perf stat JSON output linter : Ok Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Link: https://lore.kernel.org/r/20250627201818.479421-1-ctshao@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-27perf build: Fix a build error on REFCNT_CHECKING=1Namhyung Kim
Recently it added -fno-strict-aliasing to sync with the kernel behavior. But it caused an error due to potential uninitialized access like below: In file included from util/symbol.c:27: In function ‘dso__set_symbol_names_len’, inlined from ‘dso__sort_by_name’ at util/symbol.c:638:4: util/dso.h:654:46: error: ‘len’ may be used uninitialized [-Werror=maybe-uninitialized] 654 | RC_CHK_ACCESS(dso)->symbol_names_len = len; | ^ util/symbol.c: In function ‘dso__sort_by_name’: util/symbol.c:634:24: note: ‘len’ was declared here 634 | size_t len; | ^~~ Let's just initialize it with 0. Fixes: 55a18d2f3ff79c90 ("perf build: enable -fno-strict-aliasing") Closes: https://lore.kernel.org/r/aF7JC8zkG5-_-nY_@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26tools/perf: Add --exclude-buildids option to perf archive commandTianyou Li
When make a perf archive, it may contains the binaries that user did not want to ship with, add --exclude-buildids option to specify a file which contains the buildids need to be excluded. The file can be generated from command: perf buildid-list -i perf.data --with-hits | grep -v "^ " > exclude-buildids.txt Then remove the lines from the exclude-buildids.txt for buildids should be included. Signed-off-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Wangyang Guo <wangyang.guo@intel.com> Link: https://lore.kernel.org/r/20250625161509.2599646-1-tianyou.li@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf annotate: Fix source code annotate with objdumpNamhyung Kim
Recently it uses llvm and capstone to speed up annotation or disassembly of instructions. But they don't support source code view yet. Until it fixed, we can force to use objdump for source code annotation. To prevent performance loss, it's disabled by default and turned it on when user requests it in TUI by pressing 's' key. Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625230339.702610-1-namhyung@kernel.org Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26tools: Remove libcrypto dependencyYuzhuo Jing
Remove all occurrence of libcrypto in the build system. Signed-off-by: Yuzhuo Jing <yuzhuo@google.com> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625202311.23244-5-ebiggers@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf genelf: Remove libcrypto dependency and use built-in sha1()Yuzhuo Jing
genelf is the only file in perf that depends on libcrypto (or openssl) which only calculates a Build ID (SHA1, MD5, or URANDOM). SHA1 was expected to be the default option, but MD5 was used by default due to previous issues when linking against Java. This commit switches genelf to use the in-house sha1(), and also removes MD5 and URANDOM options since we have a reliable SHA1 implementation to rely on. It passes the tools/perf/tests/shell/test_java_symbol.sh test. Signed-off-by: Yuzhuo Jing <yuzhuo@google.com> Co-developed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625202311.23244-4-ebiggers@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf util: add a basic SHA-1 implementationEric Biggers
SHA-1 can be written in fewer than 100 lines of code. Just add a basic SHA-1 implementation so that there's no need to use an external library or try to pull in the kernel's SHA-1 implementation. The kernel's SHA-1 implementation is not really intended to be pulled into userspace programs in the way that it was proposed to do so for perf (https://lore.kernel.org/r/20250521225307.743726-3-yuzhuo@google.com/), and it's also likely to undergo some refactoring in the future. There's no need to tie userspace tools to it. Include a test for sha1() in the util test suite. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625202311.23244-3-ebiggers@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf build: enable -fno-strict-aliasingEric Biggers
perf pulls in code from kernel headers that assumes it is being built with -fno-strict-aliasing, namely put_unaligned_*() from <linux/unaligned.h> which write the data using packed structs that lack the may_alias attribute. Enable -fno-strict-aliasing to prevent miscompilations in sha1.c which would otherwise occur due to this issue. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625202311.23244-2-ebiggers@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf top: populate PMU capabilities data in perf_envThomas Falcon
Calling perf top with branch filters enabled on Intel CPU's with branch counters logging (A.K.A LBR event logging [1]) support results in a segfault. $ perf top -e '{cpu_core/cpu-cycles/,cpu_core/event=0xc6,umask=0x3,frontend=0x11,name=frontend_retired_dsb_miss/}' -j any,counter ... Thread 27 "perf" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffafff76c0 (LWP 949003)] perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653 653 *width = env->cpu_pmu_caps ? env->br_cntr_width : (gdb) bt #0 perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653 #1 0x00000000005b1599 in symbol__account_br_cntr (branch=0x7fffcc3db580, evsel=0xfea2d0, offset=12, br_cntr=8) at util/annotate.c:345 #2 0x00000000005b17fb in symbol__account_cycles (addr=5658172, start=5658160, sym=0x7fffcc0ee420, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:389 #3 0x00000000005b1976 in addr_map_symbol__account_cycles (ams=0x7fffcd7b01d0, start=0x7fffcd7b02b0, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:422 #4 0x000000000068d57f in hist__account_cycles (bs=0x110d288, al=0x7fffafff6540, sample=0x7fffafff6760, nonany_branch_mode=false, total_cycles=0x0, evsel=0xfea2d0) at util/hist.c:2850 #5 0x0000000000446216 in hist_iter__top_callback (iter=0x7fffafff6590, al=0x7fffafff6540, single=true, arg=0x7fffffff9e00) at builtin-top.c:737 #6 0x0000000000689787 in hist_entry_iter__add (iter=0x7fffafff6590, al=0x7fffafff6540, max_stack_depth=127, arg=0x7fffffff9e00) at util/hist.c:1359 #7 0x0000000000446710 in perf_event__process_sample (tool=0x7fffffff9e00, event=0x110d250, evsel=0xfea2d0, sample=0x7fffafff6760, machine=0x108c968) at builtin-top.c:845 #8 0x0000000000447735 in deliver_event (qe=0x7fffffffa120, qevent=0x10fc200) at builtin-top.c:1211 #9 0x000000000064ccae in do_flush (oe=0x7fffffffa120, show_progress=false) at util/ordered-events.c:245 #10 0x000000000064d005 in __ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324 #11 0x000000000064d0ef in ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP) at util/ordered-events.c:342 #12 0x00000000004472a9 in process_thread (arg=0x7fffffff9e00) at builtin-top.c:1120 #13 0x00007ffff6e7dba8 in start_thread (arg=<optimized out>) at pthread_create.c:448 #14 0x00007ffff6f01b8c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 The cause is that perf_env__find_br_cntr_info tries to access a null pointer pmu_caps in the perf_env struct. A similar issue exists for homogeneous core systems which use the cpu_pmu_caps structure. Fix this by populating cpu_pmu_caps and pmu_caps structures with values from sysfs when calling perf top with branch stack sampling enabled. [1], LBR event logging introduced here: https://lore.kernel.org/all/20231025201626.3000228-5-kan.liang@linux.intel.com/ Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250612163659.1357950-2-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf tools: move perf_pmus__find_core_pmu() prototype to pmus.hThomas Falcon
perf_pmus__find_core_pmu() is implemented in util/pmus.c but its prototpye is in util/pmu.h. Move it to util/pmus.h. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250612163659.1357950-1-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf trace: Split BPF skel code to util/bpf_trace_augment.cNamhyung Kim
And make builtin-trace.c less conditional. Dummy functions will be called when BUILD_BPF_SKEL=0 is used. This makes the builtin-trace.c slightly smaller and simpler by removing the skeleton and its helpers. The conditional guard of trace__init_syscalls_bpf_prog_array_maps() is changed from the HAVE_BPF_SKEL to HAVE_LIBBPF_SUPPORT as it doesn't have a skeleton in the code directly. And a dummy function is added so that it can be called unconditionally. The function will succeed only if the both conditions are true. Do not include trace_augment.h from the BPF code and move the definition of TRACE_AUG_MAX_BUF to the BPF directly. Reviewed-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250623225721.21553-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf test: Change all remaining #!/bin/sh to #!/bin/bashJames Clark
There are 43 instances of posix shell tests and 35 instances of bash. To give us a single consistent language for testing in, replace all #!/bin/sh to #!/bin/bash. Common sources that are included in both different shells will now work as expected. And we no longer have to fix up bashisms that appear to work when someone's system has sh symlinked to bash, but don't work on other systems that have both shells installed. Although we could have chosen sh, it's not backwards compatible so it wouldn't be possible to bulk convert without re-writing the existing bash tests. Choosing bash also gives us some nicer features including 'local' variable definitions and regexes in if statements that are already widely used in the tests. It's not expected that there are any users with only sh available due to the large number of bash tests that exist. Discussed in relation to running shellcheck here: https://lore.kernel.org/linux-perf-users/e3751a74be34bbf3781c4644f518702a7270220b.1749785642.git.collin.funk1@gmail.com/ Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Collin Funk <collin.funk1@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250623-james-perf-bash-tests-v1-1-f572f54d4559@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf header: Don't write empty BPF/BTF infoIan Rogers
If there are no values in bpf_prog_info or bpf_btf feature don't write the data into the header. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf header: Display message if BPF/BTF info is emptyIan Rogers
The perf.data file may contain a bpf_prog_info or bpf_btf feature. If the contents of these are empty then nothing is displayed. Rather than display nothing and not account for the file space, display an empty message. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf header: Allow tracing of attr eventsIan Rogers
In pipe mode attr events capture the perf_event_attr. Allow their dumping as they normally start the file. Before: ``` $ perf record -o - -a sleep 1 | perf script -D -i - . ... raw event: size 272 bytes . 0000: 40 00 00 00 00 00 10 01 00 00 00 00 88 00 00 00 @............... . 0010: 00 00 00 00 00 00 00 00 a0 0f 00 00 00 00 00 00 ................ . 0020: 87 01 01 00 00 00 00 00 14 00 00 00 00 00 00 00 ................ . 0030: 01 84 05 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0090: 91 08 00 00 00 00 00 00 92 08 00 00 00 00 00 00 ................ . 00a0: 93 08 00 00 00 00 00 00 94 08 00 00 00 00 00 00 ................ . 00b0: 95 08 00 00 00 00 00 00 96 08 00 00 00 00 00 00 ................ . 00c0: 97 08 00 00 00 00 00 00 98 08 00 00 00 00 00 00 ................ . 00d0: 99 08 00 00 00 00 00 00 9a 08 00 00 00 00 00 00 ................ . 00e0: 9b 08 00 00 00 00 00 00 9c 08 00 00 00 00 00 00 ................ . 00f0: 9d 08 00 00 00 00 00 00 9e 08 00 00 00 00 00 00 ................ . 0100: 9f 08 00 00 00 00 00 00 a0 08 00 00 00 00 00 00 ................ -1 -1 0 [0x110]: PERF_RECORD_ATTR 0x110@pipe [0x110]: event: 64 ... ``` After: ``` $ perf record -o - -a sleep 1 | perf script -D -i - 0@pipe [0x110]: event: 64 . . ... raw event: size 272 bytes . 0000: 40 00 00 00 00 00 10 01 00 00 00 00 88 00 00 00 @............... . 0010: 00 00 00 00 00 00 00 00 a0 0f 00 00 00 00 00 00 ................ . 0020: 87 01 01 00 00 00 00 00 14 00 00 00 00 00 00 00 ................ . 0030: 01 84 05 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0090: 5c 08 00 00 00 00 00 00 5d 08 00 00 00 00 00 00 \.......]....... . 00a0: 5e 08 00 00 00 00 00 00 5f 08 00 00 00 00 00 00 ^......._....... . 00b0: 60 08 00 00 00 00 00 00 61 08 00 00 00 00 00 00 `.......a....... . 00c0: 62 08 00 00 00 00 00 00 63 08 00 00 00 00 00 00 b.......c....... . 00d0: 64 08 00 00 00 00 00 00 65 08 00 00 00 00 00 00 d.......e....... . 00e0: 66 08 00 00 00 00 00 00 67 08 00 00 00 00 00 00 f.......g....... . 00f0: 68 08 00 00 00 00 00 00 69 08 00 00 00 00 00 00 h.......i....... . 0100: 6a 08 00 00 00 00 00 00 6b 08 00 00 00 00 00 00 j.......k....... -1 -1 0 [0x110]: PERF_RECORD_ATTR, type = 0 (PERF_TYPE_HARDWARE), size = 136, config = 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, precise_ip = 3, sample_id_all = 1 0x110@pipe [0x110]: event: 64 ... ``` Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf header: In pipe mode dump features without --header/-IIan Rogers
In pipe mode the header features are contained within events. While other events dump details the header features only dump if --header or -I are passed, which doesn't make sense as in pipe mode there is no perf file header. Make the printing of the information conditional on dump_trace as with other events. Before: ``` $ perf record -o - -a sleep 1 | perf script -D -i - ... 0x2c8@pipe [0x54]: event: 80 . . ... raw event: size 84 bytes . 0000: 50 00 00 00 00 00 54 00 05 00 00 00 00 00 00 00 P.....T......... . 0010: 40 00 00 00 36 2e 31 35 2e 72 63 37 2e 67 61 64 @...6.15.rc7.gad . 0020: 32 61 36 39 31 63 39 39 66 62 00 00 00 00 00 00 2a691c99fb...... . 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 .... 0 0 0x2c8 [0x54]: PERF_RECORD_FEATURE ``` After: ``` $ perf record -o - -a sleep 1 | perf script -D -i - ... 0x2c8@pipe [0x54]: event: 80 . . ... raw event: size 84 bytes . 0000: 50 00 00 00 00 00 54 00 05 00 00 00 00 00 00 00 P.....T......... . 0010: 40 00 00 00 36 2e 31 35 2e 72 63 37 2e 67 61 64 @...6.15.rc7.gad . 0020: 32 61 36 39 31 63 39 39 66 62 00 00 00 00 00 00 2a691c99fb...... . 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 .... 0 0 0x2c8 [0x54]: PERF_RECORD_FEATURE, # perf version : 6.15.rc7.gad2a691c99fb ``` Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf tests: Add a DRM PMU testIan Rogers
The test opens any DRM devices so that the shell has fdinfo files containing the DRM data. The test then uses perf stat to make sure the events can be read. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624231837.179536-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf drm_pmu: Add a tool like PMU to expose DRM informationIan Rogers
DRM clients expose information through usage stats as documented in Documentation/gpu/drm-usage-stats.rst (available online at https://docs.kernel.org/gpu/drm-usage-stats.html). Add a tool like PMU, similar to the hwmon PMU, that exposes DRM information. For example on a tigerlake laptop: ``` $ perf list drm List of pre-defined events (to be used in -e or -M): drm: drm-active-stolen-system0 [Total memory active in one or more engines. Unit: drm_i915] drm-active-system0 [Total memory active in one or more engines. Unit: drm_i915] drm-engine-capacity-video [Engine capacity. Unit: drm_i915] drm-engine-copy [Utilization in ns. Unit: drm_i915] drm-engine-render [Utilization in ns. Unit: drm_i915] drm-engine-video [Utilization in ns. Unit: drm_i915] drm-engine-video-enhance [Utilization in ns. Unit: drm_i915] drm-purgeable-stolen-system0 [Size of resident and purgeable memory bufers. Unit: drm_i915] drm-purgeable-system0 [Size of resident and purgeable memory bufers. Unit: drm_i915] drm-resident-stolen-system0 [Size of resident memory bufers. Unit: drm_i915] drm-resident-system0 [Size of resident memory bufers. Unit: drm_i915] drm-shared-stolen-system0 [Size of shared memory bufers. Unit: drm_i915] drm-shared-system0 [Size of shared memory bufers. Unit: drm_i915] drm-total-stolen-system0 [Size of shared and private memory. Unit: drm_i915] drm-total-system0 [Size of shared and private memory. Unit: drm_i915] ``` System wide data can be gathered: ``` $ perf stat -x, -I 1000 -e drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0 1.000904910,0,bytes,drm-active-stolen-system0,1,100.00,, 1.000904910,0,bytes,drm-active-system0,1,100.00,, 1.000904910,36,capacity,drm-engine-capacity-video,1,100.00,, 1.000904910,0,ns,drm-engine-copy,1,100.00,, 1.000904910,1472970566175,ns,drm-engine-render,1,100.00,, 1.000904910,0,ns,drm-engine-video,1,100.00,, 1.000904910,0,ns,drm-engine-video-enhance,1,100.00,, 1.000904910,0,bytes,drm-purgeable-stolen-system0,1,100.00,, 1.000904910,38199296,bytes,drm-purgeable-system0,1,100.00,, 1.000904910,0,bytes,drm-resident-stolen-system0,1,100.00,, 1.000904910,4643196928,bytes,drm-resident-system0,1,100.00,, 1.000904910,0,bytes,drm-shared-stolen-system0,1,100.00,, 1.000904910,1886871552,bytes,drm-shared-system0,1,100.00,, 1.000904910,0,bytes,drm-total-stolen-system0,1,100.00,, 1.000904910,4643196928,bytes,drm-total-system0,1,100.00,, 2.264426839,0,bytes,drm-active-stolen-system0,1,100.00,, ``` Or for a particular process: ``` $ perf stat -x, -I 1000 -e drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0 -p 200027 1.001040274,0,bytes,drm-active-stolen-system0,6,100.00,, 1.001040274,0,bytes,drm-active-system0,6,100.00,, 1.001040274,12,capacity,drm-engine-capacity-video,6,100.00,, 1.001040274,0,ns,drm-engine-copy,6,100.00,, 1.001040274,1542300,ns,drm-engine-render,6,100.00,, 1.001040274,0,ns,drm-engine-video,6,100.00,, 1.001040274,0,ns,drm-engine-video-enhance,6,100.00,, 1.001040274,0,bytes,drm-purgeable-stolen-system0,6,100.00,, 1.001040274,13516800,bytes,drm-purgeable-system0,6,100.00,, 1.001040274,0,bytes,drm-resident-stolen-system0,6,100.00,, 1.001040274,27746304,bytes,drm-resident-system0,6,100.00,, 1.001040274,0,bytes,drm-shared-stolen-system0,6,100.00,, 1.001040274,0,bytes,drm-shared-system0,6,100.00,, 1.001040274,0,bytes,drm-total-stolen-system0,6,100.00,, 1.001040274,27746304,bytes,drm-total-system0,6,100.00,, 2.016629075,0,bytes,drm-active-stolen-system0,6,100.00,, ``` As with the hwmon PMU, high numbered PMU types are used to encode multiple possible "DRM" PMUs. The appropriate fdinfo is found by scanning /proc and filtering which fdinfos to read with stat. To avoid some unneeding scanning, events not starting with "drm-" are ignored. The patch builds on commit 57e13264dcea ("perf pmus: Restructure pmu_read_sysfs to scan fewer PMUs") and later so that only if full wild carding is being done, the PMU starts with "drm_" or the event starts with "drm-" will /proc be scanned. That is there should be little to no cost in this PMU unless DRM events are requested. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624231837.179536-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf parse-events: Avoid scanning PMUs that can't contain eventsIan Rogers
Add perf_pmus__scan_for_event that only reads sysfs for pmus that could contain a given event. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624231837.179536-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf debug: Add function symbols to dump_stackIan Rogers
Symbolize stack traces by creating a live machine. Add this functionality to dump_stack and switch dump_stack users to use it. Switch TUI to use it. Add stack traces to the child test function which can be useful to diagnose blocked code. Example output: ``` $ perf test -vv PERF_RECORD_ ... 7: PERF_RECORD_* events & perf_sample fields: 7: PERF_RECORD_* events & perf_sample fields : Running (1 active) ^C Signal (2) while running tests. Terminating tests with the same signal Internal test harness failure. Completing any started tests: : 7: PERF_RECORD_* events & perf_sample fields: ---- unexpected signal (2) ---- #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0 #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0 #2 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64 #3 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72 #4 0x7fc12fef1393 in __nanosleep nanosleep.c:26 #5 0x7fc12ff02d68 in __sleep sleep.c:55 #6 0x55788c63196b in test__PERF_RECORD perf-record.c:0 #7 0x55788c620fb0 in run_test_child builtin-test.c:0 #8 0x55788c5bd18d in start_command run-command.c:127 #9 0x55788c621ef3 in __cmd_test builtin-test.c:0 #10 0x55788c6225bf in cmd_test ??:0 #11 0x55788c5afbd0 in run_builtin perf.c:0 #12 0x55788c5afeeb in handle_internal_command perf.c:0 #13 0x55788c52b383 in main ??:0 #14 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74 #15 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128 #16 0x55788c52b9d1 in _start ??:0 ---- unexpected signal (2) ---- #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0 #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0 #2 0x7fc12fea3a14 in pthread_sigmask@GLIBC_2.2.5 pthread_sigmask.c:45 #3 0x7fc12fe49fd9 in __GI___sigprocmask sigprocmask.c:26 #4 0x7fc12ff2601b in __longjmp_chk longjmp.c:36 #5 0x55788c6210c0 in print_test_result.isra.0 builtin-test.c:0 #6 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0 #7 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64 #8 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72 #9 0x7fc12fef1393 in __nanosleep nanosleep.c:26 #10 0x7fc12ff02d68 in __sleep sleep.c:55 #11 0x55788c63196b in test__PERF_RECORD perf-record.c:0 #12 0x55788c620fb0 in run_test_child builtin-test.c:0 #13 0x55788c5bd18d in start_command run-command.c:127 #14 0x55788c621ef3 in __cmd_test builtin-test.c:0 #15 0x55788c6225bf in cmd_test ??:0 #16 0x55788c5afbd0 in run_builtin perf.c:0 #17 0x55788c5afeeb in handle_internal_command perf.c:0 #18 0x55788c52b383 in main ??:0 #19 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74 #20 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128 #21 0x55788c52b9d1 in _start ??:0 7: PERF_RECORD_* events & perf_sample fields : Skip (permissions) ``` Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624210500.2121303-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf tools: Remove excess variable declarationsBhaskar Chowdhury
I thought array declaration might be done in the same line as assigning the value to it. Hence, getting rid of extra steps of reiterating the array name. Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Link: https://lore.kernel.org/r/20250611100256.31089-1-unixbhaskar@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf test: Replace grep perl regexp with awkChun-Tse Shao
perl is not universal on all machines and should be replaced with awk, which is much more common. Before: $ perf test "probe libc's inet_pton & backtrace it with ping" -v --- start --- test child forked, pid 145431 grep: Perl matching not supported in a --disable-perl-regexp build FAIL: could not add event ---- end(-1) ---- 121: probe libc's inet_pton & backtrace it with ping : FAILED! After: $ perf test "probe libc's inet_pton & backtrace it with ping" -v 121: probe libc's inet_pton & backtrace it with ping : Ok Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250620174034.819894-1-ctshao@google.com [ fold James' suggestion not to escape _ in the event pattern. ] Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24libperf evsel: Add missed puts and assertsIan Rogers
A missed evsel__close before evsel__delete was the source of leaking perf events due to a hybrid test. Add asserts in debug builds so that this shouldn't happen in the future. Add puts missing on the cpu map and thread maps. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf trace: Add missed freeing of ordered events and threadIan Rogers
Caught by leak sanitizer running "perf trace BTF general tests". Make the ordered_events initialization unconditional and early so that trace__exit cleanup is simple - ordered_events__init doesn't allocate and just sets up 4 values and inits 3 list heads. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf script: Add -e option to flamegraph scriptTianyou Li
When processing the perf data file generated with multiple events, the flamegraph script will count all the events regardless of different event names. This patch tries to add a -e option to specify the event name that the flamegraph will be generated accordingly. If the -e option omitted, the behavior remains unchanged. Signed-off-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Pan Deng <pan.deng@intel.com> Reviewed-by: Zhiguo Zhou <zhiguo.zhou@intel.com> Reviewed-by: Wangyang Guo <wangyang.guo@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lore.kernel.org/r/20250610040536.2390060-2-tianyou.li@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf script: Handle -i option for perf script flamegraphTianyou Li
If specify the perf data file with -i option, the script will try to read the header information regardless of the file name specified, instead it will try to access the perf.data. This simple patch use the file name from -i option for command perf report --header-only to read the header. Signed-off-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Pan Deng <pan.deng@intel.com> Reviewed-by: Zhiguo Zhou <zhiguo.zhou@intel.com> Reviewed-by: Wangyang Guo <wangyang.guo@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lore.kernel.org/r/20250610040536.2390060-1-tianyou.li@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf trace: Show zero value in STRARRAYNamhyung Kim
The STRARRAY macro is to print values in a pre-defined array. But sometimes it hides the value because it's 0. The value of 0 can have a meaning in this case so set 'show_zero' field. For example, it can show CREATE_MAP cmd in the bpf syscall. Acked-by: Howard Chu <howardchu95@gmail.com> Link: https://lore.kernel.org/r/20250502204056.973977-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf unwind-libdw: skip non-regular filesSergei Trofimovich
Without the change `perf `hangs up on charaster devices. On my system it's enough to run system-wide sampler for a few seconds to get the hangup: $ perf record -a -g --call-graph=dwarf $ perf report # hung `strace` shows that hangup happens on reading on a character device `/dev/dri/renderD128` $ strace -y -f -p 2780484 strace: Process 2780484 attached pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached It's call trace descends into `elfutils`: $ gdb -p 2780484 (gdb) bt #0 0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0) at ../sysdeps/unix/sysv/linux/pread64.c:25 #1 0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1 #2 0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #3 0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #4 0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #5 0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0) at util/dso.h:537 #6 0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114 #7 frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242 #8 0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #9 0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0, thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127, best_effort=best_effort@entry=false) at util/thread.h:152 #13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0, sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939 #14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2920 #15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440, sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true) at util/machine.c:2970 #16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440, sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198 #17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0, evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127 #18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127, arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255 #19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540, evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334 #20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0, file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367 #21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245 #22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324 #23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224, file_path=0x105ffbf0 "perf.data") at util/session.c:1419 #24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0, --Type <RET> for more, q to quit, c to continue without paging-- quit prog=prog@entry=0x7fff9df81220) at util/session.c:2132 #25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220) at util/session.c:2181 #26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226 #27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390 #28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076 #29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827 #30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:351 #31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404 #32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448 #33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556 The hangup happens because nothing in` perf` or `elfutils` checks if a mapped file is easily readable. The change conservatively skips all non-regular files. Signed-off-by: Sergei Trofimovich <slyich@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250505174419.2814857-1-slyich@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf build: The bfd features are opt-in, stop testing for them by defaultArnaldo Carvalho de Melo
These are leftovers noticed while updating a build container. We don't need those so that test-all.c can build and thus speed up the feature detection. Test for those features only if the user asks for BUILD_NONDISTRO=1 to build with libbfd. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250620212435.93846-4-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf build: Add the libpfm devel fedora package name to the hintArnaldo Carvalho de Melo
Just to follow the pattern with other devel packages. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250620212435.93846-3-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf build: Suggest java-latest-openjdk-devel instead of old 1.8.0 oneArnaldo Carvalho de Melo
Just tidying up the suggestion to pick the latest and not some specific version. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250620212435.93846-2-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-24perf srcline: Lower verbosity on addr2line debug messagesIan Rogers
Lower non-error debug messages to verbose 3 or larger. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250623161930.1421216-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-22Merge tag 'v6.16-rc3' into perf-tools-nextNamhyung Kim
To get the fixes in libbpf and perf tools. Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-22Merge tag 'locking_urgent_for_v6.16_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Borislav Petkov: - Make sure the switch to the global hash is requested always under a lock so that two threads requesting that simultaneously cannot get to inconsistent state - Reject negative NUMA nodes earlier in the futex NUMA interface handling code - Selftests fixes * tag 'locking_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Verify under the lock if hash can be replaced futex: Handle invalid node numbers supplied by user selftests/futex: Set the home_node in futex_numa_mpol selftests/futex: getopt() requires int as return value.
2025-06-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Fix another set of FP/SIMD/SVE bugs affecting NV, and plugging some missing synchronisation - A small fix for the irqbypass hook fixes, tightening the check and ensuring that we only deal with MSI for both the old and the new route entry - Rework the way the shadow LRs are addressed in a nesting configuration, plugging an embarrassing bug as well as simplifying the whole process - Add yet another fix for the dreaded arch_timer_edge_cases selftest RISC-V: - Fix the size parameter check in SBI SFENCE calls - Don't treat SBI HFENCE calls as NOPs x86 TDX: - Complete API for handling complex TDVMCALLs in userspace. This was delayed because the spec lacked a way for userspace to deny supporting these calls; the new exit code is now approved" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: TDX: Exit to userspace for GetTdVmCallInfo KVM: TDX: Handle TDG.VP.VMCALL<GetQuote> KVM: TDX: Add new TDVMCALL status code for unsupported subfuncs KVM: arm64: VHE: Centralize ISBs when returning to host KVM: arm64: Remove cpacr_clear_set() KVM: arm64: Remove ad-hoc CPTR manipulation from kvm_hyp_handle_fpsimd() KVM: arm64: Remove ad-hoc CPTR manipulation from fpsimd_sve_sync() KVM: arm64: Reorganise CPTR trap manipulation KVM: arm64: VHE: Synchronize CPTR trap deactivation KVM: arm64: VHE: Synchronize restore of host debug registers KVM: arm64: selftests: Close the GIC FD in arch_timer_edge_cases KVM: arm64: Explicitly treat routing entry type changes as changes KVM: arm64: nv: Fix tracking of shadow list registers RISC-V: KVM: Don't treat SBI HFENCE calls as NOPs RISC-V: KVM: Fix the size parameter check in SBI SFENCE calls
2025-06-21Merge tag 'perf-tools-fixes-for-v6.16-1-2025-06-20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix some file descriptor leaks that stand out with recent changes to 'perf list' - Fix prctl include to fix building 'perf bench futex' hash with musl libc - Restrict 'perf test' uniquifying entry to machines with 'uncore_imc' PMUs - Document new output fields (op, cache, mem, dtlb, snoop) used with 'perf mem' - Synchronize kernel header copies * tag 'perf-tools-fixes-for-v6.16-1-2025-06-20' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: tools headers x86 cpufeatures: Sync with the kernel sources perf bench futex: Fix prctl include in musl libc perf test: Directory file descriptor leak perf evsel: Missed close() when probing hybrid core PMUs tools headers: Synchronize linux/bits.h with the kernel sources tools arch amd ibs: Sync ibs.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers: Syncronize linux/build_bug.h with the kernel sources tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench' tools headers UAPI: Sync linux/kvm.h with the kernel sources tools headers UAPI: Sync the drm/drm.h with the kernel sources perf beauty: Update copy of linux/socket.h with the kernel sources tools headers UAPI: Sync kvm header with the kernel sources tools headers x86 svm: Sync svm headers with the kernel sources tools headers UAPI: Sync KVM's vmx.h header with the kernel sources tools kvm headers arm64: Update KVM header from the kernel sources tools headers UAPI: Sync linux/prctl.h with the kernel sources to pick FUTEX knob perf mem: Document new output fields (op, cache, mem, dtlb, snoop) tools headers: Update the fs headers with the kernel sources perf test: Restrict uniquifying test to machines with 'uncore_imc'
2025-06-20perf test: add test for BPF metadata collectionBlake Jones
This is an end-to-end test for the PERF_RECORD_BPF_METADATA support. It adds a new "bpf_metadata_perf_version" variable to perf's BPF programs, so that when they are loaded, there will be at least one BPF program with some metadata to parse. The test invokes "perf record" in a way that loads one of those BPF programs, and then sifts through the output to find its BPF metadata. Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-6-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf tools: display the new PERF_RECORD_BPF_METADATA eventBlake Jones
Here's some example "perf script -D" output for the new event type. The ": unhandled!" message is from tool.c, analogous to other behavior there. I've elided some rows with all NUL characters for brevity, and I wrapped one of the >75-column lines to fit in the commit guidelines. 0x50fc8@perf.data [0x260]: event: 84 . . ... raw event: size 608 bytes . 0000: 54 00 00 00 00 00 60 02 62 70 66 5f 70 72 6f 67 T.....`.bpf_prog . 0010: 5f 31 65 30 61 32 65 33 36 36 65 35 36 66 31 61 _1e0a2e366e56f1a . 0020: 32 5f 70 65 72 66 5f 73 61 6d 70 6c 65 5f 66 69 2_perf_sample_fi . 0030: 6c 74 65 72 00 00 00 00 00 00 00 00 00 00 00 00 lter............ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [...] . 0110: 74 65 73 74 5f 76 61 6c 75 65 00 00 00 00 00 00 test_value...... . 0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [...] . 0150: 34 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00 42.............. . 0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [...] 0 0x50fc8 [0x260]: PERF_RECORD_BPF_METADATA \ prog bpf_prog_1e0a2e366e56f1a2_perf_sample_filter entry 0: test_value = 42 : unhandled! Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-5-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf record: collect BPF metadata from new programsBlake Jones
This collects metadata for any BPF programs that were loaded during a "perf record" run, and emits it at the end of the run. Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-4-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf record: collect BPF metadata from existing BPF programsBlake Jones
Look for .rodata maps, find ones with 'bpf_metadata_' variables, extract their values as strings, and create a new PERF_RECORD_BPF_METADATA synthetic event using that data. The code gets invoked from the existing routine perf_event__synthesize_one_bpf_prog(). For example, a BPF program with the following variables: const char bpf_metadata_version[] SEC(".rodata") = "3.14159"; int bpf_metadata_value[] SEC(".rodata") = 42; would generate a PERF_RECORD_BPF_METADATA record with: .prog_name = <BPF program name, e.g. "bpf_prog_a1b2c3_foo"> .nr_entries = 2 .entries[0].key = "version" .entries[0].value = "3.14159" .entries[1].key = "value" .entries[1].value = "42" Each of the BPF programs and subprograms that share those variables would get a distinct PERF_RECORD_BPF_METADATA record, with the ".prog_name" showing the name of each program or subprogram. The prog_name is deliberately the same as the ".name" field in the corresponding PERF_RECORD_KSYMBOL record. This code only gets invoked if support for displaying BTF char arrays as strings is detected. Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-3-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf build: detect support for libbpf's emit_strings optionBlake Jones
This creates a config option that detects libbpf's ability to display character arrays as strings, which was just added to the BPF tree (https://git.kernel.org/bpf/bpf-next/c/87c9c79a02b4). To test this change, I built perf (from later in this patch set) with: - static libbpf (default, using source from kernel tree) - dynamic libbpf (LIBBPF_DYNAMIC=1 LIBBPF_INCLUDE=/usr/local/include) For both the static and dynamic versions, I used headers with and without the ".emit_strings" option. I verified that of the four resulting binaries, the two with ".emit_strings" would successfully record BPF_METADATA events, and the two without wouldn't. All four binaries would successfully display BPF_METADATA events, because the relevant bit of libbpf code is only used during "perf record". Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-2-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf header: remove unecessary core id testAnubhav Shelat
It is possible for systems to have a greater socket id number than the number of cpus present on a machine, so this test is obselete and should be removed. Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250618142921.4053400-2-ashelat@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf test: Add header shell testIan Rogers
Add a shell test that sanity checks perf data and pipe mode produce expected header fields. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250619002555.100896-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf parse-events: Set default GH modifier properlyNamhyung Kim
Commit 7b100989b4f6bce7 ("perf evlist: Remove __evlist__add_default") changed to use "cycles:P" as a default event. But the problem is it cannot set other default modifiers correctly. perf kvm needs to set attr.exclude_host by default but it didn't work because of the logic in the parse_events__modifier_list(). Also the exclude_GH_default was applied only if ":u" modifier was specified - which is strange. Move it out after handling the ":GH" and check perf_host and perf_guest properly. Before: $ ./perf kvm record -vv true |& grep exclude (nothing) But specifying an event (without a modifier) works: $ ./perf kvm record -vv -e cycles true |& grep exclude exclude_host 1 After: It now works for the both cases: $ ./perf kvm record -vv true |& grep exclude exclude_host 1 $ ./perf kvm record -vv -e cycles true |& grep exclude exclude_host 1 Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250606225431.2109754-1-namhyung@kernel.org Fixes: 35c8d21371e9b342 ("perf tools: Don't set attr.exclude_guest by default") Signed-off-by: Namhyung Kim <namhyung@kernel.org>