summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2025-06-20perf record: collect BPF metadata from new programsBlake Jones
This collects metadata for any BPF programs that were loaded during a "perf record" run, and emits it at the end of the run. Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-4-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf record: collect BPF metadata from existing BPF programsBlake Jones
Look for .rodata maps, find ones with 'bpf_metadata_' variables, extract their values as strings, and create a new PERF_RECORD_BPF_METADATA synthetic event using that data. The code gets invoked from the existing routine perf_event__synthesize_one_bpf_prog(). For example, a BPF program with the following variables: const char bpf_metadata_version[] SEC(".rodata") = "3.14159"; int bpf_metadata_value[] SEC(".rodata") = 42; would generate a PERF_RECORD_BPF_METADATA record with: .prog_name = <BPF program name, e.g. "bpf_prog_a1b2c3_foo"> .nr_entries = 2 .entries[0].key = "version" .entries[0].value = "3.14159" .entries[1].key = "value" .entries[1].value = "42" Each of the BPF programs and subprograms that share those variables would get a distinct PERF_RECORD_BPF_METADATA record, with the ".prog_name" showing the name of each program or subprogram. The prog_name is deliberately the same as the ".name" field in the corresponding PERF_RECORD_KSYMBOL record. This code only gets invoked if support for displaying BTF char arrays as strings is detected. Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-3-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf build: detect support for libbpf's emit_strings optionBlake Jones
This creates a config option that detects libbpf's ability to display character arrays as strings, which was just added to the BPF tree (https://git.kernel.org/bpf/bpf-next/c/87c9c79a02b4). To test this change, I built perf (from later in this patch set) with: - static libbpf (default, using source from kernel tree) - dynamic libbpf (LIBBPF_DYNAMIC=1 LIBBPF_INCLUDE=/usr/local/include) For both the static and dynamic versions, I used headers with and without the ".emit_strings" option. I verified that of the four resulting binaries, the two with ".emit_strings" would successfully record BPF_METADATA events, and the two without wouldn't. All four binaries would successfully display BPF_METADATA events, because the relevant bit of libbpf code is only used during "perf record". Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-2-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf header: remove unecessary core id testAnubhav Shelat
It is possible for systems to have a greater socket id number than the number of cpus present on a machine, so this test is obselete and should be removed. Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250618142921.4053400-2-ashelat@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf test: Add header shell testIan Rogers
Add a shell test that sanity checks perf data and pipe mode produce expected header fields. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250619002555.100896-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf parse-events: Set default GH modifier properlyNamhyung Kim
Commit 7b100989b4f6bce7 ("perf evlist: Remove __evlist__add_default") changed to use "cycles:P" as a default event. But the problem is it cannot set other default modifiers correctly. perf kvm needs to set attr.exclude_host by default but it didn't work because of the logic in the parse_events__modifier_list(). Also the exclude_GH_default was applied only if ":u" modifier was specified - which is strange. Move it out after handling the ":GH" and check perf_host and perf_guest properly. Before: $ ./perf kvm record -vv true |& grep exclude (nothing) But specifying an event (without a modifier) works: $ ./perf kvm record -vv -e cycles true |& grep exclude exclude_host 1 After: It now works for the both cases: $ ./perf kvm record -vv true |& grep exclude exclude_host 1 $ ./perf kvm record -vv -e cycles true |& grep exclude exclude_host 1 Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250606225431.2109754-1-namhyung@kernel.org Fixes: 35c8d21371e9b342 ("perf tools: Don't set attr.exclude_guest by default") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf test: Expand user space event reading (rdpmc) testsIan Rogers
Test that disabling rdpmc support via /sys/bus/event_source/cpu*/rdpmc disables reading in the mmap (libperf read support will fallback to using a system call). Test all hybrid PMUs support rdpmc. Ensure hybrid PMUs use the correct CPU to rdpmc the correct event. Previously the test would open cycles or instructions with no extended type then rdpmc it on whatever CPU. This could fail/skip due to which CPU the test was scheduled upon. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20250614004528.1652860-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf vendor events arm64: Update FUJITSU-MONAKA pmu eventKotaro, Tokai
Update pmu events for FUJITSU-MONAKA. And, also updated common-and-microarch.json. FUJITSU-MONAKA PMU Events Specification v1.1 and Errata v1.0 URL: https://github.com/fujitsu/FUJITSU-MONAKA Arm Architecture Reference Version L.b URL: https://developer.arm.com/documentation/ddi0487/lb/?lang=en Signed-off-by: Kotaro, Tokai <fj0635gf@aa.jp.fujitsu.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250618063618.1244363-1-fj0635gf@aa.jp.fujitsu.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-17perf bench futex: Fix prctl include in musl libcArnaldo Carvalho de Melo
Namhyung Kim reported: I've updated the perf-tools-next to v6.16-rc1 and found a build error like below on alpine linux 3.18. In file included from bench/futex.c:6: /usr/include/sys/prctl.h:88:8: error: redefinition of 'struct prctl_mm_map' 88 | struct prctl_mm_map { | ^~~~~~~~~~~~ In file included from bench/futex.c:5: /linux/tools/include/uapi/linux/prctl.h:134:8: note: originally defined here 134 | struct prctl_mm_map { | ^~~~~~~~~~~~ make[4]: *** [/linux/tools/build/Makefile.build:86: /build/bench/futex.o] Error 1 git bisect says it's the first commit introduced the failure. So both /usr/include/sys/prctl.h and /linux/tools/include/uapi/linux/prctl.h provide struct prctl_mm_map but their include guard must be different. /usr/include/sys/prctl.h provided by glibc contains the prctl() declaration. It includes also linux/prctl.h. The /usr/include/sys/prctl.h on alpine linux is different. This is probably coming from musl. It contains the PR_* definition and the prctl() declaration. So it clashes here because now the one struct is available twice. The man page for prctl(2) says: | #include <linux/prctl.h> /* Definition of PR_* constants */ | #include <sys/prctl.h> so musl doesn't follow this. So don't include linux/prctl.h explicitely and add some new defines needed if they aren't available. Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reported-by: Namhyung Kim <namhyung@kernel.org> Closes: https://lore.kernel.org/r/20250611092542.F4ooE2FL@linutronix.de Link: https://www.openwall.com/lists/musl/2025/06/12/11 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-17perf test: Directory file descriptor leakIan Rogers
Add missed close when iterating over the script directories. Fixes: f3295f5b067d3c26 ("perf tests: Use scandirat for shell script finding") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20250614004108.1650988-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-17perf evsel: Missed close() when probing hybrid core PMUsIan Rogers
Add missing close() to avoid leaking perf events. In past perfs this mattered little as the function was just used by 'perf list'. As the function is now used to detect hybrid PMUs leaking the perf event is somewhat more painful. Fixes: b41f1cec91c37eee ("perf list: Skip unsupported events") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20250614004108.1650988-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-17tools arch amd ibs: Sync ibs.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes from: 861c6b1185fbb2e3 ("x86/platform/amd: Add standard header guards to <asm/amd/ibs.h>") A small change to tools/perf/check-headers.sh was made to cope with the move of this header done in: 3846389c03a85188 ("x86/platform/amd: Move the <asm/amd-ibs.h> header to <asm/amd/ibs.h>") That don't result in any changes in the tools, just address this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/amd/ibs.h arch/x86/include/asm/amd/ibs.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/aEtCi0pup5FEwnzn@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-16tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'Arnaldo Carvalho de Melo
Also add SYM_PIC_ALIAS() to tools/perf/util/include/linux/linkage.h. This is to get the changes from: 419cbaf6a56a6e4b ("x86/boot: Add a bunch of PIC aliases") That addresses these perf tools build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/aEry7L3fibwIG5au@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-16perf beauty: Update copy of linux/socket.h with the kernel sourcesArnaldo Carvalho de Melo
To pick the changes in: b1e904999542ad67 ("net: pass const to msg_data_left()") That don't result in any changes in the tables generated from that header. This silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h Please see tools/include/uapi/README for details. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Breno Leitao <leitao@debian.org> Cc: Ian Rogers <irogers@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/aErrK24XLUILFH_P@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-16tools headers UAPI: Sync linux/prctl.h with the kernel sources to pick FUTEX ↵Arnaldo Carvalho de Melo
knob To pick the changes in: 63e8595c060a1fef ("futex: Allow to make the private hash immutable") 80367ad01d93ac78 ("futex: Add basic infrastructure for local task local hash") That adds a FUTEX knob: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/perf/trace/beauty/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after --- before 2025-06-09 14:50:45.162579336 -0300 +++ after 2025-06-09 14:50:52.797660024 -0300 @@ -72,6 +72,7 @@ [75] = "SET_SHADOW_STACK_STATUS", [76] = "LOCK_SHADOW_STACK_STATUS", [77] = "TIMER_CREATE_RESTORE_IDS", + [78] = "FUTEX_HASH", }; static const char *prctl_set_mm_options[] = { [1] = "START_CODE", $ That now will be used to decode the syscall option and also to compose filters, for instance: [root@five ~]# perf trace -e syscalls:sys_enter_prctl --filter option==SET_NAME 0.000 Isolated Servi/3474327 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23f13b7aee) 0.032 DOM Worker/3474327 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23deb25670) 7.920 :3474328/3474328 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24fbb10) 7.935 StreamT~s #374/3474328 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24fb970) 8.400 Isolated Servi/3474329 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24bab10) 8.418 StreamT~s #374/3474329 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24ba970) ^C[root@five ~]# This addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/aEiYOtKkrVDT03hZ@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-16perf mem: Document new output fields (op, cache, mem, dtlb, snoop)Namhyung Kim
Update the documentation of the new fields with examples and caveats. Also update the related documentation for AMD IBS. Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250610005742.2173050-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-16tools headers: Update the fs headers with the kernel sourcesArnaldo Carvalho de Melo
To pick up changes from: 5d894321c49e6137 ("fs: add atomic write unit max opt to statx") a516403787e08119 ("fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions") c07d3aede2b26830 ("fscrypt: add support for hardware-wrapped keys") These are used to beautify fs syscall arguments, albeit the changes in this update are not affecting those beautifiers. This addresses these tools/ build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h diff -u tools/perf/trace/beauty/include/uapi/linux/fs.h include/uapi/linux/fs.h diff -u tools/perf/trace/beauty/include/uapi/linux/stat.h include/uapi/linux/stat.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Eric Biggers <ebiggers@google.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Liam R. Howlett <liam.howlett@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/aEce1keWdO-vGeqe@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-16perf test: Restrict uniquifying test to machines with 'uncore_imc'Chun-Tse Shao
The test would fail if target machine does not have 'uncore_imc' devices. Since event uniquifying behavior is similar among different architectures, we are restricting the test to only run on machines with `uncore_imc` devices. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250521224513.1104129-1-ctshao@google.com [ Skip the test, i.e. return 2, instead of returning 0 as if the test had succeed ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-06-11perf thread: Ensure comm_lock held for comm_listIan Rogers
Add thread safety annotations for comm_list and add locking for two instances where the list is accessed without the lock held (in contradiction to ____thread__set_comm()). Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250529192206.971199-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf: Fix libjvmti.c sign compare errorYuzhuo Jing
Fix the compile errors when compiling with -Werror=sign-compare. This is a follow-up patch to a previous patch series for a separate issue. Link: https://lore.kernel.org/lkml/aC9lXhPFcs5fkHWH@x1/ Signed-off-by: Yuzhuo Jing <yuzhuo@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604173632.2362759-1-yuzhuo@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf script: perf script tests fails with segfaultAditya Bodkhe
pert script tests fails with segmentation fault as below: 92: perf script tests: --- start --- test child forked, pid 103769 DB test [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB /tmp/perf-test-script.7rbftEpOzX/perf.data (9 samples) ] /usr/libexec/perf-core/tests/shell/script.sh: line 35: 103780 Segmentation fault (core dumped) perf script -i "${perfdatafile}" -s "${db_test}" --- Cleaning up --- ---- end(-1) ---- 92: perf script tests : FAILED! Backtrace pointed to : #0 0x0000000010247dd0 in maps.machine () #1 0x00000000101d178c in db_export.sample () #2 0x00000000103412c8 in python_process_event () #3 0x000000001004eb28 in process_sample_event () #4 0x000000001024fcd0 in machines.deliver_event () #5 0x000000001025005c in perf_session.deliver_event () #6 0x00000000102568b0 in __ordered_events__flush.part.0 () #7 0x0000000010251618 in perf_session.process_events () #8 0x0000000010053620 in cmd_script () #9 0x00000000100b5a28 in run_builtin () #10 0x00000000100b5f94 in handle_internal_command () #11 0x0000000010011114 in main () Further investigation reveals that this occurs in the `perf script tests`, because it uses `db_test.py` script. This script sets `perf_db_export_mode = True`. With `perf_db_export_mode` enabled, if a sample originates from a hypervisor, perf doesn't set maps for "[H]" sample in the code. Consequently, `al->maps` remains NULL when `maps__machine(al->maps)` is called from `db_export__sample`. As al->maps can be NULL in case of Hypervisor samples , use thread->maps because even for Hypervisor sample, machine should exist. If we don't have machine for some reason, return -1 to avoid segmentation fault. Reported-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Aditya Bodkhe <aditya.b1@linux.ibm.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Link: https://lore.kernel.org/r/20250429065132.36839-1-adityab1@linux.ibm.com Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf test trace: Change the regex pattern in the struct testHoward Chu
Ian mentioned a reliably occurred failure in the trace_btf_general test where he obtained trace output of: sleep/279619 clock_nanosleep(0, 0, {1,1,}, 0x7ffcd47b6450) = 0 But the regex pattern used for verification is "^sleep/[0-9]+ clock_nanosleep\(0, 0, \{1,\}, ..." This lead to a mismatch. The reason is, different sleep commands use different timespec data to call clock_nanosleep, on my machine, the value of tv_nsec is 0. ~~~ $ sudo /tmp/perf/perf trace -e clock_nanosleep -- sleep 1 0.000 (1000.196 ms): sleep/54261 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7ffe13529550) = 0 ~~~ While Ian had this trace log: ~~~ $ sudo /tmp/perf/perf trace -e clock_nanosleep -- sleep 1 0.000 (1000.208 ms): sleep/1710732 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 1 }, rmtp: 0x7ffc091f4090) = 0 ~~~ Because sleep's behavior of setting 'tv_nsec' is not certain, and tv_sec is most definitely 1, this patch relaxes the key regex pattern to '\{1,.*\}' for a better chance of matching. Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250528191148.89118-7-howardchu95@gmail.com Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf test trace: Use --sort-events in BTF general testsHoward Chu
Without the '--sort-events' flag, perf trace doesn't receive and process events based on their arrival time, thus PERF_RECORD_COMM event that assigns the correct comm to a PID, may be delivered and processed after regular samples, causing trace outputs not having a 'comm', e.g. 'mv', instead, having the default PID placeholder, e.g. ':14514'. Hopefully this answers Namhyung's question in [1]. You can simply justify the statement with this diff: [2]. Now, simply run this command multiple times: $ touch /tmp/file1 && sudo /tmp/perf trace -e renameat* -- mv /tmp/file1 /tmp/file2 && rm -f /tmp/file2 And you should see two types of results: $ touch /tmp/file1 && sudo /tmp/perf trace -e renameat* -- mv /tmp/file1 /tmp/file2 && rm -f /tmp/file2 [debug] deliver [debug] machine__process_comm_event [OVERRIDE] old :1221169 new mv str mv [debug] deliver [debug] deliver [debug] deliver [debug] deliver [debug] deliver [debug] deliver [debug] deliver [debug] deliver [debug] deliver [debug] deliver 0.000 ( 0.013 ms): mv/1221169 renameat2(olddfd: CWD, oldname: "/tmp/file1", newdfd: CWD, newname: "/tmp/file2", flags: NOREPLACE) = 0 [debug] deliver $ touch /tmp/file1 && sudo /tmp/perf trace -e renameat* -- mv /tmp/file1 /tmp/file2 && rm -f /tmp/file2 [debug] deliver [debug] deliver [debug] deliver [debug] deliver [debug] deliver [debug] deliver [debug] deliver 0.000 ( 0.014 ms): :1221398/1221398 renameat2(olddfd: CWD, oldname: "/tmp/file1", newdfd: CWD, newname: "/tmp/file2", flags: NOREPLACE) = 0 [debug] deliver [debug] deliver [debug] machine__process_comm_event [OVERRIDE] old :1221398 new mv str mv [debug] deliver [debug] deliver [debug] deliver Anyway, use --sort-events in BTF general tests to avoid :PID, a comm is preferred. [1]: https://lore.kernel.org/linux-perf-users/Z_AeswETE5xLcPT8@google.com/ [2]: https://gist.githubusercontent.com/Sberm/6b72b2a1cf1c62244f1f996481769baf/raw/529667bd74a2e7e1953bbd4be545bf875da8a3e7/unsorted.patch Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250528191148.89118-6-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf test trace: Remove set -e for BTF general testsHoward Chu
Remove set -e and print error messages in BTF general tests. Before: $ sudo /tmp/perf test btf -vv 108: perf trace BTF general tests: 108: perf trace BTF general tests : Running --- start --- test child forked, pid 889299 Checking if vmlinux BTF exists Testing perf trace's string augmentation String augmentation test failed ---- end(-1) ---- 108: perf trace BTF general tests : FAILED! After: $ sudo /tmp/perf test btf -vv 108: perf trace BTF general tests: 108: perf trace BTF general tests : Running --- start --- test child forked, pid 886551 Checking if vmlinux BTF exists Testing perf trace's string augmentation String augmentation test failed, output: :886566/886566 renameat2(CWD, "/tmp/file1_RcMa", CWD, "/tmp/file2_RcMa", NOREPLACE) = 0---- end(-1) ---- 108: perf trace BTF general tests : FAILED! Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250528191148.89118-5-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf test trace: Stop tracing hrtimer_setup event in trace enum testHoward Chu
The event 'timer:hrtimer_setup' is relatively new, for older kernels, perf trace enum tests won't run as the event 'timer:hrtimer_setup' cannot be found. It was originally called 'timer:hrtimer_init', before being renamed in: commit 244132c4e577 ("tracing/timers: Rename the hrtimer_init event to hrtimer_setup") Using timer:hrtimer_start should be enough for current testing, and hopefully 'start' won't be renamed in the future. Before: $ sudo /tmp/perf test enum -vv 107: perf trace enum augmentation tests: 107: perf trace enum augmentation tests : Running --- start --- test child forked, pid 786187 Checking if vmlinux exists Tracing syscall landlock_add_rule Tracing non-syscall tracepoint timer:hrtimer_setup,timer:hrtimer_start [tracepoint failure] Failed to trace timer:hrtimer_setup,timer:hrtimer_start tracepoint, output: event syntax error: 'timer:hrtimer_setup,timer:hrtimer_start' \___ unknown tracepoint Error: File /sys/kernel/tracing//events/timer/hrtimer_setup not found. Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?. Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events ---- end(-1) ---- 107: perf trace enum augmentation tests : FAILED! After: $ sudo /tmp/perf test enum -vv 107: perf trace enum augmentation tests: 107: perf trace enum augmentation tests : Running --- start --- test child forked, pid 808547 Checking if vmlinux exists Tracing syscall landlock_add_rule Tracing non-syscall tracepoint timer:hrtimer_start ---- end(0) ---- 107: perf trace enum augmentation tests : Ok Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250528191148.89118-4-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf test trace: Remove set -e and print trace test's error messagesHoward Chu
Currently perf test utilizes the set -e option in shell that exit immediately if a command exits with a non-zero status, this prevents further error handling and introduces ambiguity. This patch removes set -e and prints the error message after invoking perf trace during perf tests. In my case, the command that exits with a non-zero status is perf trace instead of grep, because it can't find the 'timer:hrtimer_setup' tracepoint, see below. Before: $ sudo /tmp/perf test enum -vv 107: perf trace enum augmentation tests: 107: perf trace enum augmentation tests : Running --- start --- test child forked, pid 783533 Checking if vmlinux exists Tracing syscall landlock_add_rule Tracing non-syscall tracepoint syscall ---- end(-1) ---- 107: perf trace enum augmentation tests : FAILED! After: $ sudo /tmp/perf test enum -vv 107: perf trace enum augmentation tests: 107: perf trace enum augmentation tests : Running --- start --- test child forked, pid 851658 Checking if vmlinux exists Tracing syscall landlock_add_rule Tracing non-syscall tracepoint timer:hrtimer_setup,timer:hrtimer_start [tracepoint failure] Failed to trace tracepoint timer:hrtimer_setup,timer:hrtimer_start, output: event syntax error: 'timer:hrtimer_setup,timer:hrtimer_start' \___ unknown tracepoint Error: File /sys/kernel/tracing//events/timer/hrtimer_setup not found. Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?. Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events---- end(-1) ---- 107: perf trace enum augmentation tests : FAILED! Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250528191148.89118-3-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf test trace: Use shell's -f flag to check if vmlinux existsHoward Chu
To match the style of the existing codebase, no functional changes were applied. Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250528191148.89118-2-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf trace: Remove --map-dump documentationHoward Chu
The --map-dump option was removed in 5e6da6be3082 ("perf trace: Migrate BPF augmentation to use a skeleton"), this patch removes its remaining documentation. Fixes: 5e6da6be3082 ("perf trace: Migrate BPF augmentation to use a skeleton") Signed-off-by: Howard Chu <howardchu95@gmail.com> Link: https://lore.kernel.org/r/20250601173252.717780-1-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf thread_map: Remove uid optionsIan Rogers
Now the target doesn't have a uid, it is handled through BPF filters, remove the uid options to thread_map creation. Tidy up the functions used in tests to avoid passing unused arguments. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf target: Remove uid from targetIan Rogers
Gathering threads with a uid by scanning /proc is inherently racy leading to perf_event_open failures that quit perf. All users of the functionality now use BPF filters, so remove uid and uid_str from target. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf bench evlist-open-close: Switch user option to use BPF filterIan Rogers
Finding user processes by scanning /proc is inherently racy and results in perf_event_open failures. Use a BPF filter to drop samples where the uid doesn't match. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf trace: Switch user option to use BPF filterIan Rogers
Finding user processes by scanning /proc is inherently racy and results in perf_event_open failures. Use a BPF filter to drop samples where the uid doesn't match. Ensure adding the BPF filter forces system-wide. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf top: Switch user option to use BPF filterIan Rogers
Finding user processes by scanning /proc is inherently racy and results in perf_event_open failures. Use a BPF filter to drop samples where the uid doesn't match. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf tests record: Add basic uid filtering testIan Rogers
Based on the system-wide test with changes around how failure is handled as BPF permissions are a bigger issue than perf event paranoia. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf record: Switch user option to use BPF filterIan Rogers
Finding user processes by scanning /proc is inherently racy and results in perf_event_open failures. Use a BPF filter to drop samples where the uid doesn't match. Ensure adding the BPF filter forces system-wide. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf parse-events: Add parse_uid_filter helperIan Rogers
Add parse_uid_filter filter as a helper to parse_filter, that constructs a uid filter string. As uid filters don't work with tracepoint filters, add a is_possible_tp_filter function so the tracepoint filter isn't attempted for tracepoint evsels. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf target: Separate parse_uid into its own functionIan Rogers
Allow parse_uid to be called without a struct target. Rather than have two errors, remove TARGET_ERRNO__USER_NOT_FOUND and use TARGET_ERRNO__INVALID_UID as the handling is identical. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf parse-events filter: Use evsel__find_pmuIan Rogers
Rather than manually scanning PMUs, use evsel__find_pmu that can use the PMU set during event parsing. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf bpf-filter: Improve error messagesNamhyung Kim
The BPF filter needs libbpf/BPF-skeleton support and root privilege. Add error messages to help users understand the problem easily. When it's not build with BPF support (make BUILD_BPF_SKEL=0). $ sudo perf record -e cycles --filter "pid != 0" true Error: BPF filter is requested but perf is not built with BPF. Please make sure to build with libbpf and BPF skeleton. Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] --filter <filter> event filter When it supports BPF but runs without root or CAP_BPF. Note that it also checks pinned BPF filters. $ perf record -e cycles --filter "pid != 0" -o /dev/null true Error: BPF filter only works for users with the CAP_BPF capability! Please run 'perf record --setup-filter pin' as root first. Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] --filter <filter> event filter Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174835.1852481-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-06Merge tag 'riscv-for-linus-6.16-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the FWFT SBI extension, which is part of SBI 3.0 and a dependency for many new SBI and ISA extensions - Support for getrandom() in the VDSO - Support for mseal - Optimized routines for raid6 syndrome and recovery calculations - kexec_file() supports loading Image-formatted kernel binaries - Improvements to the instruction patching framework to allow for atomic instruction patching, along with rules as to how systems need to behave in order to function correctly - Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha, some SiFive vendor extensions - Various fixes and cleanups, including: misaligned access handling, perf symbol mangling, module loading, PUD THPs, and improved uaccess routines * tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits) riscv: uaccess: Only restore the CSR_STATUS SUM bit RISC-V: vDSO: Wire up getrandom() vDSO implementation riscv: enable mseal sysmap for RV64 raid6: Add RISC-V SIMD syndrome and recovery calculations riscv: mm: Add support for Svinval extension RISC-V: Documentation: Add enough title underlines to CMODX riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE MAINTAINERS: Update Atish's email address riscv: uaccess: do not do misaligned accesses in get/put_user() riscv: process: use unsigned int instead of unsigned long for put_user() riscv: make unsafe user copy routines use existing assembly routines riscv: hwprobe: export Zabha extension riscv: Make regs_irqs_disabled() more clear perf symbols: Ignore mapping symbols on riscv RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND riscv: module: Optimize PLT/GOT entry counting riscv: Add support for PUD THP riscv: xchg: Prefetch the destination word for sc.w riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop riscv: Add support for Zicbop ...
2025-06-05Merge tag 'riscv-mw2-6.16-rc1' of ↵Palmer Dabbelt
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into for-next riscv patches for 6.16-rc1, part 2 * Performance improvements - Add support for vdso getrandom - Implement raid6 calculations using vectors - Introduce svinval tlb invalidation * Cleanup - A bunch of deduplication of the macros we use for manipulating instructions * Misc - Introduce a kunit test for kprobes - Add support for mseal as riscv fits the requirements (thanks to Lorenzo for making sure of that :)) [Palmer: There was a rebase between part 1 and part 2, so I've had to do some more git surgery here... at least two rounds of surgery...] * alex-pr-2: (866 commits) RISC-V: vDSO: Wire up getrandom() vDSO implementation riscv: enable mseal sysmap for RV64 raid6: Add RISC-V SIMD syndrome and recovery calculations riscv: mm: Add support for Svinval extension riscv: Add kprobes KUnit test riscv: kprobes: Remove duplication of RV_EXTRACT_ITYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_UTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_RD_REG riscv: kprobes: Remove duplication of RVC_EXTRACT_BTYPE_IMM riscv: kprobes: Remove duplication of RVC_EXTRACT_C2_RS1_REG riscv: kproves: Remove duplication of RVC_EXTRACT_JTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_BTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_RS1_REG riscv: kprobes: Remove duplication of RV_EXTRACT_JTYPE_IMM riscv: kprobes: Move branch_funct3 to insn.h riscv: kprobes: Move branch_rs2_idx to insn.h Linux 6.15-rc6 Input: xpad - fix xpad_device sorting Input: xpad - add support for several more controllers Input: xpad - fix Share button on Xbox One controllers ...
2025-06-05perf symbols: Ignore mapping symbols on riscvHaibo Xu
RISCV ELF use mapping symbols with special names $x, $d to identify regions of RISCV code or code with different ISAs[1]. These symbols don't identify functions, so will confuse the perf output. The patch filters out these symbols at load time, similar to "4886f2ca perf symbols: Ignore mapping symbols on aarch64". [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/ master/riscv-elf.adoc#mapping-symbol Signed-off-by: Haibo Xu <haibo1.xu@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250409025202.201046-1-haibo1.xu@intel.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
2025-06-03Merge tag 'perf-tools-for-v6.16-1-2025-06-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Arnaldo Carvalho de Melo: "perf report/top/annotate TUI: - Accept the left arrow key as a Zoom out if done on the first column - Show if source code toggle status in title, to help spotting bugs with the various disassemblers (capstone, llvm, objdump) - Provide feedback on unhandled hotkeys Build: - Better inform when certain features are not available with warnings in the build process and in 'perf version --build-options' or 'perf -vv' perf record: - Improve the --off-cpu code by synthesizing events for switch-out -> switch-in intervals using a BPF program. This can be fine tuned using a --off-cpu-thresh knob perf report: - Add 'tgid' sort key perf mem/c2c: - Add 'op', 'cache', 'snoop', 'dtlb' output fields - Add support for 'ldlat' on AMD IBS (Instruction Based Sampling) perf ftrace: - Use process/session specific trace settings instead of messing with the global ftrace knobs perf trace: - Implement syscall summary in BPF - Support --summary-mode=cgroup - Always print return value for syscalls returning a pid - The rseq and set_robust_list don't return a pid, just -errno perf lock contention: - Symbolize zone->lock using BTF - Add -J/--inject-delay option to estimate impact on application performance by optimization of kernel locking behavior perf stat: - Improve hybrid support for the NMI watchdog warning Symbol resolution: - Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust symbols - Improve Rust demangler Hardware tracing: Intel PT: - Fix PEBS-via-PT data_src - Do not default to recording all switch events - Fix pattern matching with python3 on the SQL viewer script arm64: - Fixups for the hip08 hha PMU Vendor events: - Update Intel events/metrics files for alderlake, alderlaken, arrowlake, bonnell, broadwell, broadwellde, broadwellx, cascadelakex, clearwaterforest, elkhartlake, emeraldrapids, grandridge, graniterapids, haswell, haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, lunarlake, meteorlake, nehalemep, nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest, skylake, skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp, westmereep-sx python support: - Add support for event counts in the python binding, add a counting.py example perf list: - Display the PMU name associated with a perf metric in JSON perf test: - Hybrid improvements for metric value validation test - Fix LBR test by ignoring idle task - Add AMD IBS sw filter ana d'ldlat' tests - Add 'perf trace --summary-mode=cgroup' test - Add tests for the various language symbol demanglers Miscellaneous: - Allow specifying the cpu an event will be tied using '-e event/cpu=N/' - Sync various headers with the kernel sources - Add annotations to use clang's -Wthread-safety and fix some problems it detected - Make dump_stack() use perf's symbol resolution to provide better backtraces - Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS (Precision Event-Based Sampling), that adds timing info, the retirement latency of instructions - Various memory allocation (some detected by ASAN) and reference counting fixes - Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace PERF_RECORD_COMPRESSED - Skip unsupported event types in perf.data files, don't stop when finding one - Improve lookups using hashmaps and binary searches" * tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (206 commits) perf callchain: Always populate the addr_location map when adding IP perf lock contention: Reject more than 10ms delays for safety perf trace: Set errpid to false for rseq and set_robust_list perf symbol: Move demangling code out of symbol-elf.c perf trace: Always print return value for syscalls returning a pid perf script: Print PERF_AUX_FLAG_COLLISION flag perf mem: Show absolute percent in mem_stat output perf mem: Display sort order only if it's available perf mem: Describe overhead calculation in brief perf record: Fix incorrect --user-regs comments Revert "perf thread: Ensure comm_lock held for comm_list" perf test trace_summary: Skip --bpf-summary tests if no libbpf perf test intel-pt: Skip jitdump test if no libelf perf intel-tpebs: Avoid race when evlist is being deleted perf test demangle-java: Don't segv if demangling fails perf symbol: Fix use-after-free in filename__read_build_id perf pmu: Avoid segv for missing name/alias_name in wildcarding perf machine: Factor creating a "live" machine out of dwarf-unwind perf test: Add AMD IBS sw filter test perf mem: Count L2 HITM for c2c statistic ...
2025-05-31perf callchain: Always populate the addr_location map when adding IPIan Rogers
Dropping symbols also meant the callchain maps wasn't populated, but the callchain map is needed to find the DSO. Plumb the symbols option better, falling back to thread__find_map() rather than thread__find_symbol() when symbols are disabled. Fixes: 02b2705017d2e5ad ("perf callchain: Allow symbols to be optional when resolving a callchain") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <martin.liska@hey.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matt Fleming <matt@readmodwrite.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steve Clevenger <scclevenger@os.amperecomputing.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Yujie Liu <yujie.liu@intel.com> Cc: Zhongqiu Han <quic_zhonhan@quicinc.com> Cc: Zixian Cai <fzczx123@gmail.com> Link: https://lore.kernel.org/r/20250529044000.759937-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-31perf lock contention: Reject more than 10ms delays for safetyNamhyung Kim
Delaying kernel operations can be dangerous and the kernel may kill (non-sleepable) BPF programs running for long in the future. Limit the max delay to 10ms and update the document about it. $ sudo ./perf lock con -abl -J 100000us@cgroup_mutex true lock delay is too long: 100000us (> 10ms) Usage: perf lock contention [<options>] -J, --inject-delay <TIME@FUNC> Inject delays to specific locks Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250515181042.555189-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-29perf trace: Set errpid to false for rseq and set_robust_listAnubhav Shelat
The 'rseq' and 'set_robust_list' syscalls don't return a pid, so set errpid for both to false. Fixes: 0c1019e3463b263a ("perf trace: Mark the 'rseq' arg in the rseq syscall as coming from user space") Fixes: 1de5b5dcb8353f36 ("perf trace: Mark the 'head' arg in the set_robust_list syscall as coming from user space") Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250529143334.1469669-2-ashelat@redhat.com [ Remove explicit .errpid = false, omitting its initialization zeroes it, as noted by Namhyung ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf symbol: Move demangling code out of symbol-elf.cIan Rogers
symbol-elf.c is used when building with libelf, symbol-minimal is used otherwise. There is no reason the demangling code with no dependencies on libelf is part of symbol-elf.c so move to symbol.c. This allows demangling tests to pass with NO_LIBELF=1. Structurally, while moving the functions rename demangle_sym() to dso__demangle_sym() which is already a function exposed in symbol.h and the only purpose of which in symbol-elf.c was to call demangle_sym(). Change the calls to demangle_sym() in symbol-elf.c to calls to dso__demangle_sym(). Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528210858.499898-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf trace: Always print return value for syscalls returning a pidAnubhav Shelat
The syscalls that were consistently observed were set_robust_list and rseq. This is because perf cannot find their child process. This change ensures that the return value is always printed. Before: 0.256 ( 0.001 ms): set_robust_list(head: 0x7f09c77dba20, len: 24) = 0.259 ( 0.001 ms): rseq(rseq: 0x7f09c77dc0e0, rseq_len: 32, sig: 1392848979) = After: 0.270 ( 0.002 ms): set_robust_list(head: 0x7f0bb14a6a20, len: 24) = 0 0.273 ( 0.002 ms): rseq(rseq: 0x7f0bb14a70e0, rseq_len: 32, sig: 1392848979) = 0 Committer notes: As discussed in the thread in the Link: tag below, these two don't return a pid, but for syscalls returning one, we need to print the result and if we manage to find the children in 'perf trace' data structures, then print its name as well. Fixes: 11c8e39f5133aed9 ("perf trace: Infrastructure to show COMM strings for syscalls returning PIDs") Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250403160411.159238-2-ashelat@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf script: Print PERF_AUX_FLAG_COLLISION flagLeo Yan
Print out the collision flag for AUX trace data. This is helpful for inspecting sample collisions. After: 0x217b60@/data_nvme1n1/niayan01/upstream/perf.data [0x40]: event: 11 . . ... raw event: size 64 bytes . 0000: 0b 00 00 00 00 00 40 00 d2 ef 3f 00 00 00 00 00 ......@...?..... . 0010: ff 0f 00 00 00 00 00 00 08 00 00 00 00 00 00 00 ................ . 0020: 1c 01 00 00 1c 01 00 00 10 bf 38 d6 11 01 00 00 ..........8..... . 0030: 03 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 ................ 3 1176120114960 0x217b60 [0x40]: PERF_RECORD_AUX offset: 0x3fefd2 size: 0xfff flags: 0x8 [C] The added character '[C]' indicates the collision. Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250528153519.188644-1-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf mem: Show absolute percent in mem_stat outputNamhyung Kim
Currently the output sums up to 100% for each entry. But it can be confusing when it's displayed with 'overhead'. Before: $ perf mem report -F overhead,sample,cache,comm ... # -------------- Cache -------------- # Overhead Samples L1 L2 L3 L1-buf Other Command # ........ ............ ................................... ............... # 25.38% 517 34.6% 0.0% 15.8% 23.3% 26.2% swapper 9.03% 239 35.4% 0.8% 9.1% 22.1% 32.6% chrome 8.61% 233 45.3% 1.2% 8.9% 22.7% 21.9% Chrome_ChildIOT 7.81% 189 33.6% 0.4% 5.5% 35.9% 24.6% Isolated Web Co 3.73% 103 40.4% 0.3% 2.7% 39.4% 17.2% gnome-shell Let's convert it to use absolute percent value so that it can add up to the overhead for that entry. After: # -------------- Cache -------------- # Overhead Samples L1 L2 L3 L1-buf Other Command # ........ ............ ................................... ............... # 25.38% 517 8.8% 0.0% 4.0% 5.9% 6.7% swapper 9.03% 239 3.2% 0.1% 0.8% 2.0% 2.9% chrome 8.61% 233 3.9% 0.1% 0.8% 2.0% 1.9% Chrome_ChildIOT 7.81% 189 2.6% 0.0% 0.4% 2.8% 1.9% Isolated Web Co 3.73% 103 1.5% 0.0% 0.1% 1.5% 0.6% gnome-shell This aligns well with the existing 'mem' sort key. $ perf mem report -s comm,mem -H ... # # Overhead Samples Command / Memory access # ......................... .......................................... # 25.38% 517 swapper 8.78% 150 L1 hit 6.66% 72 RAM hit 5.92% 137 LFB/MAB hit 4.02% 157 L3 hit 0.00% 1 L3 miss 9.03% 239 chrome 3.19% 117 L1 hit 2.94% 35 RAM hit 1.99% 48 LFB/MAB hit 0.82% 32 L3 hit 0.08% 5 L2 hit 0.00% 2 L3 miss We can add an option or a config to change the setting later. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250523222157.1259998-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>