summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-07-07perf vendor events: Add PantherLake eventsIan Rogers
Bring in the events at v1.00: https://github.com/intel/perfmon/commit/d90a6737d0e4e6fbea4a5951e829615fd8317c24 Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update MeteorLake eventsIan Rogers
Update events from v1.13 to v1.14. Bring in the event updates v1.14: https://github.com/intel/perfmon/commit/6c53969b8d1a83afe6ae90149c8dd4ee416027ef Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update LunarLake eventsIan Rogers
Update events from v1.11 to v1.14. Bring in the event updates v1.14: https://github.com/intel/perfmon/commit/95634fec10542c0c466eb2c6d9a81e0c24fb1123 https://github.com/intel/perfmon/commit/84a49938387ac592af0a622273e4e8e4997e987d Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update IcelakeX eventsIan Rogers
Update events from v1.27 to v1.28. Bring in the event updates v1.28: https://github.com/intel/perfmon/commit/c52728a46cf37ba271c09b1eb7093cfc82dfbf29 Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update GraniteRapids eventsIan Rogers
Update events from v1.08 to v1.10. Bring in the event updates v1.10 https://github.com/intel/perfmon/commit/96259a932e2ce5f70ed7d347ca92fdeb78f83aa5 https://github.com/intel/perfmon/commit/19e315c8d2e0b44e170a6e60de44c9359062a6aa Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update GrandRidge eventsIan Rogers
Update events from v1.07 to v1.09. Bring in the event updates v1.09: https://github.com/intel/perfmon/commit/8c74d09c8544421256a79f4f21e548ad756f5b7f https://github.com/intel/perfmon/commit/18c7d2a75e45eacf5553f900ae2097a1290f5bed Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update EmeraldRapids eventsIan Rogers
Update events from v1.11 to v1.14. Bring in the event updates v1.14: https://github.com/intel/perfmon/commit/6f6e4c8c906992b450cb2014d0501a9ec1cda0d0 https://github.com/intel/perfmon/commit/e363f82276c129aec60402a1d64efbbd41af844e Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update CascadelakeX eventsIan Rogers
Update events from v1.23 to v1.25. Bring in the event updates v1.25: https://github.com/intel/perfmon/commit/86f146e15626b0fd3b032cab4538cafaaf2d0635 https://github.com/intel/perfmon/commit/fef03ffc333ae44d1e9d695b4e67e5bbb4429729 Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update Arrowlake eventsIan Rogers
Update events from v1.08 to v1.09. Bring in the event updates v1.09: https://github.com/intel/perfmon/commit/cf3be6daf0a751ad270b67890dfdb2261dfc75da Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update AlderlakeN eventsIan Rogers
Update events from v1.29 to v1.31. Bring in the event updates v1.31: https://github.com/intel/perfmon/commit/5a1269c8af70e32a548e74e1fda736189c398ddc https://github.com/intel/perfmon/commit/76c6d2c348c067e9ae1b616b35ee982da6d873b4 Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-07perf vendor events: Update Alderlake eventsIan Rogers
Update events from v1.29 to v1.31. Bring in the event updates v1.31: https://github.com/intel/perfmon/commit/5a1269c8af70e32a548e74e1fda736189c398ddc https://github.com/intel/perfmon/commit/76c6d2c348c067e9ae1b616b35ee982da6d873b4 Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf test: Add more test cases to sched testNamhyung Kim
$ sudo ./perf test -vv 92 92: perf sched tests: --- start --- test child forked, pid 1360101 Sched record pid 1360105's current affinity list: 0-3 pid 1360105's new affinity list: 0 pid 1360107's current affinity list: 0-3 pid 1360107's new affinity list: 0 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 4.330 MB /tmp/__perf_test_sched.perf.data.b3319 (12246 samples) ] Sched latency Sched script Sched map Sched timehist Samples of sched_switch event do not have callchains. ---- end(0) ---- 92: perf sched tests : Ok Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250703014942.1369397-9-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf sched: Fix memory leaks in 'perf sched latency'Namhyung Kim
The work_atoms should be freed after use. Add free_work_atoms() to make sure to release all. It should use list_splice_init() when merging atoms to prevent accessing invalid pointers. Fixes: b1ffe8f3e0c96f552 ("perf sched: Finish latency => atom rename and misc cleanups") Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250703014942.1369397-8-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf sched: Use RC_CHK_EQUAL() to compare pointersNamhyung Kim
So that it can check two pointers to the same object properly when REFCNT_CHECKING is on. Fixes: 78c32f4cb12f9430 ("libperf rc_check: Add RC_CHK_EQUAL") Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250703014942.1369397-7-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf sched: Fix memory leaks for evsel->priv in timehistNamhyung Kim
It uses evsel->priv to save per-cpu timing information. It should be freed when the evsel is released. Add the priv destructor for evsel same as thread to handle that. Fixes: 49394a2a24c78ce0 ("perf sched timehist: Introduce timehist command") Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250703014942.1369397-6-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf sched: Fix thread leaks in 'perf sched timehist'Namhyung Kim
Add missing thread__put() after machine__findnew_thread() or timehist_get_thread(). Also idle threads' last_thread should be refcounted properly. Fixes: 699b5b920db04a6f ("perf sched timehist: Save callchain when entering idle") Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250703014942.1369397-5-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf sched: Fix memory leaks in 'perf sched map'Namhyung Kim
It maintains per-cpu pointers for the current thread but it doesn't release the refcounts. Fixes: 5e895278697c014e ("perf sched: Move curr_thread initialization to perf_sched__map()") Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250703014942.1369397-4-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf sched: Free thread->priv using priv_destructorNamhyung Kim
In many perf sched subcommand saves priv data structure in the thread but it forgot to free them. As it's an opaque type with 'void *', it needs to register that knows how to free the data. In this case, just regular 'free()' is fine. Fixes: 04cb4fc4d40a5bf1 ("perf thread: Allow tools to register a thread->priv destructor") Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250703014942.1369397-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf sched: Make sure it frees the usage stringNamhyung Kim
The parse_options_subcommand() allocates the usage string based on the given subcommands. So it should reach the end of the function to free the string to prevent memory leaks. Fixes: 1a5efc9e13f357ab ("libsubcmd: Don't free the usage string") Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250703014942.1369397-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf tests make: Add NO_LIBDW=1 to minimal and add standalone testIan Rogers
Missing testing coverage of NO_LIBDW=1 and add NO_LIBDW=1 to the minimal test configuration. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250703053622.3141424-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf header: Fix pipe mode header dumpingIan Rogers
The pipe mode header dumping was accidentally removed when tracing of header feature events in pipe mode was added. Minor spelling tweak to header test failure message. Fixes: 61051f9a8452 ("perf header: In pipe mode dump features without --header/-I") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250703042000.2740640-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-02perf test: In forked mode add check that fds aren't leakedIan Rogers
When a test is forked no file descriptors should be open, however, parent ones may have been inherited - in particular those of the pipes of other forked child test processes. Add a loop to clean-up/close those file descriptors prior to running the test. At the end of the test assert that no additional file descriptors are present as this would indicate a file descriptor leak. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624190326.2038704-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-02perf dso: With ref count checking, avoid dso_data holding dso liveIan Rogers
With the dso_data embedded in a dso there is a reference counted pointer to the dso rather than using container_of with reference count checking. This data can hold the dso live meaning that no dso__put ever deletes it. Add a check for this case and close the dso_data when it happens. There isn't an infinite loop as the dso_data clears the file descriptor prior to putting on the dso. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624190326.2038704-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-02perf hwmon_pmu: Hold path rather than fdIan Rogers
Hold the path to the hwmon_pmu rather than the file descriptor. The file descriptor is somewhat problematic in that it reflects the directory state when opened, something that may vary in testing. Using a path simplifies testing and to some extent cleanup as the hwmon_pmu is owned by the pmus list and intentionally global and leaked when perf terminates, the file descriptor being left open looks like a leak. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624190326.2038704-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-02perf test code-reading: Avoid a leak of cpus and threadsIan Rogers
The perf_evlist__set_maps does the necessary gets on the arguments passed, so the reference count bumping isn't necessary and creates a memory leak. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624190326.2038704-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-02perf dso: Add missed dso__put to dso__load_kcoreIan Rogers
The kcore loading creates a set of list nodes that have reference counted references to maps of the kcore. The list node freeing in the success path wasn't releasing the maps, add the missing puts. It is unclear why this leak was being missed by leak sanitizer. Fixes: 83720209961f ("perf map: Move map list node into symbol") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624190326.2038704-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-02perf genelf: Fix NO_LIBDW=1 buildIan Rogers
With NO_LIBDW=1 a new unused-parameter warning/error has appeared: ``` util/genelf.c: In function ‘jit_write_elf’: util/genelf.c:163:32: error: unused parameter ‘load_addr’ [-Werror=unused-parameter] 163 | jit_write_elf(int fd, uint64_t load_addr, const char *sym, ``` Fixes: e3f612c1d8f3 ("perf genelf: Remove libcrypto dependency and use built-in sha1()") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250702175402.761818-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-02perf list: Add IBM z17 event descriptionsThomas Richter
Update IBM z17 counter description using document SA23-2260-08: "The Load-Program-Parameter and the CPU-Measurement Facilities" released in May 2025 to include counter definitions for IBM z17 counter sets: * Basic counter set * Problem/user counter set * Crypto counter set. Use document SA23-2261-09: "The CPU-Measurement Facility Extended Counters Definition for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15, z16 and z17" released on April 2025 to include counter definitions for IBM z17 * Extended counter set * MT-Diagnostic counter set. Use document SA22-7832-14: "z/Architecture Principles of Operation." released in April 2025 to include counter definitions for IBM z17 * PAI-Crypto counter set * PAI-Extention counter set. Use document "CPU MF Formulas and Updates April 2025" released in April 2025 to include metric calculations. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250623132731.899525-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-02perf tools: Fix use-after-free in help_unknown_cmd()Namhyung Kim
Currently perf aborts when it finds an invalid command. I guess it depends on the environment as I have some custom commands in the path. $ perf bad-command perf: 'bad-command' is not a perf-command. See 'perf --help'. Aborted (core dumped) It's because the exclude_cmds() in libsubcmd has a use-after-free when it removes some entries. After copying one to another entry, it keeps the pointer in the both position. And the next copy operation will free the later one but it's the same entry in the previous one. For example, let's say cmds = { A, B, C, D, E } and excludes = { B, E }. ci cj ei cmds-name excludes -----------+-------------------- 0 0 0 | A B : cmp < 0, ci == cj 1 1 0 | B B : cmp == 0 2 1 1 | C E : cmp < 0, ci != cj At this point, it frees cmds->names[1] and cmds->names[1] is assigned to cmds->names[2]. 3 2 1 | D E : cmp < 0, ci != cj Now it frees cmds->names[2] but it's the same as cmds->names[1]. So accessing cmds->names[1] will be invalid. This makes the subcmd tests succeed. $ perf test subcmd 69: libsubcmd help tests : 69.1: Load subcmd names : Ok 69.2: Uniquify subcmd names : Ok 69.3: Exclude duplicate subcmd names : Ok Fixes: 4b96679170c6 ("libsubcmd: Avoid SEGV/use-after-free when commands aren't excluded") Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250701201027.1171561-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-01perf test: Add libsubcmd help testsNamhyung Kim
Add a set of tests for subcmd routines. Currently it fails the last one since there's a bug. It'll be fixed by the next commit. $ perf test subcmd 69: libsubcmd help tests : 69.1: Load subcmd names : Ok 69.2: Uniquify subcmd names : Ok 69.3: Exclude duplicate subcmd names : FAILED! Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250701201027.1171561-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-01perf test: Check test suite description properlyNamhyung Kim
Currently perf test checks the given string with descriptions for both test suites and cases (subtests). But sometimes it's confusing since the subtests don't contain the important keyword. I think it's better to check the suite level and run the whole suite together if it matches description in the suite. Before: $ perf test hwmon (no output) After: $ perf test hwmon 10: Hwmon PMU : 10.1: Basic parsing test : Ok 10.2: Parsing without PMU name : Ok 10.3: Parsing with PMU name : Ok And keep the existing behavior when it only matches test description only. $ perf test "Equal cpu map" 39.5: Equal cpu map : Ok Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250701201027.1171561-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-01perf test: Add sched latency and script shell testsIan Rogers
Add shell tests covering the `perf sched latency` and `perf sched script` commands. The test creates 2 noploop processes on the same forced CPU, it then checks that the process appears in the `perf sched` output. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250628012302.1242532-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-01perf test: Name the noploop processIan Rogers
Name the noploop process "perf-noploop" so that tests can easily check for its existence. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250628012302.1242532-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf build: Specify shellcheck should use bashCollin Funk
When someone has a global shellcheckrc file, for example at ~/.config/shellcheckrc, with the directive 'shell=sh', building perf will fail with many shellcheck errors like: In tests/shell/base_probe/test_adding_kernel.sh line 294: (( TEST_RESULT += $? )) ^---------------------^ SC3006 (warning): In POSIX sh, standalone ((..)) is undefined. For more information: https://www.shellcheck.net/wiki/SC3006 -- In POSIX sh, standalone ((..)) is... make[5]: *** [tests/Build:91: tests/shell/base_probe/test_adding_kernel.sh.shellcheck_log] Error 1 Passing the '-s bash' option ensures that it runs correctly regardless of a developers global configuration. This patch adds '-s bash' and other options to the SHELLCHECK variable in Makefile.perf and makes use of the variable consistently. Signed-off-by: Collin Funk <collin.funk1@gmail.com> Link: https://lore.kernel.org/r/63491dbc8439edf2e949d80e264b9d22332fea61.1751082075.git.collin.funk1@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf test annotate: Use --percent-limit rather than head to reduce outputIan Rogers
The annotate test was sped up by Thomas Richter <tmricht@linux.ibm.com> in commit 658a8805cb60 ("perf test: Speed up test case 70 annotate basic tests") by reducing the annotate output using head. This causes flakes on hybrid machines where the first event dumped may not have the samples for the test within it. Rather than reduce the output using `head` switch to `--percent-limit 10` which will stop annotate dumping functions that have an overhead of less than 10%, the noploop program should be using more. Add the missing objdump option for the pipe mode version of the objdump with a command test. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20250628015832.1271229-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf test: Add basic callgraph test to record testingIan Rogers
Give some basic perf record callgraph coverage. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20250628015553.1270748-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf drm_pmu: Fix spelling mistake "bufers" -> "buffers"Colin Ian King
There are spelling mistakes in some literal strings. Fix these. Fixes: 28917cb17f9d ("perf drm_pmu: Add a tool like PMU to expose DRM information") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250630125128.562895-1-colin.i.king@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf test: perf header test fails on s390Thomas Richter
commit 2d584688643fa ("perf test: Add header shell test") introduced a new test case for perf header. It fails on s390 because call graph option -g is not supported on s390. Also the option --call-graph dwarf is only supported for the event cpu-clock. Remove this option and the test succeeds. Output after: # ./perf test 76 76: perf header tests : Ok Fixes: 2d584688643fa ("perf test: Add header shell test") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Link: https://lore.kernel.org/r/20250630091613.3061664-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-27perf stat: Fix uncore aggregation numberChun-Tse Shao
Follow up: lore.kernel.org/CAP-5=fVDF4-qYL1Lm7efgiHk7X=_nw_nEFMBZFMcsnOOJgX4Kg@mail.gmail.com/ The patch adds unit aggregation during evsel merge the aggregated uncore counters. Change the name of the column to `ctrs` and `counters` for json mode. Tested on a 2-socket machine with SNC3, uncore_imc_[0-11] and cpumask="0,120" Before: perf stat -e clockticks -I 1000 --per-socket # time socket cpus counts unit events 1.001085024 S0 1 9615386315 clockticks 1.001085024 S1 1 9614287448 clockticks perf stat -e clockticks -I 1000 --per-node # time node cpus counts unit events 1.001029867 N0 1 3205726984 clockticks 1.001029867 N1 1 3205444421 clockticks 1.001029867 N2 1 3205234018 clockticks 1.001029867 N3 1 3205224660 clockticks 1.001029867 N4 1 3205207213 clockticks 1.001029867 N5 1 3205528246 clockticks After: perf stat -e clockticks -I 1000 --per-socket # time socket ctrs counts unit events 1.001026071 S0 12 9619677996 clockticks 1.001026071 S1 12 9618612614 clockticks perf stat -e clockticks -I 1000 --per-node # time node ctrs counts unit events 1.001027449 N0 4 3207251859 clockticks 1.001027449 N1 4 3207315930 clockticks 1.001027449 N2 4 3206981828 clockticks 1.001027449 N3 4 3206566126 clockticks 1.001027449 N4 4 3206032609 clockticks 1.001027449 N5 4 3205651355 clockticks Tested with JSON output linter: perf test "perf stat JSON output linter" 94: perf stat JSON output linter : Ok Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Link: https://lore.kernel.org/r/20250627201818.479421-1-ctshao@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-27perf build: Fix a build error on REFCNT_CHECKING=1Namhyung Kim
Recently it added -fno-strict-aliasing to sync with the kernel behavior. But it caused an error due to potential uninitialized access like below: In file included from util/symbol.c:27: In function ‘dso__set_symbol_names_len’, inlined from ‘dso__sort_by_name’ at util/symbol.c:638:4: util/dso.h:654:46: error: ‘len’ may be used uninitialized [-Werror=maybe-uninitialized] 654 | RC_CHK_ACCESS(dso)->symbol_names_len = len; | ^ util/symbol.c: In function ‘dso__sort_by_name’: util/symbol.c:634:24: note: ‘len’ was declared here 634 | size_t len; | ^~~ Let's just initialize it with 0. Fixes: 55a18d2f3ff79c90 ("perf build: enable -fno-strict-aliasing") Closes: https://lore.kernel.org/r/aF7JC8zkG5-_-nY_@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26tools/perf: Add --exclude-buildids option to perf archive commandTianyou Li
When make a perf archive, it may contains the binaries that user did not want to ship with, add --exclude-buildids option to specify a file which contains the buildids need to be excluded. The file can be generated from command: perf buildid-list -i perf.data --with-hits | grep -v "^ " > exclude-buildids.txt Then remove the lines from the exclude-buildids.txt for buildids should be included. Signed-off-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Wangyang Guo <wangyang.guo@intel.com> Link: https://lore.kernel.org/r/20250625161509.2599646-1-tianyou.li@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf annotate: Fix source code annotate with objdumpNamhyung Kim
Recently it uses llvm and capstone to speed up annotation or disassembly of instructions. But they don't support source code view yet. Until it fixed, we can force to use objdump for source code annotation. To prevent performance loss, it's disabled by default and turned it on when user requests it in TUI by pressing 's' key. Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625230339.702610-1-namhyung@kernel.org Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26tools: Remove libcrypto dependencyYuzhuo Jing
Remove all occurrence of libcrypto in the build system. Signed-off-by: Yuzhuo Jing <yuzhuo@google.com> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625202311.23244-5-ebiggers@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf genelf: Remove libcrypto dependency and use built-in sha1()Yuzhuo Jing
genelf is the only file in perf that depends on libcrypto (or openssl) which only calculates a Build ID (SHA1, MD5, or URANDOM). SHA1 was expected to be the default option, but MD5 was used by default due to previous issues when linking against Java. This commit switches genelf to use the in-house sha1(), and also removes MD5 and URANDOM options since we have a reliable SHA1 implementation to rely on. It passes the tools/perf/tests/shell/test_java_symbol.sh test. Signed-off-by: Yuzhuo Jing <yuzhuo@google.com> Co-developed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625202311.23244-4-ebiggers@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf util: add a basic SHA-1 implementationEric Biggers
SHA-1 can be written in fewer than 100 lines of code. Just add a basic SHA-1 implementation so that there's no need to use an external library or try to pull in the kernel's SHA-1 implementation. The kernel's SHA-1 implementation is not really intended to be pulled into userspace programs in the way that it was proposed to do so for perf (https://lore.kernel.org/r/20250521225307.743726-3-yuzhuo@google.com/), and it's also likely to undergo some refactoring in the future. There's no need to tie userspace tools to it. Include a test for sha1() in the util test suite. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625202311.23244-3-ebiggers@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf build: enable -fno-strict-aliasingEric Biggers
perf pulls in code from kernel headers that assumes it is being built with -fno-strict-aliasing, namely put_unaligned_*() from <linux/unaligned.h> which write the data using packed structs that lack the may_alias attribute. Enable -fno-strict-aliasing to prevent miscompilations in sha1.c which would otherwise occur due to this issue. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625202311.23244-2-ebiggers@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf top: populate PMU capabilities data in perf_envThomas Falcon
Calling perf top with branch filters enabled on Intel CPU's with branch counters logging (A.K.A LBR event logging [1]) support results in a segfault. $ perf top -e '{cpu_core/cpu-cycles/,cpu_core/event=0xc6,umask=0x3,frontend=0x11,name=frontend_retired_dsb_miss/}' -j any,counter ... Thread 27 "perf" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffafff76c0 (LWP 949003)] perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653 653 *width = env->cpu_pmu_caps ? env->br_cntr_width : (gdb) bt #0 perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653 #1 0x00000000005b1599 in symbol__account_br_cntr (branch=0x7fffcc3db580, evsel=0xfea2d0, offset=12, br_cntr=8) at util/annotate.c:345 #2 0x00000000005b17fb in symbol__account_cycles (addr=5658172, start=5658160, sym=0x7fffcc0ee420, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:389 #3 0x00000000005b1976 in addr_map_symbol__account_cycles (ams=0x7fffcd7b01d0, start=0x7fffcd7b02b0, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:422 #4 0x000000000068d57f in hist__account_cycles (bs=0x110d288, al=0x7fffafff6540, sample=0x7fffafff6760, nonany_branch_mode=false, total_cycles=0x0, evsel=0xfea2d0) at util/hist.c:2850 #5 0x0000000000446216 in hist_iter__top_callback (iter=0x7fffafff6590, al=0x7fffafff6540, single=true, arg=0x7fffffff9e00) at builtin-top.c:737 #6 0x0000000000689787 in hist_entry_iter__add (iter=0x7fffafff6590, al=0x7fffafff6540, max_stack_depth=127, arg=0x7fffffff9e00) at util/hist.c:1359 #7 0x0000000000446710 in perf_event__process_sample (tool=0x7fffffff9e00, event=0x110d250, evsel=0xfea2d0, sample=0x7fffafff6760, machine=0x108c968) at builtin-top.c:845 #8 0x0000000000447735 in deliver_event (qe=0x7fffffffa120, qevent=0x10fc200) at builtin-top.c:1211 #9 0x000000000064ccae in do_flush (oe=0x7fffffffa120, show_progress=false) at util/ordered-events.c:245 #10 0x000000000064d005 in __ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324 #11 0x000000000064d0ef in ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP) at util/ordered-events.c:342 #12 0x00000000004472a9 in process_thread (arg=0x7fffffff9e00) at builtin-top.c:1120 #13 0x00007ffff6e7dba8 in start_thread (arg=<optimized out>) at pthread_create.c:448 #14 0x00007ffff6f01b8c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 The cause is that perf_env__find_br_cntr_info tries to access a null pointer pmu_caps in the perf_env struct. A similar issue exists for homogeneous core systems which use the cpu_pmu_caps structure. Fix this by populating cpu_pmu_caps and pmu_caps structures with values from sysfs when calling perf top with branch stack sampling enabled. [1], LBR event logging introduced here: https://lore.kernel.org/all/20231025201626.3000228-5-kan.liang@linux.intel.com/ Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250612163659.1357950-2-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf tools: move perf_pmus__find_core_pmu() prototype to pmus.hThomas Falcon
perf_pmus__find_core_pmu() is implemented in util/pmus.c but its prototpye is in util/pmu.h. Move it to util/pmus.h. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250612163659.1357950-1-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf trace: Split BPF skel code to util/bpf_trace_augment.cNamhyung Kim
And make builtin-trace.c less conditional. Dummy functions will be called when BUILD_BPF_SKEL=0 is used. This makes the builtin-trace.c slightly smaller and simpler by removing the skeleton and its helpers. The conditional guard of trace__init_syscalls_bpf_prog_array_maps() is changed from the HAVE_BPF_SKEL to HAVE_LIBBPF_SUPPORT as it doesn't have a skeleton in the code directly. And a dummy function is added so that it can be called unconditionally. The function will succeed only if the both conditions are true. Do not include trace_augment.h from the BPF code and move the definition of TRACE_AUG_MAX_BUF to the BPF directly. Reviewed-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250623225721.21553-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf test: Change all remaining #!/bin/sh to #!/bin/bashJames Clark
There are 43 instances of posix shell tests and 35 instances of bash. To give us a single consistent language for testing in, replace all #!/bin/sh to #!/bin/bash. Common sources that are included in both different shells will now work as expected. And we no longer have to fix up bashisms that appear to work when someone's system has sh symlinked to bash, but don't work on other systems that have both shells installed. Although we could have chosen sh, it's not backwards compatible so it wouldn't be possible to bulk convert without re-writing the existing bash tests. Choosing bash also gives us some nicer features including 'local' variable definitions and regexes in if statements that are already widely used in the tests. It's not expected that there are any users with only sh available due to the large number of bash tests that exist. Discussed in relation to running shellcheck here: https://lore.kernel.org/linux-perf-users/e3751a74be34bbf3781c4644f518702a7270220b.1749785642.git.collin.funk1@gmail.com/ Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Collin Funk <collin.funk1@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250623-james-perf-bash-tests-v1-1-f572f54d4559@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>