summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)Author
2023-04-04perf list: Use relative path for tracepoint scanNamhyung Kim
Committer notes: Added missing #include <unistd.h> for the close() prototype to fix this on Alma Linux 8: 1 21.54 almalinux:8 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-16) (GCC) util/print-events.c: In function 'print_tracepoint_events': util/print-events.c:103:4: error: implicit declaration of function 'close'; did you mean 'clone'? [-Werror=implicit-function-declaration] close(evt_fd); ^~~~~ clone Also use the newly added scandirat feature test to check if that function is available, providing a HAVE_SCANDIRAT_SUPPORT conditional warning to the user if it isn't available. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf intel-pt: Fix CYC timestamps after standalone CBRAdrian Hunter
After a standalone CBR (not associated with TSC), update the cycles reference timestamp and reset the cycle count, so that CYC timestamps are calculated relative to that point with the new frequency. Fixes: cc33618619cefc6d ("perf tools: Add Intel PT support for decoding CYC packets") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf auxtrace: Fix address filter entire kernel sizeAdrian Hunter
kallsyms is not completely in address order. In find_entire_kern_cb(), calculate the kernel end from the maximum address not the last symbol. Example: Before: $ sudo cat /proc/kallsyms | grep ' [twTw] ' | tail -1 ffffffffc00b8bd0 t bpf_prog_6deef7357e7b4530 [bpf] $ sudo cat /proc/kallsyms | grep ' [twTw] ' | sort | tail -1 ffffffffc15e0cc0 t iwl_mvm_exit [iwlmvm] $ perf.d093603a05aa record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2ceba000 After: $ perf.8fb0f7a01f8e record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2e3e2000 Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/storeRob Herring
Arm SPEv1.3 adds new load/store operation subclasses for Memory Tagging Extension (MTE) and memory operations (MOPS). The memory operations are memcpy and memset. Add support for decoding these new subclasses in the raw decoding. Reviewed-by: Leo Yan <leo.yan@linaro.org Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230327162057.4057188-1-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Handle PERF_RECORD_AUX_OUTPUT_HW_ID packetMike Leach
When using dynamically assigned CoreSight trace IDs the drivers can output the ID / CPU association as a PERF_RECORD_AUX_OUTPUT_HW_ID packet. Update cs-etm decoder to handle this packet by setting the CPU/Trace ID mapping. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Move mapping of Trace ID and cpu into helper functionMike Leach
The information to associate Trace ID and CPU will be changing. Drivers will start outputting this as a hardware ID packet in the data file which if present will be used in preference to the AUXINFO values. To prepare for this we provide a helper functions to do the individual ID mapping, and one to extract the IDs from the completed metadata blocks. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf lock contention: Show detail failure reason for BPFNamhyung Kim
It can fail to collect lock stat from BPF for various reasons. For example, I've got a report that sometimes time calculation seems wrong in case of contended spinlocks. I suspect the time delta went negative for some reason. Count them separately and show in the output like below: $ sudo perf lock contention -abE5 sleep 10 contended total wait max wait avg wait type caller 13 785.61 us 79.36 us 60.43 us spinlock remove_wait_queue+0x14 10 469.02 us 87.51 us 46.90 us spinlock prepare_to_wait+0x27 9 289.09 us 69.08 us 32.12 us spinlock finish_wait+0x36 114 251.05 us 8.56 us 2.20 us spinlock try_to_wake_up+0x1f5 132 188.63 us 5.01 us 1.43 us spinlock __wake_up_common_lock+0x62 === output for debug === bad: 1, total: 279 bad rate: 0.36 % histogram of failure reasons task: 1 stack: 0 time: 0 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230327225711.245738-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbol: Remove unused branch_callstackAdrian Hunter
branch_callstack was added by commit 8b7bad58efb7 ("perf callchain: Support handling complete branch stacks as histograms") but never used. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230330131833.12864-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf block-range: Move debug code behind ifndef NDEBUGIan Rogers
Make good on a comment and avoid a unused-but-set-variable warning. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbol: Add command line support for addr2line pathIan Rogers
Allow addr2line to be set either on the command line or via the perfconfig file. This doesn't currently work with llvm-addr2line as the addr2line code emits two things: 1) the address to decode, 2) a bogus ',' value. The expectation is the bogus value will generate: ?? ??:0 that terminates the addr2line reading. However, the output from llvm-addr2line is a single line with just the input ',' locking up the addr2line reading that is expecting a second line. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Allow objdump to be set in perfconfigIan Rogers
Allow the setting of the objdump command in the perfconfig. Update man page for this new option. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Own objdump_path and disassembler_style stringsIan Rogers
Make struct annotation_options own the strings objdump_path and disassembler_style, freeing them on exit. Add missing strdup for disassembler_style when read from a config file. Committer notes: Converted free(obj->member) to zfree(&obj->member) in annotation_options__exit() Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Add init/exit to annotation_options remove defaultIan Rogers
The annotation__default_options global variable was used to initialize annotation_options. Switch to the init/exit pattern as later changes will give ownership over strings and this will be necessary to avoid memory leaks. Committer note: Fix the GTK2=1 build, hist_entry__gtk_annotate() needs to receive a 'struct annotation_options' pointer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf tools: Avoid warning in do_realloc_array_as_needed()Adrian Hunter
do_realloc_array_as_needed() used memcpy() of zero size with a NULL pointer. Check the size first to avoid sanitize warning. Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/oe-lkp/202303061424.6ad43294-yujie.liu@intel.com Link: https://lore.kernel.org/r/20230316194156.8320-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbols: Fix unaligned access in get_x86_64_plt_disp()Adrian Hunter
Use memcpy() to avoid unaligned access. Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Fixes: ce4c8e7966f317ef ("perf symbols: Get symbols for .plt.got for x86-64") Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/oe-lkp/202303061424.6ad43294-yujie.liu@intel.com Link: https://lore.kernel.org/r/20230316194156.8320-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbols: Fix use-after-free in get_plt_got_name()Adrian Hunter
Fix use-after-free in get_plt_got_name(). Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Fixes: ce4c8e7966f317ef ("perf symbols: Get symbols for .plt.got for x86-64") Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/oe-lkp/202303061424.6ad43294-yujie.liu@intel.com Link: https://lore.kernel.org/r/20230316194156.8320-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf metrics: Add has_pmem literalIan Rogers
Add literal so that if nvdimms aren't installed we can record fewer events. The file detection mechanism was suggested by Dan Williams <dan.j.williams@intel.com> in: https://lore.kernel.org/linux-perf-users/641bbe1eced26_1b98bb29440@dwillia2-xfh.jf.intel.com.notmuch/ Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf lock contention: Fix msan issue in lock_contention_read()Namhyung Kim
I got a report of a msan failure like below: $ sudo perf lock con -ab -- sleep 1 ... ==224416==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5651160d6c96 in lock_contention_read util/bpf_lock_contention.c:290:8 #1 0x565115f90870 in __cmd_contention builtin-lock.c:1919:3 #2 0x565115f90870 in cmd_lock builtin-lock.c:2385:8 #3 0x565115f03a83 in run_builtin perf.c:330:11 #4 0x565115f03756 in handle_internal_command perf.c:384:8 #5 0x565115f02d53 in run_argv perf.c:428:2 #6 0x565115f02d53 in main perf.c:562:3 #7 0x7f43553bc632 in __libc_start_main #8 0x565115e865a9 in _start It was because the 'key' variable is not initialized. Actually it'd be set by bpf_map_get_next_key() but msan didn't seem to understand it. Let's make msan happy by initializing the variable. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230324001922.937634-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf hist: Improve srcfile sort key performance (really)Namhyung Kim
The earlier commit f0cdde28fecc0d7f ("perf hist: Improve srcfile sort key performance") updated the srcfile logic but missed to change the ->cmp() callback which is called for every sample. It should use the same logic like in the srcline to speed up the processing because it'd return the same information repeatedly for the same address. The real processing will be done in sort__srcfile_collapse(). Fixes: f0cdde28fecc0d7f ("perf hist: Improve srcfile sort key performance") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230323025005.191239-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf report: Append inlines to non-DWARF callchainsArtem Savkov
Append information about inlined functions to FP and LBR callchains from DWARF debuginfo when available. Do so by calling append_inlines() from add_callchain_ip(). Testing it: Frame-pointer mode recorded with 'perf record --call-graph=fp --freq=max -- ./a.out' #include <stdio.h> #include <stdint.h> static __attribute__((noinline)) uint32_t func5(uint32_t i) { return i + 10; } static uint32_t func4(uint32_t i) { return func5(i + 5); } static inline uint32_t func3(uint32_t i) { return func4(i + 4); } static __attribute__((noinline)) uint32_t func2(uint32_t i) { return func3(i + 3); } static uint32_t func1(uint32_t i) { return func2(i + 2); } __attribute__((noinline)) uint64_t entry(void) { uint64_t ret = 0; uint32_t i = 0; for (i = 0; i < 1000000; i++) { ret += func1(i); ret -= func2(i); ret += func3(i); ret += func4(i); ret -= func5(i); } return ret; } int main(int argc, char **argv) { printf("%s\n", __func__); return entry(); } ====== Here is the output I get with '--call-graph callee --no-children' ====== # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 250 of event 'cycles:u' # Event count (approx.): 26819859 # # Overhead Command Shared Object Symbol # ........ ....... .................... ..................................... # 43.58% a.out a.out [.] func5 | |--28.93%--entry | main | __libc_start_call_main | --14.65%--func4 (inlined) | |--10.45%--entry | main | __libc_start_call_main | --4.20%--func3 (inlined) entry main __libc_start_call_main 38.80% a.out a.out [.] entry | |--23.27%--func4 (inlined) | | | |--20.28%--func3 (inlined) | | func2 | | main | | __libc_start_call_main | | | --2.99%--entry | main | __libc_start_call_main | |--8.17%--func5 | main | __libc_start_call_main | |--3.89%--func1 (inlined) | entry | main | __libc_start_call_main | --3.48%--entry main __libc_start_call_main 13.07% a.out a.out [.] func2 | ---func5 main __libc_start_call_main 1.54% a.out [unknown] [k] 0xffffffff81e011b7 1.16% a.out [unknown] [k] 0xffffffff81e00193 | --0.57%--__mmap64 (inlined) __mmap64 (inlined) 0.34% a.out ld-linux-x86-64.so.2 [.] __tunable_get_val 0.34% a.out ld-linux-x86-64.so.2 [.] strcmp 0.32% a.out libc.so.6 [.] strchr 0.31% a.out ld-linux-x86-64.so.2 [.] _dl_relocate_object 0.22% a.out ld-linux-x86-64.so.2 [.] _dl_init_paths 0.18% a.out ld-linux-x86-64.so.2 [.] get_common_cache_info.constprop.0 0.14% a.out ld-linux-x86-64.so.2 [.] __GI___tunables_init # # (Tip: Show individual samples with: perf script) # ====== It does not seem to be out of order, or at least it is consistent with what I get with dwarf unwinders. Committer notes: Adrian Hunter pointed out that this breaks --branch-history, so don't do it for branches, see the second Link below. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: <asavkov@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230316133557.868731-2-asavkov@redhat.com Link: https://lore.kernel.org/r/54129783-2960-84e1-05e9-97ac70ffb432@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-21perf tools: Add support for perf_event_attr::config3Rob Herring
perf_event_attr has gained a new field, config3, so add support for it extending the existing configN support. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220914-arm-perf-tool-spe1-2-v2-v5-2-2cf5210b2f77@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-21perf kvm: Reference count 'struct kvm_info'Leo Yan
hists__add_entry_ops() doesn't allocate a new histogram entry if it has an existing entry for a KVM event, in this case, find_create_kvm_event() allocates a 'struct kvm_info' but it's not used by any histograms and never freed. To fix the memory leak, this patch first introduces a refcnt and a set of functions for refcnt operations on 'struct kvm_info'. When the data structure is not anymore used (the refcnt hits zero) kvm_info__zput() will free the memory used. Committer: Provide a nop version of kvm_info__zput() to be used when HAVE_KVM_STAT_SUPPORT isn't defined as it is used unconditionally in hists__findnew_entry() and hist_entry__delete(). Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230320061619.29520-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20perf report: Add 'simd' sort fieldGerman Gomez
Add 'simd' sort field to visualize SIMD ops in 'perf report'. Rows are labeled with the SIMD ISA, and the type of predicate (if any): - [p] partial predicate - [e] empty predicate (no elements in the vector being used) Example with Arm SPE and SVE (Scalable Vector Extension): #include <arm_sve.h> double src[1025], dst[1025]; int main(void) { svfloat64_t vc = svdup_f64(1); for(;;) for(int i = 0; i < 1025; i += svcntd()) { svbool_t pg = svwhilelt_b64(i, 1025); svfloat64_t vsrc = svld1(pg, &src[i]); svfloat64_t vdst = svadd_x(pg, vsrc, vc); svst1(pg, &dst[i], vdst); } return 0; } ... compiled using "gcc-11 -march=armv8-a+sve -O3" Profiling on a platform that implements FEAT_SVE and FEAT_SPEv1p1: $ perf record -e arm_spe_0// -- ./a.out $ perf report --itrace=i1i -s overhead,pid,simd,sym Overhead Pid:Command Simd Symbol ........ ................ ....... ...................... 53.76% 10758:program [.] main 46.14% 10758:program [.] SVE [.] main 0.09% 10758:program [p] SVE [.] main The report shows 0.09% of the sampled SVE operations use partial predicates due to src and dst arrays not being multiples of the vector register lengths. Signed-off-by: German Gomez <german.gomez@arm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman.Khandual@arm.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20perf arm-spe: Add SVE flags to the SPE samplesGerman Gomez
Add flags from the Scalable Vector Extension (SVE) to the SPE samples which are available from Armv8.3 (FEAT_SPEv1p1). These will be displayed in a new SIMD sort field in a later commit. Signed-off-by: German Gomez <german.gomez@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com Cc: Anshuman.Khandual@arm.com Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20perf arm-spe: Refactor arm-spe to support operation packet typeGerman Gomez
Extend the decoder of Arm SPE records to support more fields from the operation packet type. Not all fields are being decoded by this commit. Only those needed to support the use-case SVE load/store/other operations. Suggested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: German Gomez <german.gomez@arm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman.Khandual@arm.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20perf event: Add 'simd_flags' field to 'struct perf_sample'German Gomez
Add new field to 'struct perf_sample' to store flags related to SIMD ops. It will be used to store SIMD information from SVE and NEON when profiling using ARM SPE. Signed-off-by: German Gomez <german.gomez@arm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman.Khandual@arm.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20perf intel-pt: Add support for new branch instructions ERETS and ERETUAdrian Hunter
Intel Flexible Return and Event Delivery (FRED) adds instructions ERETS (return to supervisor) and ERETU (return to user). Intel PT instruction decoder needs to know about these instructions because they are branch instructions. Similar to IRET instructions, when the decoder encounters one of these instructions it will match it to a TIP (target instruction pointer) packet that informs what the branch destination is. The existing "x86 instruction decoder - new instructions" test can be used to test the result e.g. $ perf test -v ins |& grep eret Decoded ok: f2 0f 01 ca erets Decoded ok: f3 0f 01 ca eretu Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230320183517.15099-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20perf symbol: Sort names under write lockIan Rogers
If finding a name doesn't find the sorted names then they are allocated and sorted. This shouldn't be done under a read lock as another reader may access it. Release the read lock and acquire the write lock, then release the write lock and reacquire the read lock. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: André Almeida <andrealmeid@collabora.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20perf bpf_counter: Use public cpumap accessorsIan Rogers
Avoid the use of internal apis via the cpumap accessor functions. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: André Almeida <andrealmeid@collabora.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20perf symbol: Avoid memory leak from abi::__cxa_demangleIan Rogers
Rather than allocate memory, allow abi::__cxa_demangle to do that. This avoids a problem where on error NULL was returned triggering a memory leak. Fixes: 3b4e4efe88f615f1 ("perf symbol: Add abi::__cxa_demangle C++ demangling support") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: André Almeida <andrealmeid@collabora.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Add TUI mode for stat reportLeo Yan
Since we have supported histograms list and prepared the dimensions in the tool, this patch adds TUI mode for stat report. It also adds UI progress for sorting for better user experience. Committer notes: kvm_display() is only used by functions enclosed in: #if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT) So do it with this new function as well. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Polish sorting keyLeo Yan
Since histograms supports sorting, the tool doesn't need to maintain the mapping between the sorting keys and the corresponding comparison callbacks, therefore, this patch removes structure kvm_event_key. But we still need to validate the sorting key, this patch uses an array for sorting keys and renames function select_key() to is_valid_key() to validate the sorting key passed by user. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Use histograms list to replace cached listLeo Yan
perf kvm tool defines its own cached list which is managed with RB tree, histograms also provide RB tree to manage data entries. Since now we have introduced histograms in the tool, it's not necessary to use the self defined list and we can directly use histograms list to manage KVM events. This patch changes to use histograms list to track KVM events, and it invokes the common function hists__output_resort_cb() to sort result, this also give us flexibility to extend more sorting key words easily. After histograms list supported, the cached list is redundant so remove the relevant code for it. Committer notes: kvm_hists__reinit() is only used by functions enclosed in: #if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT) So do it with this new function as well. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Add dimensions for KVM event statisticsLeo Yan
To support KVM event statistics, this patch firstly registers histograms columns and sorting fields; every column or field has its own format structure, the format structure is dereferenced to access the dimension, finally the dimension provides the comparison callback for sorting result. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf hist: Add 'kvm_info' field in histograms entryLeo Yan
__hists__add_entry() creates a temporary entry and compare it with existed histograms entries, if any existed entry equals to the temporary entry it skips to allocation to avoid duplication. The problem for support KVM event in histograms is it doesn't contain any info to identify KVM event and can be used for comparison entries. This patch adds 'kvm_info' field in the histograms entry which contains the KVM event's key, this identifier will be used for comparison histograms entries in later change. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Parse address location for samplesLeo Yan
Parse address location for samples and save it into the structure 'perf_kvm_stat', it is to be used by histograms entry. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Introduce histograms data structuresLeo Yan
This is a preparation to support histograms in perf kvm tool. As first step, this patch defines histograms data structures and initialize them. Committer notes: Those are only used by functions enclosed in: #if efined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT) So do this for these new functions and struct as well. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Use macro to replace variable 'decode_str_len'Leo Yan
The variable 'decode_str_len' defines the string length for KVM event name and every arch defines its own values. This introduces complexity that the variable definition are spreading in multiple source files under arch folder. This patch refactors code to use a macro KVM_EVENT_NAME_LEN to define event name length and thus remove the definitions in arch files. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Use subtraction for comparison metricsLeo Yan
Currently the metrics comparison uses greater operator (>), it returns the boolean value (0 or 1). This patch changes to use subtraction as comparison result, which can be used by histograms sorting. Since the subtraction result is u64 type, we change key_cmp_fun's return type to int64_t to avoid overflow. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Add pointer to 'perf_kvm_stat' in kvm eventLeo Yan
Sometimes, handling kvm events needs to base on global variables, e.g. when read event counts we need to know the target vcpu ID; the global variables are stored in structure perf_kvm_stat. This patch adds add a 'perf_kvm_stat' pointer in kvm event structure, it is to be used by later refactoring. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf bpf filter: Show warning for missing sample flagsNamhyung Kim
For a BPF filter to work properly, users need to provide appropriate options to enable the sample types. Otherwise the BPF program would see an invalid value (i.e. always 0) and filter won't work well. Show a warning message if sample types are missing like below. $ sudo ./perf record -e cycles --filter 'addr < 100' true Error: cycles event does not have PERF_SAMPLE_ADDR Hint: please add -d option to perf record. failed to set filter "BPF" on event cycles with 22 (Invalid argument) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf bpf filter: Add logical OR operatorNamhyung Kim
It supports two or more expressions connected as a group and the group result is considered true when one of them returns true. The new group operators (GROUP_BEGIN and GROUP_END) are added to setup and check the condition. As it doesn't allow nested groups, the condition is saved in local variables. For example, the following is to get samples only if the data source memory level is L2 cache or the weight value is greater than 30. $ sudo ./perf record -adW -e cpu/mem-loads/pp \ > --filter 'mem_lvl == l2 || weight > 30' -- sleep 1 $ sudo ./perf script -F data_src,weight 10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 47 11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 57 10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 56 10650100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L2 miss|LCK No|BLK N/A 144 10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 16 10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 20 11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 189 1026a100142 |OP LOAD|LVL L1 or L1 hit|SNP None|TLB L1 or L2 hit|LCK Yes|BLK N/A 193 10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 18 ... Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf bpf filter: Add data_src sample data supportNamhyung Kim
The data_src has many entries to express memory behaviors. Add each term separately so that users can combine them for their purpose. I didn't add prefix for the constants for simplicity as they are mostly distinguishable but I had to use l1_miss and l2_hit for mem_dtlb since mem_lvl has different values for the same names. Note that I decided mem_lvl to be used as an alias of mem_lvlnum as it's deprecated now. According to the comment in the UAPI header, users should use the mix of mem_lvlnum, mem_remote and mem_snoop. Also the SNOOPX bits are concatenated to mem_snoop for simplicity. The following terms are used for data_src and the corresponding perf sample data fields: * mem_op : { load, store, pfetch, exec } * mem_lvl: { l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem } * mem_snoop: { none, hit, miss, hitm, fwd, peer } * mem_remote: { remote } * mem_lock: { locked } * mem_dtlb { l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault } * mem_blk { by_data, by_addr } * mem_hops { hops0, hops1, hops2, hops3 } We can now use a filter expression like below: 'mem_op == load, mem_lvl <= l2, mem_dtlb == l1_hit' 'mem_dtlb == l2_miss, mem_hops > hops1' 'mem_lvl == ram, mem_remote == 1' Note that 'na' is shared among the terms as it has the same value except for mem_lvl. I don't have a good idea to handle that for now. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf bpf filter: Add more weight sample data supportNamhyung Kim
The weight data consists of a couple of fields with the PERF_SAMPLE_WEIGHT_STRUCT. Add weight{1,2,3} term to select them separately. Also add their aliases like 'ins_lat', 'p_stage_cyc' and 'retire_lat'. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf bpf filter: Add 'pid' sample data supportNamhyung Kim
The pid is special because it's saved in the PERF_SAMPLE_TID together. So it needs to differenciate tid and pid using the 'part' field in the perf bpf filter entry struct. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf record: Record dropped sample countNamhyung Kim
When it uses bpf filters, event might drop some samples. It'd be nice if it can report how many samples it lost. As LOST_SAMPLES event can carry the similar information, let's use it for bpf filters. To indicate it's from BPF filters, add a new misc flag for that and do not display cpu load warnings. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf record: Add BPF event filter supportNamhyung Kim
Use --filter option to set BPF filter for generic events other than the tracepoints or Intel PT. The BPF program will check the sample data and filter according to the expression. For example, the below is the typical perf record for frequency mode. The sample period started from 1 and increased gradually. $ sudo ./perf record -e cycles true $ sudo ./perf script perf-exec 2272336 546683.916875: 1 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916892: 1 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916899: 3 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916905: 17 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916911: 100 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916917: 589 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916924: 3470 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916930: 20465 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) true 2272336 546683.916940: 119873 cycles: ffffffff8283afdd perf_iterate_ctx+0x2d ([kernel.kallsyms]) true 2272336 546683.917003: 461349 cycles: ffffffff82892517 vma_interval_tree_insert+0x37 ([kernel.kallsyms]) true 2272336 546683.917237: 635778 cycles: ffffffff82a11400 security_mmap_file+0x20 ([kernel.kallsyms]) When you add a BPF filter to get samples having periods greater than 1000, the output would look like below: $ sudo ./perf record -e cycles --filter 'period > 1000' true $ sudo ./perf script perf-exec 2273949 546850.708501: 5029 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708508: 32409 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708526: 143369 cycles: ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms]) perf-exec 2273949 546850.708600: 372650 cycles: ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms]) perf-exec 2273949 546850.708791: 482953 cycles: ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms]) true 2273949 546850.709036: 501985 cycles: ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms]) true 2273949 546850.709292: 503065 cycles: 7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Committer notes: Add stubs for perf_bpf_filter__prepare() and perf_bpf_filter__destroy() to tools/perf/util/python.c to keep it building. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf bpf filter: Implement event sample filteringNamhyung Kim
The BPF program will be attached to a perf_event and be triggered when it overflows. It'd iterate the filters map and compare the sample value according to the expression. If any of them fails, the sample would be dropped. Also it needs to have the corresponding sample data for the expression so it compares data->sample_flags with the given value. To access the sample data, it uses the bpf_cast_to_kern_ctx() kfunc which was added in v6.2 kernel. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf bpf filter: Introduce basic BPF filter expressionNamhyung Kim
This implements a tiny parser for the filter expressions used for BPF. Each expression will be converted to struct perf_bpf_filter_expr and be passed to a BPF map. For now, I'd like to start with the very basic comparisons like EQ or GT. The LHS should be a term for sample data and the RHS is a number. The expressions are connected by a comma. For example, period > 10000 ip < 0x1000000000000, cpu == 3 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf top: Fix rare segfault in thread__comm_len()liuwenyu
In thread__comm_len(),strlen() is called outside of the thread->comm_lock critical section,which may cause a UAF problems if comm__free() is called by the process_thread concurrently. backtrace of the core file is as follows: (gdb) bt #0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77 #1 0x000055ad15d31de5 in thread__comm_len (thread=0x7f627d20e300) at util/thread.c:320 #2 0x000055ad15d4fade in hists__calc_col_len (h=0x7f627d295940, hists=0x55ad1772bfe0) at util/hist.c:103 #3 hists__calc_col_len (hists=0x55ad1772bfe0, h=0x7f627d295940) at util/hist.c:79 #4 0x000055ad15d52c8c in output_resort (hists=hists@entry=0x55ad1772bfe0, prog=0x0, use_callchain=false, cb=cb@entry=0x0, cb_arg=0x0) at util/hist.c:1926 #5 0x000055ad15d530a4 in evsel__output_resort_cb (evsel=evsel@entry=0x55ad1772bde0, prog=prog@entry=0x0, cb=cb@entry=0x0, cb_arg=cb_arg@entry=0x0) at util/hist.c:1945 #6 0x000055ad15d53110 in evsel__output_resort (evsel=evsel@entry=0x55ad1772bde0, prog=prog@entry=0x0) at util/hist.c:1950 #7 0x000055ad15c6ae9a in perf_top__resort_hists (t=t@entry=0x7ffcd9cbf4f0) at builtin-top.c:311 #8 0x000055ad15c6cc6d in perf_top__print_sym_table (top=0x7ffcd9cbf4f0) at builtin-top.c:346 #9 display_thread (arg=0x7ffcd9cbf4f0) at builtin-top.c:700 #10 0x00007f6282fab4fa in start_thread (arg=<optimized out>) at pthread_create.c:443 #11 0x00007f628302e200 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 The reason is that strlen() get a pointer to a memory that has been freed. The string pointer is stored in the structure comm_str, which corresponds to a rb_tree node,when the node is erased, the memory of the string is also freed. In thread__comm_len(),it gets the pointer within the thread->comm_lock critical section, but passed to strlen() outside of the thread->comm_lock critical section, and the perf process_thread may called comm__free() concurrently, cause this segfault problem. The process is as follows: display_thread process_thread -------------- -------------- thread__comm_len -> thread__comm_str # held the comm read lock -> __thread__comm_str(thread) # release the comm read lock thread__delete # held the comm write lock -> comm__free -> comm_str__put(comm->comm_str) -> zfree(&cs->str) # release the comm write lock # The memory of the string pointed to by comm has been free. -> thread->comm_len = strlen(comm); This patch expand the critical section range of thread->comm_lock in thread__comm_len(), to make strlen() called safe. Signed-off-by: Wenyu Liu <liuwenyu7@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Feilong Lin <linfeilong@huawei.com> Cc: Hewenliang <hewenliang4@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Yunfeng Ye <yeyunfeng@huawei.com> Link: https://lore.kernel.org/r/322bfb49-840b-f3b6-9ef1-f9ec3435b07e@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>