summaryrefslogtreecommitdiff
path: root/tools/perf/util/disasm.c
AgeCommit message (Collapse)Author
2024-12-09perf disasm: Return a proper error when not determining the file typeArnaldo Carvalho de Melo
Before: ⬢ [acme@toolbox a]$ perf annotate --stdio2 -i acme-perf-injected.data 'java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int)' Error: Couldn't annotate java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int): Internal error: Invalid -1 error code ⬢ [acme@toolbox a]$ After: ⬢ [acme@toolbox a]$ perf annotate --stdio2 -i acme-perf-injected.data 'java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int)' Error: Couldn't annotate java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int): Couldn't determine the file /tmp/perf-3308868.map type. ⬢ [acme@toolbox a]$ Reported-by: Francesco Nigro <fnigro@redhat.com> Reported-by: Ilan Green <igreen@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/lkml/Z092D9-r_iOgwIWM@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-13perf disasm: Allow configuring what disassemblers to useArnaldo Carvalho de Melo
The perf tools annotation code used for a long time parsing the output of binutils's objdump (or its reimplementations, like llvm's) to then parse and augment it with samples, allow navigation, etc. More recently disassemblers from the capstone and llvm (libraries, not parsing the output of tools using those libraries to mimic binutils's objdump output) were introduced. So when all those methods are available, there is a static preference for a series of attempts of disassembling a binary, with the 'llvm, capstone, objdump' sequence being hard coded. This patch allows users to change that sequence, specifying via a 'perf config' 'annotate.disassemblers' entry which and in what order disassemblers should be attempted. As alluded to in the comments in the source code of this series, this flexibility is useful for users and developers alike, elliminating the requirement to rebuild the tool with some specific set of libraries to see how the output of disassembling would be for one of these methods. root@x1:~# rm -f ~/.perfconfig root@x1:~# perf annotate -v --stdio2 update_load_avg <SNIP> symbol__disassemble: filename=/usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux, sym=update_load_avg, start=0xffffffffb6148fe0, en> annotating [0x6ff7170] /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux : [0x7407ca0] update_load_avg Disassembled with llvm annotate.disassemblers=llvm,capstone,objdump Samples: 66 of event 'cpu_atom/cycles/P', 10000 Hz, Event count (approx.): 5185444, [percent: local period] update_load_avg() /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux Percent 0xffffffff81148fe0 <update_load_avg>: 1.61 pushq %r15 pushq %r14 1.00 pushq %r13 movl %edx,%r13d 1.90 pushq %r12 pushq %rbp movq %rsi,%rbp pushq %rbx movq %rdi,%rbx subq $0x18,%rsp 15.14 movl 0x1a4(%rdi),%eax root@x1:~# perf config annotate.disassemblers=capstone root@x1:~# cat ~/.perfconfig # this file is auto-generated. [annotate] disassemblers = capstone root@x1:~# root@x1:~# perf annotate -v --stdio2 update_load_avg <SNIP> Disassembled with capstone annotate.disassemblers=capstone Samples: 66 of event 'cpu_atom/cycles/P', 10000 Hz, Event count (approx.): 5185444, [percent: local period] update_load_avg() /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux Percent 0xffffffff81148fe0 <update_load_avg>: 1.61 pushq %r15 pushq %r14 1.00 pushq %r13 movl %edx,%r13d 1.90 pushq %r12 pushq %rbp movq %rsi,%rbp pushq %rbx movq %rdi,%rbx subq $0x18,%rsp 15.14 movl 0x1a4(%rdi),%eax root@x1:~# perf config annotate.disassemblers=objdump,capstone root@x1:~# perf config annotate.disassemblers annotate.disassemblers=objdump,capstone root@x1:~# cat ~/.perfconfig # this file is auto-generated. [annotate] disassemblers = objdump,capstone root@x1:~# perf annotate -v --stdio2 update_load_avg Executing: objdump --start-address=0xffffffff81148fe0 \ --stop-address=0xffffffff811497aa \ -d --no-show-raw-insn -S -C "$1" Disassembled with objdump annotate.disassemblers=objdump,capstone Samples: 66 of event 'cpu_atom/cycles/P', 10000 Hz, Event count (approx.): 5185444, [percent: local period] update_load_avg() /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux Percent Disassembly of section .text: ffffffff81148fe0 <update_load_avg>: #define DO_ATTACH 0x4 ffffffff81148fe0 <update_load_avg>: #define DO_ATTACH 0x4 #define DO_DETACH 0x8 /* Update task and its cfs_rq load average */ static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { 1.61 push %r15 push %r14 1.00 push %r13 mov %edx,%r13d 1.90 push %r12 push %rbp mov %rsi,%rbp push %rbx mov %rdi,%rbx sub $0x18,%rsp } /* rq->task_clock normalized against any time this cfs_rq has spent throttled */ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq) { if (unlikely(cfs_rq->throttle_count)) 15.14 mov 0x1a4(%rdi),%eax root@x1:~# After adding a way to select the disassembler from the command line a 'perf test' comparing the output of the various diassemblers should be introduced, to test these codebases. Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steinar H. Gunderson <sesse@google.com> Link: https://lore.kernel.org/r/20241111151734.1018476-4-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-13perf disasm: Define stubs for the LLVM and capstone disassemblersArnaldo Carvalho de Melo
This reduces the number of ifdefs in the main symbol__disassemble() method and paves the way for allowing the user to configure the disassemblers of preference. Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steinar H. Gunderson <sesse@google.com> Link: https://lore.kernel.org/r/20241111151734.1018476-3-acme@kernel.org [ Applied fixes from Masami Hiramatsu and Aditya Bodkhe for when capstone devel files are not available ] Link: https://lore.kernel.org/r/B78FB6DF-24E9-4A3C-91C9-535765EC0E2A@ibm.com Link: https://lore.kernel.org/r/173145729034.2747044.453926054000880254.stgit@mhiramat.roam.corp.google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-11perf disasm: Introduce symbol__disassemble_objdump()Arnaldo Carvalho de Melo
With the first disassemble method in perf, the parsing of objdump output, just like we have for llvm and capstone. This paves the way to allow the user to specify what disassemblers are preferred and to also to at some point allow building without the objdump method. Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steinar H. Gunderson <sesse@google.com> Link: https://lore.kernel.org/r/20241111151734.1018476-2-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-09perf dwarf-regs: Add EM_HOST and EF_HOST definesIan Rogers
Computed from the build architecture defines, EM_HOST and EF_HOST give values that can be used in dwarf register lookup. Place in dwarf-regs.h so the value can be shared. Move some dwarf-regs.c constants used for EM_HOST to dwarf-regs.h. Add CSky constants that may be missing. In disasm.c add an include of dwarf-regs.h as the included arch/*/annotate/instructions.c files make use of the constants and we want the elf.h/dwarf-regs.h dependency to be explicit. Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241108234606.429459-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-23perf disasm: Fix not cleaning up disasm_line in symbol__disassemble_raw()Li Huafei
In symbol__disassemble_raw(), the created disasm_line should be discarded before returning an error. When creating disasm_line fails, break the loop and then release the created lines. Fixes: 0b971e6bf1c3 ("perf annotate: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: sesse@google.com Cc: kjain@linux.ibm.com Link: https://lore.kernel.org/r/20241019154157.282038-3-lihuafei1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-23perf disasm: Use disasm_line__free() to properly free disasm_lineLi Huafei
symbol__disassemble_capstone_powerpc() goto the 'err' label when it failed in the loop that created disasm_line, and then used free() directly to free disasm_line. Since the structure disasm_line contains members that allocate memory dynamically, this can result in a memory leak. In fact, we can simply break the loop when it fails in the middle of the loop, and disasm_line__free() will then be called to properly free the created line. Other error paths do not need to consider freeing disasm_line. Fixes: c5d60de1813a ("perf annotate: Add support to use libcapstone in powerpc") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: sesse@google.com Cc: kjain@linux.ibm.com Link: https://lore.kernel.org/r/20241019154157.282038-2-lihuafei1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-23perf disasm: Use disasm_line__free() to properly free disasm_lineLi Huafei
The structure disasm_line contains members that require dynamically allocated memory and need to be freed correctly using disasm_line__free(). This patch fixes the incorrect release in symbol__disassemble_capstone(). Fixes: 6d17edc113de ("perf annotate: Use libcapstone to disassemble") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: sesse@google.com Cc: kjain@linux.ibm.com Link: https://lore.kernel.org/r/20241019154157.282038-1-lihuafei1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18perf build: Rename HAVE_DWARF_SUPPORT to HAVE_LIBDW_SUPPORTIan Rogers
In Makefile.config for unwinding the name dwarf implies either libunwind or libdw. Make it clearer that HAVE_DWARF_SUPPORT is really just defined when libdw is present by renaming to HAVE_LIBDW_SUPPORT. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Leo Yan <leo.yan@arm.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241017001354.56973-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf disasm: Fix capstone memory leakIan Rogers
The insn argument passed to cs_disasm needs freeing. To support accurately having count, add an additional free_count variable. Fixes: c5d60de1813a ("perf annotate: Add support to use libcapstone in powerpc") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Cc: David S. Miller <davem@davemloft.net> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20241016235622.52166-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-03perf annotate: LLVM-based disassemblerSteinar H. Gunderson
Support using LLVM as a disassembler method, allowing helperless annotation in non-distro builds. (It is also much faster than using libbfd or bfd objdump on binaries with a lot of debug information.) This is nearly identical to the output of llvm-objdump; there are some very rare whitespace differences, some minor changes to demangling (since we use perf's regular demangling and not LLVM's own) and the occasional case where llvm-objdump makes a different choice when multiple symbols share the same address. It should work across all of LLVM's supported architectures, although I've only tested 64-bit x86, and finding the right triple from perf's idea of machine architecture can sometimes be a bit tricky. Ideally, we should have some way of finding the triplet just from the file itself. Committer notes: Address this on 32-bit systems by using PRIu64 from inttypes.h 3 17.58 almalinux:9-i386 : FAIL gcc version 11.4.1 20231218 (Red Hat 11.4.1-3) (GCC) util/llvm-c-helpers.cpp: In function ‘char* make_symbol_relative_string(dso*, const char*, u64, u64)’: util/llvm-c-helpers.cpp:150:52: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘u64’ {aka +‘long long unsigned int’} [-Werror=format=] 150 | snprintf(buf, sizeof(buf), "%s+0x%lx", | ~~^ | | | long unsigned int | %llx 151 | demangled ? demangled : sym_name, addr - base_addr); | ~~~~~~~~~~~~~~~~ | | | u64 {aka long long unsigned int} cc1plus: all warnings being treated as errors Signed-off-by: Steinar H. Gunderson <sesse@google.com> Cc: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20240803152008.2818485-3-sesse@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-03perf annotate: Split out read_symbol()Steinar H. Gunderson
The Capstone disassembler code has a useful code snippet to read the bytes for a given code symbol into memory. Split it out into its own function, so that the LLVM disassembler can use it in the next patch. Signed-off-by: Steinar H. Gunderson <sesse@google.com> Cc: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20240803152008.2818485-2-sesse@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14perf annotate: Display the branch counter histogramKan Liang
Display the branch counter histogram in the annotation view. Press 'B' to display the branch counter's abbreviation list as well. Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }', 4000 Hz, Event count (approx.): f3 /home/sdp/test/tchain_edit [Percent: local period] Percent │ IPC Cycle Branch Counter (Average IPC: 1.39, IPC Coverage: 29.4%) │ 0000000000401755 <f3>: 0.00 0.00 │ endbr64 │ push %rbp │ mov %rsp,%rbp │ movl $0x0,-0x4(%rbp) 0.00 0.00 │1.33 3 |A |- | ↓ jmp 25 11.03 11.03 │ 11: mov -0x4(%rbp),%eax │ and $0x1,%eax │ test %eax,%eax 17.13 17.13 │2.41 1 |A |- | ↓ je 21 │ addl $0x1,-0x4(%rbp) 21.84 21.84 │2.22 2 |AA |- | ↓ jmp 25 17.13 17.13 │ 21: addl $0x1,-0x4(%rbp) 21.84 21.84 │ 25: cmpl $0x270f,-0x4(%rbp) 11.03 11.03 │0.61 3 |A |- | ↑ jle 11 │ nop │ pop %rbp 0.00 0.00 │0.24 20 |AA |B | ← ret Originally-by: Tinghao Zhang <tinghao.zhang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-8-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14perf disasm: Fix memory leak for locked operationsIan Rogers
lock__parse() calls disasm_line__parse() passing &ops->locked.ins.name that will use strdup() to populate it. Ensure ops->locked.ins.name is freed in lock__delete(). Found with address/leak sanitizer. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240813040613.882075-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-05perf annotate: Set al->data_nr using the notes->src->nr_eventsNamhyung Kim
This is a preparation to support skipping empty events. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240803211332.1107222-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-01perf bpf: Move BPF disassembly routines to separate file to avoid clash with ↵Arnaldo Carvalho de Melo
capstone bpf headers There is a clash of the libbpf and capstone libraries, that ends up with: In file included from /usr/include/capstone/capstone.h:325, from util/disasm.c:1513: /usr/include/capstone/bpf.h:94:14: error: ‘bpf_insn’ defined as wrong kind of tag 94 | typedef enum bpf_insn { So far we're just trying to avoid this by not having both headers included in the same .c or .h file, do it one more time by moving the BPF diassembly routines from util/disasm.c to util/disasm_bpf.c. This is only being hit when building with BUILD_NONDISTRO=1, i.e. building with binutils-devel, that isn't the in the default build due to a licencing clash. We need to reimplement what is now isolated in util/disasm_bpf.c using some other library to have BPF annotation feature that now only is available with BUILD_NONDISTRO=1. Fixes: 6d17edc113de1e21 ("perf annotate: Use libcapstone to disassemble") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZqpUSKPxMwaQKORr@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Set instruction name to be used with insn-stat when using raw ↵Athira Rajeev
instruction Since the "ins.name" is not set while using raw instruction, 'perf annotate' with insn-stat gives wrong data: Result from "./perf annotate --data-type --insn-stat": Annotate Instruction stats total 615, ok 419 (68.1%), bad 196 (31.9%) Name : Good Bad ----------------------------------------------------------- : 419 196 This patch sets "dl->ins.name" in arch specific function "check_ppc_insn" while initialising "struct disasm_line". Also update "ins_find" function to pass "struct disasm_line" as a parameter so as to set its name field in arch specific call. With the patch changes: Annotate Instruction stats total 609, ok 446 (73.2%), bad 163 (26.8%) Name/opcode : Good Bad ----------------------------------------------------------- 58 : 323 80 32 : 49 43 34 : 33 11 OP_31_XOP_LDX : 8 20 40 : 23 0 OP_31_XOP_LWARX : 5 1 OP_31_XOP_LWZX : 2 3 OP_31_XOP_LDARX : 3 0 33 : 0 2 OP_31_XOP_LBZX : 0 1 OP_31_XOP_LWAX : 0 1 OP_31_XOP_LHZX : 0 1 Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-16-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Add support to use libcapstone in powerpcAthira Rajeev
Now perf uses the capstone library to disassemble the instructions in x86. capstone is used (if available) for perf annotate to speed up. Currently it only supports x86 architecture. This patch includes changes to enable this in powerpc. For now, only for data type sort keys, this method is used and only binary code (raw instruction) is read. This is because powerpc approach to understand instructions and reg fields uses raw instruction. The "cs_disasm" is currently not enabled. While attempting to do cs_disasm, observation is that some of the instructions were not identified (ex: extswsli, maddld) and it had to fallback to use objdump. Hence enabling "cs_disasm" is added in comment section as a TODO for powerpc. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-15-atrajeev@linux.vnet.ibm.com [ Use dso__nsinfo(dso) as required to match EXTRA_CFLAGS=-DREFCNT_CHECKING=1 build expectations ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Use capstone_init and remove open_capstone_handle from disasm.cAthira Rajeev
capstone_init is made availbale for all archs to use and updated to enable support for CS_ARCH_PPC as well. Patch removes open_capstone_handle and uses capstone_init in all the places. Committer notes: Avoid including capstone/capstone.h from print_insn.h to not break the build in builtin-script.c due to the namespace clash with libbpf: /usr/include/capstone/bpf.h:94:14: error: 'bpf_insn' defined as wrong kind of tag Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-14-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Update instruction tracking for powerpcAthira Rajeev
Add instruction tracking function "update_insn_state_powerpc" for powerpc. Example sequence in powerpc: ld r10,264(r3) mr r31,r3 <<after some sequence> ld r9,312(r31) Consider ithe sample is pointing to: "ld r9,312(r31)". Here the memory reference is hit at "312(r31)" where 312 is the offset and r31 is the source register. Previous instruction sequence shows that register state of r3 is moved to r31. So to identify the data type for r31 access, the previous instruction ("mr") needs to be tracked and the state type entry has to be updated. Current instruction tracking support in perf tools infrastructure is specific to x86. Patch adds this support for powerpc as well. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-12-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Add some of the arithmetic instructions to support ↵Athira Rajeev
instruction tracking in powerpc Data-type profiling has the concept of instruction tracking. Example sequence in powerpc: ld r10,264(r3) mr r31,r3 <<after some sequence> ld r9,312(r31) or differently lwz r10,264(r3) add r31, r3, RB lwz r9, 0(r31) If a sample is hit at "lwz r9, 0(r31)", data type of r31 depends on previous instruction sequence here. So to track the previous instructions, patch adds changes to identify some of the arithmetic instructions which are having opcode as 31. Since memory instructions also has cases with opcode 31, use the bits 22:30 to filter the arithmetic instructions here. Also there are instructions with just two operands like "addme", "addze". This patch adds new instructions ops "arithmetic_ops" to handle this Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-10-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Add support to identify memory instructions of opcode 31 in ↵Athira Rajeev
powerpc There are memory instructions in powerpc with opcode as 31. Example: "ldx RT,RA,RB" , Its X form is as below: ______________________________________ | 31 | RT | RA | RB | 21 |/| -------------------------------------- 0 6 11 16 21 30 31 The opcode for "ldx" is 31. There are other instructions also with opcode 31 which are memory insn like ldux, stbx, lwzx, lhaux But all instructions with opcode 31 are not memory. Example is add instruction: "add RT,RA,RB" The value in bit 21-30 [ 21 for ldx ] is different for these instructions. Patch uses this value to assign instruction ops for these cases. The naming convention and value to identify these are picked from defines in "arch/powerpc/include/asm/ppc-opcode.h" Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-9-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Add parse function for memory instructions in powerpcAthira Rajeev
Use the raw instruction code and macros to identify memory instructions, extract register fields and also offset. The implementation addresses the D-form, X-form, DS-form instructions. Two main functions are added. New parse function "load_store__parse" as instruction ops parser for memory instructions. Unlike other parsers (like mov__parse), this one fills in the "multi_regs" field for source/target and new added "mem_ref" field. No other fields are set because, here there is no need to parse the disassembled code and arch specific macros will take care of extracting offset and regs which is easier and will be precise. In powerpc, all instructions with a primary opcode from 32 to 63 are memory instructions. Update "ins__find" function to have "raw_insn" also as a parameter. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-8-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Update parameters for reg extract functions to use raw ↵Athira Rajeev
instruction on powerpc Use the raw instruction code and macros to identify memory instructions, extract register fields and also offset. The implementation addresses the D-form, X-form, DS-form instructions. Adds "mem_ref" field to check whether source/target has memory reference. Add function "get_powerpc_regs" which will set these fields: reg1, reg2, offset depending of where it is source or target ops. Update "parse" callback for "struct ins_ops" to also pass "struct disasm_line" as argument. This is needed in parse functions where opcode is used to determine whether to set multi_regs and other fields Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-7-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Add support to capture and parse raw instruction in powerpc ↵Athira Rajeev
using dso__data_read_offset utility Add support to capture and parse raw instruction in powerpc. Currently, the perf tool infrastructure uses two ways to disassemble and understand the instruction. One is objdump and other option is via libcapstone. Currently, the perf tool infrastructure uses "--no-show-raw-insn" option with "objdump" while disassemble. Example from powerpc with this option for an instruction address is: Snippet from: objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux> c0000000010224b4: lwz r10,0(r9) This line "lwz r10,0(r9)" is parsed to extract instruction name, registers names and offset. Also to find whether there is a memory reference in the operands, "memory_ref_char" field of objdump is used. For x86, "(" is used as memory_ref_char to tackle instructions of the form "mov (%rax), %rcx". In case of powerpc, not all instructions using "(" are the only memory instructions. Example, above instruction can also be of extended form (X form) "lwzx r10,0,r19". Inorder to easy identify the instruction category and extract the source/target registers, patch adds support to use raw instruction for powerpc. Approach used is to read the raw instruction directly from the DSO file using "dso__data_read_offset" utility which is already implemented in perf infrastructure in "util/dso.c". Example: 38 01 81 e8 ld r4,312(r1) Here "38 01 81 e8" is the raw instruction representation. In powerpc, this translates to instruction form: "ld RT,DS(RA)" and binary code as: | 58 | RT | RA | DS | | ------------------------------------- 0 6 11 16 30 31 Function "symbol__disassemble_dso" is updated to read raw instruction directly from DSO using dso__data_read_offset utility. In case of above example, this captures: line: 38 01 81 e8 The above works well when 'perf report' is invoked with only sort keys for data type ie type and typeoff. Because there is no instruction level annotation needed if only data type information is requested for. For annotating sample, along with type and typeoff sort key, "sym" sort key is also needed. And by default invoking just "perf report" uses sort key "sym" that displays the symbol information. With approach changes in powerpc which first reads DSO for raw instruction, "perf annotate" and "perf report" + a key breaks since it doesn't do the instruction level disassembly. Snippet of result from 'perf report': Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238 do_work /usr/bin/pmlogger [Percent: local period] Percent│ ea230010 │ 3a550010 │ 3a600000 │ 38f60001 │ 39490008 │ 42400438 51.44 │ 81290008 │ 7d485378 Here, raw instruction is displayed in the output instead of human readable annotated form. One way to get the appropriate data is to specify "--objdump path", by which code annotation will be done. But the default behaviour will be changed. To fix this breakage, check if "sym" sort key is set. If so fallback and use the libcapstone/objdump way of disassmbling the sample. With the changes and "perf report" Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238 do_work /usr/bin/pmlogger [Percent: local period] Percent│ ld r17,16(r3) │ addi r18,r21,16 │ li r19,0 │ 8b0: rldicl r10,r10,63,33 │ addi r10,r10,1 │ mtctr r10 │ ↓ b 8e4 │ 8c0: addi r7,r22,1 │ addi r10,r9,8 │ ↓ bdz d00 51.44 │ lwz r9,8(r9) │ mr r8,r10 │ cmpw r20,r9 Committer notes: Just add the extern for 'sort_order' in disasm.c so that we don't end up breaking the build due to this type colision with capstone and libbpf: In file included from /usr/include/capstone/capstone.h:325, from /git/perf-6.10.0/tools/perf/util/print_insn.h:23, from builtin-script.c:38: /usr/include/capstone/bpf.h:94:14: error: 'bpf_insn' defined as wrong kind of tag 94 | typedef enum bpf_insn { I reported this to the bpf mailing list, see one of the links below. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-6-atrajeev@linux.vnet.ibm.com Link: https://lore.kernel.org/bpf/ZqOltPk9VQGgJZAA@x1/T/#u Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Add disasm_line__parse() to parse raw instruction for powerpcAthira Rajeev
Currently, the perf tool infrastructure uses the disasm_line__parse function to parse disassembled line. Example snippet from objdump: objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux> c0000000010224b4: lwz r10,0(r9) This line "lwz r10,0(r9)" is parsed to extract instruction name, registers names and offset. In powerpc, the approach for data type profiling uses raw instruction instead of result from objdump to identify the instruction category and extract the source/target registers. Example: 38 01 81 e8 ld r4,312(r1) Here "38 01 81 e8" is the raw instruction representation. Add function "disasm_line__parse_powerpc" to handle parsing of raw instruction. Also update "struct disasm_line" to save the binary code/ With the change, function captures: line -> "38 01 81 e8 ld r4,312(r1)" raw instruction "38 01 81 e8" Raw instruction is used later to extract the reg/offset fields. Macros are added to extract opcode and register fields. "struct disasm_line" is updated to carry union of "bytes" and "raw_insn" of 32 bit to carry raw code (raw). Function "disasm_line__parse_powerpc fills the raw instruction hex value and can use macros to get opcode. There is no changes in existing code paths, which parses the disassembled code. The size of raw instruction depends on architecture. In case of powerpc, the parsing the disasm line needs to handle cases for reading binary code directly from DSO as well as parsing the objdump result. Hence adding the logic into separate function instead of updating "disasm_line__parse". The architecture using the instruction name and present approach is not altered. Since this approach targets powerpc, the macro implementation is added for powerpc as of now. Since the disasm_line__parse is used in other cases (perf annotate) and not only data tye profiling, the powerpc callback includes changes to work with binary code as well as mnemonic representation. Also in case if the DSO read fails and libcapstone is not supported, the approach fallback to use objdump as option. Hence as option, patch has changes to ensure objdump option also works well. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-5-atrajeev@linux.vnet.ibm.com [ Add check for strndup() result ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Add "update_insn_state" callback function to handle arch ↵Athira Rajeev
specific instruction tracking Add "update_insn_state" callback to "struct arch" to handle instruction tracking. Currently updating instruction state is handled by static function "update_insn_state_x86" which is defined in "annotate-data.c". Make this as a callback for specific arch and move to archs specific file "arch/x86/annotate/instructions.c" . This will help to add helper function for other platforms in file: "arch/<platform>/annotate/instructions.c" and make changes/updates easier. Define callback "update_insn_state" as part of "struct arch", also make some of the debug functions non-static so that it can be referenced from other places. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-3-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-12perf dso: Fix address sanitizer buildIan Rogers
Various files had been missed from having accessor functions added for the sake of dso reference count checking. Add the function calls and missing dso accessor functions. Fixes: ee756ef7491e ("perf dso: Add reference count checking and accessor functions") Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yunseong Kim <yskelg@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240704011745.1021288-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-05-06perf dso: Add reference count checking and accessor functionsIan Rogers
Add reference count checking to struct dso, this can help with implementing correct reference counting discipline. To avoid RC_CHK_ACCESS everywhere, add accessor functions for the variables in struct dso. The majority of the change is mechanical in nature and not easy to split up. Committer testing: 'perf test' up to this patch shows no regressions. But: util/symbol.c: In function ‘dso__load_bfd_symbols’: util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’ 1683 | dso__set_adjust_symbols(dso); | ^~~~~~~~~~~~~~~~~~~~~~~ In file included from util/symbol.c:21: util/dso.h:268:20: note: declared here 268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val) | ^~~~~~~~~~~~~~~~~~~~~~~ make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1 MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/ make[6]: *** Waiting for unfinished jobs.... This was updated: - symbols__fixup_end(&dso->symbols, false); - symbols__fixup_duplicate(&dso->symbols); - dso->adjust_symbols = 1; + symbols__fixup_end(dso__symbols(dso), false); + symbols__fixup_duplicate(dso__symbols(dso)); + dso__set_adjust_symbols(dso); But not build tested with BUILD_NONDISTRO and libbfd devel files installed (binutils-devel on fedora). Add the missing argument: symbols__fixup_end(dso__symbols(dso), false); symbols__fixup_duplicate(dso__symbols(dso)); - dso__set_adjust_symbols(dso); + dso__set_adjust_symbols(dso, true); Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf annotate: Update DSO binary type when trying build-idNamhyung Kim
dso__disassemble_filename() tries to get the filename for objdump (or capstone) using build-id. But I found sometimes it didn't disassemble some functions. It turned out that those functions belong to a DSO which has no binary type set. It seems it sets the binary type for some special files only - like kernel (kallsyms or kcore) or BPF images. And there's a logic to skip dso with DSO_BINARY_TYPE__NOT_FOUND. As it's checked the build-id cache link, it should set the binary type as DSO_BINARY_TYPE__BUILD_ID_CACHE. Fixes: 873a83731f1cc85c ("perf annotate: Skip DSOs not found") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240425005157.1104789-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf annotate: Fallback disassemble to objdump when capstone failsNamhyung Kim
I found some cases that capstone failed to disassemble. Probably my capstone is an old version but anyway there's a chance it can fail. And then it silently stopped in the middle. In my case, it didn't understand "RDPKRU" instruction. Let's check if the capstone disassemble reached the end of the function and fallback to objdump if not. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240425005157.1104789-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate: Skip DSOs not foundNamhyung Kim
In some data file, I see the following messages repeated. It seems it doesn't have DSOs in the system and the dso->binary_type is set to DSO_BINARY_TYPE__NOT_FOUND. Let's skip them to avoid the followings. No output from objdump --start-address=0x0000000000000000 --stop-address=0x00000000000000d4 -d --no-show-raw-insn -C "$1" Error running objdump --start-address=0x0000000000000000 --stop-address=0x0000000000000631 -d --no-show-raw-insn -C "$1" ... Closes: https://lore.kernel.org/linux-perf-users/15e1a2847b8cebab4de57fc68e033086aa6980ce.camel@yandex.ru/ Reported-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240410185117.1987239-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Add symbol name when using capstoneNamhyung Kim
This is to keep the existing behavior with objdump. It needs to show symbol information of global variables like below: Percent | Source code & Disassembly of elf for cycles:P (1 samples, percent: local period) ------------------------------------------------------------------------------------------------ : 0 0xffffffff81338f70 <vm_normal_page>: 0.00 : ffffffff81338f70: endbr64 0.00 : ffffffff81338f74: callq 0xffffffff81083a40 0.00 : ffffffff81338f79: movq %rdi, %r8 0.00 : ffffffff81338f7c: movq %rdx, %rdi 0.00 : ffffffff81338f7f: callq *0x17021c3(%rip) # ffffffff82a3b148 <pv_ops+0x1e8> 0.00 : ffffffff81338f85: movq 0xffbf3c(%rip), %rdx # ffffffff82334ec8 <physical_mask> 0.00 : ffffffff81338f8c: testq %rax, %rax ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 0.00 : ffffffff81338f8f: je 0xffffffff81338fd0 here 0.00 : ffffffff81338f91: movq %rax, %rcx 0.00 : ffffffff81338f94: andl $1, %ecx Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Use libcapstone to disassembleNamhyung Kim
Now it can use the capstone library to disassemble the instructions. Let's use that (if available) for perf annotate to speed up. Currently it only supports x86 architecture. With this change I can see ~3x speed up in data type profiling. But note that capstone cannot give the source file and line number info. For now, users should use the external objdump for that by specifying the --objdump option explicitly. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Split out util/disasm.cNamhyung Kim
The util/annotate.c code has both disassembly and sample annotation related codes. Factor out the disasm part so that it can be handled more easily. No functional changes intended. Committer notes: Add missing include env.h, util.h, bpf-event.h and bpf-util.h to disasm.c, to fix things like: util/disasm.c: In function ‘symbol__disassemble_bpf’: util/disasm.c:1203:9: error: implicit declaration of function ‘perf_exe’ [-Werror=implicit-function-declaration] 1203 | perf_exe(tpath, sizeof(tpath)); | ^~~~~~~~ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>