summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)Author
2015-12-14perf tools: Use same signal handling strategy as 'record'Arnaldo Carvalho de Melo
I.e. don't exit with the signal number, instead set the signal handler to the default one and then raise it again. Noticed while trying to dump the stack at segfaults in the 'perf test' forked process used to run each test, that inspects signal info at each test. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5x5r176wnoqxi5p6id05wv9w@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-11irq_poll: make blk-iopoll available outside the block layerChristoph Hellwig
The new name is irq_poll as iopoll is already taken. Better suggestions welcome. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
2015-12-11perf tools: Clear struct machine during machine__init()Wang Nan
There are so many test cases use stack allocated 'struct machine'. Including: test__hists_link test__hists_filter test__mmap_thread_lookup test__thread_mg_share test__hists_output test__hists_cumulate Also, in non-test code (for example, machine__new_host()) there are code use 'malloc()' to alloc struct machine. These are dangerous operations, cause some tests fail or hung in machines__exit(). For example, in machines__exit -> machine__destroy_kernel_maps -> map_groups__remove -> maps__remove -> pthread_rwlock_wrlock a incorrectly initialized lock causes unintended behavior. This patch memset(0) that structure in machine__init() to ensure all fields in 'struct machine' are initialized to zero. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1449541544-67621-17-git-send-email-wangnan0@huawei.com [ Use memset, see 'man bzero' ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-11perf data: Add u32_hex data typeWang Nan
Add hexadecimal u32 to base data type, which is useful for raw output because raw data is u32 aligned. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: David S. Miller <davem@davemloft.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1449541544-67621-12-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-10perf symbols: Fix dso__load_sym to put dsoMasami Hiramatsu
Fix dso__load_sym to put dso because dsos__add already got it. Refcnt debugger explain the problem: ---- ==== [0] ==== Unreclaimed dso: 0x19dd200 Refcount +1 => 1 at ./perf(dso__new+0x1ff) [0x4a62df] ./perf(dso__load_sym+0xe89) [0x503509] ./perf(dso__load_vmlinux+0xbf) [0x4aa77f] ./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc] ./perf() [0x50539a] ./perf(convert_perf_probe_events+0xd79) [0x50ad39] ./perf() [0x45600f] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5] ./perf() [0x4220a9] Refcount +1 => 2 at ./perf(dso__get+0x34) [0x4a65f4] ./perf(map__new2+0x76) [0x4be216] ./perf(dso__load_sym+0xee1) [0x503561] ./perf(dso__load_vmlinux+0xbf) [0x4aa77f] ./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc] ./perf() [0x50539a] ./perf(convert_perf_probe_events+0xd79) [0x50ad39] ./perf() [0x45600f] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5] ./perf() [0x4220a9] Refcount +1 => 3 at ./perf(dsos__add+0xf3) [0x4a6bc3] ./perf(dso__load_sym+0xfc1) [0x503641] ./perf(dso__load_vmlinux+0xbf) [0x4aa77f] ./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc] ./perf() [0x50539a] ./perf(convert_perf_probe_events+0xd79) [0x50ad39] ./perf() [0x45600f] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5] ./perf() [0x4220a9] Refcount -1 => 2 at ./perf(dso__put+0x2f) [0x4a664f] ./perf(map_groups__exit+0xb9) [0x4bee29] ./perf(machine__delete+0xb0) [0x4b93d0] ./perf(exit_probe_symbol_maps+0x28) [0x506718] ./perf() [0x45628a] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5] ./perf() [0x4220a9] Refcount -1 => 1 at ./perf(dso__put+0x2f) [0x4a664f] ./perf(machine__delete+0xfe) [0x4b941e] ./perf(exit_probe_symbol_maps+0x28) [0x506718] ./perf() [0x45628a] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5] ./perf() [0x4220a9] ---- So, in the dso__load_sym, dso is gotten 3 times, by dso__new, map__new2, and dsos__add. The last 2 is actually released by map_groups and machine__delete correspondingly. However, the first reference by dso__new, is never released. Committer note: Changed the place where the reference count is dropped to: Fix it by dropping it right after creating curr_map, since we know that either that operation failed and we need to drop the dso refcount or that it succeed and we have it referenced via curr_map->dso. Then only drop the curr_map refcount after we call dsos__add() to make sure we hold a reference to it via curr_map->dso. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151209021118.10245.49869.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-10perf tools: Make perf_session__register_idle_thread drop the refcountMasami Hiramatsu
Note that since the thread was already inserted to the session list, it will be released when the session is released. Also, in perf_session__register_idle_thread() failure path, the thread should be put before returning. Refcnt debugger shows that the perf_session__register_idle_thread gets the returned thread, but the caller (__cmd_top) does not put the returned idle thread. ---- ==== [0] ==== Unreclaimed thread@0x24e6240 Refcount +1 => 0 at ./perf(thread__new+0xe5) [0x4c8a75] ./perf(machine__findnew_thread+0x9a) [0x4bbdba] ./perf(perf_session__register_idle_thread+0x28) [0x4c63c8] ./perf(cmd_top+0xd7d) [0x43cf6d] ./perf() [0x47ba35] ./perf(main+0x617) [0x4225b7] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5] ./perf() [0x42272d] Refcount +1 => 1 at ./perf(thread__get+0x2c) [0x4c8bcc] ./perf(machine__findnew_thread+0xee) [0x4bbe0e] ./perf(perf_session__register_idle_thread+0x28) [0x4c63c8] ./perf(cmd_top+0xd7d) [0x43cf6d] ./perf() [0x47ba35] ./perf(main+0x617) [0x4225b7] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5] ./perf() [0x42272d] Refcount +1 => 2 at ./perf(thread__get+0x2c) [0x4c8bcc] ./perf(machine__findnew_thread+0x112) [0x4bbe32] ./perf(perf_session__register_idle_thread+0x28) [0x4c63c8] ./perf(cmd_top+0xd7d) [0x43cf6d] ./perf() [0x47ba35] ./perf(main+0x617) [0x4225b7] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5] ./perf() [0x42272d] ---- Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151209021122.10245.69707.stgit@localhost.localdomain [ Drop the refcount in perf_session__register_idle_thread() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-10perf top: Delete half-processed hist entries when exitNamhyung Kim
After sample processing is done, hist entries are in both of hists->entries and hists->entries_in (or hists->entries_collapsed). So I guess perf report does not have leaks on hists. But for perf top, it's possible to have half-processed entries which are only in hists->entries_in. Eventually they will go to the hists->entries and get freed but they cannot be deleted by current hists__delete_entries(). This patch adds hists__delete_all_entries function to delete those entries. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-and-Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1449734015-9148-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-10perf tools: Get rid of exit_browser() from usage_with_options()Namhyung Kim
Since all of its users call before setup_browser(), there's no need to call exit_browser() inside of the function. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449716459-23004-8-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-10perf thread_map: Free strlist on constructor error pathNamhyung Kim
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449716459-23004-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf tools: Move cmd_version() to builtin-version.cJosh Poimboeuf
Move cmd_version() to its own file so that help.c can be moved to a library. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/e908b1b68f20ab6d8d33941d5571c23110622e60.1449548395.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf tools: Save cmdline arguments earlierJosh Poimboeuf
perf_env__set_cmdline() only saves the arguments the first time it's called. It doesn't need to be called every time the options and suboptions are parsed. Instead it can just be called once. This also has the advantage of making the option parsing code less perf-specific so it can be moved out to a library. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/19b76a5aa1b688bd635bd65d80bbc103a978d75e.1449548395.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf tools: Move term functions out of util.cJosh Poimboeuf
The term functions are needed by help.c which is going to be moved into a separate library. Move them out of util.c and into their own file. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/9a39c854dd156b55ebda57e427594c9a59dcb40f.1449548395.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf tools: Remove unused pager_use_color variableJosh Poimboeuf
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/e540c61b3068761181db6d9b1b3411990bafdb2f.1449548395.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf tools: Fix write_numa_topology to put cpu_map instead of freeMasami Hiramatsu
Fix write_numa_topology to put cpu_map instead of free because cpu_map is managed based on refcnt. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151209021135.10245.79046.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf machine: Fix machine.vmlinux_maps to make sure to clear the old oneMasami Hiramatsu
Fix machine.vmlinux_maps to make sure to clear the old one if it is renewal. This can leak the previous maps on the vmlinux_maps because those are just overwritten. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151209021133.10245.93730.stgit@localhost.localdomain [ Simplified the memset, same end result ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf tools: Fix maps__fixup_overlappings to put used mapsMasami Hiramatsu
Since the __map_groups__insert got the given map, we don't need to keep it. So put the maps. Refcnt debugger shows that map_groups__fixup_overlappings() got a map twice but the group released it just once. This pattern usually indicates the leak happens in caller site. ---- ==== [0] ==== Unreclaimed map@0x39d3ae0 Refcount +1 => 1 at ./perf(map_groups__fixup_overlappings+0x335) [0x4c1865] ./perf(thread__insert_map+0x30) [0x4c8e00] ./perf(machine__process_mmap2_event+0x106) [0x4bd876] ./perf() [0x4c378e] ./perf() [0x4c4393] ./perf(perf_session__process_events+0x38a) [0x4c654a] ./perf(cmd_record+0xe24) [0x42fc94] ./perf() [0x47b745] ./perf(main+0x617) [0x422547] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5] ./perf() [0x4226bd] Refcount +1 => 2 at ./perf(map_groups__fixup_overlappings+0x3c5) [0x4c18f5] ./perf(thread__insert_map+0x30) [0x4c8e00] ./perf(machine__process_mmap2_event+0x106) [0x4bd876] ./perf() [0x4c378e] ./perf() [0x4c4393] ./perf(perf_session__process_events+0x38a) [0x4c654a] ./perf(cmd_record+0xe24) [0x42fc94] ./perf() [0x47b745] ./perf(main+0x617) [0x422547] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5] ./perf() [0x4226bd] Refcount -1 => 1 at ./perf(map_groups__exit+0x92) [0x4c0962] ./perf(map_groups__put+0x60) [0x4c0bc0] ./perf(thread__put+0x90) [0x4c8a40] ./perf(machine__delete_threads+0x7e) [0x4bad9e] ./perf(perf_session__delete+0x4f) [0x4c499f] ./perf(cmd_record+0xb6d) [0x42f9dd] ./perf() [0x47b745] ./perf(main+0x617) [0x422547] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5] ./perf() [0x4226bd] ---- Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151209021131.10245.41485.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf hists: Fix hists_evsel to release histsMasami Hiramatsu
Since hists__init doesn't set the destructor of hists_evsel (which is an extended evsel structure), when hists_evsel is released, the extended part of the hists_evsel is not deleted (note that the hists_evsel object itself is freed). This fixes it to add a destructor for hists__evsel and to set it up. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151209021129.10245.28710.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf tools: Fix map_groups__clone to put cloned mapMasami Hiramatsu
Fix map_groups__clone to put cloned map after inserting it to the map_groups. Refcnt debugger shows: ---- ==== [0] ==== Unreclaimed map: 0x2a27ee0 Refcount +1 => 1 at ./perf(map_groups__clone+0x8d) [0x4bb7ed] ./perf(thread__fork+0xbe) [0x4c1f9e] ./perf(machine__process_fork_event+0x216) [0x4b79a6] ./perf(perf_event__synthesize_threads+0x38b) [0x48135b] ./perf(cmd_top+0xdc6) [0x43cb76] ./perf() [0x477223] ./perf(main+0x617) [0x422077] /lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0] ./perf() [0x4221ed] Refcount +1 => 2 at ./perf(map_groups__clone+0x128) [0x4bb888] ./perf(thread__fork+0xbe) [0x4c1f9e] ./perf(machine__process_fork_event+0x216) [0x4b79a6] ./perf(perf_event__synthesize_threads+0x38b) [0x48135b] ./perf(cmd_top+0xdc6) [0x43cb76] ./perf() [0x477223] ./perf(main+0x617) [0x422077] /lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0] ./perf() [0x4221ed] Refcount -1 => 1 at ./perf(map_groups__exit+0x87) [0x4ba757] ./perf(map_groups__put+0x68) [0x4ba9a8] ./perf(thread__put+0x8b) [0x4c1aeb] ./perf(machine__delete_threads+0x81) [0x4b48f1] ./perf(perf_session__delete+0x4f) [0x4be63f] ./perf(cmd_top+0x1094) [0x43ce44] ./perf() [0x477223] ./perf(main+0x617) [0x422077] /lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0] ./perf() [0x4221ed] ---- This shows map_groups__clone get the map twice and put it when map_groups__exit. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151209021120.10245.95388.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf annotate: ARM supportRussell King
Add basic support to parse ARM assembly. This: * enables perf to correctly show the disassembly, rather than chopping some constants off at the '#' (which is not a comment character on ARM). * allows perf to identify ARM instructions that branch to other parts within the same function, thereby properly annotating them. * allows perf to identify function calls, allowing called functions to be followed in the annotated view. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/n/tip-owp1uj0nmcgfrlppfyeetuyf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf evlist: Factor perf_evlist__(enable|disable) functionsJiri Olsa
Use perf_evsel__(enable|disable) functions in perf_evlist__(enable|disable) functions in order to centralize ioctl enable/disable calls. This way we eliminate 2 places calling directly ioctl. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf evsel: Introduce disable() methodJiri Olsa
Adding perf_evsel__disable function to have complement for perf_evsel__enable function. Both will be used in following patch to factor perf_evlist__(enable|disable). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf evsel: Use event maps directly in perf_evsel__enableJiri Olsa
All events now share proper cpu and thread maps. There's no need to pass those maps from evlist, it's safe to use evsel maps for enabling event. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf machine: Pass correct string to dso__adjust_kmod_long_nameWang Nan
There's a mistake in dso__adjust_kmod_long_name() that it use strdup() to dup the new long_name of a dso, but passes the original string to dso__set_long_name(). Which causes random crash during cleanup. Signed-off-by: Wang Nan <wangnan0@huawei.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Fixes: c03d5184f0e9 ("perf machine: Adjust dso->long_name for offline module") Link: http://lkml.kernel.org/r/1449455785-42020-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-27perf bpf: Rename bpf config to program configWang Nan
Following patches are going to introduce BPF object level configuration to enable setting values into BPF maps. To avoid confusion, this patch renames existing 'config' in bpf-loader.c to 'program config'. Following patches would introduce 'object config'. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448614067-197576-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-27perf hists: Do not skip elided fields when processing samplesNamhyung Kim
If user gives a filter, perf marks the corresponding column elided and omits the output. But it should process and aggregates samples using the field, otherwise samples will be aggregated as if the column was not there resulted in incorrect output. For example, I'd like to set a filter on native_write_msr_safe. The original overhead of the function is negligible. $ perf report | grep native_write_msr_safe 0.00% swapper [kernel.vmlinux] native_write_msr_safe 0.00% perf [kernel.vmlinux] native_write_msr_safe However adding -S option gives different output. $ perf report -S native_write_msr_safe --percentage absolute | \ > grep -e swapper -e perf 51.47% swapper [kernel.vmlinux] 4.14% perf [kernel.vmlinux] Since it aggregated samples using comm and dso only. In fact, the above values are same when it sorts with -s comm,dso. $ perf report -s comm,dso | grep -e swapper -e perf 51.47% swapper [kernel.vmlinux] 4.14% perf [kernel.vmlinux] This resulted in TUI failure with -ERANGE since it tries to increase sample hit count for annotation with wrong symbols due to incorrect aggregation. This patch fixes it not to skip elided fields when comparing samples in order to insert them to the hists. Commiter note: After the patch, with a different workloads: # perf report --show-total-period -S native_write_msr_safe --stdio # # symbol: native_write_msr_safe # # Samples: 455 of event 'cycles:pp' # Event count (approx.): 134787489 # # Overhead Period Command Shared Object # ........ ...... ............... ................ # 0.22% 293081 qemu-system-x86 [vmlinux] 0.19% 255914 swapper [vmlinux] 0.00% 2054 Timer [vmlinux] 0.00% 1021 firefox [vmlinux] 0.00% 2 perf [vmlinux] # perf report --show-total-period | grep native_write_msr_safe Failed to open /tmp/perf-14838.map, continuing without symbols 0.22% 293081 qemu-system-x86 [vmlinux] [k] native_write_msr_safe 0.19% 255914 swapper [vmlinux] [k] native_write_msr_safe 0.00% 2054 Timer [vmlinux] [k] native_write_msr_safe 0.00% 1021 firefox [vmlinux] [k] native_write_msr_safe 0.00% 2 perf [vmlinux] [k] native_write_msr_safe # Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1448645559-31167-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-27perf list: Robustify event printing routineArnaldo Carvalho de Melo
When a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") added PERF_COUNT_SW_BPF_OUTPUT we ended up with a new entry in the event_symbols_sw array that wasn't initialized, thus set to NULL, fix print_symbol_events() to check for that case so that we don't crash if this happens again. (gdb) bt #0 __match_glob (ignore_space=false, pat=<optimized out>, str=<optimized out>) at util/string.c:198 #1 strglobmatch (str=<optimized out>, pat=pat@entry=0x7fffffffe61d "stall") at util/string.c:252 #2 0x00000000004993a5 in print_symbol_events (type=1, syms=0x872880 <event_symbols_sw+160>, max=11, name_only=false, event_glob=0x7fffffffe61d "stall") at util/parse-events.c:1615 #3 print_events (event_glob=event_glob@entry=0x7fffffffe61d "stall", name_only=false) at util/parse-events.c:1675 #4 0x000000000042c79e in cmd_list (argc=1, argv=0x7fffffffe390, prefix=<optimized out>) at builtin-list.c:68 #5 0x00000000004788a5 in run_builtin (p=p@entry=0x871758 <commands+120>, argc=argc@entry=2, argv=argv@entry=0x7fffffffe390) at perf.c:370 #6 0x0000000000420ab0 in handle_internal_command (argv=0x7fffffffe390, argc=2) at perf.c:429 #7 run_argv (argv=0x7fffffffe110, argcp=0x7fffffffe11c) at perf.c:473 #8 main (argc=2, argv=0x7fffffffe390) at perf.c:588 (gdb) p event_symbols_sw[PERF_COUNT_SW_BPF_OUTPUT] $4 = {symbol = 0x0, alias = 0x0} (gdb) A patch to robustify perf to not segfault when the next counter gets added in the kernel will follow this one. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-57wysblcjfrseb0zg5u7ek10@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-27perf list: Add support for PERF_COUNT_SW_BPF_OUTArnaldo Carvalho de Melo
When PERF_COUNT_SW_BPF_OUTPUT was added to the kernel we should've added it to tools/perf, where it is used just to list events. This ended up causing a segfault in commands like "perf list stall". Fix it by adding that new software counter. A patch to robustify perf to not segfault when the next counter gets added in the kernel will follow this one. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uya354upi3eprsey6mi5962d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-27perf buildid-list: Show running kernel build id fixMichael Petlan
The --kernel option of perf buildid-list tool should show the running kernel buildid. The functionality has been lost during other changes of the related code. The build_id__sprintf() function should return length of the build-id string, but it was the length of the build-id raw data instead. Due to that, some return value checking caused that the final string was not printed out. With this patch the build_id__sprintf() returns the correct value, so the --kernel option works again. Before: # perf buildid-list --kernel # After: # perf buildid-list --kernel 972c1edab5bdc06cc224af45d510af662a3c6972 # Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> LPU-Reference: 1448632089.24573.114.camel@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-26perf evlist: Display WEIGHT sample type bitJiri Olsa
Adding WIEGHT bit_name call to display sample_type properly. $ perf evlist -v cpu/mem-loads/pp: ...SNIP... sample_type: IP|TID|TIME|ADDR|ID|CPU|DATA_SRC|WEIGHT ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1448465815-27404-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-26perf symbols: Add the path to vmlinux.debugEkaterina Tumanova
Currently when debuginfo is separated to vmlinux.debug, it's contents get ignored. Let's change that and add it to the vmlinux_path list. Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1448469166-61363-3-git-send-email-tumanova@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-26perf symbols: Refactor vmlinux_path__init() to ease path additionsEkaterina Tumanova
Refactor vmlinux_path__init() to ease subsequent additions of new vmlinux locations. Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1448469166-61363-2-git-send-email-tumanova@linux.vnet.ibm.com [ Rename vmlinux_path__update() to vmlinux_path__add() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-26perf tools: Correctly identify anon_hugepage when generating map (v2)Yannick Brosseau
When parsing /proc/xxx/maps, the sscanf in perf_event__synthesize_mmap_events truncate the map name at the space in "/anon_hugepage (deleted)". is_anon_memory() then only receives the string "/anon_hugepage" and does not detect it. We change is_anon_memory() to only compare the first part of the string, effectively ignoring if " (deleted)" is there. Signed-off-by: Yannick Brosseau <scientist@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Joshua Zhu <zhu.wen-jie@hp.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/1448538152-2898-1-git-send-email-scientist@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-26perf machine: Adjust dso->long_name for offline moduleWang Nan
Something unexpected may happen if copy statically linked perf to a production environment: # ./perf probe -m ./mymodule.ko my_func [mymodule] with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko Error: Failed to add events. # ./perf buildid-cache -a ./mymodule.ko # ./perf probe -m ./mymodule.ko my_func Added new event: probe:my_func (on my_func in /home/wangnan/kmodule/mymodule.ko) You can now use it in all perf tools, such as: perf record -e probe:my_func -aR sleep 1 Where: # ldd ./perf not a dynamic executable # strace -e open ./perf probe -m ./mymodule.ko my_func ... open("/home/wangnan/kmodule/mymodule.ko", O_RDONLY) = 3 open("/home/wangnan/kmodule/../lib64/elfutils/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) ... open("/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/usr/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("[mymodule]", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/wangnan/.debug/.build-id/32/6ab42550ef3d24944f53c817533728367effeb", O_RDONLY) = -1 ENOENT (No such file or directory) open("[mymodule]", O_RDONLY) = -1 ENOENT (No such file or directory) In the above example, probe fails before we put the module into buildid-cache. However, user would expect it success in both case because perf is able to find probe points actually. The reason is because perf won't utilize module's full path if it failed to open debuginfo. In: convert_to_probe_trace_events -> find_probe_trace_events_from_map -> get_target_map -> kernel_get_module_map -> machine__findnew_module_map -> map_groups__find_by_name map_groups__find_by_name() is able to find the map of that module, but this information is found from /proc/module before it knows the real path of the offline module. Therefore, the map->dso->long_name is set to something like '[mymodule]', which prevent dso__load() find the real path of the module file. In another aspect, if dso__load() can get the offline module through buildid cache, it can read symble table from that ko. Even if debuginfo is not available, 'perf probe' can success if the '.symtab' can be found. This patch improves machine__findnew_module_map(): when dso->long_name is leading with '[' (doesn't find path of module when parsing /proc/modules), fixes it by dso__set_long_name(), so following dso__load() is possible to find the symbol table. This patch won't interfere with buildid matching. Here is the test result: # ./perf probe -m ./mymodule.ko my_func Added new event: probe:my_func (on my_func in /home/wangnan/kmodule/mymodule.ko) You can now use it in all perf tools, such as: perf record -e probe:my_func -aR sleep 1 # ./perf probe -d '*' Removed event: probe:my_func # mv ./mymodule.{ko,.bak} # mv ./moduleb.ko mymodule.ko # ./perf probe -m ./mymodule.ko my_func /home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko Error: Failed to add events. # ./perf probe -v -m ./mymodule.ko my_func probe-definition(0): my_func symbol:my_func file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Could not open debuginfo. Try to use symbols. symsrc__init: build id mismatch for /home/wangnan/kmodule/mymodule.ko. /home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko Error: Failed to add events. Reason: No such file or directory (Code: -2) Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448510397-187965-1-git-send-email-wangnan0@huawei.com [ Renamed adjust_dso_long_name() do dso__adjust_kmod_long_name() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-26perf top: Fix freeze on --call-graph flat/foldedNamhyung Kim
The callchain rbtree is rebuilt periodically, so it needs to reinitialize the root everytime. Otherwise it can be stuck in the rbtree insertion with stale pointers. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1448521700-32062-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-26perf callchain: Honor hide_unresolvedNamhyung Kim
If user requested to hide unresolved entries, skip unresolved callchains as well as hist entries. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1448521700-32062-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-25perf probe: Fix to free temporal Dwarf_Frame correctlyMasami Hiramatsu
The commit 05c8d802fa52 ("perf probe: Fix to free temporal Dwarf_Frame") tried to fix the memory leak of Dwarf_Frame, but it released the frame at wrong point. Since the dwarf_frame_cfa(frame, &pf->fb_ops, &nops) can return an address inside the frame data structure to pf->fb_ops, we can not release the frame before using pf->fb_ops. This reverts the commit and releases the frame afterwards (right before returning from call_probe_finder) correctly. Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 05c8d802fa52 ("perf probe: Fix to free temporal Dwarf_Frame") LPU-Reference: 20151125103432.1473.31009.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-23perf callchain: Add missing parent_val initializationJiri Olsa
Adding missing parent_val callchain_node initialization. It's causing segfault in perf top: $ sudo perf top -g perf: Segmentation fault -------- backtrace -------- free_callchain_node(+0x29) in perf [0x4a4b3e] free_callchain(+0x29) in perf [0x4a5a83] hist_entry__delete(+0x126) in perf [0x4c6649] hists__delete_entry(+0x6e) in perf [0x4c66dc] hists__decay_entries(+0x7d) in perf [0x4c6776] perf_top__sort_new_samples(+0x7c) in perf [0x436a78] hist_browser__run(+0xf2) in perf [0x507760] perf_evsel__hists_browse(+0x1da) in perf [0x507c8d] perf_evlist__tui_browse_hists(+0x3e) in perf [0x5088cf] display_thread_tui(+0x7f) in perf [0x437953] start_thread(+0xc5) in libpthread-2.21.so [0x7f7068fbb555] __clone(+0x6d) in libc-2.21.so [0x7f7066fc3b9d] [0x0] Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 4b3a3212233a ("perf hists browser: Support flat callchains") Link: http://lkml.kernel.org/r/20151121102355.GA17313@krava.local Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-23perf callchain: Add order support for libdw DWARF unwinderJiri Olsa
As reported by Milian, currently for DWARF unwind (both libdw and libunwind) we display callchain in callee order only. Adding the support to follow callchain order setup to libdw DWARF unwinder, so we could get following output for report: $ perf record --call-graph dwarf ls ... $ perf report --no-children --stdio 21.12% ls libc-2.21.so [.] __strcoll_l | ---__strcoll_l mpsort_with_tmp mpsort_with_tmp mpsort_with_tmp sort_files main __libc_start_main _start $ perf report --stdio --no-children -g caller 21.12% ls libc-2.21.so [.] __strcoll_l | ---_start __libc_start_main main sort_files mpsort_with_tmp mpsort_with_tmp mpsort_with_tmp __strcoll_l Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jan Kratochvil <jkratoch@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20151119130119.GA26617@krava.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-23perf callchain: Add order support for libunwind DWARF unwinderJiri Olsa
As reported by Milian, currently for DWARF unwind (both libdw and libunwind) we display callchain in callee order only. Adding the support to follow callchain order setup to libunwind DWARF unwinder, so we could get following output for report: $ perf record --call-graph dwarf ls ... $ perf report --no-children --stdio 39.26% ls libc-2.21.so [.] __strcoll_l | ---__strcoll_l mpsort_with_tmp mpsort_with_tmp sort_files main __libc_start_main _start 0 $ perf report -g caller --no-children --stdio ... 39.26% ls libc-2.21.so [.] __strcoll_l | ---0 _start __libc_start_main main sort_files mpsort_with_tmp mpsort_with_tmp __strcoll_l Based-on-patch-by: Milian Wolff <milian.wolff@kdab.com> Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20151118075247.GA5416@krava.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-23perf callchain: Move initial entry call into get_entries functionJiri Olsa
Moving initial entry call into get_entries function so all entries processing is on one place. It will be useful for next change that adds ordering logic. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1447772739-18471-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf hists browser: Support flat callchainsNamhyung Kim
The flat callchain mode is to print all chains in a single, simple hierarchy so make it easy to see. Currently perf report --tui doesn't show flat callchains properly. With flat callchains, only leaf nodes are added to the final rbtree so it should show entries in parent nodes. To do that, add parent_val list to struct callchain_node and show them along with the (normal) val list. For example, consider following callchains with '-g graph'. $ perf report -g graph - 39.93% swapper [kernel.vmlinux] [k] intel_idle intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle - cpu_startup_entry 28.63% start_secondary - 11.30% rest_init start_kernel x86_64_start_reservations x86_64_start_kernel Before: $ perf report -g flat - 39.93% swapper [kernel.vmlinux] [k] intel_idle 28.63% start_secondary - 11.30% rest_init start_kernel x86_64_start_reservations x86_64_start_kernel After: $ perf report -g flat - 39.93% swapper [kernel.vmlinux] [k] intel_idle - 28.63% intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry start_secondary - 11.30% intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry start_kernel x86_64_start_reservations x86_64_start_kernel Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-8-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf report: Add callchain value optionNamhyung Kim
Now -g/--call-graph option supports how to display callchain values. Possible values are 'percent', 'period' and 'count'. The percent is same as before and it's the default behavior. The period displays the raw period value rather than the percentage. The count displays the number of occurrences. $ perf report --no-children --stdio -g percent ... 39.93% swapper [kernel.vmlinux] [k] intel_idel | ---intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry | |--28.63%-- start_secondary | --11.30%-- rest_init $ perf report --no-children --show-total-period --stdio -g period ... 39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel | ---intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry | |--9334403-- start_secondary | --3684302-- rest_init $ perf report --no-children --show-nr-samples --stdio -g count ... 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel | ---intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry | |--57-- start_secondary | --23-- rest_init Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf callchain: Add count fields to struct callchain_nodeNamhyung Kim
It's to track the count of occurrences of the callchains. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Brendan Gregg <brendan.d.gregg@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf callchain: Abstract callchain print functionNamhyung Kim
This is a preparation to support for printing other type of callchain value like count or period. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-4-git-send-email-namhyung@kernel.org [ renamed new _sprintf_ operation to _scnprintf_ ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf report: Support folded callchain mode on --stdioNamhyung Kim
Add new call chain option (-g) 'folded' to print callchains in a line. The callchains are separated by semicolons, and preceded by (absolute) percent values and a space. For example, the following 20 lines can be printed in 3 lines with the folded output mode: $ perf report -g flat --no-children | grep -v ^# | head -20 60.48% swapper [kernel.vmlinux] [k] intel_idle 54.60% intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry start_secondary 5.88% intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry rest_init start_kernel x86_64_start_reservations x86_64_start_kernel $ perf report -g folded --no-children | grep -v ^# | head -3 60.48% swapper [kernel.vmlinux] [k] intel_idle 54.60% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 5.88% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel This mode is supported only for --stdio now and intended to be used by some scripts like in FlameGraphs[1]. Support for other UI might be added later. [1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html Requested-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf machine: Fix machine__findnew_module_map to put dsoMasami Hiramatsu
Fix machine__findnew_module_map to drop the reference to the dso because it is already referenced by both machine__findnew_module_dso() and map__new2(). Refcnt debugger shows: ==== [1] ==== Unreclaimed dso: 0x1ffd980 Refcount +1 => 1 at ./perf(dso__new+0x1ff) [0x4a62df] ./perf(__dsos__addnew+0x29) [0x4a6e19] ./perf() [0x4b8b91] ./perf(modules__parse+0xfc) [0x4a9d5c] ./perf() [0x4b8460] ./perf(machine__create_kernel_maps+0x150) [0x4bb550] ./perf(machine__new_host+0xfa) [0x4bb75a] ./perf(init_probe_symbol_maps+0x93) [0x506623] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5] ./perf() [0x4220a9] This map_groups__insert(0x4b8b91) already gets a reference to the new dso: ---- eu-addr2line -e ./perf -f 0x4b8b91 map_groups__insert inlined at util/machine.c:586 in machine__create_module util/map.h:207 ---- So this dso refcnt will be released when map_groups gets released. [snip] Refcount +1 => 2 at ./perf(dso__get+0x34) [0x4a65f4] ./perf() [0x4b8b35] ./perf(modules__parse+0xfc) [0x4a9d5c] ./perf() [0x4b8460] ./perf(machine__create_kernel_maps+0x150) [0x4bb550] ./perf(machine__new_host+0xfa) [0x4bb75a] ./perf(init_probe_symbol_maps+0x93) [0x506623] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5] ./perf() [0x4220a9] Here, machine__findnew_module_dso(0x4b8b35) gets the dso (and stores it in a local variable): ---- # eu-addr2line -e ./perf -f 0x4b8b35 machine__findnew_module_dso inlined at util/machine.c:578 in machine__create_module util/machine.c:514 ---- Refcount +1 => 3 at ./perf(dso__get+0x34) [0x4a65f4] ./perf(map__new2+0x76) [0x4be1c6] ./perf() [0x4b8b4f] ./perf(modules__parse+0xfc) [0x4a9d5c] ./perf() [0x4b8460] ./perf(machine__create_kernel_maps+0x150) [0x4bb550] ./perf(machine__new_host+0xfa) [0x4bb75a] ./perf(init_probe_symbol_maps+0x93) [0x506623] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5] ./perf() [0x4220a9] But also map__new2() gets the dso which will be put when the map is released. So, we have to drop the constructor reference obtained in machine__findnew_module_dso(). Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151118064035.30709.58824.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf tools: Fix machine__create_kernel_maps to put kernel dso refcountMasami Hiramatsu
Fix machine__create_kernel_maps() to put kernel dso because the dso has been gotten via __machine__create_kernel_maps(). Refcnt debugger shows: ==== [0] ==== Unreclaimed dso: 0x3036ab0 Refcount +1 => 1 at ./perf(dso__new+0x1ff) [0x4a62df] ./perf(__dsos__addnew+0x29) [0x4a6e19] ./perf(dsos__findnew+0xd1) [0x4a7181] ./perf(machine__findnew_kernel+0x27) [0x4a5e17] ./perf() [0x4b8cf2] ./perf(machine__create_kernel_maps+0x28) [0x4bb428] ./perf(machine__new_host+0xfa) [0x4bb74a] ./perf(init_probe_symbol_maps+0x93) [0x506613] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5] ./perf() [0x4220a9] [snip] Refcount +1 => 2 at ./perf(dsos__findnew+0x7e) [0x4a712e] ./perf(machine__findnew_kernel+0x27) [0x4a5e17] ./perf() [0x4b8cf2] ./perf(machine__create_kernel_maps+0x28) [0x4bb428] ./perf(machine__new_host+0xfa) [0x4bb74a] ./perf(init_probe_symbol_maps+0x93) [0x506613] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5] ./perf() [0x4220a9] [snip] Refcount -1 => 1 at ./perf(dso__put+0x2f) [0x4a664f] ./perf(machine__delete+0xfe) [0x4b93ee] ./perf(exit_probe_symbol_maps+0x28) [0x5066b8] ./perf() [0x45628a] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5] ./perf() [0x4220a9] Actually, dsos__findnew gets the dso before returning it, so the dso user (in this case machine__create_kernel_maps) has to put the dso after used. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151118064033.30709.98954.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf tools: Fix __dsos__addnew to put dso after adding it to the listMasami Hiramatsu
__dsos__addnew should drop the constructor reference to dso after adding it to the list, because __dsos__add() will get a reference that will be kept while it is in the list. This fixes DSO leaks when entries are removed to the list and the refcount never gets to zero. Refcnt debugger shows: ==== [0] ==== Unreclaimed dso: 0x2fccab0 Refcount +1 => 1 at ./perf(dso__new+0x1ff) [0x4a62df] ./perf(__dsos__addnew+0x29) [0x4a6e19] ./perf(dsos__findnew+0xd1) [0x4a7281] ./perf(machine__findnew_kernel+0x27) [0x4a5e17] ./perf() [0x4b8df2] ./perf(machine__create_kernel_maps+0x28) [0x4bb528] ./perf(machine__new_host+0xfa) [0x4bb84a] ./perf(init_probe_symbol_maps+0x93) [0x506713] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5] ./perf() [0x4220a9] Refcount +1 => 2 at ./perf(__dsos__addnew+0xfb) [0x4a6eeb] ./perf(dsos__findnew+0xd1) [0x4a7281] ./perf(machine__findnew_kernel+0x27) [0x4a5e17] ./perf() [0x4b8df2] ./perf(machine__create_kernel_maps+0x28) [0x4bb528] ./perf(machine__new_host+0xfa) [0x4bb84a] ./perf(init_probe_symbol_maps+0x93) [0x506713] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5] ./perf() [0x4220a9] Refcount +1 => 3 at ./perf(dsos__findnew+0x7e) [0x4a722e] ./perf(machine__findnew_kernel+0x27) [0x4a5e17] ./perf() [0x4b8df2] ./perf(machine__create_kernel_maps+0x28) [0x4bb528] ./perf(machine__new_host+0xfa) [0x4bb84a] ./perf(init_probe_symbol_maps+0x93) [0x506713] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5] ./perf() [0x4220a9] [snip] Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151118064031.30709.81460.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf tools: Fix to put new map after inserting to map_groups in dso__load_symMasami Hiramatsu
Fix dso__load_sym to put the map object which is already insterted to kmaps. Refcnt debugger shows ==== [0] ==== Unreclaimed map: 0x39113e0 Refcount +1 => 1 at ./perf(map__new2+0xb5) [0x4be155] ./perf(dso__load_sym+0xee1) [0x503461] ./perf(dso__load_vmlinux+0xbf) [0x4aa6df] ./perf(dso__load_vmlinux_path+0x8c) [0x4aa83c] ./perf() [0x50528a] ./perf(convert_perf_probe_events+0xd79) [0x50ac29] ./perf() [0x45600f] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5] ./perf() [0x4220a9] Refcount +1 => 2 at ./perf(maps__insert+0x9a) [0x4bfffa] ./perf(dso__load_sym+0xf89) [0x503509] ./perf(dso__load_vmlinux+0xbf) [0x4aa6df] ./perf(dso__load_vmlinux_path+0x8c) [0x4aa83c] ./perf() [0x50528a] ./perf(convert_perf_probe_events+0xd79) [0x50ac29] ./perf() [0x45600f] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5] ./perf() [0x4220a9] Refcount -1 => 1 at ./perf(map_groups__exit+0x94) [0x4bed04] ./perf(machine__delete+0xb0) [0x4b9300] ./perf(exit_probe_symbol_maps+0x28) [0x506608] ./perf() [0x45628a] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5] ./perf() [0x4220a9] This means that the dso__load_sym calls map__new2 and maps_insert, both of them bump the map refcount, but map_groups__exit will drop just one reference. Fix it by dropping the refcount after inserting it into kmaps. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151118064026.30709.50038.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-19perf tools: Make perf_exec_path() always return malloc'd stringMasami Hiramatsu
Since system_path() returns malloc'd string if given path is not an absolute path, perf_exec_path() sometimes returns a static string and sometimes returns a malloc'd string depending on the environment variables or command options. This may cause a memory leak because the caller can not unconditionally free the returned string. This fixes perf_exec_path() and system_path() to always return a malloc'd string, so the caller can always free it. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151119060453.14210.65666.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>