summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2023-04-04perf map: Add accessor for dsoIan Rogers
Later changes will add reference count checking for struct map, with dso being the most frequently accessed variable. Add an accessor so that the reference count check is only necessary in one place. Additional changes: - add a dso variable to avoid repeated map__dso calls. - in builtin-mem.c dump_raw_samples, code only partially tested for dso == NULL. Make the possibility of NULL consistent. - in thread.c thread__memcpy fix use of spaces and use tabs. Committer notes: Did missing conversions on these files: tools/perf/arch/powerpc/util/skip-callchain-idx.c tools/perf/arch/powerpc/util/sym-handling.c tools/perf/ui/browsers/hists.c tools/perf/ui/gtk/annotate.c tools/perf/util/cs-etm.c tools/perf/util/thread.c tools/perf/util/unwind-libunwind-local.c tools/perf/util/unwind-libunwind.c Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf maps: Add functions to access mapsIan Rogers
Introduce functions to access struct maps. These functions reduce the number of places reference counting is necessary. While tidying APIs do some small const-ification, in particlar to unwind_libunwind_ops. Committer notes: Fixed up tools/perf/util/unwind-libunwind.c: - return ops->get_entries(cb, arg, thread, data, max_stack); + return ops->get_entries(cb, arg, thread, data, max_stack, best_effort); Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf maps: Remove rb_node from struct mapIan Rogers
struct map is reference counted, having it also be a node in an red-black tree complicates the reference counting. Switch to having a map_rb_node which is a red-block tree node but points at the reference counted struct map. This reference is responsible for a single reference count. Committer notes: Fixed up tools/perf/util/unwind-libunwind-local.c to use map_rb_node as well. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf map: Move map list node into symbolIan Rogers
Using a perf map as a list node is only done in symbol. Move the list_node struct into symbol as a single pointer to the map. This makes reference count behavior more obvious and easy to check. Committer notes: Some changes to reduce the number of lines touched by keeping, for instance, the 'new_map' variable and setting it to new_node->map, so that we keep more of the project history in place and keep as much as possible the value of the 'git blame' tool. Also use map__zput() when putting a struct members, so that when we free the container struct we can get use-after-free errors as NULL pointer derefs sometimes. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf jit: Fix a few memory leaksIan Rogers
As reported by leak sanitizer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Brian Robbins <brianrob@linux.microsoft.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20230403203545.1872196-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf build: Allow C++ demangle without libelfIan Rogers
The cxa demangle support isn't dependent on libelf and so we no longer need to disable demangling if libelf isn't present. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230403211021.1892231-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf srcline: Avoid addr2line SIGPIPEsIan Rogers
Ignore SIGPIPEs when addr2line is configured. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230403184033.1836023-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf srcline: Support for llvm-addr2lineIan Rogers
The sentinel value differs for llvm-addr2line. Configure this once and then detect when reading records. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230403184033.1836023-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf srcline: Simplify addr2line subprocessIan Rogers
Don't wrap stdin and stdout of subprocess with streams, use the api/io library for buffering. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230403184033.1836023-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04tools api: Add io__getlineIan Rogers
Reads a line to allocated memory up to a newline following the getline API. Committer notes: It also adds this new function to the 'api io' 'perf test' entry: $ perf test "api io" 64: Test api io : Ok $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230403184033.1836023-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf intel-pt: Use perf_pmu__scan_file_at() if possibleNamhyung Kim
Intel-PT calls perf_pmu__scan_file() a lot, let's use relative address when it accesses multiple files at one place. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf pmu: Add perf_pmu__{open,scan}_file_at()Namhyung Kim
These two helpers will also use openat() to reduce the overhead with relative pathnames. Convert other functions in pmu_lookup() to use the new helpers. Committer testing: Before: ⬢[acme@toolbox perf-tools-next]$ perf bench internals pmu-scan # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average PMU scanning took: 2729.040 usec (+- 7.117 usec) ⬢[acme@toolbox perf-tools-next]$ After: ⬢[acme@toolbox perf-tools-next]$ perf bench internals pmu-scan # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average PMU scanning took: 2419.870 usec (+- 9.057 usec) ⬢[acme@toolbox perf-tools-next]$ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf pmu: Use relative path in setup_pmu_alias_list()Namhyung Kim
Likewise, x86 needs to traverse the PMU list to build alias. Let's use the new helpers to use relative paths. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf pmu: Use relative path in perf_pmu__caps_parse()Namhyung Kim
Likewise, it needs to traverse the pmu/caps directory, let's use openat() with the dirfd instead of open() using the absolute path. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: LKML <linux-kernel@vger.kernel.org> Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf pmu: Use relative path for sysfs scanNamhyung Kim
The PMU information is in the kernel sysfs so it needs to scan the directory to get the whole information like event aliases, formats and so on. During the traversal, it opens a lot of files and directories like below: dir = opendir("/sys/bus/event_source/devices"); while (dentry = readdir(dir)) { char buf[PATH_MAX]; snprintf(buf, sizeof(buf), "%s/%s", "/sys/bus/event_source/devices", dentry->d_name); fd = open(buf, O_RDONLY); ... } But this is not good since it needs to copy the string to build the absolute pathname, and it makes redundant pathname walk (from the /sys) unnecessarily. We can use openat(2) to open the file in the given directory. While it's not a problem ususally, it can be a problem when the kernel has contentions on the sysfs. Add a couple of new helper to return the file descriptor of PMU directory so that it can use it with relative paths. * perf_pmu__event_source_devices_fd() - returns a fd for the PMU root ("/sys/bus/event_source/devices") * perf_pmu__pathname_fd() - returns a fd for "<pmu>/<file>" under the PMU root Now the above code can be converted something like below: dirfd = perf_pmu__event_source_devices_fd(); dir = fdopendir(dirfd); while (dentry = readdir(dir)) { fd = openat(dirfd, dentry->d_name, O_RDONLY); ... } Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf bench: Add pmu-scan benchmarkNamhyung Kim
The pmu-scan benchmark will repeatedly scan the sysfs to get the available PMU information. $ ./perf bench internals pmu-scan # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average PMU scanning took: 6850.990 usec (+- 48.445 usec) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf pmu: Add perf_pmu__destroy() functionNamhyung Kim
It seems there's no function to delete the perf pmu struct. Add the perf_pmu__destroy() to do the job. While at it, add some more helper functions to delete pmu aliases and caps. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf tools: Fix a asan issue in parse_events_multi_pmu_add()Namhyung Kim
In the parse_events_multi_pmu_add() it passes the 'config' variable twice to parse_events_term__num() - one for config and another for loc_term. I'm not sure about the second one as it's converted to YYLTYPE variable. Asan reports it like below: In function ‘parse_events_term__num’, inlined from ‘parse_events_multi_pmu_add’ at util/parse-events.c:1602:6: util/parse-events.c:2653:64: error: array subscript ‘YYLTYPE[0]’ is partly outside array bounds of ‘char[8]’ [-Werror=array-bounds] 2653 | .err_term = loc_term ? loc_term->first_column : 0, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ util/parse-events.c: In function ‘parse_events_multi_pmu_add’: util/parse-events.c:1587:15: note: object ‘config’ of size 8 1587 | char *config; | ^~~~~~ cc1: all warnings being treated as errors Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf list: Use relative path for tracepoint scanNamhyung Kim
Committer notes: Added missing #include <unistd.h> for the close() prototype to fix this on Alma Linux 8: 1 21.54 almalinux:8 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-16) (GCC) util/print-events.c: In function 'print_tracepoint_events': util/print-events.c:103:4: error: implicit declaration of function 'close'; did you mean 'clone'? [-Werror=implicit-function-declaration] close(evt_fd); ^~~~~ clone Also use the newly added scandirat feature test to check if that function is available, providing a HAVE_SCANDIRAT_SUPPORT conditional warning to the user if it isn't available. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04tools build: Add a feature test for scandirat(), that is not implemented so ↵Arnaldo Carvalho de Melo
far in musl and uclibc We use it just when listing tracepoint events, and for root, so just emit a warning about it to get users to ask the library maintainers to implement it, as suggested in this systemd ticket: https://github.com/systemd/casync/issues/129 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZCwv4z5Dh%2FdHUMG6@kernel.org/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf intel-pt: Fix CYC timestamps after standalone CBRAdrian Hunter
After a standalone CBR (not associated with TSC), update the cycles reference timestamp and reset the cycle count, so that CYC timestamps are calculated relative to that point with the new frequency. Fixes: cc33618619cefc6d ("perf tools: Add Intel PT support for decoding CYC packets") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf auxtrace: Fix address filter entire kernel sizeAdrian Hunter
kallsyms is not completely in address order. In find_entire_kern_cb(), calculate the kernel end from the maximum address not the last symbol. Example: Before: $ sudo cat /proc/kallsyms | grep ' [twTw] ' | tail -1 ffffffffc00b8bd0 t bpf_prog_6deef7357e7b4530 [bpf] $ sudo cat /proc/kallsyms | grep ' [twTw] ' | sort | tail -1 ffffffffc15e0cc0 t iwl_mvm_exit [iwlmvm] $ perf.d093603a05aa record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2ceba000 After: $ perf.8fb0f7a01f8e record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2e3e2000 Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/storeRob Herring
Arm SPEv1.3 adds new load/store operation subclasses for Memory Tagging Extension (MTE) and memory operations (MOPS). The memory operations are memcpy and memset. Add support for decoding these new subclasses in the raw decoding. Reviewed-by: Leo Yan <leo.yan@linaro.org Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230327162057.4057188-1-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Handle PERF_RECORD_AUX_OUTPUT_HW_ID packetMike Leach
When using dynamically assigned CoreSight trace IDs the drivers can output the ID / CPU association as a PERF_RECORD_AUX_OUTPUT_HW_ID packet. Update cs-etm decoder to handle this packet by setting the CPU/Trace ID mapping. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Update record event to use new Trace ID protocolMike Leach
Trace IDs are now dynamically allocated. Previously used the static association algorithm that is no longer used. The 'cpu * 2 + seed' was outdated and broken for systems with high core counts (>46). as it did not scale and was broken for larger core counts. Trace ID will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record. Legacy ID algorithm renamed and retained for limited backward compatibility use. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Move mapping of Trace ID and cpu into helper functionMike Leach
The information to associate Trace ID and CPU will be changing. Drivers will start outputting this as a hardware ID packet in the data file which if present will be used in preference to the AUXINFO values. To prepare for this we provide a helper functions to do the individual ID mapping, and one to extract the IDs from the completed metadata blocks. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf lock contention: Show detail failure reason for BPFNamhyung Kim
It can fail to collect lock stat from BPF for various reasons. For example, I've got a report that sometimes time calculation seems wrong in case of contended spinlocks. I suspect the time delta went negative for some reason. Count them separately and show in the output like below: $ sudo perf lock contention -abE5 sleep 10 contended total wait max wait avg wait type caller 13 785.61 us 79.36 us 60.43 us spinlock remove_wait_queue+0x14 10 469.02 us 87.51 us 46.90 us spinlock prepare_to_wait+0x27 9 289.09 us 69.08 us 32.12 us spinlock finish_wait+0x36 114 251.05 us 8.56 us 2.20 us spinlock try_to_wake_up+0x1f5 132 188.63 us 5.01 us 1.43 us spinlock __wake_up_common_lock+0x62 === output for debug === bad: 1, total: 279 bad rate: 0.36 % histogram of failure reasons task: 1 stack: 0 time: 0 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230327225711.245738-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf lock contention: Fix debug stat if no contentionNamhyung Kim
It should not divide if the total number is 0. Otherwise it'd show NaN in the bad rate output. Also add a whitespace in the "output for debug" message. $ sudo perf lock contention -abv true Looking at the vmlinux_path (8 entries long) symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols contended total wait max wait avg wait type caller === output for debug=== bad: 0, total: 0 bad rate: -nan % <------------------------- (here) histogram of events caused bad sequence acquire: 0 acquired: 0 contended: 0 release: 0 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230327225711.245738-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Update ivybridge and ivytownIan Rogers
Update to versions 24 and 23 respectively. Adds the event BR_MISP_EXEC.INDIRECT. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230328234142.1080045-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf bench numa: Fix type of loop iterator in do_work, it should be 'long'Andreas Herrmann
'j' is of type int and start/end are of type 'long'. Thus 'j' might become negative and cause segfault in access_data(). Fix it by using 'long' for 'j' as well. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Andreas Herrmann <aherrmann@suse.de> Link: https://lore.kernel.org/r/20230330074202.14052-1-aherrmann@suse.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbol: Remove unused branch_callstackAdrian Hunter
branch_callstack was added by commit 8b7bad58efb7 ("perf callchain: Support handling complete branch stacks as histograms") but never used. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230330131833.12864-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf top: Add --branch-history optionAdrian Hunter
Add --branch-history option, to act the same as that option does for perf report. Example: $ cat tcallf.c volatile a = 10000, b = 100000, c; __attribute__((noinline)) f2() { c = a / b; } __attribute__((noinline)) f1() { f2(); f2(); } main() { while (1) f1(); } $ gcc -w -g -o tcallf tcallf.c $ ./tcallf & [1] 29409 $ perf top -e cycles:u -t $(pidof tcallf) --stdio --no-children --branch-history PerfTop: 3819 irqs/sec kernel: 0.0% exact: 0.0% lost: 0/0 drop: 0/0 [4000Hz cycles:u], (target_tid: 29409) -------------------------------------------------------------------------------------------------------------------- 49.01% tcallf.c:5 [.] f2 tcallf | |--24.91%--f2 tcallf.c:4 | | | |--17.14%--f1 tcallf.c:11 (cycles:1) | | f1 tcallf.c:11 | | f2 tcallf.c:6 (cycles:3) | | f2 tcallf.c:4 | | f1 tcallf.c:10 (cycles:2) | | f1 tcallf.c:9 | | main tcallf.c:16 (cycles:1) | | main tcallf.c:16 | | main tcallf.c:16 (cycles:1) | | main tcallf.c:16 | | f1 tcallf.c:12 (cycles:1) | | f1 tcallf.c:12 | | f2 tcallf.c:6 (cycles:3) | | f2 tcallf.c:4 | | f1 tcallf.c:11 (cycles:1 iter:1 avg_cycles:12) | | f1 tcallf.c:11 | | f2 tcallf.c:6 (cycles:3 iter:1 avg_cycles:12) | | f2 tcallf.c:4 | | f1 tcallf.c:10 (cycles:2 iter:1 avg_cycles:12) | | | --7.78%--f1 tcallf.c:10 (cycles:2) | f1 tcallf.c:9 | main tcallf.c:16 (cycles:1) | main tcallf.c:16 | main tcallf.c:16 (cycles:1) | main tcallf.c:16 | f1 tcallf.c:12 (cycles:1) | f1 tcallf.c:12 | f2 tcallf.c:6 (cycles:3) | f2 tcallf.c:4 | f1 tcallf.c:11 (cycles:1) | f1 tcallf.c:11 | f2 tcallf.c:6 (cycles:3) | f2 tcallf.c:4 | f1 tcallf.c:10 (cycles:2 iter:1 avg_cycles:12) | f1 tcallf.c:9 | main tcallf.c:16 (cycles:1 iter:1 avg_cycles:12) | main tcallf.c:16 | main tcallf.c:16 (cycles:1 iter:1 avg_cycles:12) ... $ pkill tcallf [1]+ Terminated ./tcallf Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230330131833.12864-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf build: Conditionally define NDEBUGIan Rogers
When a build is done without DEBUG=1 then define NDEBUG. This will compile out asserts and other debug code. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf block-range: Move debug code behind ifndef NDEBUGIan Rogers
Make good on a comment and avoid a unused-but-set-variable warning. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf bench: Avoid NDEBUG warningIan Rogers
With NDEBUG set the asserts are compiled out. This yields "unused-but-set-variable" variables. Move these variables behind NDEBUG to avoid the warning. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events: Update Alderlake for E-Core TMA v2.3Ian Rogers
https://github.com/intel/perfmon/pull/65 Generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py The PR notes state: - E-Core TMA version 2.3. - FP_UOPS changed to FPDIV_Uops - Added BR_MISP breakdown stats - Frontend_Bandwidth/Latency changed to Fetch_Bandwidth/Latency - Load_Store_Bound changed to Memory_Bound - Icache changed to ICache_Misses - ITLB changed to ITLB_Misses - Store_Fwd changed to Store_Fwd_Blk Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230329162318.1227114-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbol: Add command line support for addr2line pathIan Rogers
Allow addr2line to be set either on the command line or via the perfconfig file. This doesn't currently work with llvm-addr2line as the addr2line code emits two things: 1) the address to decode, 2) a bogus ',' value. The expectation is the bogus value will generate: ?? ??:0 that terminates the addr2line reading. However, the output from llvm-addr2line is a single line with just the input ',' locking up the addr2line reading that is expecting a second line. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Allow objdump to be set in perfconfigIan Rogers
Allow the setting of the objdump command in the perfconfig. Update man page for this new option. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Own objdump_path and disassembler_style stringsIan Rogers
Make struct annotation_options own the strings objdump_path and disassembler_style, freeing them on exit. Add missing strdup for disassembler_style when read from a config file. Committer notes: Converted free(obj->member) to zfree(&obj->member) in annotation_options__exit() Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Add init/exit to annotation_options remove defaultIan Rogers
The annotation__default_options global variable was used to initialize annotation_options. Switch to the init/exit pattern as later changes will give ownership over strings and this will be necessary to avoid memory leaks. Committer note: Fix the GTK2=1 build, hist_entry__gtk_annotate() needs to receive a 'struct annotation_options' pointer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf report: Additional config warningsIan Rogers
If the default_sort_order isn't correctly strdup-ed warn and return an error. Debug warn if no option is matched. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Delete session for debug buildsIan Rogers
Use the debug build indicator as the guide to free the session. This implements a behavior described in a comment, which is consequentially removed. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf tools: Avoid warning in do_realloc_array_as_needed()Adrian Hunter
do_realloc_array_as_needed() used memcpy() of zero size with a NULL pointer. Check the size first to avoid sanitize warning. Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/oe-lkp/202303061424.6ad43294-yujie.liu@intel.com Link: https://lore.kernel.org/r/20230316194156.8320-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbols: Fix unaligned access in get_x86_64_plt_disp()Adrian Hunter
Use memcpy() to avoid unaligned access. Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Fixes: ce4c8e7966f317ef ("perf symbols: Get symbols for .plt.got for x86-64") Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/oe-lkp/202303061424.6ad43294-yujie.liu@intel.com Link: https://lore.kernel.org/r/20230316194156.8320-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbols: Fix use-after-free in get_plt_got_name()Adrian Hunter
Fix use-after-free in get_plt_got_name(). Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Fixes: ce4c8e7966f317ef ("perf symbols: Get symbols for .plt.got for x86-64") Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/oe-lkp/202303061424.6ad43294-yujie.liu@intel.com Link: https://lore.kernel.org/r/20230316194156.8320-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events power9: Remove UTF-8 characters from JSON filesKajol Jain
Commit 3c22ba5243040c13 ("perf vendor events powerpc: Update POWER9 events") added and updated power9 PMU JSON events. However some of the JSON events which are part of other.json and pipeline.json files, contains UTF-8 characters in their brief description. Having UTF-8 character could breaks the perf build on some distros. Fix this issue by removing the UTF-8 characters from other.json and pipeline.json files. Result without the fix: [command]# file -i pmu-events/arch/powerpc/power9/* pmu-events/arch/powerpc/power9/cache.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/floating-point.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/frontend.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/marked.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/memory.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/nest_metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/other.json: application/json; charset=utf-8 pmu-events/arch/powerpc/power9/pipeline.json: application/json; charset=utf-8 pmu-events/arch/powerpc/power9/pmc.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/translation.json: application/json; charset=us-ascii [command]# Result with the fix: [command]# file -i pmu-events/arch/powerpc/power9/* pmu-events/arch/powerpc/power9/cache.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/floating-point.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/frontend.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/marked.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/memory.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/nest_metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/other.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/pipeline.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/pmc.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/translation.json: application/json; charset=us-ascii [command]# Fixes: 3c22ba5243040c13 ("perf vendor events powerpc: Update POWER9 events") Reported-by: Arnaldo Carvalho de Melo <acme@kernel.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/lkml/ZBxP77deq7ikTxwG@kernel.org/ Link: https://lore.kernel.org/r/20230328112908.113158-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf ftrace: Make system wide the default target for latency subcommandYang Jihong
If no target is specified for 'latency' subcommand, the execution fails because - 1 (invalid value) is written to set_ftrace_pid tracefs file. Make system wide the default target, which is the same as the default behavior of 'trace' subcommand. Before the fix: # perf ftrace latency -T schedule failed to set ftrace pid After the fix: # perf ftrace latency -T schedule ^C# DURATION | COUNT | GRAPH | 0 - 1 us | 0 | | 1 - 2 us | 0 | | 2 - 4 us | 0 | | 4 - 8 us | 2828 | #### | 8 - 16 us | 23953 | ######################################## | 16 - 32 us | 408 | | 32 - 64 us | 318 | | 64 - 128 us | 4 | | 128 - 256 us | 3 | | 256 - 512 us | 0 | | 512 - 1024 us | 1 | | 1 - 2 ms | 4 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 4 | | 256 - 512 ms | 2 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | Fixes: 53be50282269b46c ("perf ftrace: Add 'latency' subcommand") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230324032702.109964-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf bench syscall: Add fork syscall benchmarkTiezhu Yang
This is a follow up patch for the execve bench which is actually fork + execve, it makes sense to add the fork syscall benchmark to compare the execve part precisely. Some archs have no __NR_fork definition which is used only as a check condition to call test_fork(), let us just define it as -1 to avoid build error. Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: loongson-kernel@lists.loongnix.cn Link: https://lore.kernel.org/r/1679381821-22736-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf stat: Suppress warning when using cpum_cf events on s390Thomas Richter
Running command perf stat -vv -e cpu_cycles -C0 -- true displays this warning: Attempting to add event pmu 'cpum_cf' with 'cpu_cycles,' that may result in non-fatal errors Make the PMU cpum_cf selectable and avoid this warning. While at it also fix this warning for PMUs pai_crypto and pai_ext. Output before: # ./perf stat -vv -e cpu_cycles -C0 -- true Using CPUID IBM,3931,704,A01,3.7,002f Attempting to add event pmu 'cpum_cf' with 'cpu_cycles,' that may result in non-fatal errors After aliases, add event pmu 'cpum_cf' with 'event,' that may result in non-fatal errors cpu_cycles -> cpum_cf/event=0/ Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 10 size 128 config 0x1001 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 cpu_cycles: 0: 290434 2479172 2479172: cpu_cycles: 290434 2479172 2479172 Performance counter stats for 'CPU(s) 0': 290,434 cpu_cycles 0.002465617 seconds time elapsed # Now the warning "Attempting to add event pmu 'cpum_cf' ..." does not show up anymore. Output after: # ./perf stat -vv -e cpu_cycles -C0 -- true Using CPUID IBM,3931,704,A01,3.7,002f After aliases, add event pmu 'cpum_cf' with 'event,' that may result in non-fatal errors cpu_cycles -> cpum_cf/event=0/ Control descriptor is not initialized .... Performance counter stats for 'CPU(s) 0': 357,023 cpu_cycles 0.002454995 seconds time elapsed # Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20230316074946.41110-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf tests record_offcpu.sh: Fix redirection of stderr to stdinPatrice Duroux
It's not 2&>1, the correct is 2>&1 Fixes: ade1d0307b2fb3d9 ("perf offcpu: Update offcpu test for child process") Signed-off-by: Patrice Duroux <patrice.duroux@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230303193058.21274-1-patrice.duroux@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>