summaryrefslogtreecommitdiff
path: root/tools/perf/Documentation/perf-record.txt
AgeCommit message (Collapse)Author
2023-03-15perf record: Update documentation for BPF filtersNamhyung Kim
Add more description and examples. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf record: Add BPF event filter supportNamhyung Kim
Use --filter option to set BPF filter for generic events other than the tracepoints or Intel PT. The BPF program will check the sample data and filter according to the expression. For example, the below is the typical perf record for frequency mode. The sample period started from 1 and increased gradually. $ sudo ./perf record -e cycles true $ sudo ./perf script perf-exec 2272336 546683.916875: 1 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916892: 1 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916899: 3 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916905: 17 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916911: 100 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916917: 589 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916924: 3470 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916930: 20465 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) true 2272336 546683.916940: 119873 cycles: ffffffff8283afdd perf_iterate_ctx+0x2d ([kernel.kallsyms]) true 2272336 546683.917003: 461349 cycles: ffffffff82892517 vma_interval_tree_insert+0x37 ([kernel.kallsyms]) true 2272336 546683.917237: 635778 cycles: ffffffff82a11400 security_mmap_file+0x20 ([kernel.kallsyms]) When you add a BPF filter to get samples having periods greater than 1000, the output would look like below: $ sudo ./perf record -e cycles --filter 'period > 1000' true $ sudo ./perf script perf-exec 2273949 546850.708501: 5029 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708508: 32409 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708526: 143369 cycles: ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms]) perf-exec 2273949 546850.708600: 372650 cycles: ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms]) perf-exec 2273949 546850.708791: 482953 cycles: ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms]) true 2273949 546850.709036: 501985 cycles: ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms]) true 2273949 546850.709292: 503065 cycles: 7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Committer notes: Add stubs for perf_bpf_filter__prepare() and perf_bpf_filter__destroy() to tools/perf/util/python.c to keep it building. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf evlist: Remove group option.Ian Rogers
The group option predates grouping events using curly braces added in commit 89efb029502d7f2d ("perf tools: Add support to parse event group syntax"). The --group option was retained for legacy support (in August 2012) but keeping it adds complexity. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Eelco Chaudron <echaudro@redhat.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Shaomin Deng <dengshaomin@cdjrlc.com> Cc: Stephane Eranian <eranian@google.com> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20221213232651.1269909-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf record: Add remaining branch filters: "no_cycles", "no_flags" & "hw_index"Anshuman Khandual
This adds all remaining branch filters i.e "no_cycles", "no_flags" and "hw_index". While here, also updates the documentation. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20221205064443.533587-1-anshuman.khandual@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-27perf tools: Make quiet mode consistent between toolsJames Clark
Use the global quiet variable everywhere so that all tools hide warnings in quiet mode and update the documentation to reflect this. 'perf probe' claimed that errors are not printed in quiet mode but I don't see this so remove it from the docs. Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221018094137.783081-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE eventsRavi Bangoria
Currently perf sets PERF_SAMPLE_WEIGHT flag only for mem load events. Set it for combined load-store event as well which will enable recording of load latency by default on arch that does not support independent mem load event. Also document missing -W in perf-record man page. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: https://lore.kernel.org/r/20221006153946.7816-5-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf branch: Add branch privilege information request flagAnshuman Khandual
This updates the perf tools with branch privilege information request flag i.e PERF_SAMPLE_BRANCH_PRIV_SAVE that has been added earlier in the kernel. This also updates 'perf record' documentation, branch_modes[], and generic branch privilege level enumeration as added earlier in the kernel. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220824044822.70230-8-anshuman.khandual@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04perf record: Allow multiple recording time rangesAdrian Hunter
AUX area traces can produce too much data to record successfully or analyze subsequently. Add another means to reduce data collection by allowing multiple recording time ranges. This is useful, for instance, in cases where a workload produces predictably reproducible events in specific time ranges. Today we only have perf record -D <msecs> to start at a specific region, or some complicated approach using snapshot mode and external scripts sending signals or using the fifos. But these approaches are difficult to set up compared with simply having perf do it. Extend perf record option -D/--delay option to specifying relative time stamps for start stop controlled by perf with the right time offset, for instance: perf record -e intel_pt// -D 10-20,30-40 to record 10ms to 20ms into the trace and 30ms to 40ms. Example: The example workload is: $ cat repeat-usleep.c int usleep(useconds_t usec); int usage(int ret, const char *msg) { if (msg) fprintf(stderr, "%s\n", msg); fprintf(stderr, "Usage is: repeat-usleep <microseconds>\n"); return ret; } int main(int argc, char *argv[]) { unsigned long usecs; char *end_ptr; if (argc != 2) return usage(1, "Error: Wrong number of arguments!"); errno = 0; usecs = strtoul(argv[1], &end_ptr, 0); if (errno || *end_ptr || usecs > UINT_MAX) return usage(1, "Error: Invalid argument!"); while (1) { int ret = usleep(usecs); if (ret & errno != EINTR) return usage(1, "Error: usleep() failed!"); } return 0; } $ perf record -e intel_pt//u --delay 10-20,40-70,110-160 -- ./repeat-usleep 500 Events disabled Events enabled Events disabled Events enabled Events disabled Events enabled Events disabled [ perf record: Woken up 5 times to write data ] [ perf record: Captured and wrote 0.204 MB perf.data ] Terminated A dlfilter is used to determine continuous data collection (timestamps less than 1ms apart): $ cat dlfilter-show-delays.c static __u64 start_time; static __u64 last_time; int start(void **data, void *ctx) { printf("%-17s\t%-9s\t%-6s\n", " Time", " Duration", " Delay"); return 0; } int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx) { __u64 delta; if (!sample->time) return 1; if (!last_time) goto out; delta = sample->time - last_time; if (delta < 1000000) goto out2;; printf("%17.9f\t%9.1f\t%6.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0, delta / 1000000.0); out: start_time = sample->time; out2: last_time = sample->time; return 1; } int stop(void *data, void *ctx) { printf("%17.9f\t%9.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0); return 0; } The result shows the times roughly match the --delay option: $ perf script --itrace=qb --dlfilter dlfilter-show-delays.so Time Duration Delay 39215.302317300 9.7 20.5 39215.332480217 30.4 40.9 39215.403837717 49.8 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220824072814.16422-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-27perf docs: Update the documentation for the save_type filterKan Liang
Update the documentation to reflect the kernel changes. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220816125612.2042397-2-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-27perf record: Fix manpage formatting of description of support to hybrid systemsAndi Kleen
The Intel hybrid description is written in a different style than the rest of the perf record man page. There were some new command line options added after it which resulted in very strange section ordering. Move the hybrid include last. Also the sub sections in the hybrid document don't fit the record manpage well (especially since it talks about all kinds of unrelated commands). I left this for now, but would be better to separate this properly in the different man pages. It would be better to use sub sections for the other sections, but these don't seem to be supported in AsciiDoc? Some of the examples are still misrendered in the manpage with an indented troff command, but I don't know how to fix that. In any case it's now better than before. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: zhengjun.xing@intel.com Link: https://lore.kernel.org/r/20220818100127.249401-1-ak@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12perf record: Allow to specify max stack depth of fp callchainNamhyung Kim
Currently it has no interface to specify the max stack depth for perf record. Extend the command line parameter to accept a number after 'fp' to specify the depth like '--call-graph fp,32'. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220615163222.1275500-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-23perf record: Add new option to sample identifierAdrian Hunter
In preparation for recording sideband events in a virtual machine guest so that they can be injected into a host perf.data file. Add an option to always include sample type PERF_SAMPLE_IDENTIFIER. Committer testing: # perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ] # perf evlist -v cycles: size: 128, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # # # perf record --sample-identifier sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.022 MB perf.data (7 samples) ] # perf evlist -v cycles: size: 128, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220615052511.4441-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-03perf docs: Correct typo of event_sourcesIan Rogers
The sysfs directory is called event_source. Before: $ ls -la /sys/bus/event_sources/devices/cpu/format/ ls: cannot access '/sys/bus/event_sources/devices/cpu/format/': No such file or directory $ After: $ ls -la /sys/bus/event_source/devices/cpu/format/ total 0 drwxr-xr-x. 2 root root 0 Jun 2 15:36 . drwxr-xr-x. 6 root root 0 Jun 2 15:35 .. -r--r--r--. 1 root root 4096 Jun 2 15:36 any -r--r--r--. 1 root root 4096 Jun 2 15:36 cmask -r--r--r--. 1 root root 4096 Jun 2 15:36 edge -r--r--r--. 1 root root 4096 Jun 2 15:36 event -r--r--r--. 1 root root 4096 Jun 2 15:36 frontend -r--r--r--. 1 root root 4096 Jun 2 15:36 inv -r--r--r--. 1 root root 4096 Jun 2 15:36 ldlat -r--r--r--. 1 root root 4096 Jun 2 15:36 offcore_rsp -r--r--r--. 1 root root 4096 Jun 2 15:36 pc -r--r--r--. 1 root root 4096 Jun 2 15:36 umask $ Reviewed-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Alyssa Ross <hi@alyssa.is> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joshua Martinez <joshuamart@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220603045744.2815559-1-irogers@google.com Reported-by: Kevin Nomura <nomurak@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-26perf record: Enable off-cpu analysis with BPFNamhyung Kim
Add --off-cpu option to enable the off-cpu profiling with BPF. It'd use a bpf_output event and rename it to "offcpu-time". Samples will be synthesized at the end of the record session using data from a BPF map which contains the aggregated off-cpu time at context switches. So it needs root privilege to get the off-cpu profiling. Each sample will have a separate user stacktrace so it will skip kernel threads. The sample ip will be set from the stacktrace and other sample data will be updated accordingly. Currently it only handles some basic sample types. The sample timestamp is set to a dummy value just not to bother with other events during the sorting. So it has a very big initial value and increase it on processing each samples. Good thing is that it can be used together with regular profiling like cpu cycles. If you don't want to that, you can use a dummy event to enable off-cpu profiling only. Example output: $ sudo perf record --off-cpu perf bench sched messaging -l 1000 $ sudo perf report --stdio --call-graph=no # Total Lost Samples: 0 # # Samples: 41K of event 'cycles' # Event count (approx.): 42137343851 ... # Samples: 1K of event 'offcpu-time' # Event count (approx.): 587990831640 # # Children Self Command Shared Object Symbol # ........ ........ ............... .................. ......................... # 81.66% 0.00% sched-messaging libc-2.33.so [.] __libc_start_main 81.66% 0.00% sched-messaging perf [.] cmd_bench 81.66% 0.00% sched-messaging perf [.] main 81.66% 0.00% sched-messaging perf [.] run_builtin 81.43% 0.00% sched-messaging perf [.] bench_sched_messaging 40.86% 40.86% sched-messaging libpthread-2.33.so [.] __read 37.66% 37.66% sched-messaging libpthread-2.33.so [.] __write 2.91% 2.91% sched-messaging libc-2.33.so [.] __poll ... As you can see it spent most of off-cpu time in read and write in bench_sched_messaging(). The --call-graph=no was added just to make the output concise here. It uses perf hooks facility to control BPF program during the record session rather than adding new BPF/off-cpu specific calls. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Blake Jones <blakejones@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220518224725.742882-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-10perf record: Extend --threads command line optionAlexey Bayduraev
Extend --threads option in perf record command line interface. The option can have a value in the form of masks that specify CPUs to be monitored with data streaming threads and its layout in system topology. The masks can be filtered using CPU mask provided via -C option. The specification value can be user defined list of masks. Masks separated by colon define CPUs to be monitored by one thread and affinity mask of that thread is separated by slash. For example: <cpus mask 1>/<affinity mask 1>:<cpu mask 2>/<affinity mask 2> specifies parallel threads layout that consists of two threads with corresponding assigned CPUs to be monitored. The specification value can be a string e.g. "cpu", "core" or "package" meaning creation of data streaming thread for every CPU or core or package to monitor distinct CPUs or CPUs grouped by core or package. The option provided with no or empty value defaults to per-cpu parallel threads layout creating data streaming thread for every CPU being monitored. Document --threads option syntax and parallel data streaming modes in Documentation/perf-record.txt. Suggested-by: Jiri Olsa <jolsa@kernel.org> Suggested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/079e2619be70c465317cf7c9fdaf5fa069728c32.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-10perf record: Introduce --threads command line optionAlexey Bayduraev
Provide --threads option in perf record command line interface. The option creates a data streaming thread for each CPU in the system. Document --threads option in Documentation/perf-record.txt. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/01aeae43b047f428596c4ef9f9342ab94865cedd.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-15perf record: Disable debuginfod by defaultJiri Olsa
Fedora 35 sets DEBUGINFOD_URLS by default, which might lead to unexpected stalls in perf record exit path, when we try to cache profiled binaries. # DEBUGINFOD_PROGRESS=1 ./perf record -a ^C[ perf record: Woken up 1 times to write data ] Downloading from https://debuginfod.fedoraproject.org/ 447069 Downloading from https://debuginfod.fedoraproject.org/ 1502175 Downloading \^Z Disabling DEBUGINFOD_URLS by default in perf record and adding debuginfod option and .perfconfig variable support to enable id. Default without debuginfo processing: # perf record -a Using system debuginfod setup: # perf record -a --debuginfod Using custom debuginfd url: # perf record -a --debuginfod='https://evenbetterdebuginfodserver.krava' Adding single perf_debuginfod_setup function and using it also in perf buildid-cache command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20211209200425.303561-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07perf docs: Add info on AMD raw event encodingSandipan Das
AMD processors have events with event select codes and unit masks larger than a byte. The core PMU, for example, uses 12-bit event select codes split between bits 0-7 and 32-35 of the PERF_CTL MSRs as can be seen from /sys/bus/event_sources/devices/cpu/format/*. The Processor Programming Reference (PPR) lists the event codes as unified 12-bit hexadecimal values instead and the split between the bits is not apparent to someone who is not aware of the layout of the PERF_CTL MSRs. 8-bit event select codes continue to work as the layout matches that of the PERF_CTL MSRs i.e. bits 0-7 for event select and 8-15 for unit mask. This adds more details in the perf man pages about using /sys/bus/event_sources/devices/*/format/* for determining the correct raw event encoding scheme. E.g. the "op_cache_hit_miss.op_cache_hit" event with code 0x28f and umask 0x03 can be programmed using its symbolic name as: $ sudo perf --debug perf-event-open stat -e op_cache_hit_miss.op_cache_hit sleep 1 ------------------------------------------------------------ perf_event_attr: type 4 size 128 config 0x20000038f sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ [...] One might use a simple eventsel+umask combination based on what the current man pages say and incorrectly program the event as: $ sudo perf --debug perf-event-open stat -e r0328f sleep 1 ------------------------------------------------------------ perf_event_attr: type 4 size 128 config 0x328f sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ [...] When it should have been based on the format from sysfs: $ cat /sys/bus/event_source/devices/cpu/format/event config:0-7,32-35 $ sudo perf --debug perf-event-open stat -e r20000038f sleep 1 ------------------------------------------------------------ perf_event_attr: type 4 size 128 config 0x20000038f sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ [...] Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Robert Richter <rrichter@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Link: https://lore.kernel.org/r/20211123084613.243792-1-sandipan.das@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13perf arm-spe: Update --switch-events docs in 'perf record'German Gomez
Update 'perf record' docs and ARM SPE recording options so that they are consistent. This includes supporting the --no-switch-events flag in ARM SPE as well. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: German Gomez <german.gomez@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20211111133625.193568-3-german.gomez@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-17perf record: Add --synth optionNamhyung Kim
Add an option to control the synthesizing behavior. --synth <no|all|task|mmap|cgroup> Fine-tune event synthesis: default=all This can be useful when we know it doesn't need some synthesis like in a specific usecase and/or when using pipe: $ perf record -a --all-cgroups --synth cgroup -o- sleep 1 | \ > perf report -i- -s cgroup Committer notes: Added a clarification to the man page entry for --synth that this is about pre-existing threads. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https //lore.kernel.org/r/20210811044658.1313391-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10perf docs: Fix accidental em-dashesAlyssa Ross
" -- " is an em dash (—) in asciidoc, so all these examples that were supposed to be producing a literal two dashes were being misrendered. Signed-off-by: Alyssa Ross <hi@alyssa.is> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210809153226.332545-1-hi@alyssa.is Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29perf Documentation: Document intel-hybrid supportJin Yao
Add some words and examples to help understanding of Intel hybrid perf support. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-27-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-20perf tools: Add 'ping' control commandJiri Olsa
Add a control 'ping' command to detect if perf is up and its control interface is operational. It will be used in following daemon patches to synchronize with record session - when control interface is up and running, we know that perf record is monitoring and ready to receive signals. Example session: terminal 1: # mkfifo control ack # perf record --control=fifo:control,ack terminal 2: # echo ping > control # cat ack ack Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201226232038.390883-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-20perf tools: Add 'stop' control commandJiri Olsa
Adding control 'stop' command to stop perf record. When it is received, perf will set the 'done' variable to 1 to stop its mmap ring buffer reading loop. Example session: terminal 1: # mkfifo control ack # perf record --control=fifo:control,ack terminal 2: # echo stop > control terminal 1: [ perf record: Woken up 7 times to write data ] [ perf record: Captured and wrote 3.214 MB perf.data (38280 samples) ] # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201226232038.390883-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-20perf tools: Add 'evlist' control commandJiri Olsa
Add a new 'evlist' control command to display all the evlist events. When it is received, perf will scan and print current evlist into perf record terminal. The interface string for control file is: evlist [-v|-g|-F] The syntax follows perf evlist command: -F Show just the sample frequency used for each event. -v Show all fields. -g Show event group information. Example session: terminal 1: # mkfifo control ack # perf record --control=fifo:control,ack -e '{cycles,instructions}' terminal 2: # echo evlist > control terminal 1: cycles instructions dummy:HG terminal 2: # echo 'evlist -v' > control terminal 1: cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: \ IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, \ sample_id_all: 1, exclude_guest: 1 instructions: size: 120, config: 0x1, { sample_period, sample_freq }: 4000, \ sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, inherit: 1, freq: 1, \ sample_id_all: 1, exclude_guest: 1 dummy:HG: type: 1, size: 120, config: 0x9, { sample_period, sample_freq }: 4000, \ sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, inherit: 1, mmap: 1, \ comm: 1, freq: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, \ bpf_event: 1 terminal 2: # echo 'evlist -g' > control terminal 1: {cycles,instructions} dummy:HG terminal 2: # echo 'evlist -F' > control terminal 1: cycles: sample_freq=4000 instructions: sample_freq=4000 dummy:HG: sample_freq=4000 This new evlist command is handy to get real event names when wildcards are used. Adding evsel_fprintf.c object to python/perf.so build, because it's now evlist.c dependency. Adding PYTHON_PERF define for python/perf.so compilation, so we can use it to compile in only evsel__fprintf from evsel_fprintf.c object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201226232038.390883-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-20perf tools: Allow to enable/disable events via control fileJiri Olsa
Adding new control events to enable/disable specific event. The interface string for control file are: 'enable <EVENT NAME>' 'disable <EVENT NAME>' when received the command, perf will scan the current evlist for <EVENT NAME> and if found it's enabled/disabled. Example session: terminal 1: # mkfifo control ack perf.pipe # perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:*' -o - > perf.pipe terminal 2: # cat perf.pipe | perf --no-pager script -i - terminal 1: Events disabled NOTE Above message will show only after read side of the pipe ('>') is started on 'terminal 2'. The 'terminal 1's bash does not execute perf before that, hence the delyaed perf record message. terminal 3: # echo 'enable sched:sched_process_fork' > control terminal 1: event sched:sched_process_fork enabled terminal 2: bash 33349 [034] 149587.674295: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34056 bash 33349 [034] 149588.239521: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34057 terminal 3: # echo 'enable sched:sched_wakeup_new' > control terminal 1: event sched:sched_wakeup_new enabled terminal 2: bash 33349 [034] 149632.228023: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34059 bash 33349 [034] 149632.228050: sched:sched_wakeup_new: bash:34059 [120] success=1 CPU:036 bash 33349 [034] 149633.950005: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34060 bash 33349 [034] 149633.950030: sched:sched_wakeup_new: bash:34060 [120] success=1 CPU:036 Committer testing: If I use 'sched:*' and then enable all events, I can't get 'perf record' to react to further commands, so I tested it with: [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe Events disabled Events enabled Events disabled And then it works as expected, so we need to fix this pre-existing problem. Another issue, we need to check if a event is already enabled or disabled and change the message to be clearer, i.e.: [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe Events disabled If we receive a 'disable' command, then it should say: [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe Events disabled Events already disabled Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201226232038.390883-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-20perf record: Add support for PERF_SAMPLE_CODE_PAGE_SIZEKan Liang
Adds the infrastructure to sample the code address page size. Introduce a new --code-page-size option for perf record. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Originally-by: Stephane Eranian <eranian@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210105195752.43489-4-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-28perf record: Add --buildid-mmap option to enable PERF_RECORD_MMAP2's build idJiri Olsa
Add --buildid-mmap option to enable build id in PERF_RECORD_MMAP2 events. It will only work if there's kernel support for that and it disables build id cache (implies --no-buildid). It's also possible to enable it permanently via config option in ~/.perfconfig file: [record] build-id=mmap Also added build_id bit in the verbose output for perf_event_attr: # perf record --buildid-mmap -vv ... perf_event_attr: type 1 size 120 ... build_id 1 Adding also missing text_poke bit. Committer testing: $ perf record -h build Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -B, --no-buildid do not collect buildids in perf.data -N, --no-buildid-cache do not update the buildid cache --buildid-all Record build-id of all DSOs regardless of hits --buildid-mmap Record build-id in map events $ $ perf record --buildid-mmap sleep 1 Failed: no support to record build id in mmap events, update your kernel. $ After adding the needed kernel bits in a test kernel: $ perf record -vv --buildid-mmap sleep 1 |& grep -m1 build Enabling build id in mmap2 events. $ perf evlist -v cycles:u: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, build_id: 1 $ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201214105457.543111-16-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-17perf tools: Reformat record's control fd man textJiri Olsa
Adding available control commands in separate paragraph, so it's more readable and easier to add new commands. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201216083914.47215-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-17perf record: Support new sample type for data page sizeKan Liang
Support new sample type PERF_SAMPLE_DATA_PAGE_SIZE for page size. Add new option --data-page-size to record sample data page size. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/20201130172803.2676-3-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-04perf record: Add 'snapshot' control commandAdrian Hunter
Add 'snapshot' control command to create an AUX area tracing snapshot the same as if sending SIGUSR2. The advantage of the FIFO is that access is governed by access to the FIFO. Example: $ mkfifo perf.control $ mkfifo perf.ack $ cat perf.ack & [1] 15235 $ sudo ~/bin/perf record --control fifo:perf.control,perf.ack -S -e intel_pt//u -- sleep 60 & [2] 15243 $ ps -e | grep perf 15244 pts/1 00:00:00 perf $ kill -USR2 15244 bash: kill: (15244) - Operation not permitted $ echo snapshot > perf.control ack $ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200901093758.32293-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-04perf tools: Add FIFO file names as alternative options to --controlAdrian Hunter
Enable the --control option to accept file names as an alternative to file descriptors. Example: $ mkfifo perf.control $ mkfifo perf.ack $ cat perf.ack & [1] 6808 $ perf record --control fifo:perf.control,perf.ack -- sleep 300 & [2] 6810 $ echo disable > perf.control $ Events disabled ack $ echo enable > perf.control $ Events enabled ack $ echo disable > perf.control $ Events disabled ack $ kill %2 [ perf record: Woken up 4 times to write data ] $ [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ] [1]- Done cat perf.ack [2]+ Terminated perf record --control fifo:perf.control,perf.ack -- sleep 300 $ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200902105707.11491-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-04perf tools: Use AsciiDoc formatting for --control option documentationAdrian Hunter
The --control option does not display well in man pages unless AsciiDoc formatting is used. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200901093758.32293-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03perf record/stat: Explicitly call out event modifiers in the documentationKim Phillips
Event modifiers are not mentioned in the perf record or perf stat manpages. Add them to orient new users more effectively by pointing them to the perf list manpage for details. Fixes: 2055fdaf8703 ("perf list: Document precise event sampling for AMD IBS") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tony Jones <tonyj@suse.de> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20200901215853.276234-1-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04perf record: Introduce --control fd:ctl-fd[,ack-fd] optionsAlexey Budankov
Introduce --control fd:ctl-fd[,ack-fd] options to pass open file descriptors numbers from command line. Extend perf-record.txt file with --control fd:ctl-fd[,ack-fd] options description. Document possible usage model introduced by --control fd:ctl-fd[,ack-fd] options by providing example bash shell script. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/8dc01e1a-3a80-3f67-5385-4bc7112b0dd3@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04perf record: Extend -D,--delay option with -1 valueAlexey Budankov
Extend -D,--delay option with -1 to start collection with events disabled to be enabled later by 'enable' command provided via control file descriptor. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/3e7d362c-7973-ee5d-e81e-c60ea22432c3@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29perf tools: Add optional support for libpfm4Stephane Eranian
This patch links perf with the libpfm4 library if it is available and LIBPFM4 is passed to the build. The libpfm4 library contains hardware event tables for all processors supported by perf_events. It is a helper library that helps convert from a symbolic event name to the event encoding required by the underlying kernel interface. This library is open-source and available from: http://perfmon2.sf.net. With this patch, it is possible to specify full hardware events by name. Hardware filters are also supported. Events must be specified via the --pfm-events and not -e option. Both options are active at the same time and it is possible to mix and match: $ perf stat --pfm-events inst_retired:any_p:c=1:i -e cycles .... One needs to explicitely ask for its inclusion by using the LIBPFM4 make command line option, ie its opt-in rather than opt-out of feature detection and build support. Signed-off-by: Stephane Eranian <eranian@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Igor Lubashev <ilubashe@akamai.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jiwei Sun <jiwei.sun@windriver.com> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Cc: yuzhoujian <yuzhoujian@didichuxing.com> Link: http://lore.kernel.org/lkml/20200505182943.218248-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28perf record: Respect --no-switch-eventsAdrian Hunter
Context switch events are added automatically by Intel PT and Coresight. Make it possible to suppress them. That is useful for tracing the scheduler without the disturbance that the switch event processing creates. Example: Prerequisites: $ which perf ~/bin/perf $ sudo setcap "cap_sys_rawio,cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_ipc_lock=ep" ~/bin/perf $ sudo chmod +r /proc/kcore Before: $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.938 MB perf.data ] $ perf script -D | grep PERF_RECORD_SWITCH | wc -l 572 After: $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001 Warning: Intel Processor Trace decoding will not be possible except for kernel tracing! [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.838 MB perf.data ] $ perf script -D | grep PERF_RECORD_SWITCH | wc -l 0 $ sudo chmod go-r /proc/kcore $ sudo setcap -r ~/bin/perf Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lore.kernel.org/lkml/20200528120859.21604-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-05perf record: Introduce --switch-output-eventArnaldo Carvalho de Melo
Now we can use it with --overwrite to have a flight recorder mode that gets snapshot requests from arbitrary events that are processed in the side band thread together with the PERF_RECORD_BPF_EVENT processing. Example: To collect scheduler events until a recvmmsg syscall happens, system wide: [root@five a]# rm -f perf.data.2020042717* [root@five a]# perf record --overwrite -e sched:*switch,syscalls:*recvmmsg --switch-output-event syscalls:sys_enter_recvmmsg [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042717585458 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042717590235 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042717590398 ] ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2020042717590511 ] [ perf record: Captured and wrote 7.244 MB perf.data.<timestamp> ] So in the above case we had 3 snapshots, the fourth was forced by control+C: [root@five a]# ls -la total 20440 drwxr-xr-x. 2 root root 4096 Apr 27 17:59 . dr-xr-x---. 12 root root 4096 Apr 27 17:46 .. -rw-------. 1 root root 3936125 Apr 27 17:58 perf.data.2020042717585458 -rw-------. 1 root root 5074869 Apr 27 17:59 perf.data.2020042717590235 -rw-------. 1 root root 4291037 Apr 27 17:59 perf.data.2020042717590398 -rw-------. 1 root root 7617037 Apr 27 17:59 perf.data.2020042717590511 [root@five a]# One can make this more precise by adding the switch output event to the main -e events list, as since this is done asynchronously, a few events after the signal event will appear in the snapshots, as can be seen with: [root@five a]# rm -f perf.data.20200427175* [root@five a]# perf record --overwrite -e sched:*switch,syscalls:*recvmmsg --switch-output-event syscalls:sys_enter_recvmmsg [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042718024203 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042718024301 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042718024484 ] ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2020042718024562 ] [ perf record: Captured and wrote 7.337 MB perf.data.<timestamp> ] [root@five a]# perf script -i perf.data.2020042718024203 | tail -15 PacerThread 148586 [005] 122.830729: sched:sched_switch: prev_comm=PacerThread prev_pid=148586... swapper 0 [000] 122.833588: sched:sched_switch: prev_comm=swapper/0 prev_pid=... NetworkManager 1251 [000] 122.833619: syscalls:sys_enter_recvmmsg: fd: 0x0000001c, mmsg: 0x7ffe83054a1... swapper 0 [002] 122.833624: sched:sched_switch: prev_comm=swapper/2 prev_pid=... swapper 0 [003] 122.833624: sched:sched_switch: prev_comm=swapper/3 prev_pid=... NetworkManager 1251 [000] 122.833626: syscalls:sys_exit_recvmmsg: 0x1 kworker/3:3-eve 158946 [003] 122.833628: sched:sched_switch: prev_comm=kworker/3:3 prev_pid=15894... swapper 0 [004] 122.833641: sched:sched_switch: prev_comm=swapper/4 prev_pid=... NetworkManager 1251 [000] 122.833642: sched:sched_switch: prev_comm=NetworkManage... perf 228273 [002] 122.833645: sched:sched_switch: prev_comm=perf prev_pid=22827... swapper 0 [011] 122.833646: sched:sched_switch: prev_comm=swapper/1... swapper 0 [002] 122.833648: sched:sched_switch: prev_comm=swapper/... kworker/0:2-eve 207387 [000] 122.833648: sched:sched_switch: prev_comm=kworker/0:2 prev_pid=20738... kworker/2:3-eve 232038 [002] 122.833652: sched:sched_switch: prev_comm=kworker/2:3 prev_pid=23203... perf 235825 [003] 122.833653: sched:sched_switch: prev_comm=perf prev_pid=23582... [root@five a]# Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lore.kernel.org/lkml/20200429131106.27974-8-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-23perf record: Add num-synthesize-threads optionStephane Eranian
To control degree of parallelism of the synthesize_mmap() code which is scanning /proc/PID/task/PID/maps and can be time consuming. Mimic perf top way of handling the option. If not specified will default to 1 thread, i.e. default behavior before this option. On a desktop computer the processing of /proc/PID/task/PID/maps isn't slow enough to warrant parallel processing and the thread creation has some cost - hence the default of 1. On a loaded server with >100 cores it is possible to see synthesis times in the order of seconds and in this case having the option is desirable. As the processing is a synchronization point, it is legitimate to worry if Amdahl's law will apply to this patch. Profiling with this patch in place: https://lore.kernel.org/lkml/20200415054050.31645-4-irogers@google.com/ shows: ... - 32.59% __perf_event__synthesize_threads - 32.54% __event__synthesize_thread + 22.13% perf_event__synthesize_mmap_events + 6.68% perf_event__get_comm_ids.constprop.0 + 1.49% process_synthesized_event + 1.29% __GI___readdir64 + 0.60% __opendir ... That is the processing is 1.49% of execution time and there is plenty to make parallel. This is shown in the benchmark in this patch: https://lore.kernel.org/lkml/20200415054050.31645-2-irogers@google.com/ Computing performance of multi threaded perf event synthesis by synthesizing events on CPU 0: Number of synthesis threads: 1 Average synthesis took: 127729.000 usec (+- 3372.880 usec) Average num. events: 21548.600 (+- 0.306) Average time per event 5.927 usec Number of synthesis threads: 2 Average synthesis took: 88863.500 usec (+- 385.168 usec) Average num. events: 21552.800 (+- 0.327) Average time per event 4.123 usec Number of synthesis threads: 3 Average synthesis took: 83257.400 usec (+- 348.617 usec) Average num. events: 21553.200 (+- 0.327) Average time per event 3.863 usec Number of synthesis threads: 4 Average synthesis took: 75093.000 usec (+- 422.978 usec) Average num. events: 21554.200 (+- 0.200) Average time per event 3.484 usec Number of synthesis threads: 5 Average synthesis took: 64896.600 usec (+- 353.348 usec) Average num. events: 21558.000 (+- 0.000) Average time per event 3.010 usec Number of synthesis threads: 6 Average synthesis took: 59210.200 usec (+- 342.890 usec) Average num. events: 21560.000 (+- 0.000) Average time per event 2.746 usec Number of synthesis threads: 7 Average synthesis took: 54093.900 usec (+- 306.247 usec) Average num. events: 21562.000 (+- 0.000) Average time per event 2.509 usec Number of synthesis threads: 8 Average synthesis took: 48938.700 usec (+- 341.732 usec) Average num. events: 21564.000 (+- 0.000) Average time per event 2.269 usec Where average time per synthesized event goes from 5.927 usec with 1 thread to 2.269 usec with 8. This isn't a linear speed up as not all of synthesize code has been made parallel. If the synthesis time was about 10 seconds then using 8 threads may bring this down to less than 4. Signed-off-by: Stephane Eranian <eranian@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Jones <tonyj@suse.de> Cc: yuzhoujian <yuzhoujian@didichuxing.com> Link: http://lore.kernel.org/lkml/20200422155038.9380-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf record: Add --all-cgroups optionNamhyung Kim
The --all-cgroups option is to enable cgroup profiling support. It tells kernel to record CGROUP events in the ring buffer so that perf report can identify task/cgroup association later. [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ] [root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 558 of event 'cycles' # Event count (approx.): 458017341 # # Overhead cgroup id (dev/inode) Cgroup Pid:Command # ........ ..................... .......... ............... # 33.15% 4/0xeffffffb /sub 9615:looper0 32.83% 4/0xf00002f5 /sub/cgrp2 9620:looper2 32.79% 4/0xf00002f4 /sub/cgrp1 9619:looper1 0.35% 4/0xf00002f5 /sub/cgrp2 9618:cgtest 0.34% 4/0xf00002f4 /sub/cgrp1 9617:cgtest 0.32% 4/0xeffffffb / 9615:looper0 0.11% 4/0xeffffffb /sub 9617:cgtest 0.10% 4/0xeffffffb /sub 9618:cgtest # # (Tip: Sample related events with: perf record -e '{cycles,instructions}:S') # [root@seventh ~]# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-8-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-25perf callchain: Update docs regarding kernel/user space unwindingTony Jones
The method of unwinding for kernel space is defined by the kernel config, not by the value of --call-graph. Improve the documentation to reflect this. Signed-off-by: Tony Jones <tonyj@suse.de> Link: http://lore.kernel.org/lkml/20200325164053.10177-1-tonyj@suse.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-11perf intel-pt: Add Intel PT man page referencesAdrian Hunter
Add references to Intel PT man page in man pages of associated tools. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200311122034.3697-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22perf record: Add aux-sample-size config termAdrian Hunter
To allow individual events to be selected for AUX area sampling, add aux-sample-size config term. attr.aux_sample_size is updated by auxtrace_parse_sample_options() so that the existing validation will see the value. Any event that has a non-zero aux_sample_size will cause AUX area sampling to be configured, irrespective of the --aux-sample option. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22perf record: Add support for AUX area samplingAdrian Hunter
Add a 'perf record' option '--aux-sample' to request AUX area sampling. AUX area sampling uses an overwriting buffer much like snapshot mode, so adjust the AUX buffer mmapping accordingly. To make it easy to queue samples for decoding, synthesize an ID index. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-07perf record: Add support for limit perf output file sizeJiwei Sun
The patch adds a new option to limit the output file size, then based on it, we can create a wrapper of the perf command that uses the option to avoid exhausting the disk space by the unconscious user. In order to make the perf.data parsable, we just limit the sample data size, since the perf.data consists of many headers and sample data and other data, the actual size of the recorded file will bigger than the setting value. Testing it: # ./perf record -a -g --max-size=10M Couldn't synthesize bpf events. [ perf record: perf size limit reached (10249 KB), stopping session ] [ perf record: Woken up 32 times to write data ] [ perf record: Captured and wrote 10.133 MB perf.data (71964 samples) ] # ls -lh perf.data -rw------- 1 root root 11M Oct 22 14:32 perf.data # ./perf record -a -g --max-size=10K [ perf record: perf size limit reached (10 KB), stopping session ] Couldn't synthesize bpf events. [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 1.546 MB perf.data (69 samples) ] # ls -l perf.data -rw------- 1 root root 1626952 Oct 22 14:36 perf.data Committer notes: Fixed the build in multiple distros by using PRIu64 to print u64 struct members, fixing this: builtin-record.c: In function 'record__write': builtin-record.c:150:5: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=] rec->bytes_written >> 10); ^ CC /tmp/build/pe Signed-off-by: Jiwei Sun <jiwei.sun@windriver.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Danter <richard.danter@windriver.com> Link: http://lore.kernel.org/lkml/20191022080901.3841-1-jiwei.sun@windriver.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf record: Put a copy of kcore into the perf.data directoryAdrian Hunter
Add a new 'perf record' option '--kcore' which will put a copy of /proc/kcore, kallsyms and modules into a perf.data directory. Note, that without the --kcore option, output goes to a file as previously. The tools' -o and -i options work with either a file name or directory name. Example: $ sudo perf record --kcore uname $ sudo tree perf.data perf.data ├── kcore_dir │   ├── kallsyms │   ├── kcore │   └── modules └── data $ sudo perf script -v build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541 Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field. Using CPUID GenuineIntel-6-8E-A Using perf.data/kcore_dir/kcore for kernel data Using perf.data/kcore_dir/kallsyms for symbols perf 19058 506778.423729: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423733: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423734: 7 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423736: 117 cycles: ffffffffa2caa54a native_write_msr+0xa (vmlinux) perf 19058 506778.423738: 2092 cycles: ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux) perf 19058 506778.423740: 37380 cycles: ffffffffa2f121d0 perf_event_addr_filters_exec+0x0 (vmlinux) uname 19058 506778.423751: 582673 cycles: ffffffffa303a407 propagate_protected_usage+0x147 (vmlinux) uname 19058 506778.423892: 2241841 cycles: ffffffffa2cae0c9 unwind_next_frame.part.5+0x79 (vmlinux) uname 19058 506778.424430: 2457397 cycles: ffffffffa3019232 check_memory_region+0x52 (vmlinux) Committer testing: # rm -rf perf.data* # perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ] # ls -l perf.data -rw-------. 1 root root 34772 Oct 21 11:08 perf.data # perf record --kcore uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ] ls[root@quaco ~]# ls -lad perf.data* drwx------. 3 root root 4096 Oct 21 11:08 perf.data -rw-------. 1 root root 34772 Oct 21 11:08 perf.data.old # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # perf evlist -v -i perf.data/data cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/20191004083121.12182-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-14perf tools: Add aux-output config termAdrian Hunter
Expose the aux_output attribute flag to the user to configure, by adding a config term 'aux-output'. For events that support it, selection of 'aux-output' causes the generation of AUX records instead of event records. This requires that an AUX area event is also provided. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190806084606.4021-7-alexander.shishkin@linux.intel.com Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-14perf record: Add an option to take an AUX snapshot on exitAlexander Shishkin
It is sometimes useful to generate a snapshot when perf record exits; I've been using a wrapper script around the workload that would do a killall -USR2 perf when the workload exits. This patch makes it easier and also works when perf record is attached to a pre-existing task. A new snapshot option 'e' can be specified in -S to enable this behavior: root@elsewhere:~# perf record -e intel_pt// -Se sleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.085 MB perf.data ] Co-developed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190806144101.62892-1-alexander.shishkin@linux.intel.com [ Fixed up !HAVE_AUXTRACE_SUPPORT build in builtin-record.c, adding 2 missing __maybe_unused ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-10perf record: Add support to collect callchains from kernel or user space onlyyuzhoujian
One can just record callchains in the kernel or user space with this new options. We can use it together with "--all-kernel" options. This two options is used just like print_stack(sys) or print_ustack(usr) for systemtap. Shown below is the usage of this new option combined with "--all-kernel" options: 1. Configure all used events to run in kernel space and just collect kernel callchains. $ perf record -a -g --all-kernel --kernel-callchains 2. Configure all used events to run in kernel space and just collect user callchains. $ perf record -a -g --all-kernel --user-callchains Committer notes: Improved documentation to state that asking for kernel callchains really is asking for excluding user callchains, and vice versa. Further mentioned that using both won't get both, but nothing, as both will be excluded. Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1559222962-22891-1-git-send-email-ufo19890607@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>