diff options
Diffstat (limited to 'tools/perf/Documentation/perf-report.txt')
-rw-r--r-- | tools/perf/Documentation/perf-report.txt | 54 |
1 files changed, 37 insertions, 17 deletions
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 87f864519406..3376c4710575 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -44,7 +44,7 @@ OPTIONS --comms=:: Only consider symbols in these comms. CSV that understands file://filename entries. This option will affect the percentage of - the overhead column. See --percentage for more info. + the overhead and latency columns. See --percentage for more info. --pid=:: Only show events for given process ID (comma separated list). @@ -54,12 +54,12 @@ OPTIONS --dsos=:: Only consider symbols in these dsos. CSV that understands file://filename entries. This option will affect the percentage of - the overhead column. See --percentage for more info. + the overhead and latency columns. See --percentage for more info. -S:: --symbols=:: Only consider these symbols. CSV that understands file://filename entries. This option will affect the percentage of - the overhead column. See --percentage for more info. + the overhead and latency columns. See --percentage for more info. --symbol-filter=:: Only show symbols that match (partially) with this filter. @@ -68,6 +68,21 @@ OPTIONS --hide-unresolved:: Only display entries resolved to a symbol. +--parallelism:: + Only consider these parallelism levels. Parallelism level is the number + of threads that actively run on CPUs at the time of sample. The flag + accepts single number, comma-separated list, and ranges (for example: + "1", "7,8", "1,64-128"). This is useful in understanding what a program + is doing during sequential/low-parallelism phases as compared to + high-parallelism phases. This option will affect the percentage of + the overhead and latency columns. See --percentage for more info. + Also see the `CPU and latency overheads' section for more details. + +--latency:: + Show latency-centric profile rather than the default + CPU-consumption-centric profile + (requires perf record --latency flag). + -s:: --sort=:: Sort histogram entries by given key(s) - multiple keys can be specified @@ -87,6 +102,7 @@ OPTIONS entries are displayed as "[other]". - cpu: cpu number the task ran at the time of sample - socket: processor socket number the task ran at the time of sample + - parallelism: number of running threads at the time of sample - srcline: filename and line number executed at the time of sample. The DWARF debugging info must be provided. - srcfile: file name of the source file of the samples. Requires dwarf @@ -97,12 +113,14 @@ OPTIONS - cgroup_id: ID derived from cgroup namespace device and inode numbers. - cgroup: cgroup pathname in the cgroupfs. - transaction: Transaction abort flags. - - overhead: Overhead percentage of sample - - overhead_sys: Overhead percentage of sample running in system mode - - overhead_us: Overhead percentage of sample running in user mode - - overhead_guest_sys: Overhead percentage of sample running in system mode + - overhead: CPU overhead percentage of sample. + - latency: latency (wall-clock) overhead percentage of sample. + See the `CPU and latency overheads' section for more details. + - overhead_sys: CPU overhead percentage of sample running in system mode + - overhead_us: CPU overhead percentage of sample running in user mode + - overhead_guest_sys: CPU overhead percentage of sample running in system mode on guest machine - - overhead_guest_us: Overhead percentage of sample running in user mode on + - overhead_guest_us: CPU overhead percentage of sample running in user mode on guest machine - sample: Number of sample - period: Raw number of event count of sample @@ -125,8 +143,8 @@ OPTIONS - weight2: Average value of event specific weight (2nd field of weight_struct). - weight3: Average value of event specific weight (3rd field of weight_struct). - By default, comm, dso and symbol keys are used. - (i.e. --sort comm,dso,symbol) + By default, overhead, comm, dso and symbol keys are used. + (i.e. --sort overhead,comm,dso,symbol). If --branch-stack option is used, following sort keys are also available: @@ -201,9 +219,9 @@ OPTIONS --fields=:: Specify output field - multiple keys can be specified in CSV format. Following fields are available: - overhead, overhead_sys, overhead_us, overhead_children, sample, period, - weight1, weight2, weight3, ins_lat, p_stage_cyc and retire_lat. The - last 3 names are alias for the corresponding weights. When the weight + overhead, latency, overhead_sys, overhead_us, overhead_children, sample, + period, weight1, weight2, weight3, ins_lat, p_stage_cyc and retire_lat. + The last 3 names are alias for the corresponding weights. When the weight fields are used, they will show the average value of the weight. Also it can contain any sort key(s). @@ -289,7 +307,7 @@ OPTIONS Accumulate callchain of children to parent entry so that then can show up in the output. The output will have a new "Children" column and will be sorted on the data. It requires callchains are recorded. - See the `overhead calculation' section for more details. Enabled by + See the `Overhead calculation' section for more details. Enabled by default, disable with --no-children. --max-stack:: @@ -442,9 +460,9 @@ OPTIONS --call-graph option for details. --percentage:: - Determine how to display the overhead percentage of filtered entries. - Filters can be applied by --comms, --dsos and/or --symbols options and - Zoom operations on the TUI (thread, dso, etc). + Determine how to display the CPU and latency overhead percentage + of filtered entries. Filters can be applied by --comms, --dsos, --symbols + and/or --parallelism options and Zoom operations on the TUI (thread, dso, etc). "relative" means it's relative to filtered entries only so that the sum of shown entries will be always 100%. "absolute" means it retains @@ -627,6 +645,8 @@ include::itrace.txt[] --skip-empty:: Do not print 0 results in the --stat output. +include::cpu-and-latency-overheads.txt[] + include::callchain-overhead-calculation.txt[] SEE ALSO |