summaryrefslogtreecommitdiff
path: root/tools/perf/Documentation/perf-report.txt
diff options
context:
space:
mode:
Diffstat (limited to 'tools/perf/Documentation/perf-report.txt')
-rw-r--r--tools/perf/Documentation/perf-report.txt66
1 files changed, 51 insertions, 15 deletions
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index d8b863e01fe0..3376c4710575 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -44,7 +44,7 @@ OPTIONS
--comms=::
Only consider symbols in these comms. CSV that understands
file://filename entries. This option will affect the percentage of
- the overhead column. See --percentage for more info.
+ the overhead and latency columns. See --percentage for more info.
--pid=::
Only show events for given process ID (comma separated list).
@@ -54,12 +54,12 @@ OPTIONS
--dsos=::
Only consider symbols in these dsos. CSV that understands
file://filename entries. This option will affect the percentage of
- the overhead column. See --percentage for more info.
+ the overhead and latency columns. See --percentage for more info.
-S::
--symbols=::
Only consider these symbols. CSV that understands
file://filename entries. This option will affect the percentage of
- the overhead column. See --percentage for more info.
+ the overhead and latency columns. See --percentage for more info.
--symbol-filter=::
Only show symbols that match (partially) with this filter.
@@ -68,6 +68,21 @@ OPTIONS
--hide-unresolved::
Only display entries resolved to a symbol.
+--parallelism::
+ Only consider these parallelism levels. Parallelism level is the number
+ of threads that actively run on CPUs at the time of sample. The flag
+ accepts single number, comma-separated list, and ranges (for example:
+ "1", "7,8", "1,64-128"). This is useful in understanding what a program
+ is doing during sequential/low-parallelism phases as compared to
+ high-parallelism phases. This option will affect the percentage of
+ the overhead and latency columns. See --percentage for more info.
+ Also see the `CPU and latency overheads' section for more details.
+
+--latency::
+ Show latency-centric profile rather than the default
+ CPU-consumption-centric profile
+ (requires perf record --latency flag).
+
-s::
--sort=::
Sort histogram entries by given key(s) - multiple keys can be specified
@@ -87,6 +102,7 @@ OPTIONS
entries are displayed as "[other]".
- cpu: cpu number the task ran at the time of sample
- socket: processor socket number the task ran at the time of sample
+ - parallelism: number of running threads at the time of sample
- srcline: filename and line number executed at the time of sample. The
DWARF debugging info must be provided.
- srcfile: file name of the source file of the samples. Requires dwarf
@@ -97,12 +113,14 @@ OPTIONS
- cgroup_id: ID derived from cgroup namespace device and inode numbers.
- cgroup: cgroup pathname in the cgroupfs.
- transaction: Transaction abort flags.
- - overhead: Overhead percentage of sample
- - overhead_sys: Overhead percentage of sample running in system mode
- - overhead_us: Overhead percentage of sample running in user mode
- - overhead_guest_sys: Overhead percentage of sample running in system mode
+ - overhead: CPU overhead percentage of sample.
+ - latency: latency (wall-clock) overhead percentage of sample.
+ See the `CPU and latency overheads' section for more details.
+ - overhead_sys: CPU overhead percentage of sample running in system mode
+ - overhead_us: CPU overhead percentage of sample running in user mode
+ - overhead_guest_sys: CPU overhead percentage of sample running in system mode
on guest machine
- - overhead_guest_us: Overhead percentage of sample running in user mode on
+ - overhead_guest_us: CPU overhead percentage of sample running in user mode on
guest machine
- sample: Number of sample
- period: Raw number of event count of sample
@@ -121,9 +139,12 @@ OPTIONS
- type: Data type of sample memory access.
- typeoff: Offset in the data type of sample memory access.
- symoff: Offset in the symbol.
+ - weight1: Average value of event specific weight (1st field of weight_struct).
+ - weight2: Average value of event specific weight (2nd field of weight_struct).
+ - weight3: Average value of event specific weight (3rd field of weight_struct).
- By default, comm, dso and symbol keys are used.
- (i.e. --sort comm,dso,symbol)
+ By default, overhead, comm, dso and symbol keys are used.
+ (i.e. --sort overhead,comm,dso,symbol).
If --branch-stack option is used, following sort keys are also
available:
@@ -198,7 +219,11 @@ OPTIONS
--fields=::
Specify output field - multiple keys can be specified in CSV format.
Following fields are available:
- overhead, overhead_sys, overhead_us, overhead_children, sample and period.
+ overhead, latency, overhead_sys, overhead_us, overhead_children, sample,
+ period, weight1, weight2, weight3, ins_lat, p_stage_cyc and retire_lat.
+ The last 3 names are alias for the corresponding weights. When the weight
+ fields are used, they will show the average value of the weight.
+
Also it can contain any sort key(s).
By default, every sort keys not specified in -F will be appended
@@ -282,7 +307,7 @@ OPTIONS
Accumulate callchain of children to parent entry so that then can
show up in the output. The output will have a new "Children" column
and will be sorted on the data. It requires callchains are recorded.
- See the `overhead calculation' section for more details. Enabled by
+ See the `Overhead calculation' section for more details. Enabled by
default, disable with --no-children.
--max-stack::
@@ -384,6 +409,14 @@ OPTIONS
This allows to examine the path the program took to each sample.
The data collection must have used -b (or -j) and -g.
+ Also show with some branch flags that can be:
+ - Predicted: display the average percentage of predicated branches.
+ (predicated number / total number)
+ - Abort: display the number of tsx aborted branches.
+ - Cycles: cycles in basic block.
+
+ - iterations: display the average number of iterations in callchain list.
+
--addr2line=<path>::
Path to addr2line binary.
@@ -427,9 +460,9 @@ OPTIONS
--call-graph option for details.
--percentage::
- Determine how to display the overhead percentage of filtered entries.
- Filters can be applied by --comms, --dsos and/or --symbols options and
- Zoom operations on the TUI (thread, dso, etc).
+ Determine how to display the CPU and latency overhead percentage
+ of filtered entries. Filters can be applied by --comms, --dsos, --symbols
+ and/or --parallelism options and Zoom operations on the TUI (thread, dso, etc).
"relative" means it's relative to filtered entries only so that the
sum of shown entries will be always 100%. "absolute" means it retains
@@ -607,10 +640,13 @@ include::itrace.txt[]
'Avg Cycles%' - block average sampled cycles / sum of total block average
sampled cycles
'Avg Cycles' - block average sampled cycles
+ 'Branch Counter' - block branch counter histogram (with -v showing the number)
--skip-empty::
Do not print 0 results in the --stat output.
+include::cpu-and-latency-overheads.txt[]
+
include::callchain-overhead-calculation.txt[]
SEE ALSO