summaryrefslogtreecommitdiff
path: root/tools/perf/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r--tools/perf/Documentation/intel-pt.txt59
-rw-r--r--tools/perf/Documentation/perf-config.txt5
-rw-r--r--tools/perf/Documentation/perf-diff.txt5
-rw-r--r--tools/perf/Documentation/perf-list.txt3
-rw-r--r--tools/perf/Documentation/perf-record.txt16
-rw-r--r--tools/perf/Documentation/perf-report.txt11
-rw-r--r--tools/perf/Documentation/perf-stat.txt11
-rw-r--r--tools/perf/Documentation/perf-trace.txt14
-rw-r--r--tools/perf/Documentation/perf.data-directory-format.txt63
-rw-r--r--tools/perf/Documentation/perf.txt2
10 files changed, 187 insertions, 2 deletions
diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index e0d9e7dd4f17..2cf2d9e9d0da 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -434,6 +434,56 @@ pwr_evt Enable power events. The power events provide information about
"0" otherwise.
+AUX area sampling option
+------------------------
+
+To select Intel PT "sampling" the AUX area sampling option can be used:
+
+ --aux-sample
+
+Optionally it can be followed by the sample size in bytes e.g.
+
+ --aux-sample=8192
+
+In addition, the Intel PT event to sample must be defined e.g.
+
+ -e intel_pt//u
+
+Samples on other events will be created containing Intel PT data e.g. the
+following will create Intel PT samples on the branch-misses event, note the
+events must be grouped using {}:
+
+ perf record --aux-sample -e '{intel_pt//u,branch-misses:u}'
+
+An alternative to '--aux-sample' is to add the config term 'aux-sample-size' to
+events. In this case, the grouping is implied e.g.
+
+ perf record -e intel_pt//u -e branch-misses/aux-sample-size=8192/u
+
+is the same as:
+
+ perf record -e '{intel_pt//u,branch-misses/aux-sample-size=8192/u}'
+
+but allows for also using an address filter e.g.:
+
+ perf record -e intel_pt//u --filter 'filter * @/bin/ls' -e branch-misses/aux-sample-size=8192/u -- ls
+
+It is important to select a sample size that is big enough to contain at least
+one PSB packet. If not a warning will be displayed:
+
+ Intel PT sample size (%zu) may be too small for PSB period (%zu)
+
+The calculation used for that is: if sample_size <= psb_period + 256 display the
+warning. When sampling is used, psb_period defaults to 0 (2KiB).
+
+The default sample size is 4KiB.
+
+The sample size is passed in aux_sample_size in struct perf_event_attr. The
+sample size is limited by the maximum event size which is 64KiB. It is
+difficult to know how big the event might be without the trace sample attached,
+but the tool validates that the sample size is not greater than 60KiB.
+
+
new snapshot option
-------------------
@@ -487,8 +537,8 @@ their mlock limit (which defaults to 64KiB but is not multiplied by the number
of cpus).
In full-trace mode, powers of two are allowed for buffer size, with a minimum
-size of 2 pages. In snapshot mode, it is the same but the minimum size is
-1 page.
+size of 2 pages. In snapshot mode or sampling mode, it is the same but the
+minimum size is 1 page.
The mmap size and auxtrace mmap size are displayed if the -vv option is used e.g.
@@ -501,12 +551,17 @@ Intel PT modes of operation
Intel PT can be used in 2 modes:
full-trace mode
+ sample mode
snapshot mode
Full-trace mode traces continuously e.g.
perf record -e intel_pt//u uname
+Sample mode attaches a Intel PT sample to other events e.g.
+
+ perf record --aux-sample -e intel_pt//u -e branch-misses:u
+
Snapshot mode captures the available data when a signal is sent e.g.
perf record -v -e intel_pt//u -S ./loopy 1000000000 &
diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index c599623a1f3d..c4dd23c4b478 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -561,6 +561,11 @@ trace.*::
trace.show_zeros::
Do not suppress syscall arguments that are equal to zero.
+ trace.tracepoint_beautifiers::
+ Use "libtraceevent" to use that library to augment the tracepoint arguments,
+ "libbeauty", the default, to use the same argument beautifiers used in the
+ strace-like sys_enter+sys_exit lines.
+
llvm.*::
llvm.clang-path::
Path to clang. If omit, search it from $PATH.
diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index d5cc15e651cf..f50ca0fef0a4 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -95,6 +95,11 @@ OPTIONS
diff.compute config option. See COMPARISON METHODS section for
more info.
+--cycles-hist::
+ Report a histogram and the standard deviation for cycles data.
+ It can help us to judge if the reported cycles data is noisy or
+ not. This option should be used with '-c cycles'.
+
-p::
--period::
Show period values for both compared hist entries.
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 18ed1b0fceb3..6345db33c533 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -36,6 +36,9 @@ Enable debugging output.
Print how named events are resolved internally into perf events, and also
any extra expressions computed by perf stat.
+--deprecated::
+Print deprecated events. By default the deprecated events are hidden.
+
[[EVENT_MODIFIERS]]
EVENT MODIFIERS
---------------
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index c6f9f31b6039..b23a4012a606 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -62,6 +62,9 @@ OPTIONS
like this: name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\'.
- 'aux-output': Generate AUX records instead of events. This requires
that an AUX area event is also provided.
+ - 'aux-sample-size': Set sample size for AUX area sampling. If the
+ '--aux-sample' option has been used, set aux-sample-size=0 to disable
+ AUX area sampling for the event.
See the linkperf:perf-list[1] man page for more parameters.
@@ -433,6 +436,12 @@ can be specified in a string that follows this option:
In Snapshot Mode trace data is captured only when signal SIGUSR2 is received
and on exit if the above 'e' option is given.
+--aux-sample[=OPTIONS]::
+Select AUX area sampling. At least one of the events selected by the -e option
+must be an AUX area event. Samples on other events will be created containing
+data from the AUX area. Optionally sample size may be specified, otherwise it
+defaults to 4KiB.
+
--proc-map-timeout::
When processing pre-existing threads /proc/XXX/mmap, it may take a long time,
because the file may be huge. A time out is needed in such cases.
@@ -571,6 +580,13 @@ config terms. For example: 'cycles/overwrite/' and 'instructions/no-overwrite/'.
Implies --tail-synthesize.
+--kcore::
+Make a copy of /proc/kcore and place it into a directory with the perf data file.
+
+--max-size=<size>::
+Limit the sample data max size, <size> is expected to be a number with
+appended unit character - B/K/M/G
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 7315f155803f..8dbe2119686a 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -525,6 +525,17 @@ include::itrace.txt[]
Configure time quantum for time sort key. Default 100ms.
Accepts s, us, ms, ns units.
+--total-cycles::
+ When --total-cycles is specified, it supports sorting for all blocks by
+ 'Sampled Cycles%'. This is useful to concentrate on the globally hottest
+ blocks. In output, there are some new columns:
+
+ 'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles
+ 'Sampled Cycles' - block sampled cycles aggregation
+ 'Avg Cycles%' - block average sampled cycles / sum of total block average
+ sampled cycles
+ 'Avg Cycles' - block average sampled cycles
+
include::callchain-overhead-calculation.txt[]
SEE ALSO
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 930c51c01201..9431b8066fb4 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -217,6 +217,11 @@ core number and the number of online logical processors on that physical process
Aggregate counts per monitored threads, when monitoring threads (-t option)
or processes (-p option).
+--per-node::
+Aggregate counts per NUMA nodes for system-wide mode measurements. This
+is a useful mode to detect imbalance between NUMA nodes. To enable this
+mode, use --per-node in addition to -a. (system-wide).
+
-D msecs::
--delay msecs::
After starting the program, wait msecs before measuring. This is useful to
@@ -323,6 +328,12 @@ The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf
Users who wants to get the actual value can apply --no-metric-only.
+--all-kernel::
+Configure all used events to run in kernel space.
+
+--all-user::
+Configure all used events to run in user space.
+
EXAMPLES
--------
diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt
index 25b74fdb36fa..abc9b5d83312 100644
--- a/tools/perf/Documentation/perf-trace.txt
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -42,6 +42,11 @@ OPTIONS
Prefixing with ! shows all syscalls but the ones specified. You may
need to escape it.
+--filter=<filter>::
+ Event filter. This option should follow an event selector (-e) which
+ selects tracepoint event(s).
+
+
-D msecs::
--delay msecs::
After starting the program, wait msecs before measuring. This is useful to
@@ -141,6 +146,10 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
Show all syscalls followed by a summary by thread with min, max, and
average times (in msec) and relative stddev.
+--errno-summary::
+ To be used with -s or -S, to show stats for the errnos experienced by
+ syscalls, using only this option will trigger --summary.
+
--tool_stats::
Show tool stats such as number of times fd->pathname was discovered thru
hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc.
@@ -219,6 +228,11 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
may happen, for instance, when a thread gets migrated to a different CPU
while processing a syscall.
+--libtraceevent_print::
+ Use libtraceevent to print tracepoint arguments. By default 'perf trace' uses
+ the same beautifiers used in the strace-like enter+exit lines to augment the
+ tracepoint arguments.
+
--map-dump::
Dump BPF maps setup by events passed via -e, for instance the augmented_raw_syscalls
living in tools/perf/examples/bpf/augmented_raw_syscalls.c. For now this
diff --git a/tools/perf/Documentation/perf.data-directory-format.txt b/tools/perf/Documentation/perf.data-directory-format.txt
new file mode 100644
index 000000000000..f37fbd29112e
--- /dev/null
+++ b/tools/perf/Documentation/perf.data-directory-format.txt
@@ -0,0 +1,63 @@
+perf.data directory format
+
+DISCLAIMER This is not ABI yet and is subject to possible change
+ in following versions of perf. We will remove this
+ disclaimer once the directory format soaks in.
+
+
+This document describes the on-disk perf.data directory format.
+
+The layout is described by HEADER_DIR_FORMAT feature.
+Currently it holds only version number (0):
+
+ HEADER_DIR_FORMAT = 24
+
+ struct {
+ uint64_t version;
+ }
+
+The current only version value 0 means that:
+ - there is a single perf.data file named 'data' within the directory.
+ e.g.
+
+ $ tree -ps perf.data
+ perf.data
+ └── [-rw------- 25912] data
+
+Future versions are expected to describe different data files
+layout according to special needs.
+
+Currently the only 'perf record' option to output to a directory is
+the --kcore option which puts a copy of /proc/kcore into the directory.
+e.g.
+
+ $ sudo perf record --kcore uname
+ Linux
+ [ perf record: Woken up 1 times to write data ]
+ [ perf record: Captured and wrote 0.015 MB perf.data (9 samples) ]
+ $ sudo tree -ps perf.data
+ perf.data
+ ├── [-rw------- 23744] data
+ └── [drwx------ 4096] kcore_dir
+ ├── [-r-------- 6731125] kallsyms
+ ├── [-r-------- 40230912] kcore
+ └── [-r-------- 5419] modules
+
+ 1 directory, 4 files
+ $ sudo perf script -v
+ build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de
+ build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541
+ build id event received for /lib/x86_64-linux-gnu/libc-2.28.so: 5b157f49586a3ca84d55837f97ff466767dd3445
+ Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field.
+ Using CPUID GenuineIntel-6-8E-A
+ Using perf.data/kcore_dir/kcore for kernel data
+ Using perf.data/kcore_dir/kallsyms for symbols
+ perf 15316 2060795.480902: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
+ perf 15316 2060795.480906: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
+ perf 15316 2060795.480908: 7 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
+ perf 15316 2060795.480910: 119 cycles: ffffffffa2caa54a native_write_msr+0xa (vmlinux)
+ perf 15316 2060795.480912: 2109 cycles: ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux)
+ perf 15316 2060795.480914: 37606 cycles: ffffffffa2f121fe perf_event_addr_filters_exec+0x2e (vmlinux)
+ uname 15316 2060795.480924: 588287 cycles: ffffffffa303a56d page_counter_try_charge+0x6d (vmlinux)
+ uname 15316 2060795.481067: 2261945 cycles: ffffffffa301438f kmem_cache_free+0x4f (vmlinux)
+ uname 15316 2060795.481643: 2172167 cycles: 7f1a48c393c0 _IO_un_link+0x0 (/lib/x86_64-linux-gnu/libc-2.28.so)
diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
index 401f0ed67439..3f37ded13f8c 100644
--- a/tools/perf/Documentation/perf.txt
+++ b/tools/perf/Documentation/perf.txt
@@ -24,6 +24,8 @@ OPTIONS
data-convert - data convert command debug messages
stderr - write debug output (option -v) to stderr
in browser mode
+ perf-event-open - Print perf_event_open() arguments and
+ return value
--buildid-dir::
Setup buildid cache directory. It has higher priority than