summaryrefslogtreecommitdiff
path: root/tools/perf/Documentation/perf-script.txt
diff options
context:
space:
mode:
Diffstat (limited to 'tools/perf/Documentation/perf-script.txt')
-rw-r--r--tools/perf/Documentation/perf-script.txt196
1 files changed, 166 insertions, 30 deletions
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 9e4def08d569..03d112960632 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -79,6 +79,9 @@ OPTIONS
--dump-raw-trace=::
Display verbose dump of the trace data.
+--dump-unsorted-raw-trace=::
+ Same as --dump-raw-trace but not sorted in time order.
+
-L::
--Latency=::
Show latency attributes (irqs/preemption disabled, etc).
@@ -98,6 +101,18 @@ OPTIONS
Generate perf-script.[ext] starter script for given language,
using current perf.data.
+--dlfilter=<file>::
+ Filter sample events using the given shared object file.
+ Refer linkperf:perf-dlfilter[1]
+
+--dlarg=<arg>::
+ Pass 'arg' as an argument to the dlfilter. --dlarg may be repeated
+ to add more arguments.
+
+--list-dlfilters::
+ Display a list of available dlfilters. Use with option -v (must come
+ before option --list-dlfilters) to show long descriptions.
+
-a::
Force system-wide collection. Scripts run without a <command>
normally use -a by default, while scripts run with a <command>
@@ -115,9 +130,12 @@ OPTIONS
-F::
--fields::
Comma separated list of fields to print. Options are:
- comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
- srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
- brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode.
+ comm, tid, pid, time, cpu, event, trace, ip, sym, dso, dsoff, addr, symoff,
+ srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output,
+ brstackinsn, brstackinsnlen, brstackdisasm, brstackoff, callindent, insn, disasm,
+ insnlen, synth, phys_addr, metric, misc, srccode, ipc, data_page_size,
+ code_page_size, ins_lat, machine_pid, vcpu, cgroup, retire_lat, brcntr,
+
Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
@@ -159,6 +177,12 @@ OPTIONS
the override, and the result of the above is that only S/W and H/W
events are displayed with the given fields.
+ It's possible tp add/remove fields only for specific event type:
+
+ -Fsw:-cpu,-period
+
+ removes cpu and period from software events.
+
For the 'wildcard' option if a user selected field is invalid for an
event type, a message is displayed to the user that the option is
ignored for that type. For example:
@@ -176,38 +200,61 @@ OPTIONS
At this point usage is displayed, and perf-script exits.
The flags field is synthesized and may have a value when Instruction
- Trace decoding. The flags are "bcrosyiABEx" which stand for branch,
+ Trace decoding. The flags are "bcrosyiABExghDt" which stand for branch,
call, return, conditional, system, asynchronous, interrupt,
- transaction abort, trace begin, trace end, and in transaction,
- respectively. Known combinations of flags are printed more nicely e.g.
+ transaction abort, trace begin, trace end, in transaction, VM-Entry,
+ VM-Exit, interrupt disabled and interrupt disable toggle respectively.
+ Known combinations of flags are printed more nicely e.g.
"call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b",
"int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs",
"async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB",
- "tr end" for "bE". However the "x" flag will be display separately in those
- cases e.g. "jcc (x)" for a condition branch within a transaction.
+ "tr end" for "bE", "vmentry" for "bcg", "vmexit" for "bch".
+ However the "x", "D" and "t" flags will be displayed separately in those
+ cases e.g. "jcc (xD)" for a condition branch within a transaction
+ with interrupts disabled. Note, interrupts becoming disabled is "t",
+ whereas interrupts becoming enabled is "Dt".
The callindent field is synthesized and may have a value when
Instruction Trace decoding. For calls and returns, it will display the
name of the symbol indented with spaces to reflect the stack depth.
- When doing instruction trace decoding insn and insnlen give the
- instruction bytes and the instruction length of the current
- instruction.
+ When doing instruction trace decoding, insn, disasm and insnlen give the
+ instruction bytes, disassembled instructions (requires libcapstone support)
+ and the instruction length of the current instruction respectively.
The synth field is used by synthesized events which may be created when
Instruction Trace decoding.
+ The ipc (instructions per cycle) field is synthesized and may have a value when
+ Instruction Trace decoding.
+
+ The machine_pid and vcpu fields are derived from data resulting from using
+ perf inject to insert a perf.data file recorded inside a virtual machine into
+ a perf.data file recorded on the host at the same time.
+
+ The cgroup fields requires sample having the cgroup id which is saved
+ when "--all-cgroups" option is passed to 'perf record'.
+
Finally, a user may not set fields to none for all event types.
i.e., -F "" is not allowed.
The brstack output includes branch related information with raw addresses using the
- /v/v/v/v/cycles syntax in the following order:
- FROM: branch source instruction
- TO : branch target instruction
- M/P/-: M=branch target mispredicted or branch direction was mispredicted, P=target predicted or direction predicted, -=not supported
- X/- : X=branch inside a transactional region, -=not in transaction region or not supported
- A/- : A=TSX abort entry, -=not aborted region or not supported
- cycles
+ FROM/TO/EVENT/INTX/ABORT/CYCLES/TYPE/SPEC syntax in the following order:
+ FROM : branch source instruction
+ TO : branch target instruction
+ EVENT : M=branch target or direction was mispredicted
+ P=branch target or direction was predicted
+ N=branch not-taken
+ -=no event or not supported
+ INTX : X=branch inside a transactional region
+ -=branch not in transaction region or not supported
+ ABORT : A=TSX abort entry
+ -=not aborted region or not supported
+ CYCLES: the number of cycles that have elapsed since the last branch was recorded
+ TYPE : branch type: COND/UNCOND/IND/CALL/IND_CALL/RET etc.
+ -=not supported
+ SPEC : branch speculation info: SPEC_WRONG_PATH/NON_SPEC_CORRECT_PATH/SPEC_CORRECT_PATH
+ -=not supported
The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible.
@@ -215,15 +262,22 @@ OPTIONS
is printed. This is the full execution path leading to the sample. This is only supported when the
sample was recorded with perf record -b or -j any.
+ Use brstackinsnlen to print the brstackinsn lenght. For example, you
+ can’t know the next sequential instruction after an unconditional branch unless
+ you calculate that based on its length.
+
+ brstackdisasm acts like brstackinsn, but will print disassembled instructions if
+ perf is built with the capstone library.
+
The brstackoff field will print an offset into a specific dso/binary.
With the metric option perf script can compute metrics for
sampling periods, similar to perf stat. This requires
- specifying a group with multiple metrics with the :S option
+ specifying a group with multiple events defining metrics with the :S option
for perf record. perf will sample on the first event, and
- compute metrics for all the events in the group. Please note
+ print computed metrics for all the events in the group. Please note
that the metric computed is averaged over the whole sampling
- period, not just for the sample point.
+ period (since the last sample), not just for the sample point.
For sample events it's possible to display misc field with -F +misc option,
following letters are displayed for each bit:
@@ -307,6 +361,16 @@ OPTIONS
--show-round-events
Display finished round events i.e. events of type PERF_RECORD_FINISHED_ROUND.
+--show-bpf-events
+ Display bpf events i.e. events of type PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT.
+
+--show-cgroup-events
+ Display cgroup events i.e. events of type PERF_RECORD_CGROUP.
+
+--show-text-poke-events
+ Display text poke events i.e. events of type PERF_RECORD_TEXT_POKE and
+ PERF_RECORD_KSYMBOL.
+
--demangle::
Demangle symbol names to human readable form. It's enabled by default,
disable with --no-demangle.
@@ -314,6 +378,9 @@ OPTIONS
--demangle-kernel::
Demangle kernel symbol names to human readable form (for C++ kernels).
+--addr2line=<path>::
+ Path to addr2line binary.
+
--header
Show perf.data header.
@@ -349,12 +416,13 @@ include::itrace.txt[]
--time::
Only analyze samples within given time window: <start>,<stop>. Times
- have the format seconds.microseconds. If start is not given (i.e., time
+ have the format seconds.nanoseconds. If start is not given (i.e. time
string is ',x.y') then analysis starts at the beginning of the file. If
- stop time is not given (i.e, time string is 'x.y,') then analysis goes
- to end of file.
+ stop time is not given (i.e. time string is 'x.y,') then analysis goes
+ to end of file. Multiple ranges can be separated by spaces, which
+ requires the argument to be quoted e.g. --time "1234.567,1234.789 1235,"
- Also support time percent with multipe time range. Time string is
+ Also support time percent with multiple time ranges. Time string is
'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
For example:
@@ -371,9 +439,15 @@ include::itrace.txt[]
perf script --time 0%-10%,30%-40%
--max-blocks::
- Set the maximum number of program blocks to print with brstackasm for
+ Set the maximum number of program blocks to print with brstackinsn for
each sample.
+--reltime::
+ Print time stamps relative to trace start.
+
+--deltatime::
+ Print time stamps relative to previous event.
+
--per-event-dump::
Create per event files with a "perf.data.EVENT.dump" name instead of
printing to stdout, useful, for instance, for generating flamegraphs.
@@ -383,13 +457,45 @@ include::itrace.txt[]
will be printed. Each entry has function name and file/line. Enabled by
default, disable with --no-inline.
---insn-trace::
- Show instruction stream for intel_pt traces. Combine with --xed to
- show disassembly.
+--insn-trace[=<raw|disasm>]::
+ Show instruction stream in bytes (raw) or disassembled (disasm)
+ for intel_pt traces. The default is 'raw'. To use xed, combine
+ 'raw' with --xed to show disassembly done by xed.
--xed::
Run xed disassembler on output. Requires installing the xed disassembler.
+-S::
+--symbols=symbol[,symbol...]::
+ Only consider the listed symbols. Symbols are typically a name
+ but they may also be hexadecimal address.
+
+ The hexadecimal address may be the start address of a symbol or
+ any other address to filter the trace records
+
+ For example, to select the symbol noploop or the address 0x4007a0:
+ perf script --symbols=noploop,0x4007a0
+
+ Support filtering trace records by symbol name, start address of
+ symbol, any hexadecimal address and address range.
+
+ The comparison order is:
+
+ 1. symbol name comparison
+ 2. symbol start address comparison.
+ 3. any hexadecimal address comparison.
+ 4. address range comparison (see --addr-range).
+
+--addr-range::
+ Use with -S or --symbols to list traced records within address range.
+
+ For example, to list the traced records within the address range
+ [0x4007a0, 0x0x4007a9]:
+ perf script -S 0x4007a0 --addr-range 10
+
+--dsos=::
+ Only consider symbols in these DSOs.
+
--call-trace::
Show call stream for intel_pt traces. The CPUs are interleaved, but
can be filtered with -C.
@@ -401,7 +507,37 @@ include::itrace.txt[]
For itrace only show specified functions and their callees for
itrace. Multiple functions can be separated by comma.
+--switch-on EVENT_NAME::
+ Only consider events after this event is found.
+
+--switch-off EVENT_NAME::
+ Stop considering events after this event is found.
+
+--show-on-off-events::
+ Show the --switch-on/off events too.
+
+--stitch-lbr::
+ Show callgraph with stitched LBRs, which may have more complete
+ callgraph. The perf.data file must have been obtained using
+ perf record --call-graph lbr.
+ Disabled by default. In common cases with call stack overflows,
+ it can recreate better call stacks than the default lbr call stack
+ output. But this approach is not foolproof. There can be cases
+ where it creates incorrect call stacks from incorrect matches.
+ The known limitations include exception handing such as
+ setjmp/longjmp will have calls/returns not match.
+
+--merge-callchains::
+ Enable merging deferred user callchains if available. This is the
+ default behavior. If you want to see separate CALLCHAIN_DEFERRED
+ records for some reason, use --no-merge-callchains explicitly.
+
+:GMEXAMPLECMD: script
+:GMEXAMPLESUBCMD:
+include::guest-files.txt[]
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
-linkperf:perf-script-python[1]
+linkperf:perf-script-python[1], linkperf:perf-intel-pt[1],
+linkperf:perf-dlfilter[1]