summaryrefslogtreecommitdiff
path: root/tools/perf/Documentation/perf-stat.txt
diff options
context:
space:
mode:
Diffstat (limited to 'tools/perf/Documentation/perf-stat.txt')
-rw-r--r--tools/perf/Documentation/perf-stat.txt26
1 files changed, 26 insertions, 0 deletions
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 5af2e432b54f..61d091670dee 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -308,6 +308,14 @@ use --per-die in addition to -a. (system-wide). The output includes the
die number and the number of online processors on that die. This is
useful to gauge the amount of aggregation.
+--per-cluster::
+Aggregate counts per processor cluster for system-wide mode measurement. This
+is a useful mode to detect imbalance between clusters. To enable this mode,
+use --per-cluster in addition to -a. (system-wide). The output includes the
+cluster number and the number of online processors on that cluster. This is
+useful to gauge the amount of aggregation. The information of cluster ID and
+related CPUs can be gotten from /sys/devices/system/cpu/cpuX/topology/cluster_{id, cpus}.
+
--per-cache::
Aggregate counts per cache instance for system-wide mode measurements. By
default, the aggregation happens for the cache level at the highest index
@@ -396,6 +404,9 @@ Aggregate counts per processor socket for system-wide mode measurements.
--per-die::
Aggregate counts per processor die for system-wide mode measurements.
+--per-cluster::
+Aggregate counts perf processor cluster for system-wide mode measurements.
+
--per-cache::
Aggregate counts per cache instance for system-wide mode measurements. By
default, the aggregation happens for the cache level at the highest index
@@ -487,6 +498,21 @@ To interpret the results it is usually needed to know on which
CPUs the workload runs on. If needed the CPUs can be forced using
taskset.
+--record-tpebs::
+Enable automatic sampling on Intel TPEBS retire_latency events (event with :R
+modifier). Without this option, perf would not capture dynamic retire_latency
+at runtime. Currently, a zero value is assigned to the retire_latency event when
+this option is not set. The TPEBS hardware feature starts from Intel Granite
+Rapids microarchitecture. This option only exists in X86_64 and is meaningful on
+Intel platforms with TPEBS feature.
+
+--tpebs-mode=[mean|min|max|last]::
+Set how retirement latency events have their sample times
+combined. The default "mean" gives the average of retirement
+latency. "min" or "max" give the smallest or largest retirment latency
+times respectively. "last" uses the last retirment latency sample's
+time.
+
--td-level::
Print the top-down statistics that equal the input level. It allows
users to print the interested top-down metrics level instead of the