diff options
Diffstat (limited to 'tools/perf/Documentation/perf-stat.txt')
-rw-r--r-- | tools/perf/Documentation/perf-stat.txt | 26 |
1 files changed, 26 insertions, 0 deletions
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 5af2e432b54f..61d091670dee 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -308,6 +308,14 @@ use --per-die in addition to -a. (system-wide). The output includes the die number and the number of online processors on that die. This is useful to gauge the amount of aggregation. +--per-cluster:: +Aggregate counts per processor cluster for system-wide mode measurement. This +is a useful mode to detect imbalance between clusters. To enable this mode, +use --per-cluster in addition to -a. (system-wide). The output includes the +cluster number and the number of online processors on that cluster. This is +useful to gauge the amount of aggregation. The information of cluster ID and +related CPUs can be gotten from /sys/devices/system/cpu/cpuX/topology/cluster_{id, cpus}. + --per-cache:: Aggregate counts per cache instance for system-wide mode measurements. By default, the aggregation happens for the cache level at the highest index @@ -396,6 +404,9 @@ Aggregate counts per processor socket for system-wide mode measurements. --per-die:: Aggregate counts per processor die for system-wide mode measurements. +--per-cluster:: +Aggregate counts perf processor cluster for system-wide mode measurements. + --per-cache:: Aggregate counts per cache instance for system-wide mode measurements. By default, the aggregation happens for the cache level at the highest index @@ -487,6 +498,21 @@ To interpret the results it is usually needed to know on which CPUs the workload runs on. If needed the CPUs can be forced using taskset. +--record-tpebs:: +Enable automatic sampling on Intel TPEBS retire_latency events (event with :R +modifier). Without this option, perf would not capture dynamic retire_latency +at runtime. Currently, a zero value is assigned to the retire_latency event when +this option is not set. The TPEBS hardware feature starts from Intel Granite +Rapids microarchitecture. This option only exists in X86_64 and is meaningful on +Intel platforms with TPEBS feature. + +--tpebs-mode=[mean|min|max|last]:: +Set how retirement latency events have their sample times +combined. The default "mean" gives the average of retirement +latency. "min" or "max" give the smallest or largest retirment latency +times respectively. "last" uses the last retirment latency sample's +time. + --td-level:: Print the top-down statistics that equal the input level. It allows users to print the interested top-down metrics level instead of the |