diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-06-03 15:11:44 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-06-03 15:11:44 -0700 |
| commit | 0939bd2fcf337243133b0271335a2838857c319f (patch) | |
| tree | 57324f5cc62bd878f248f69e23d06ec49b197c18 /tools/lib | |
| parent | 70087d2200d4a3bd31812ab4578c9ec70ea344af (diff) | |
| parent | a913ef6fd883c05bd6538ed21ee1e773f0d750b7 (diff) | |
Merge tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
"perf report/top/annotate TUI:
- Accept the left arrow key as a Zoom out if done on the first column
- Show if source code toggle status in title, to help spotting bugs
with the various disassemblers (capstone, llvm, objdump)
- Provide feedback on unhandled hotkeys
Build:
- Better inform when certain features are not available with warnings
in the build process and in 'perf version --build-options' or 'perf -vv'
perf record:
- Improve the --off-cpu code by synthesizing events for switch-out ->
switch-in intervals using a BPF program. This can be fine tuned
using a --off-cpu-thresh knob
perf report:
- Add 'tgid' sort key
perf mem/c2c:
- Add 'op', 'cache', 'snoop', 'dtlb' output fields
- Add support for 'ldlat' on AMD IBS (Instruction Based Sampling)
perf ftrace:
- Use process/session specific trace settings instead of messing with
the global ftrace knobs
perf trace:
- Implement syscall summary in BPF
- Support --summary-mode=cgroup
- Always print return value for syscalls returning a pid
- The rseq and set_robust_list don't return a pid, just -errno
perf lock contention:
- Symbolize zone->lock using BTF
- Add -J/--inject-delay option to estimate impact on application
performance by optimization of kernel locking behavior
perf stat:
- Improve hybrid support for the NMI watchdog warning
Symbol resolution:
- Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust
symbols
- Improve Rust demangler
Hardware tracing:
Intel PT:
- Fix PEBS-via-PT data_src
- Do not default to recording all switch events
- Fix pattern matching with python3 on the SQL viewer script
arm64:
- Fixups for the hip08 hha PMU
Vendor events:
- Update Intel events/metrics files for alderlake, alderlaken,
arrowlake, bonnell, broadwell, broadwellde, broadwellx,
cascadelakex, clearwaterforest, elkhartlake, emeraldrapids,
grandridge, graniterapids, haswell, haswellx, icelake, icelakex,
ivybridge, ivytown, jaketown, lunarlake, meteorlake, nehalemep,
nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest,
skylake, skylakex, snowridgex, tigerlake, westmereep-dp,
westmereep-sp, westmereep-sx
python support:
- Add support for event counts in the python binding, add a
counting.py example
perf list:
- Display the PMU name associated with a perf metric in JSON
perf test:
- Hybrid improvements for metric value validation test
- Fix LBR test by ignoring idle task
- Add AMD IBS sw filter ana d'ldlat' tests
- Add 'perf trace --summary-mode=cgroup' test
- Add tests for the various language symbol demanglers
Miscellaneous:
- Allow specifying the cpu an event will be tied using '-e
event/cpu=N/'
- Sync various headers with the kernel sources
- Add annotations to use clang's -Wthread-safety and fix some
problems it detected
- Make dump_stack() use perf's symbol resolution to provide better
backtraces
- Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS
(Precision Event-Based Sampling), that adds timing info, the
retirement latency of instructions
- Various memory allocation (some detected by ASAN) and reference
counting fixes
- Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace
PERF_RECORD_COMPRESSED
- Skip unsupported event types in perf.data files, don't stop when
finding one
- Improve lookups using hashmaps and binary searches"
* tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (206 commits)
perf callchain: Always populate the addr_location map when adding IP
perf lock contention: Reject more than 10ms delays for safety
perf trace: Set errpid to false for rseq and set_robust_list
perf symbol: Move demangling code out of symbol-elf.c
perf trace: Always print return value for syscalls returning a pid
perf script: Print PERF_AUX_FLAG_COLLISION flag
perf mem: Show absolute percent in mem_stat output
perf mem: Display sort order only if it's available
perf mem: Describe overhead calculation in brief
perf record: Fix incorrect --user-regs comments
Revert "perf thread: Ensure comm_lock held for comm_list"
perf test trace_summary: Skip --bpf-summary tests if no libbpf
perf test intel-pt: Skip jitdump test if no libelf
perf intel-tpebs: Avoid race when evlist is being deleted
perf test demangle-java: Don't segv if demangling fails
perf symbol: Fix use-after-free in filename__read_build_id
perf pmu: Avoid segv for missing name/alias_name in wildcarding
perf machine: Factor creating a "live" machine out of dwarf-unwind
perf test: Add AMD IBS sw filter test
perf mem: Count L2 HITM for c2c statistic
...
Diffstat (limited to 'tools/lib')
| -rw-r--r-- | tools/lib/perf/Documentation/libperf.txt | 1 | ||||
| -rw-r--r-- | tools/lib/perf/cpumap.c | 10 | ||||
| -rw-r--r-- | tools/lib/perf/include/perf/cpumap.h | 2 | ||||
| -rw-r--r-- | tools/lib/perf/include/perf/event.h | 12 | ||||
| -rw-r--r-- | tools/lib/perf/include/perf/threadmap.h | 1 | ||||
| -rw-r--r-- | tools/lib/perf/threadmap.c | 17 |
6 files changed, 43 insertions, 0 deletions
diff --git a/tools/lib/perf/Documentation/libperf.txt b/tools/lib/perf/Documentation/libperf.txt index 59aabdd3cabf..4072bc9b7670 100644 --- a/tools/lib/perf/Documentation/libperf.txt +++ b/tools/lib/perf/Documentation/libperf.txt @@ -210,6 +210,7 @@ SYNOPSIS struct perf_record_time_conv; struct perf_record_header_feature; struct perf_record_compressed; + struct perf_record_compressed2; -- DESCRIPTION diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index 4454a5987570..b20a5280f2b3 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -242,6 +242,16 @@ out: return cpus; } +struct perf_cpu_map *perf_cpu_map__new_int(int cpu) +{ + struct perf_cpu_map *cpus = perf_cpu_map__alloc(1); + + if (cpus) + RC_CHK_ACCESS(cpus)->map[0].cpu = cpu; + + return cpus; +} + static int __perf_cpu_map__nr(const struct perf_cpu_map *cpus) { return RC_CHK_ACCESS(cpus)->nr; diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h index 8c1ab0f9194e..58cc5c5fa47c 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -37,6 +37,8 @@ LIBPERF_API struct perf_cpu_map *perf_cpu_map__new_online_cpus(void); * perf_cpu_map__new_online_cpus is returned. */ LIBPERF_API struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list); +/** perf_cpu_map__new_int - create a map with the one given cpu. */ +LIBPERF_API struct perf_cpu_map *perf_cpu_map__new_int(int cpu); LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map); LIBPERF_API int perf_cpu_map__merge(struct perf_cpu_map **orig, struct perf_cpu_map *other); diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h index 37bb7771d914..09b7c643ddac 100644 --- a/tools/lib/perf/include/perf/event.h +++ b/tools/lib/perf/include/perf/event.h @@ -457,6 +457,16 @@ struct perf_record_compressed { char data[]; }; +/* + * `header.size` includes the padding we are going to add while writing the record. + * `data_size` only includes the size of `data[]` itself. + */ +struct perf_record_compressed2 { + struct perf_event_header header; + __u64 data_size; + char data[]; +}; + enum perf_user_event_type { /* above any possible kernel type */ PERF_RECORD_USER_TYPE_START = 64, PERF_RECORD_HEADER_ATTR = 64, @@ -478,6 +488,7 @@ enum perf_user_event_type { /* above any possible kernel type */ PERF_RECORD_HEADER_FEATURE = 80, PERF_RECORD_COMPRESSED = 81, PERF_RECORD_FINISHED_INIT = 82, + PERF_RECORD_COMPRESSED2 = 83, PERF_RECORD_HEADER_MAX }; @@ -518,6 +529,7 @@ union perf_event { struct perf_record_time_conv time_conv; struct perf_record_header_feature feat; struct perf_record_compressed pack; + struct perf_record_compressed2 pack2; }; #endif /* __LIBPERF_EVENT_H */ diff --git a/tools/lib/perf/include/perf/threadmap.h b/tools/lib/perf/include/perf/threadmap.h index 8b40e7777cea..44deb815b817 100644 --- a/tools/lib/perf/include/perf/threadmap.h +++ b/tools/lib/perf/include/perf/threadmap.h @@ -14,6 +14,7 @@ LIBPERF_API void perf_thread_map__set_pid(struct perf_thread_map *map, int idx, LIBPERF_API char *perf_thread_map__comm(struct perf_thread_map *map, int idx); LIBPERF_API int perf_thread_map__nr(struct perf_thread_map *threads); LIBPERF_API pid_t perf_thread_map__pid(struct perf_thread_map *map, int idx); +LIBPERF_API int perf_thread_map__idx(struct perf_thread_map *map, pid_t pid); LIBPERF_API struct perf_thread_map *perf_thread_map__get(struct perf_thread_map *map); LIBPERF_API void perf_thread_map__put(struct perf_thread_map *map); diff --git a/tools/lib/perf/threadmap.c b/tools/lib/perf/threadmap.c index 07968f3ea093..db431b036f57 100644 --- a/tools/lib/perf/threadmap.c +++ b/tools/lib/perf/threadmap.c @@ -97,5 +97,22 @@ int perf_thread_map__nr(struct perf_thread_map *threads) pid_t perf_thread_map__pid(struct perf_thread_map *map, int idx) { + if (!map) { + assert(idx == 0); + return -1; + } + return map->map[idx].pid; } + +int perf_thread_map__idx(struct perf_thread_map *threads, pid_t pid) +{ + if (!threads) + return pid == -1 ? 0 : -1; + + for (int i = 0; i < threads->nr; ++i) { + if (threads->map[i].pid == pid) + return i; + } + return -1; +} |
