summaryrefslogtreecommitdiff
path: root/tools/perf/util/parse-events.c
AgeCommit message (Collapse)Author
8 daysperf tp_pmu: Factor existing tracepoint logic to new fileIan Rogers
Start the creation of a tracepoint PMU abstraction. Tracepoint events don't follow the regular sysfs perf conventions. Eventually the new PMU abstraction will bridge the gap so tracepoint events look more like regular perf ones. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250725185202.68671-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 daysperf parse-events: Remove non-json software eventsIan Rogers
Remove the hard coded encodings from parse-events. This has the consequence that software events are matched using the sysfs/json priority, will be case insensitive and will be wildcarded across PMUs. As there were software and hardware types in the parsing code, the removal means software vs hardware logic can be removed and hardware assumed. Now the perf json provides detailed descriptions of software events, remove the previous listing support that didn't contain event descriptions. When globbing is required for the "sw" option in perf list, use string PMU globbing as was done previously for the tool PMU. The output of `perf list sw` command changed like this. Before: List of pre-defined events (to be used in -e or -M): alignment-faults [Software event] bpf-output [Software event] cgroup-switches [Software event] context-switches OR cs [Software event] cpu-clock [Software event] cpu-migrations OR migrations [Software event] dummy [Software event] emulation-faults [Software event] major-faults [Software event] minor-faults [Software event] page-faults OR faults [Software event] task-clock [Software event] After: List of pre-defined events (to be used in -e or -M): software: alignment-faults [Number of kernel handled memory alignment faults. Unit: software] bpf-output [An event used by BPF programs to write to the perf ring buffer. Unit: software] cgroup-switches [Number of context switches to a task in a different cgroup. Unit: software] context-switches [Number of context switches [This event is an alias of cs]. Unit: software] cpu-clock [Per-CPU high-resolution timer based event. Unit: software] cpu-migrations [Number of times a process has migrated to a new CPU [This event is an alias of migrations]. Unit: software] cs [Number of context switches [This event is an alias of context-switches]. Unit: software] dummy [A placeholder event that doesn't count anything. Unit: software] ... Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250725185202.68671-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf parse-events: Fix missing slots for Intel topdown metric eventsIan Rogers
Topdown metric events require grouping with a slots event. In perf metrics this is currently achieved by metrics adding an unnecessary "0 * tma_info_thread_slots". New TMA metrics trigger optimizations of the metric expression that removes the event and breaks the metric due to the missing but required event. Add a pass immediately before sorting and fixing parsed events, that insert a slots event if one is missing. Update test expectations to match this. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250719030517.1990983-15-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf parse-events: Support user CPUs mixed with threads/processesIan Rogers
Counting events system-wide with a specified CPU prior to this change worked: ``` $ perf stat -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' -a sleep 1 Performance counter stats for 'system wide': 59,393,419,099 msr/tsc/ 33,927,965,927 msr/tsc,cpu=cpu_core/ 25,465,608,044 msr/tsc,cpu=cpu_atom/ ``` However, when counting with process the counts became system wide: ``` $ perf stat -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10 10.1: Basic parsing test : Ok 10.2: Parsing without PMU name : Ok 10.3: Parsing with PMU name : Ok Performance counter stats for 'perf test -F 10': 59,233,549 msr/tsc/ 59,227,556 msr/tsc,cpu=cpu_core/ 59,224,053 msr/tsc,cpu=cpu_atom/ ``` Make the handling of CPU maps with event parsing clearer. When an event is parsed creating an evsel the cpus should be either the PMU's cpumask or user specified CPUs. Update perf_evlist__propagate_maps so that it doesn't clobber the user specified CPUs. Try to make the behavior clearer, firstly fix up missing cpumasks. Next, perform sanity checks and adjustments from the global evlist CPU requests and for the PMU including simplifying to the "any CPU"(-1) value. Finally remove the event if the cpumask is empty. So that events are opened with a CPU and a thread change stat's create_perf_stat_counter to give both. With the change things are fixed: ``` $ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10 10.1: Basic parsing test : Ok 10.2: Parsing without PMU name : Ok 10.3: Parsing with PMU name : Ok Performance counter stats for 'perf test -F 10': 63,704,975 msr/tsc/ 47,060,704 msr/tsc,cpu=cpu_core/ (4.62%) 16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%) ``` However, note the "--no-scale" option is used. This is necessary as the running time for the event on the counter isn't the same as the enabled time because the thread doesn't necessarily run on the CPUs specified for the counter. All counter values are scaled with: scaled_value = value * time_enabled / time_running and so without --no-scale the scaled_value becomes very large. This problem already exists on hybrid systems for the same reason. Here are 2 runs of the same code with an instructions event that counts the same on both types of core, there is no real multiplexing happening on the event: ``` $ perf stat -e instructions perf test -F 10 ... Performance counter stats for 'perf test -F 10': 87,896,447 cpu_atom/instructions/ (14.37%) 98,171,964 cpu_core/instructions/ (85.63%) ... $ perf stat --no-scale -e instructions perf test -F 10 ... Performance counter stats for 'perf test -F 10': 13,069,890 cpu_atom/instructions/ (19.32%) 83,460,274 cpu_core/instructions/ (80.68%) ... ``` The scaling has inflated per-PMU instruction counts and the overall count by 2x. To fix this the kernel needs changing when a task+CPU event (or just task event on hybrid) is scheduled out. A fix could be that the state isn't inactive but off for such events, so that time_enabled counts don't accumulate on them. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250719030517.1990983-13-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf parse-events: Minor __add_event refactoringIan Rogers
Rename cpu_list to user_cpus. If a PMU isn't given, find it early from the perf_event_attr. Make the pmu_cpus more explicitly a copy from the PMU (except when user_cpus are given). Derive the cpus from pmu_cpus and user_cpus as appropriate. Handle strdup errors on name and metric_id. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 dayslibperf evsel: Rename own_cpus to pmu_cpusIan Rogers
own_cpus is generally the cpumask from the PMU. Rename to pmu_cpus to try to make this clearer. Variable rename with no other changes. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf parse-events: Allow the cpu term to be a PMU or CPU rangeIan Rogers
On hybrid systems, events like msr/tsc/ will aggregate counts across all CPUs. Often metrics only want a value like msr/tsc/ for the cores on which the metric is being computed. Listing each CPU with terms cpu=0,cpu=1.. is laborious and would need to be encoded for all variations of a CPU model. Allow the cpumask from a PMU to be an argument to the cpu term. For example in the following the cpumask of the cstate_pkg PMU selects the CPUs to count msr/tsc/ counter upon: ``` $ cat /sys/bus/event_source/devices/cstate_pkg/cpumask 0 $ perf stat -A -e 'msr/tsc,cpu=cstate_pkg/' -a sleep 0.1 Performance counter stats for 'system wide': CPU0 252,621,253 msr/tsc,cpu=cstate_pkg/ 0.101184092 seconds time elapsed ``` As the cpu term is now also allowed to be a string, allow it to encode a range of CPUs (a list can't be supported as ',' is already a special token). The "event qualifiers" section of the `perf list` man page is updated to detail the additional behavior. The man page formatting is tidied up in this section, as it was incorrectly appearing within the "parameterized events" section. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250719030517.1990983-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf parse-events: Warn if a cpu term is unsupported by a CPUIan Rogers
Factor requested CPU warning out of evlist and into evsel. At the end of adding an event, perform the warning check. To avoid repeatedly testing if the cpu_list is empty, add a local variable. ``` $ perf stat -e cpu_atom/cycles,cpu=1/ -a true WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cpu_atom/cycles/' Performance counter stats for 'system wide': <not supported> cpu_atom/cycles/ 0.000781511 seconds time elapsed ``` Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-11perf parse-events: Minor tidy up of event_type helperIan Rogers
Add missing breakpoint and raw types. Avoid a switch, just use a lookup array. Switch the type to unsigned to avoid checking negative values. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf parse-events: Avoid scanning PMUs that can't contain eventsIan Rogers
Add perf_pmus__scan_for_event that only reads sysfs for pmus that could contain a given event. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624231837.179536-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf parse-events: Set default GH modifier properlyNamhyung Kim
Commit 7b100989b4f6bce7 ("perf evlist: Remove __evlist__add_default") changed to use "cycles:P" as a default event. But the problem is it cannot set other default modifiers correctly. perf kvm needs to set attr.exclude_host by default but it didn't work because of the logic in the parse_events__modifier_list(). Also the exclude_GH_default was applied only if ":u" modifier was specified - which is strange. Move it out after handling the ":GH" and check perf_host and perf_guest properly. Before: $ ./perf kvm record -vv true |& grep exclude (nothing) But specifying an event (without a modifier) works: $ ./perf kvm record -vv -e cycles true |& grep exclude exclude_host 1 After: It now works for the both cases: $ ./perf kvm record -vv true |& grep exclude exclude_host 1 $ ./perf kvm record -vv -e cycles true |& grep exclude exclude_host 1 Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250606225431.2109754-1-namhyung@kernel.org Fixes: 35c8d21371e9b342 ("perf tools: Don't set attr.exclude_guest by default") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf parse-events: Add parse_uid_filter helperIan Rogers
Add parse_uid_filter filter as a helper to parse_filter, that constructs a uid filter string. As uid filters don't work with tracepoint filters, add a is_possible_tp_filter function so the tracepoint filter isn't attempted for tracepoint evsels. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09perf parse-events filter: Use evsel__find_pmuIan Rogers
Rather than manually scanning PMUs, use evsel__find_pmu that can use the PMU set during event parsing. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-05-14perf parse-events: Use wildcard processing to set an event to merge intoIan Rogers
The merge stat code fails for uncore events if they are repeated twice, for example `perf stat -e clockticks,clockticks -I 1000` as the counts of the second set of uncore events will be merged into the first counter. Reimplement the logic to have a first_wildcard_match so that merged later events correctly merge into the first wildcard event that they will be aggregated into. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Chun-Tse Shao <ctshao@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250513215401.2315949-3-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-14perf evlist: Make uniquifying counter names consistentIan Rogers
'perf stat' has different uniquification logic to 'perf record' and perf top. In the case of perf record and 'perf top' all hybrid event names are uniquified. 'perf stat' is more disciplined respecting name config terms, libpfm4 events, etc. 'perf stat' will uniquify hybrid events and the non-core PMU cases shouldn't apply to perf record or 'perf top'. For consistency, remove the uniquification for 'perf record' and 'perf top' and reuse the 'perf stat' uniquification, making the code more globally visible for this. Fix the detection of cross-PMU for disabling uniquify by correctly setting last_pmu. When setting uniquify on an evsel, make sure the PMUs between the 2 considered events differ otherwise the uniquify isn't adding value. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Chun-Tse Shao <ctshao@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250513215401.2315949-2-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12perf parse-events: Add "cpu" term to set the CPU an event is recorded onIan Rogers
The -C option allows the CPUs for a list of events to be specified but its not possible to set the CPU for a single event. Add a term to allow this. The term isn't a general CPU list due to ',' already being a special character in event parsing instead multiple cpu= terms may be provided and they will be merged/unioned together. An example of mixing different types of events counted on different CPUs: ``` $ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1 Performance counter stats for 'system wide': CPU0 6,979,225 instructions/cpu=0/ # 0.89 insn per cycle CPU4 75,138 cpu/l1d-misses/ CPU5 1,418,939 cpu/l1d-misses/ CPU8 797,553 cpu/inst_retired.any,cpu=8/ CPU0 7,845,302 cycles CPU4 6,546,859 cycles CPU5 185,915,438 cycles CPU8 2,065,668 cycles 0.112449242 seconds time elapsed ``` Committer testing: root@number:~# grep -m1 "model name" /proc/cpuinfo model name : AMD Ryzen 9 9950X3D 16-Core Processor root@number:~# perf stat -A -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0.1 Performance counter stats for 'system wide': CPU0 2,398,351 instructions/cpu=0/ # 0.44 insn per cycle CPU0 2,398,152 instructions # 0.44 insn per cycle CPU1 1,265,634 instructions # 0.49 insn per cycle CPU2 606,087 instructions # 0.50 insn per cycle CPU3 4,025,752 instructions # 0.52 insn per cycle CPU4 4,236,810 instructions # 0.53 insn per cycle CPU5 3,984,832 instructions # 0.66 insn per cycle CPU6 434,132 instructions # 0.44 insn per cycle CPU7 65,752 instructions # 0.41 insn per cycle CPU8 459,083 instructions # 0.48 insn per cycle CPU9 6,464,161 instructions # 1.31 insn per cycle <SNIP> root@number:~# perf stat -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0. Performance counter stats for 'system wide': 144,822 instructions/cpu=0/ # 0.03 insn per cycle 4,666,114 instructions # 0.93 insn per cycle 2,583 l1d-misses 4,993,633 cycles 0.000868512 seconds time elapsed root@number:~# Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250403194337.40202-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12perf parse-events: Set is_pmu_core for legacy hardware eventsIan Rogers
Also set the CPU map to all online CPU maps. This is done so the behavior of legacy hardware and hardware cache events better matches that of sysfs and JSON events during __perf_evlist__propagate_maps(). Fix missing cpumap put in "Synthesize attr update" test. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250403194337.40202-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08perf parse-events: Add debug dump of evlist if reorderedIan Rogers
Add debug verbose output to show how evsels were reordered by parse_events__sort_events_and_fix_groups(). For example: ``` $ perf record -v -e '{instructions,cycles}' true Using CPUID GenuineIntel-6-B7-1 WARNING: events were regrouped to match PMUs evlist after sorting/fixing: '{cpu_atom/instructions/,cpu_atom/cycles/},{cpu_core/instructions/,cpu_core/cycles/}' ``` Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250402201549.4090305-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf pmu-events: Add retirement latency to JSON events inside of perfIan Rogers
The updated Intel vendor events add retirement latency for graniterapids: https://lore.kernel.org/lkml/20250322063403.364981-14-irogers@google.com/ This change makes those values available within an alias/event within a PMU and saves them into the evsel at event parse time. When no TPEBS data is available the default values are substituted in for TMA metrics that are using retirement latency events - currently just those on graniterapids. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-03-11perf parse-events: Corrections to topdown sortingIan Rogers
In the case of '{instructions,slots},faults,topdown-retiring' the first event that must be grouped, slots, is ignored causing the topdown-retiring event not to be adjacent to the group it needs to be inserted into. Don't ignore the group members when computing the force_grouped_index. Make the force_grouped_index be for the leader of the group it is within and always use it first rather than a group leader index so that topdown events may be sorted from one group into another. As the PMU name comparison applies to moving events in the same group ensure the name ordering is always respected. Change the group splitting logic to not group if there are no other topdown events and to fix cases where the force group leader wasn't being grouped with the other members of its group. Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Closes: https://lore.kernel.org/lkml/20250224083306.71813-2-dapeng1.mi@linux.intel.com/ Closes: https://lore.kernel.org/lkml/f7e4f7e8-748c-4ec7-9088-0e844392c11a@linux.intel.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250307023906.1135613-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24perf parse-events: Switch tracepoints to io_dir__readdirIan Rogers
Avoid DIR allocations when scanning sysfs by using io_dir for the readdir implementation, that allocates about 1kb on the stack. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250222061015.303622-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-04perf pmu: Rename name matching for no suffix or wildcard variantsIan Rogers
Wildcard PMU naming will match a name like pmu_1 to a PMU name like pmu_10 but not to a PMU name like pmu_2 as the suffix forms part of the match. No suffix matching will match pmu_10 to either pmu_1 or pmu_2. Add or rename matching functions on PMU to make it clearer what kind of matching is being performed. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20250201074320.746259-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-18perf tools: Add aux-action config termAdrian Hunter
Add a new common config term "aux-action" to use for configuring AUX area trace pause / resume. The value is a string that will be parsed in a subsequent patch. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241216070244.14450-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf evsel: Allow evsel__newtp without libtraceeventIan Rogers
Switch from reading the tracepoint format to reading the id directly for the evsel config. This avoids the need to initialize libtraceevent, plugins, etc. It is sufficient for many tracepoint commands to work like: $ perf stat -e sched:sched_switch true To populate evsel->tp_format, do lazy initialization using libtraceevent in the evsel__tp_format function (the sys and name are saved in evsel__newtp_idx for this purpose). Reading the id should be indicative of the format failing to load, but if not an error is reported in evsel__tp_format. This could happen for a tracepoint with a format that fails to parse. As tracepoints can be parsed without libtraceevent with this, remove the associated #ifdefs in parse-events.c. By only lazily parsing the tracepoint format information it is hoped this will help improve the performance of code using tracepoints but not the format information. It also cuts down on the build and ifdef logic. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Zixian Cai <fzczx123@gmail.com> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241118225345.889810-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-22perf tools: Do not set exclude_guest for precise_ipNamhyung Kim
It seems perf sets the exclude_guest bit because of Intel PEBS implementation which uses a virtual address. IIUC now kernel disables PEBS when it goes to the guest mode regardless of this bit so we don't need to set it explicitly. At least for the other archs/vendors. I found the commit 1342798cc13e set the exclude_guest for precise_ip in the tool and the commit 20b279ddb38c added kernel side enforcement which was reverted by commit a706d965dcfd later. Actually it doesn't set the exclude_guest for the default event (cycles:P) already. $ grep -m1 vendor /proc/cpuinfo vendor_id : GenuineIntel $ perf record -e cycles:P true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB perf.data (9 samples) ] $ perf evlist -v | tr ',' '\n' | grep -e exclude -e precise precise_ip: 3 But having lower 'p' modifier set the bit for some reason. $ perf record -e cycles:pp true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB perf.data (9 samples) ] $ perf evlist -v | tr ',' '\n' | grep -e exclude -e precise precise_ip: 2 exclude_guest: 1 Actually AMD IBS suffers from this because it doesn't support excludes and having this bit effectively disables new features in the current implementation (due to the missing feature check). $ grep -m1 vendor /proc/cpuinfo vendor_id : AuthenticAMD $ perf record -W -e cycles:p -vv true 2>&1 | grep switching switching off PERF_FORMAT_LOST support switching off weight struct support switching off bpf_event switching off ksymbol switching off cloexec flag switching off mmap2 switching off exclude_guest, exclude_host By not setting exclude_guest, we can fix this inconsistency and the troubles. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Atish Patra <atishp@atishpatra.org> Cc: Mingwei Zhang <mizhang@google.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20241016062359.264929-5-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-22perf tools: Don't set attr.exclude_guest by defaultNamhyung Kim
The exclude_guest in the event attribute is to limit profiling in the host environment. But I'm not sure why we want to set it by default cause we don't care about it in most cases and I feel like it just makes new PMU implementation complicated. Of course it's useful for perf kvm command so I added the exclude_GH_default variable to preserve the old behavior for perf kvm and other commands like perf record and stat won't set the exclude bit. This is helpful for AMD IBS case since having exclude_guest bit will clear new feature bit due to the missing feature check logic. $ sysctl kernel.perf_event_paranoid kernel.perf_event_paranoid = 0 $ perf record -W -e ibs_op// -vv true 2>&1 | grep switching switching off PERF_FORMAT_LOST support switching off weight struct support switching off bpf_event switching off ksymbol switching off cloexec flag switching off mmap2 switching off exclude_guest, exclude_host Intestingly, I found it sets the exclude_bit if "u" modifier is used. I don't know why but it's neither intuitive nor consistent. Let's remove the bit there too. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Atish Patra <atishp@atishpatra.org> Cc: Mingwei Zhang <mizhang@google.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20241016062359.264929-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf tool_pmu: Factor tool events into their own PMUIan Rogers
Rather than treat tool events as a special kind of event, create a tool only PMU where the events/aliases match the existing duration_time, user_time and system_time events. Remove special parsing and printing support for the tool events, but add function calls for when PMU functions are called on a tool_pmu. Move the tool PMU code in evsel into tool_pmu.c to better encapsulate the tool event behavior in that file. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf parse-events: Expose/rename config_term_nameIan Rogers
Expose config_term_name as parse_events__term_type_str so that PMUs not in pmu.c may access it. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf pmu: Allow hardcoded terms to be applied to attributesIan Rogers
Hard coded terms like "config=10" are skipped by perf_pmu__config assuming they were already applied to a perf_event_attr by parse event's config_attr function. When doing a reverse number to name lookup in perf_pmu__name_from_config, as the hardcoded terms aren't applied the config value is incorrect leading to misses or false matches. Fix this by adding a parameter to have perf_pmu__config apply hardcoded terms too (not just in parse event's config_term_common). Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-26perf evsel: Remove pmu_nameIan Rogers
"evsel->pmu_name" is only ever assigned a strdup of "pmu->name", a strdup of "evsel->pmu_name" or NULL. As such, prefer to use "pmu->name" directly and even to directly compare PMUs than PMU names. For safety, add some additional NULL tests. Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> [ Fix arm-spe.c usage of pmu_name and empty PMU name ] Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: ak@linux.intel.com Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240926144851.245903-6-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-26perf evsel: Add alternate_hw_config and use in evsel__matchIan Rogers
There are cases where we want to match events like instructions and cycles with legacy hardware values, in particular in stat-shadow's hard coded metrics. An evsel's name isn't a good point of reference as it gets altered, strstr would be too imprecise and re-parsing the event from its name is silly. Instead, hold the legacy hardware event name, determined during parsing, in the evsel for this matching case. Inline evsel__match2 that is only used in builtin-diff. Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Yunseong Kim <yskelg@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: ak@linux.intel.com Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240926144851.245903-2-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-11perf pmus: Fake PMU clean upIan Rogers
Rather than passing a fake PMU around, just pass that the fake PMU should be used - true when doing testing. Move the fake PMU into pmus.[ch] and try to abstract the PMU's properties in pmu.c, ie so there is less "if fake_pmu" in non-PMU code. Give the fake PMU a made up type number. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xu Yang <xu.yang_2@nxp.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20240907050830.6752-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-04perf parse-events: Vary default_breakpoint_len on i386 and arm64Ian Rogers
On arm64 the breakpoint length should be 4-bytes but 8-bytes is tolerated as perf passes that as sizeof(long). Just pass the correct value. On i386 the sizeof(long) check in the kernel needs to match the kernel's long size. Check using an environment (uname checks) whether 4 or 8 bytes needs to be passed. Cache the value in a static. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Jihong <yangjihong@bytedance.com> Link: https://lore.kernel.org/r/20240904050606.752788-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-04perf parse-events: Add default_breakpoint_len helperIan Rogers
The default breakpoint length is "sizeof(long)" however this is incorrect on platforms like Aarch64 where sizeof(long) is 8 but the breakpoint length is 4. Add a helper function that can be used to determine the correct breakpoint length, in this change it just returns the existing default sizeof(long) value. Use the helper in the bp_account test so that, when modifying the event from a watchpoint to a breakpoint, the breakpoint length is appropriate for the architecture and not just sizeof(long). Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Jihong <yangjihong@bytedance.com> Link: https://lore.kernel.org/r/20240904050606.752788-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-03perf parse-events: Pass cpu_list as a perf_cpu_map in __add_event()Ian Rogers
Previously the cpu_list is a string and typically no cpu_list is passed to __add_event(). Wanting to make events have their cpus distinct from the PMU means that in more occassions we want to pass a cpu_list. If we're reading this from sysfs it is easier to read a perf_cpu_map than allocate and pass around strings that will later be parsed. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Gautham Shenoy <gautham.shenoy@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Link: https://lore.kernel.org/r/20240718003025.1486232-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12perf parse-events: Add a retirement latency modifierIan Rogers
Retirement latency is a separate sampled count used on newer Intel CPUs. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-2-weilin.wang@intel.com Signed-off-by: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-06-10perf evsel: Refactor tool eventsIan Rogers
Tool events unnecessarily open a dummy perf event which is useless even with `perf record` which will still open a dummy event. Change the behavior of tool events so: - duration_time - call `rdclock` on open and then report the count as a delta since the start in evsel__read_counter. This moves code out of builtin-stat making it more general purpose. - user_time/system_time - open the fd as either `/proc/pid/stat` or `/proc/stat` for cases like system wide. evsel__read_counter will read the appropriate field out of the procfs file. These values were previously supplied by wait4, if the procfs read fails then the wait4 values are used, assuming the process/thread terminated. By reading user_time and system_time this way, interval mode, per PID and per CPU can be supported although there are restrictions given what the files provide (e.g. per PID can't be combined with per CPU). Opening any of the tool events for `perf record` is changed to return invalid. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240503232849.17752-1-irogers@google.com
2024-06-06perf: parse-events: Fix compilation error while defining DEBUG_PARSERClément Le Goffic
Compiling perf tool with 'DEBUG_PARSER=1' leads to errors: $> make -C tools/perf PARSER_DEBUG=1 NO_LIBTRACEEVENT=1 ... CC util/expr-flex.o CC util/expr.o util/parse-events.c:33:12: error: redundant redeclaration of ‘parse_events_debug’ [-Werror=redundant-decls] 33 | extern int parse_events_debug; | ^~~~~~~~~~~~~~~~~~ In file included from util/parse-events.c:18: util/parse-events-bison.h:43:12: note: previous declaration of ‘parse_events_debug’ with type ‘int’ 43 | extern int parse_events_debug; | ^~~~~~~~~~~~~~~~~~ util/expr.c:27:12: error: redundant redeclaration of ‘expr_debug’ [-Werror=redundant-decls] 27 | extern int expr_debug; | ^~~~~~~~~~ In file included from util/expr.c:11: util/expr-bison.h:43:12: note: previous declaration of ‘expr_debug’ with type ‘int’ 43 | extern int expr_debug; | ^~~~~~~~~~ cc-1: all warnings being treated as errors Remove extern declaration from the parse-envents.c file as there is a conflict with the ones generated using bison and yacc tools from the file parse-events.[ly]. Signed-off-by: Clément Le Goffic <clement.legoffic@foss.st.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240605140453.614862-1-clement.legoffic@foss.st.com
2024-05-26Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"Arnaldo Carvalho de Melo
This reverts commit 617824a7f0f73e4de325cf8add58e55b28c12493. This made a simple 'perf record -e cycles:pp make -j199' stop working on the Ampere ARM64 system Linus uses to test ARM64 kernels, as discussed at length in the threads in the Link tags below. The fix provided by Ian wasn't acceptable and work to fix this will take time we don't have at this point, so lets revert this and work on it on the next devel cycle. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com> Cc: Ethan Adams <j.ethan.adams@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tycho Andersen <tycho@tycho.pizza> Cc: Yang Jihong <yangjihong@bytedance.com> Link: https://lore.kernel.org/lkml/CAHk-=wi5Ri=yR2jBVk-4HzTzpoAWOgstr1LEvg_-OXtJvXXJOA@mail.gmail.com Link: https://lore.kernel.org/lkml/CAHk-=wiWvtFyedDNpoV7a8Fq_FpbB+F5KmWK2xPY3QoYseOf_A@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf parse-events: Add new 'fake_tp' parameter for testsDominique Martinet
The next commit will allow tracepoints starting with digits, but most systems do not have any available by default so tests should skip the actual "check if it exists in /sys/kernel/debug/tracing" step. In order to do that, add a new boolean flag specifying if we should actually "format" the probe or not. Originally-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240510-perf_digit-v4-2-db1553f3233b@codewreck.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf parse-events: pass parse_state to add_tracepointDominique Martinet
The next patch will add another flag to parse_state that we will want to pass to evsel__newtp_idx(), so pass the whole parse_state all the way down instead of giving only the index Originally-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240510-perf_digit-v4-1-db1553f3233b@codewreck.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-03perf test pmu: Refactor format test and exposed test APIsIan Rogers
In tests/pmu.c, make a common utility that creates a PMU in a mkdtemp directory and uses regular PMU parsing logic to load that PMU. Formats must still be eagerly loaded as by default the PMU code assumes devices are going to be in sysfs. In util/pmu.[ch], hide perf_pmu__format_parse but add the eager argument to perf_pmu__lookup called by perf_pmus__add_test_pmu. Later patches will eagerly load other non-sysfs files when eager loading is enabled. In tests/pmu.c, rather than manually constructing a list of term arguments, just use the term parsing code from a string. Add more comments and debug logging. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240502213507.2339733-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf parse-events: Tidy the setting of the default event nameIan Rogers
Add comments. Pass ownership of the event name to save on a strdup. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf parse-events: Minor grouping tidy upIan Rogers
Add comments. Ensure leader->group_name is freed before overwriting it. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf parse-event: Constify event_symbol arraysIan Rogers
Moves 352 bytes from .data to .data.rel.ro. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf parse-events: Improvements to modifier parsingIan Rogers
Use a struct/bitmap rather than a copied string from lexer. In lexer give improved error message when too many precise flags are given or repeated modifiers. Before: $ perf stat -e 'cycles:kuk' true event syntax error: 'cycles:kuk' \___ Bad modifier ... $ perf stat -e 'cycles:pppp' true event syntax error: 'cycles:pppp' \___ Bad modifier ... $ perf stat -e '{instructions:p,cycles:pp}:pp' -a true event syntax error: '..cycles:pp}:pp' \___ Bad modifier ... After: $ perf stat -e 'cycles:kuk' true event syntax error: 'cycles:kuk' \___ Duplicate modifier 'k' (kernel) ... $ perf stat -e 'cycles:pppp' true event syntax error: 'cycles:pppp' \___ Maximum precise value is 3 ... $ perf stat -e '{instructions:p,cycles:pp}:pp' true event syntax error: '..cycles:pp}:pp' \___ Maximum combined precise value is 3, adding precision to "cycles:pp" ... Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf parse-events: Inline parse_events_evlist_errorIan Rogers
Inline parse_events_evlist_error that is only used in parse_events_error. Modify parse_events_error to not report a parser error unless errors haven't already been reported. Make it clearer that the latter case only happens for unrecognized input. Before: $ perf stat -e 'cycles/period=99999999999999999999/' true event syntax error: 'cycles/period=99999999999999999999/' \___ parser error event syntax error: '..les/period=99999999999999999999/' \___ Bad base 10 number "99999999999999999999" Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events $ perf stat -e 'cycles:xyz' true event syntax error: 'cycles:xyz' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events After: $ perf stat -e 'cycles/period=99999999999999999999/xyz' true event syntax error: '..les/period=99999999999999999999/xyz' \___ Bad base 10 number "99999999999999999999" Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events $ perf stat -e 'cycles:xyz' true event syntax error: 'cycles:xyz' \___ Unrecognized input Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf parse-events: Inline parse_events_update_listsIan Rogers
The helper function just wraps a splice and free. Making the free inline removes a comment, so then it just wraps a splice which we can make inline too. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf parse-events: Prefer sysfs/JSON hardware events over legacyIan Rogers
It was requested that RISC-V be able to add events to the perf tool so the PMU driver didn't need to map legacy events to config encodings: https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/ This change makes the priority of events specified without a PMU the same as those specified with a PMU, namely sysfs and JSON events are checked first before using the legacy encoding. The hw_term is made more generic as a hardware_event that encodes a pair of string and int value, allowing parse_events_multi_pmu_add to fall back on a known encoding when the sysfs/JSON adding fails for core events. As this covers PE_VALUE_SYM_HW, that token is removed and related code simplified. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26perf parse-events: Constify parse_events_add_numericIan Rogers
Allow the term list to be const so that other functions can pass const term lists. Add const as necessary to called functions. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>