summaryrefslogtreecommitdiff
path: root/include/linux/perf_event.h
diff options
context:
space:
mode:
authorPeter Zijlstra <peterz@infradead.org>2023-10-09 23:04:25 +0200
committerPeter Zijlstra <peterz@infradead.org>2023-10-12 19:28:38 +0200
commitf06cc667f79909e9175460b167c277b7c64d3df0 (patch)
tree866f158162b5de5c9cd37df4fbbefe917f273718 /include/linux/perf_event.h
parent8f4156d58713b058e9aeebb28ffbe5f45ae57b47 (diff)
perf: Optimize perf_cgroup_switch()
Namhyung reported that bd2756811766 ("perf: Rewrite core context handling") regresses context switch overhead when perf-cgroup is in use together with 'slow' PMUs like uncore. Specifically, perf_cgroup_switch()'s perf_ctx_disable() / ctx_sched_out() etc.. all iterate the full list of active PMUs for that CPU, even if they don't have cgroup events. Previously there was cgrp_cpuctx_list which linked the relevant PMUs together, but that got lost in the rework. Instead of re-instruducing a similar list, let the perf_event_pmu_context iteration skip those that do not have cgroup events. This avoids growing multiple versions of the perf_event_pmu_context iteration. Measured performance (on a slightly different patch): Before) $ taskset -c 0 ./perf bench sched pipe -l 10000 -G AAA,BBB # Running 'sched/pipe' benchmark: # Executed 10000 pipe operations between two processes Total time: 0.901 [sec] 90.128700 usecs/op 11095 ops/sec After) $ taskset -c 0 ./perf bench sched pipe -l 10000 -G AAA,BBB # Running 'sched/pipe' benchmark: # Executed 10000 pipe operations between two processes Total time: 0.065 [sec] 6.560100 usecs/op 152436 ops/sec Fixes: bd2756811766 ("perf: Rewrite core context handling") Reported-by: Namhyung Kim <namhyung@kernel.org> Debugged-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231009210425.GC6307@noisy.programming.kicks-ass.net
Diffstat (limited to 'include/linux/perf_event.h')
-rw-r--r--include/linux/perf_event.h1
1 files changed, 1 insertions, 0 deletions
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f31f962a6445..0367d748fae0 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -878,6 +878,7 @@ struct perf_event_pmu_context {
unsigned int embedded : 1;
unsigned int nr_events;
+ unsigned int nr_cgroups;
atomic_t refcount; /* event <-> epc */
struct rcu_head rcu_head;