diff options
author | Alexei Starovoitov <ast@kernel.org> | 2025-03-03 17:40:14 -0800 |
---|---|---|
committer | Alexei Starovoitov <ast@kernel.org> | 2025-03-15 11:48:28 -0700 |
commit | 3a6fa573c50f31d6ab8c8c3318a68d511c79f8fb (patch) | |
tree | b109c094280af113cda230fce761407b94cd9127 /kernel/bpf/core.c | |
parent | 2941e215376399d5e71eddcd720f185e28ba2dbb (diff) | |
parent | 2fb761823eadf2fdfb6fdf146c4b94807b4bc3ba (diff) |
Merge branch 'timed-may_goto'
Kumar Kartikeya Dwivedi says:
====================
Timed may_goto
This series replaces the current implementation of cond_break, which
uses the may_goto instruction, and counts 8 million iterations per stack
frame, with an implementation based on sampling time locally on the CPU.
This is done to permit a longer time for a given loop per-program
invocation. The accounting is still done per-stack frame, but the count
is used to instead amortize the cost of the logic to sample and check
the time spent since the start.
This is needed for expressing more complicated algorithms (spin locks,
waiting loops, etc.) in BPF programs without false positive expiration
of the loop. For instance, the plan is to make use of this for
implementing spin locks for BPF arena [0].
For the loop as follows:
for (int i = 0;; i++) {}
Testing on a bare-metal Sapphire Rapids Intel server yields the following
table (taking an average of 25 runs).
+-----------------------------+--------------+--------------+------------------+
| Loop type | Iterations | Time (ms) | Time/iter (ns) |
+-----------------------------|--------------+--------------+------------------+
| may_goto | 8388608 | 3 | 0.36 |
| timed_may_goto (count=65535)| 589674932 | 250 | 0.42 |
| bpf_for | 8388608 | 10 | 1.19 |
+-----------------------------+--------------+--------------+------------------+
Here, count is used to amortize the time sampling and checking logic.
Obviously, this is the limit of an empty loop. Given the complexity of
the loop body, the time spent in the loop can be longer. Cancellations
will address the task of imposing an upper bound on program runtime.
For now, the implementation only supports x86.
[0]: https://lore.kernel.org/bpf/20250118162238.2621311-1-memxor@gmail.com
Changelog:
----------
v1 -> v2
v1: https://lore.kernel.org/bpf/20250302201348.940234-1-memxor@gmail.com
* Address comments from Alexei
* Use kernel comment style for new code.
* Remove p->count == 0 check in bpf_check_timed_may_goto.
* Add comments on AX as argument/retval calling convention.
* Add comments describing how the counting logic works.
* Use BPF_EMIT_CALL instead of open-coding instruction encoding.
* Change if ax != 1 goto pc+X condition to if ax != 0 goto pc+X.
====================
Link: https://patch.msgid.link/20250304003239.2390751-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'kernel/bpf/core.c')
-rw-r--r-- | kernel/bpf/core.c | 26 |
1 files changed, 26 insertions, 0 deletions
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index a0200fbbace9..e583c19a0291 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -3069,6 +3069,32 @@ void __weak arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, { } +bool __weak bpf_jit_supports_timed_may_goto(void) +{ + return false; +} + +u64 __weak arch_bpf_timed_may_goto(void) +{ + return 0; +} + +u64 bpf_check_timed_may_goto(struct bpf_timed_may_goto *p) +{ + u64 time = ktime_get_mono_fast_ns(); + + /* Populate the timestamp for this stack frame, and refresh count. */ + if (!p->timestamp) { + p->timestamp = time; + return BPF_MAX_TIMED_LOOPS; + } + /* Check if we've exhausted our time slice, and zero count. */ + if (time - p->timestamp >= (NSEC_PER_SEC / 4)) + return 0; + /* Refresh the count for the stack frame. */ + return BPF_MAX_TIMED_LOOPS; +} + /* for configs without MMU or 32-bit */ __weak const struct bpf_map_ops arena_map_ops; __weak u64 bpf_arena_get_user_vm_start(struct bpf_arena *arena) |