summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorkuyo chang <kuyo.chang@mediatek.com>2025-06-15 21:10:56 +0800
committerPeter Zijlstra <peterz@infradead.org>2025-08-26 10:46:01 +0200
commit421fc59cf58c64f898cafbbbbda0bc705837e7df (patch)
tree46e9d76189b1bbd7f4eb97f004a8db6d026215e7
parentbb4700adc3abec34c0a38b64f66258e4e233fc16 (diff)
sched/deadline: Fix RT task potential starvation when expiry time passed
[Symptom] The fair server mechanism, which is intended to prevent fair starvation when higher-priority tasks monopolize the CPU. Specifically, RT tasks on the runqueue may not be scheduled as expected. [Analysis] The log "sched: DL replenish lagged too much" triggered. By memory dump of dl_server: curr = 0xFFFFFF80D6A0AC00 ( dl_server = 0xFFFFFF83CD5B1470( dl_runtime = 0x02FAF080, dl_deadline = 0x3B9ACA00, dl_period = 0x3B9ACA00, dl_bw = 0xCCCC, dl_density = 0xCCCC, runtime = 0x02FAF080, deadline = 0x0000082031EB0E80, flags = 0x0, dl_throttled = 0x0, dl_yielded = 0x0, dl_non_contending = 0x0, dl_overrun = 0x0, dl_server = 0x1, dl_server_active = 0x1, dl_defer = 0x1, dl_defer_armed = 0x0, dl_defer_running = 0x1, dl_timer = ( node = ( expires = 0x000008199756E700), _softexpires = 0x000008199756E700, function = 0xFFFFFFDB9AF44D30 = dl_task_timer, base = 0xFFFFFF83CD5A12C0, state = 0x0, is_rel = 0x0, is_soft = 0x0, clock_update_flags = 0x4, clock = 0x000008204A496900, - The timer expiration time (rq->curr->dl_server->dl_timer->expires) is already in the past, indicating the timer has expired. - The timer state (rq->curr->dl_server->dl_timer->state) is 0. [Suspected Root Cause] The relevant code flow in the throttle path of update_curr_dl_se() as follows: dequeue_dl_entity(dl_se, 0); // the DL entity is dequeued if (unlikely(is_dl_boosted(dl_se) || !start_dl_timer(dl_se))) { if (dl_server(dl_se)) // timer registration fails enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);//enqueue immediately ... } The failure of `start_dl_timer` is caused by attempting to register a timer with an expiration time that is already in the past. When this situation persists, the code repeatedly re-enqueues the DL entity without properly replenishing or restarting the timer, resulting in RT task may not be scheduled as expected. [Proposed Solution]: Instead of immediately re-enqueuing the DL entity on timer registration failure, this change ensures the DL entity is properly replenished and the timer is restarted, preventing RT potential starvation. Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers") Signed-off-by: kuyo chang <kuyo.chang@mediatek.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Closes: https://lore.kernel.org/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Jiri Slaby <jirislaby@kernel.org> Tested-by: Diederik de Haas <didi.debian@cknow.org> Link: https://lkml.kernel.org/r/20250615131129.954975-1-kuyo.chang@mediatek.com
-rw-r--r--kernel/sched/deadline.c8
1 files changed, 5 insertions, 3 deletions
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index bb813afe5b08..88c3bd64a8a0 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1496,10 +1496,12 @@ throttle:
}
if (unlikely(is_dl_boosted(dl_se) || !start_dl_timer(dl_se))) {
- if (dl_server(dl_se))
- enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);
- else
+ if (dl_server(dl_se)) {
+ replenish_dl_new_period(dl_se, rq);
+ start_dl_timer(dl_se);
+ } else {
enqueue_task_dl(rq, dl_task_of(dl_se), ENQUEUE_REPLENISH);
+ }
}
if (!is_leftmost(dl_se, &rq->dl))