summaryrefslogtreecommitdiff
path: root/kernel/sched/idle.c
diff options
context:
space:
mode:
authorRafael J. Wysocki <rafael.j.wysocki@intel.com>2018-03-15 23:05:50 +0100
committerRafael J. Wysocki <rafael.j.wysocki@intel.com>2018-04-05 19:01:14 +0200
commit2aaf709a518d26563b80fd7a42379d7aa7ffed4a (patch)
treef58fc91024e9511ffed664ba51dc2f246e09ad1e /kernel/sched/idle.c
parent0e7767687fdabfc58d5046e7488632bf2ecd4d0c (diff)
sched: idle: Do not stop the tick upfront in the idle loop
Push the decision whether or not to stop the tick somewhat deeper into the idle loop. Stopping the tick upfront leads to unpleasant outcomes in case the idle governor doesn't agree with the nohz code on the duration of the upcoming idle period. Specifically, if the tick has been stopped and the idle governor predicts short idle, the situation is bad regardless of whether or not the prediction is accurate. If it is accurate, the tick has been stopped unnecessarily which means excessive overhead. If it is not accurate, the CPU is likely to spend too much time in the (shallow, because short idle has been predicted) idle state selected by the governor [1]. As the first step towards addressing this problem, change the code to make the tick stopping decision inside of the loop in do_idle(). In particular, do not stop the tick in the cpu_idle_poll() code path. Also don't do that in tick_nohz_irq_exit() which doesn't really have enough information on whether or not to stop the tick. Link: https://marc.info/?l=linux-pm&m=150116085925208&w=2 # [1] Link: https://tu-dresden.de/zih/forschung/ressourcen/dateien/projekte/haec/powernightmares.pdf Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Diffstat (limited to 'kernel/sched/idle.c')
-rw-r--r--kernel/sched/idle.c9
1 files changed, 6 insertions, 3 deletions
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index c0bc111878e6..3777e83c0b5a 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -216,13 +216,13 @@ static void do_idle(void)
__current_set_polling();
tick_nohz_idle_enter();
- tick_nohz_idle_stop_tick_protected();
while (!need_resched()) {
check_pgt_cache();
rmb();
if (cpu_is_offline(cpu)) {
+ tick_nohz_idle_stop_tick_protected();
cpuhp_report_idle_dead();
arch_cpu_idle_dead();
}
@@ -236,10 +236,13 @@ static void do_idle(void)
* broadcast device expired for us, we don't want to go deep
* idle as we know that the IPI is going to arrive right away.
*/
- if (cpu_idle_force_poll || tick_check_broadcast_expired())
+ if (cpu_idle_force_poll || tick_check_broadcast_expired()) {
+ tick_nohz_idle_restart_tick();
cpu_idle_poll();
- else
+ } else {
+ tick_nohz_idle_stop_tick();
cpuidle_idle_call();
+ }
arch_cpu_idle_exit();
}