summaryrefslogtreecommitdiff
path: root/kernel/sched/clock.c
AgeCommit message (Collapse)Author
2023-06-27Merge tag 'locking-core-2023-06-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Introduce cmpxchg128() -- aka. the demise of cmpxchg_double() The cmpxchg128() family of functions is basically & functionally the same as cmpxchg_double(), but with a saner interface. Instead of a 6-parameter horror that forced u128 - u64/u64-halves layout details on the interface and exposed users to complexity, fragility & bugs, use a natural 3-parameter interface with u128 types. - Restructure the generated atomic headers, and add kerneldoc comments for all of the generic atomic{,64,_long}_t operations. The generated definitions are much cleaner now, and come with documentation. - Implement lock_set_cmp_fn() on lockdep, for defining an ordering when taking multiple locks of the same type. This gets rid of one use of lockdep_set_novalidate_class() in the bcache code. - Fix raw_cpu_generic_try_cmpxchg() bug due to an unintended variable shadowing generating garbage code on Clang on certain ARM builds. * tag 'locking-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) locking/atomic: scripts: fix ${atomic}_dec_if_positive() kerneldoc percpu: Fix self-assignment of __old in raw_cpu_generic_try_cmpxchg() locking/atomic: treewide: delete arch_atomic_*() kerneldoc locking/atomic: docs: Add atomic operations to the driver basic API documentation locking/atomic: scripts: generate kerneldoc comments docs: scripts: kernel-doc: accept bitwise negation like ~@var locking/atomic: scripts: simplify raw_atomic*() definitions locking/atomic: scripts: simplify raw_atomic_long*() definitions locking/atomic: scripts: split pfx/name/sfx/order locking/atomic: scripts: restructure fallback ifdeffery locking/atomic: scripts: build raw_atomic_long*() directly locking/atomic: treewide: use raw_atomic*_<op>() locking/atomic: scripts: add trivial raw_atomic*_<op>() locking/atomic: scripts: factor out order template generation locking/atomic: scripts: remove leftover "${mult}" locking/atomic: scripts: remove bogus order parameter locking/atomic: xtensa: add preprocessor symbols locking/atomic: x86: add preprocessor symbols locking/atomic: sparc: add preprocessor symbols locking/atomic: sh: add preprocessor symbols ...
2023-06-05sched/clock: Provide local_clock_noinstr()Peter Zijlstra
Now that all ARCH_WANTS_NO_INSTR architectures (arm64, loongarch, s390, x86) provide sched_clock_noinstr(), use this to provide local_clock_noinstr(). This local_clock_noinstr() will be safe to use from noinstr code with the assumption that any such noinstr code is non-preemptible (it had better be, entry code will have IRQs disabled while __cpuidle must have preemption disabled). Specifically, preempt_enable_notrace(), a common part of many a sched_clock() implementation calls out to schedule() -- even though, per the above, it will never trigger -- which frustrates noinstr validation. vmlinux.o: warning: objtool: local_clock+0xb5: call to preempt_schedule_notrace_thunk() leaves .noinstr.text section Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> # Hyper-V Link: https://lore.kernel.org/r/20230519102715.978624636@infradead.org
2023-06-05locking/atomic: treewide: use raw_atomic*_<op>()Mark Rutland
Now that we have raw_atomic*_<op>() definitions, there's no need to use arch_atomic*_<op>() definitions outside of the low-level atomic definitions. Move treewide users of arch_atomic*_<op>() over to the equivalent raw_atomic*_<op>(). There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-19-mark.rutland@arm.com
2023-04-21sched/clock: Fix local_clock() before sched_clock_init()Aaron Thompson
Have local_clock() return sched_clock() if sched_clock_init() has not yet run. sched_clock_cpu() has this check but it was not included in the new noinstr implementation of local_clock(). The effect can be seen on x86 with CONFIG_PRINTK_TIME enabled, for instance. scd->clock quickly reaches the value of TICK_NSEC and that value is returned until sched_clock_init() runs. dmesg without this patch: [ 0.000000] kvm-clock: ... [ 0.000002] kvm-clock: ... [ 0.000672] clocksource: ... [ 0.001000] tsc: ... [ 0.001000] e820: ... [ 0.001000] e820: ... ... [ 0.001000] ..TIMER: ... [ 0.001000] clocksource: ... [ 0.378956] Calibrating delay loop ... [ 0.379955] pid_max: ... dmesg with this patch: [ 0.000000] kvm-clock: ... [ 0.000001] kvm-clock: ... [ 0.000675] clocksource: ... [ 0.002685] tsc: ... [ 0.003331] e820: ... [ 0.004190] e820: ... ... [ 0.421939] ..TIMER: ... [ 0.422842] clocksource: ... [ 0.424582] Calibrating delay loop ... [ 0.425580] pid_max: ... Fixes: 776f22913b8e ("sched/clock: Make local_clock() noinstr") Signed-off-by: Aaron Thompson <dev@aaront.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230413175012.2201-1-dev@aaront.org
2023-01-31sched/clock: Make local_clock() noinstrPeter Zijlstra
With sched_clock() noinstr, provide a noinstr implementation of local_clock(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230126151323.760767043@infradead.org
2022-05-19sched/clock: Use try_cmpxchg64 in sched_clock_{local,remote}Uros Bizjak
Use try_cmpxchg64 instead of cmpxchg64 (*ptr, old, new) != old in sched_clock_{local,remote}. x86 cmpxchg returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220518184953.3446778-1-ubizjak@gmail.com
2022-02-23sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c ↵Ingo Molnar
files there Collect all utility functionality source code files into a single kernel/sched/build_utility.c file, via #include-ing the .c files: kernel/sched/clock.c kernel/sched/completion.c kernel/sched/loadavg.c kernel/sched/swait.c kernel/sched/wait_bit.c kernel/sched/wait.c CONFIG_CPU_FREQ: kernel/sched/cpufreq.c CONFIG_CPU_FREQ_GOV_SCHEDUTIL: kernel/sched/cpufreq_schedutil.c CONFIG_CGROUP_CPUACCT: kernel/sched/cpuacct.c CONFIG_SCHED_DEBUG: kernel/sched/debug.c CONFIG_SCHEDSTATS: kernel/sched/stats.c CONFIG_SMP: kernel/sched/cpupri.c kernel/sched/stop_task.c kernel/sched/topology.c CONFIG_SCHED_CORE: kernel/sched/core_sched.c CONFIG_PSI: kernel/sched/psi.c CONFIG_MEMBARRIER: kernel/sched/membarrier.c CONFIG_CPU_ISOLATION: kernel/sched/isolation.c CONFIG_SCHED_AUTOGROUP: kernel/sched/autogroup.c The goal is to amortize the 60+ KLOC header bloat from over a dozen build units into a single build unit. The build time of build_utility.c also roughly matches the build time of core.c and fair.c - allowing better load-balancing of scheduler-only rebuilds. Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Peter Zijlstra <peterz@infradead.org>
2022-02-23sched/headers: sched/clock: Mark all functions 'notrace', remove ↵Ingo Molnar
CC_FLAGS_FTRACE build asymmetry Mark all non-init functions in kernel/sched.c as 'notrace', instead of turning them all off via CC_FLAGS_FTRACE. This is going to allow the treatment of this file as any other scheduler file, and it can be #include-ed in compound compilation units as well. Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Peter Zijlstra <peterz@infradead.org>
2021-03-22sched: Fix various typosIngo Molnar
Fix ~42 single-word typos in scheduler code comments. We have accumulated a few fun ones over the years. :-) Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: linux-kernel@vger.kernel.org
2019-11-29sched/clock: Use static_branch_likely() with sched_clock_runningZhenzhong Duan
sched_clock_running is enabled early at bootup stage and never disabled. So hint that to the compiler by using static_branch_likely() rather than static_branch_unlikely(). The branch probability mis-annotation was introduced in the original commit that converted the plain sched_clock_running flag to a static key: 46457ea464f5 ("sched/clock: Use static key for sched_clock_running") Steve further notes: | Looks like the confusion was the moving of the "!": | | - if (unlikely(!sched_clock_running)) | + if (!static_branch_unlikely(&sched_clock_running)) | | Where, it was unlikely that !sched_clock_running would be true, but | because the "!" was moved outside the "unlikely()" it makes the test | "likely()". That is, if we added an intermediate step, it would have | been: | | if (!likely(sched_clock_running)) | | which would have prevented the mistake that this patch fixes. [ mingo: Edited the changelog. ] Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: dietmar.eggemann@arm.com Cc: juri.lelli@redhat.com Cc: mgorman@suse.de Cc: vincent.guittot@linaro.org Link: https://lkml.kernel.org/r/1574843848-26825-1-git-send-email-zhenzhong.duan@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-21treewide: Add SPDX license identifier for missed filesThomas Gleixner
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-30sched/clock: Disable interrupts when calling generic_sched_clock_init()Pavel Tatashin
sched_clock_init() used be called early during boot when interrupts were still disabled. After the recent changes to utilize sched clock early the sched_clock_init() call happens when interrupts are already enabled, which triggers the following warning: WARNING: CPU: 0 PID: 0 at kernel/time/sched_clock.c:180 sched_clock_register+0x44/0x278 [<c001a13c>] (warn_slowpath_null) from [<c052367c>] (sched_clock_register+0x44/0x278) [<c052367c>] (sched_clock_register) from [<c05238d8>] (generic_sched_clock_init+0x28/0x88) [<c05238d8>] (generic_sched_clock_init) from [<c0521a00>] (sched_clock_init+0x54/0x74) [<c0521a00>] (sched_clock_init) from [<c0519c18>] (start_kernel+0x310/0x3e4) [<c0519c18>] (start_kernel) from [<00000000>] ( (null)) Disable IRQs for the duration of generic_sched_clock_init(). Fixes: 857baa87b642 ("sched/clock: Enable sched clock early") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Link: https://lkml.kernel.org/r/20180730135252.24599-1-pasha.tatashin@oracle.com
2018-07-20sched/clock: Close a hole in sched_clock_init()Peter Zijlstra
All data required for the 'unstable' sched_clock must be set-up _before_ enabling it -- setting sched_clock_running. This includes the __gtod_offset but also a recent scd stamp. Make the gtod-offset update also set the csd stamp -- it requires the same two clock reads _anyway_. This doesn't hurt in the sched_clock_tick_stable() case and ensures sched_clock_init() gets everything set-up before use. Also switch to unconditional IRQ-disable/enable because the static key stuff already requires this is not ran with IRQs disabled. Fixes: 857baa87b642 ("sched/clock: Enable sched clock early") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180720080911.GM2494@hirez.programming.kicks-ass.net
2018-07-20sched/clock: Use static key for sched_clock_runningPavel Tatashin
sched_clock_running may be read every time sched_clock_cpu() is called. Yet, this variable is updated only twice during boot, and never changes again, therefore it is better to make it a static key. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-25-pasha.tatashin@oracle.com
2018-07-20sched/clock: Enable sched clock earlyPavel Tatashin
Allow sched_clock() to be used before schec_clock_init() is called. This provides a way to get early boot timestamps on machines with unstable clocks. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: peterz@infradead.org Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-24-pasha.tatashin@oracle.com
2018-07-20sched/clock: Move sched clock initialization and merge with generic clockPavel Tatashin
sched_clock_postinit() initializes a generic clock on systems where no other clock is provided. This function may be called only after timekeeping_init(). Rename sched_clock_postinit to generic_clock_inti() and call it from sched_clock_init(). Move the call for sched_clock_init() until after time_init(). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-23-pasha.tatashin@oracle.com
2018-03-04sched/headers: Simplify and clean up header usage in the schedulerIngo Molnar
Do the following cleanups and simplifications: - sched/sched.h already includes <asm/paravirt.h>, so no need to include it in sched/core.c again. - order the <linux/sched/*.h> headers alphabetically - add all <linux/sched/*.h> headers to kernel/sched/sched.h - remove all unnecessary includes from the .c files that are already included in kernel/sched/sched.h. Finally, make all scheduler .c files use a single common header: #include "sched.h" ... which now contains a union of the relied upon headers. This makes the various .c files easier to read and easier to handle. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-03sched: Clean up and harmonize the coding style of the scheduler code baseIngo Molnar
A good number of small style inconsistencies have accumulated in the scheduler core, so do a pass over them to harmonize all these details: - fix speling in comments, - use curly braces for multi-line statements, - remove unnecessary parentheses from integer literals, - capitalize consistently, - remove stray newlines, - add comments where necessary, - remove invalid/unnecessary comments, - align structure definitions and other data types vertically, - add missing newlines for increased readability, - fix vertical tabulation where it's misaligned, - harmonize preprocessor conditional block labeling and vertical alignment, - remove line-breaks where they uglify the code, - add newline after local variable definitions, No change in functionality: md5: 1191fa0a890cfa8132156d2959d7e9e2 built-in.o.before.asm 1191fa0a890cfa8132156d2959d7e9e2 built-in.o.after.asm Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-08sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabledFrederic Weisbecker
Use lockdep to check that IRQs are enabled or disabled as expected. This way the sanity check only shows overhead when concurrency correctness debug code is enabled. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: David S . Miller <davem@davemloft.net> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1509980490-4285-12-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-24sched/clock: Fix early boot preempt assumption in __set_sched_clock_stable()Peter Zijlstra
The more strict early boot preemption warnings found that __set_sched_clock_stable() was incorrectly assuming we'd still be running on a single CPU: BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 caller is debug_smp_processor_id+0x1c/0x1e CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc2-00108-g1c3c5ea #1 Call Trace: dump_stack+0x110/0x192 check_preemption_disabled+0x10c/0x128 ? set_debug_rodata+0x25/0x25 debug_smp_processor_id+0x1c/0x1e sched_clock_init_late+0x27/0x87 [...] Fix it by disabling IRQs. Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: lkp@01.org Cc: tipbuild@zytor.com Link: http://lkml.kernel.org/r/20170524065202.v25vyu7pvba5mhpd@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15sched/clock: Print a warning recommending 'tsc=unstable'Peter Zijlstra
With our switch to stable delayed until late_initcall(), the most likely cause of hitting mark_tsc_unstable() is the watchdog. The watchdog typically only triggers when creative BIOS'es fiddle with the TSC to hide SMI latency. Since the watchdog can only detect TSC fiddling after the fact all TSC clocks (including userspace GTOD) can already have reported funny values. The only way to fully avoid this, is manually marking the TSC unstable at boot. Suggest people do this on their broken systems. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15sched/clock: Use late_initcall() instead of sched_init_smp()Peter Zijlstra
Core2 marks its TSC unstable in ACPI Processor Idle, which is probed after sched_init_smp(). Luckily it appears both acpi_processor and intel_idle (which has a similar check) are mandatory built-in. This means we can delay switching to stable until after these drivers have ran (if they were modules, this would be impossible). Delay the stable switch to late_initcall() to allow these drivers to mark TSC unstable and avoid difficult stable->unstable transitions. Reported-by: Lofstedt, Marta <marta.lofstedt@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15cpuidle: Fix idle time trackingPeter Zijlstra
Ville reported that on his Core2, which has TSC stop in idle, we would always report very short idle durations. He tracked this down to commit: e93e59ce5b85 ("cpuidle: Replace ktime_get() with local_clock()") which replaces ktime_get() with local_clock(). Add a sched_clock_idle_wakeup_event() call, which will re-sync the clock with ktime_get_ns() when TSC is unstable and no-op otherwise. Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: e93e59ce5b85 ("cpuidle: Replace ktime_get() with local_clock()") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15sched/clock: Remove watchdog touchingPeter Zijlstra
Commit: 2bacec8c318c ("sched: touch softlockup watchdog after idling") introduced the touch_softlockup_watchdog_sched() call without justification and I feel sched_clock management is not the right place, it should only be concerned with producing semi coherent time. If this causes watchdog thingies, we can find a better place. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15sched/clock: Remove unused argument to sched_clock_idle_wakeup_event()Peter Zijlstra
The argument to sched_clock_idle_wakeup_event() has not been used in a long time. Remove it. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15x86/tsc, sched/clock, clocksource: Use clocksource watchdog to provide ↵Peter Zijlstra
stable sync points Currently we keep sched_clock_tick() active for stable TSC in order to keep the per-CPU state semi up-to-date. The (obvious) problem is that by the time we detect TSC is borked, our per-CPU state is also borked. So hook into the clocksource watchdog and call a method after we've found it to still be stable. There's the obvious race where the TSC goes wonky between finding it stable and us running the callback, but closing that is too much work and not really worth it, since we're already detecting TSC wobbles after the fact, so we cannot, per definition, fully avoid funny clock values. And since the watchdog runs less often than the tick, this is also an optimization. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15sched/clock: Initialize all per-CPU state before switching (back) to unstablePeter Zijlstra
In preparation for not keeping the sched_clock_tick() active for stable TSC, we need to explicitly initialize all per-CPU state before switching back to unstable. Note: this patch looses the __gtod_offset calculation; it will be restored in the next one. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-27sched/clock: Fix broken stable to unstable transferPavel Tatashin
When it is determined that the clock is actually unstable, and we switch from stable to unstable, the __clear_sched_clock_stable() function is eventually called. In this function we set gtod_offset so the following holds true: sched_clock() + raw_offset == ktime_get_ns() + gtod_offset But instead of getting the latest timestamps, we use the last values from scd, so instead of sched_clock() we use scd->tick_raw, and instead of ktime_get_ns() we use scd->tick_gtod. However, later, when we use gtod_offset sched_clock_local() we do not add it to scd->tick_gtod to calculate the correct clock value when we determine the boundaries for min/max clocks. This can result in tick granularity sched_clock() values, so fix it. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Fixes: 5680d8094ffa ("sched/clock: Provide better clock continuity") Link: http://lkml.kernel.org/r/1490214265-899964-2-git-send-email-pasha.tatashin@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-23sched/clock, x86/perf: Fix "perf test tsc"Peter Zijlstra
People reported that commit: 5680d8094ffa ("sched/clock: Provide better clock continuity") broke "perf test tsc". That commit added another offset to the reported clock value; so take that into account when computing the provided offset values. Reported-by: Adrian Hunter <adrian.hunter@intel.com> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 5680d8094ffa ("sched/clock: Provide better clock continuity") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-23sched/clock: Fix clear_sched_clock_stable() preempt wobblyPeter Zijlstra
Paul reported a problems with clear_sched_clock_stable(). Since we run all of __clear_sched_clock_stable() from workqueue context, there's a preempt problem. Solve it by only running the static_key_disable() from workqueue. Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: fweisbec@gmail.com Link: http://lkml.kernel.org/r/20170313124621.GA3328@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/nmi.h> We are going to move softlockup APIs out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. <linux/nmi.h> already includes <linux/sched.h>. Include the <linux/nmi.h> header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/clock.h> We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which will have to be picked up from other headers and .c files. Create a trivial placeholder <linux/sched/clock.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-20sched/clock: Fix hotplug crashPeter Zijlstra
Mike reported that he could trigger the WARN_ON_ONCE() in set_sched_clock_stable() using hotplug. This exposed a fundamental problem with the interface, we should never mark the TSC stable if we ever find it to be unstable. Therefore set_sched_clock_stable() is a broken interface. The reason it existed is that not having it is a pain, it means all relevant architecture code needs to call clear_sched_clock_stable() where appropriate. Of the three architectures that select HAVE_UNSTABLE_SCHED_CLOCK ia64 and parisc are trivial in that they never called set_sched_clock_stable(), so add an unconditional call to clear_sched_clock_stable() to them. For x86 the story is a lot more involved, and what this patch tries to do is ensure we preserve the status quo. So even is Cyrix or Transmeta have usable TSC they never called set_sched_clock_stable() so they now get an explicit mark unstable. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 9881b024b7d7 ("sched/clock: Delay switching sched_clock to stable") Link: http://lkml.kernel.org/r/20170119133633.GB6536@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14sched/clock: Provide better clock continuityPeter Zijlstra
When switching between the unstable and stable variants it is currently possible that clock discontinuities occur. And while these will mostly be 'small', attempt to do better. As observed on my IVB-EP, the sched_clock() is ~1.5s ahead of the ktime_get_ns() based timeline at the point of switchover (sched_clock_init_late()) after SMP bringup. Equally, when the TSC is later found to be unstable -- typically because SMM tries to hide its SMI latencies by mucking with the TSC -- we want to avoid large jumps. Since the clocksource watchdog reports the issue after the fact we cannot exactly fix up time, but since SMI latencies are typically small (~10ns range), the discontinuity is mainly due to drift between sched_clock() and ktime_get_ns() (which on my desktop is ~79s over 24days). I dislike this patch because it adds overhead to the good case in favour of dealing with badness. But given the widespread failure of TSC stability this is worth it. Note that in case the TSC makes drastic jumps after SMP bringup we're still hosed. There's just not much we can do in that case without stupid overhead. If we were to somehow expose tsc_clocksource_reliable (which is hard because this code is also used on ia64 and parisc) we could avoid some of the newly introduced overhead. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14sched/clock: Delay switching sched_clock to stablePeter Zijlstra
Currently we switch to the stable sched_clock if we guess the TSC is usable, and then switch back to the unstable path if it turns out TSC isn't stable during SMP bringup after all. Delay switching to the stable path until after SMP bringup is complete. This way we'll avoid switching during the time we detect the worst of the TSC offences. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14sched/clock: Update static_key usagePeter Zijlstra
sched_clock was still using the deprecated static_key interface. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13sched/clock: Make local_clock()/cpu_clock() inlineDaniel Lezcano
The local_clock/cpu_clock functions were changed to prevent a double identical test with sched_clock_cpu() when HAVE_UNSTABLE_SCHED_CLOCK is set. That resulted in one line functions. As these functions are in all the cases one line functions and in the hot path, it is useful to specify them as static inline in order to give a strong hint to the compiler. After verification, it appears the compiler does not inline them without this hint. Change those functions to static inline. sched_clock_cpu() is called via the inlined local_clock()/cpu_clock() functions from sched.h. So any module code including sched.h will reference sched_clock_cpu(). Thus it must be exported with the EXPORT_SYMBOL_GPL macro. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1460385514-14700-2-git-send-email-daniel.lezcano@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13sched/clock: Remove pointless test in cpu_clock/local_clockDaniel Lezcano
In case the HAVE_UNSTABLE_SCHED_CLOCK config is set, the cpu_clock() version checks if sched_clock_stable() is not set and calls sched_clock_cpu(), otherwise it calls sched_clock(). sched_clock_cpu() checks also if sched_clock_stable() is set and, if true, calls sched_clock(). sched_clock() will be called in sched_clock_cpu() if sched_clock_stable() is true. Remove the duplicate test by directly calling sched_clock_cpu() and let the static key act in this function instead. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1460385514-14700-1-git-send-email-daniel.lezcano@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-02sched-clock: Migrate to use new tick dependency mask modelFrederic Weisbecker
Instead of checking sched_clock_stable from the nohz subsystem to verify its tick dependency, migrate it to the new mask in order to include it to the all-in-one check. Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christoph Lameter <cl@linux.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2016-01-11Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue update from Tejun Heo: "Workqueue changes for v4.5. One cleanup patch and three to improve the debuggability. Workqueue now has a stall detector which dumps workqueue state if any worker pool hasn't made forward progress over a certain amount of time (30s by default) and also triggers a warning if a workqueue which can be used in memory reclaim path tries to wait on something which can't be. These should make workqueue hangs a lot easier to debug." * 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: simplify the apply_workqueue_attrs_locked() workqueue: implement lockup detector watchdog: introduce touch_softlockup_watchdog_sched() workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
2015-12-08watchdog: introduce touch_softlockup_watchdog_sched()Tejun Heo
touch_softlockup_watchdog() is used to tell watchdog that scheduler stall is expected. One group of usage is from paths where the task may not be able to yield for a long time such as performing slow PIO to finicky device and coming out of suspend. The other is to account for scheduler and timer going idle. For scheduler softlockup detection, there's no reason to distinguish the two cases; however, workqueue lockup detector is planned and it can use the same signals from the former group while the latter would spuriously prevent detection. This patch introduces a new function touch_softlockup_watchdog_sched() and convert the latter group to call it instead. For now, it just calls touch_softlockup_watchdog() and there's no functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org>
2015-11-23treewide: Remove old email addressPeter Zijlstra
There were still a number of references to my old Red Hat email address in the kernel source. Remove these while keeping the Red Hat copyright notices intact. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-12kernel/sched/clock.c: add another clock for use with the soft lockup watchdogCyril Bur
When the hypervisor pauses a virtualised kernel the kernel will observe a jump in timebase, this can cause spurious messages from the softlockup detector. Whilst these messages are harmless, they are accompanied with a stack trace which causes undue concern and more problematically the stack trace in the guest has nothing to do with the observed problem and can only be misleading. Futhermore, on POWER8 this is completely avoidable with the introduction of the Virtual Time Base (VTB) register. This patch (of 2): This permits the use of arch specific clocks for which virtualised kernels can use their notion of 'running' time, not the elpased wall time which will include host execution time. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Andrew Jones <drjones@redhat.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: chai wen <chaiw.fnst@cn.fujitsu.com> Cc: Fabian Frederick <fabf@skynet.be> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Ben Zhang <benzh@chromium.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-26time: Replace __get_cpu_var usesChristoph Lameter
Convert uses of __get_cpu_var for creating a address from a percpu offset to this_cpu_ptr. The two cases where get_cpu_var is used to actually access a percpu variable are changed to use this_cpu_read/raw_cpu_read. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-04-07kernel: use macros from compiler.h instead of __attribute__((...))Gideon Israel Dsouza
To increase compiler portability there is <linux/compiler.h> which provides convenience macros for various gcc constructs. Eg: __weak for __attribute__((weak)). I've replaced all instances of gcc attributes with the right macro in the kernel subsystem. Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-11sched/clock: Prevent tracing recursion in sched_clock_cpu()Fernando Luis Vazquez Cao
Prevent tracing of preempt_disable/enable() in sched_clock_cpu(). When CONFIG_DEBUG_PREEMPT is enabled, preempt_disable/enable() are traced and this causes trace_clock() users (and probably others) to go into an infinite recursion. Systems with a stable sched_clock() are not affected. This problem is similar to that fixed by upstream commit 95ef1e52922 ("KVM guest: prevent tracing recursion with kvmclock"). Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1394083528.4524.3.camel@nexus Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-23sched/clock: Fixup early initializationPeter Zijlstra
The code would assume sched_clock_stable() and switch to !stable later, this switch brings a discontinuity in time. The discontinuity on switching from stable to unstable was always present, but previously we would set stable/unstable before initializing TSC and usually stick to the one we start out with. So the static_key bits brought an extra switch where there previously wasn't one. Things are further complicated by the fact that we cannot use static_key as early as we usually call set_sched_clock_stable(). Fix things by tracking the stable state in a regular variable and only set the static_key to the right state on sched_clock_init(), which is ran right after late_time_init->tsc_init(). Before this we would not be using the TSC anyway. Reported-and-Tested-by: Sasha Levin <sasha.levin@oracle.com> Reported-by: dyoung@redhat.com Fixes: 35af99e646c7 ("sched/clock, x86: Use a static_key for sched_clock_stable") Cc: jacob.jun.pan@linux.intel.com Cc: Mike Galbraith <bitbucket@online.de> Cc: hpa@zytor.com Cc: paulmck@linux.vnet.ibm.com Cc: John Stultz <john.stultz@linaro.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: lenb@kernel.org Cc: rjw@rjwysocki.net Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: rui.zhang@intel.com Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140122115918.GG3694@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/clock: Fix up clear_sched_clock_stable()Peter Zijlstra
The below tells us the static_key conversion has a problem; since the exact point of clearing that flag isn't too important, delay the flip and use a workqueue to process it. [ ] TSC synchronization [CPU#0 -> CPU#22]: [ ] Measured 8 cycles TSC warp between CPUs, turning off TSC clock. [ ] [ ] ====================================================== [ ] [ INFO: possible circular locking dependency detected ] [ ] 3.13.0-rc3-01745-g848b0d0322cb-dirty #637 Not tainted [ ] ------------------------------------------------------- [ ] swapper/0/1 is trying to acquire lock: [ ] (jump_label_mutex){+.+...}, at: [<ffffffff8115a637>] jump_label_lock+0x17/0x20 [ ] [ ] but task is already holding lock: [ ] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109408b>] cpu_hotplug_begin+0x2b/0x60 [ ] [ ] which lock already depends on the new lock. [ ] [ ] [ ] the existing dependency chain (in reverse order) is: [ ] [ ] -> #1 (cpu_hotplug.lock){+.+.+.}: [ ] [<ffffffff810def00>] lock_acquire+0x90/0x130 [ ] [<ffffffff81661f83>] mutex_lock_nested+0x63/0x3e0 [ ] [<ffffffff81093fdc>] get_online_cpus+0x3c/0x60 [ ] [<ffffffff8104cc67>] arch_jump_label_transform+0x37/0x130 [ ] [<ffffffff8115a3cf>] __jump_label_update+0x5f/0x80 [ ] [<ffffffff8115a48d>] jump_label_update+0x9d/0xb0 [ ] [<ffffffff8115aa6d>] static_key_slow_inc+0x9d/0xb0 [ ] [<ffffffff810c0f65>] sched_feat_set+0xf5/0x100 [ ] [<ffffffff810c5bdc>] set_numabalancing_state+0x2c/0x30 [ ] [<ffffffff81d12f3d>] numa_policy_init+0x1af/0x1b7 [ ] [<ffffffff81cebdf4>] start_kernel+0x35d/0x41f [ ] [<ffffffff81ceb5a5>] x86_64_start_reservations+0x2a/0x2c [ ] [<ffffffff81ceb6a2>] x86_64_start_kernel+0xfb/0xfe [ ] [ ] -> #0 (jump_label_mutex){+.+...}: [ ] [<ffffffff810de141>] __lock_acquire+0x1701/0x1eb0 [ ] [<ffffffff810def00>] lock_acquire+0x90/0x130 [ ] [<ffffffff81661f83>] mutex_lock_nested+0x63/0x3e0 [ ] [<ffffffff8115a637>] jump_label_lock+0x17/0x20 [ ] [<ffffffff8115aa3b>] static_key_slow_inc+0x6b/0xb0 [ ] [<ffffffff810ca775>] clear_sched_clock_stable+0x15/0x20 [ ] [<ffffffff810503b3>] mark_tsc_unstable+0x23/0x70 [ ] [<ffffffff810772cb>] check_tsc_sync_source+0x14b/0x150 [ ] [<ffffffff81076612>] native_cpu_up+0x3a2/0x890 [ ] [<ffffffff810941cb>] _cpu_up+0xdb/0x160 [ ] [<ffffffff810942c9>] cpu_up+0x79/0x90 [ ] [<ffffffff81d0af6b>] smp_init+0x60/0x8c [ ] [<ffffffff81cebf42>] kernel_init_freeable+0x8c/0x197 [ ] [<ffffffff8164e32e>] kernel_init+0xe/0x130 [ ] [<ffffffff8166beec>] ret_from_fork+0x7c/0xb0 [ ] [ ] other info that might help us debug this: [ ] [ ] Possible unsafe locking scenario: [ ] [ ] CPU0 CPU1 [ ] ---- ---- [ ] lock(cpu_hotplug.lock); [ ] lock(jump_label_mutex); [ ] lock(cpu_hotplug.lock); [ ] lock(jump_label_mutex); [ ] [ ] *** DEADLOCK *** [ ] [ ] 2 locks held by swapper/0/1: [ ] #0: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81094037>] cpu_maps_update_begin+0x17/0x20 [ ] #1: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109408b>] cpu_hotplug_begin+0x2b/0x60 [ ] [ ] stack backtrace: [ ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc3-01745-g848b0d0322cb-dirty #637 [ ] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010 [ ] ffffffff82c9c270 ffff880236843bb8 ffffffff8165c5f5 ffffffff82c9c270 [ ] ffff880236843bf8 ffffffff81658c02 ffff880236843c80 ffff8802368586a0 [ ] ffff880236858678 0000000000000001 0000000000000002 ffff880236858000 [ ] Call Trace: [ ] [<ffffffff8165c5f5>] dump_stack+0x4e/0x7a [ ] [<ffffffff81658c02>] print_circular_bug+0x1f9/0x207 [ ] [<ffffffff810de141>] __lock_acquire+0x1701/0x1eb0 [ ] [<ffffffff816680ff>] ? __atomic_notifier_call_chain+0x8f/0xb0 [ ] [<ffffffff810def00>] lock_acquire+0x90/0x130 [ ] [<ffffffff8115a637>] ? jump_label_lock+0x17/0x20 [ ] [<ffffffff8115a637>] ? jump_label_lock+0x17/0x20 [ ] [<ffffffff81661f83>] mutex_lock_nested+0x63/0x3e0 [ ] [<ffffffff8115a637>] ? jump_label_lock+0x17/0x20 [ ] [<ffffffff8115a637>] jump_label_lock+0x17/0x20 [ ] [<ffffffff8115aa3b>] static_key_slow_inc+0x6b/0xb0 [ ] [<ffffffff810ca775>] clear_sched_clock_stable+0x15/0x20 [ ] [<ffffffff810503b3>] mark_tsc_unstable+0x23/0x70 [ ] [<ffffffff810772cb>] check_tsc_sync_source+0x14b/0x150 [ ] [<ffffffff81076612>] native_cpu_up+0x3a2/0x890 [ ] [<ffffffff810941cb>] _cpu_up+0xdb/0x160 [ ] [<ffffffff810942c9>] cpu_up+0x79/0x90 [ ] [<ffffffff81d0af6b>] smp_init+0x60/0x8c [ ] [<ffffffff81cebf42>] kernel_init_freeable+0x8c/0x197 [ ] [<ffffffff8164e320>] ? rest_init+0xd0/0xd0 [ ] [<ffffffff8164e32e>] kernel_init+0xe/0x130 [ ] [<ffffffff8166beec>] ret_from_fork+0x7c/0xb0 [ ] [<ffffffff8164e320>] ? rest_init+0xd0/0xd0 [ ] ------------[ cut here ]------------ [ ] WARNING: CPU: 0 PID: 1 at /usr/src/linux-2.6/kernel/smp.c:374 smp_call_function_many+0xad/0x300() [ ] Modules linked in: [ ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc3-01745-g848b0d0322cb-dirty #637 [ ] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010 [ ] 0000000000000009 ffff880236843be0 ffffffff8165c5f5 0000000000000000 [ ] ffff880236843c18 ffffffff81093d8c 0000000000000000 0000000000000000 [ ] ffffffff81ccd1a0 ffffffff810ca951 0000000000000000 ffff880236843c28 [ ] Call Trace: [ ] [<ffffffff8165c5f5>] dump_stack+0x4e/0x7a [ ] [<ffffffff81093d8c>] warn_slowpath_common+0x8c/0xc0 [ ] [<ffffffff810ca951>] ? sched_clock_tick+0x1/0xa0 [ ] [<ffffffff81093dda>] warn_slowpath_null+0x1a/0x20 [ ] [<ffffffff8110b72d>] smp_call_function_many+0xad/0x300 [ ] [<ffffffff8104f200>] ? arch_unregister_cpu+0x30/0x30 [ ] [<ffffffff8104f200>] ? arch_unregister_cpu+0x30/0x30 [ ] [<ffffffff810ca951>] ? sched_clock_tick+0x1/0xa0 [ ] [<ffffffff8110ba96>] smp_call_function+0x46/0x80 [ ] [<ffffffff8104f200>] ? arch_unregister_cpu+0x30/0x30 [ ] [<ffffffff8110bb3c>] on_each_cpu+0x3c/0xa0 [ ] [<ffffffff810ca950>] ? sched_clock_idle_sleep_event+0x20/0x20 [ ] [<ffffffff810ca951>] ? sched_clock_tick+0x1/0xa0 [ ] [<ffffffff8104f964>] text_poke_bp+0x64/0xd0 [ ] [<ffffffff810ca950>] ? sched_clock_idle_sleep_event+0x20/0x20 [ ] [<ffffffff8104ccde>] arch_jump_label_transform+0xae/0x130 [ ] [<ffffffff8115a3cf>] __jump_label_update+0x5f/0x80 [ ] [<ffffffff8115a48d>] jump_label_update+0x9d/0xb0 [ ] [<ffffffff8115aa6d>] static_key_slow_inc+0x9d/0xb0 [ ] [<ffffffff810ca775>] clear_sched_clock_stable+0x15/0x20 [ ] [<ffffffff810503b3>] mark_tsc_unstable+0x23/0x70 [ ] [<ffffffff810772cb>] check_tsc_sync_source+0x14b/0x150 [ ] [<ffffffff81076612>] native_cpu_up+0x3a2/0x890 [ ] [<ffffffff810941cb>] _cpu_up+0xdb/0x160 [ ] [<ffffffff810942c9>] cpu_up+0x79/0x90 [ ] [<ffffffff81d0af6b>] smp_init+0x60/0x8c [ ] [<ffffffff81cebf42>] kernel_init_freeable+0x8c/0x197 [ ] [<ffffffff8164e320>] ? rest_init+0xd0/0xd0 [ ] [<ffffffff8164e32e>] kernel_init+0xe/0x130 [ ] [<ffffffff8166beec>] ret_from_fork+0x7c/0xb0 [ ] [<ffffffff8164e320>] ? rest_init+0xd0/0xd0 [ ] ---[ end trace 6ff1df5620c49d26 ]--- [ ] tsc: Marking TSC unstable due to check_tsc_sync_source failed Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-v55fgqj3nnyqnngmvuu8ep6h@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/clock, x86: Use a static_key for sched_clock_stablePeter Zijlstra
In order to avoid the runtime condition and variable load turn sched_clock_stable into a static_key. Also provide a shorter implementation of local_clock() and cpu_clock(int) when sched_clock_stable==1. MAINLINE PRE POST sched_clock_stable: 1 1 1 (cold) sched_clock: 329841 221876 215295 (cold) local_clock: 301773 234692 220773 (warm) sched_clock: 38375 25602 25659 (warm) local_clock: 100371 33265 27242 (warm) rdtsc: 27340 24214 24208 sched_clock_stable: 0 0 0 (cold) sched_clock: 382634 235941 237019 (cold) local_clock: 396890 297017 294819 (warm) sched_clock: 38194 25233 25609 (warm) local_clock: 143452 71234 71232 (warm) rdtsc: 27345 24245 24243 Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/clock: Remove local_irq_disable() from the clocksPeter Zijlstra
Now that x86 no longer requires IRQs disabled for sched_clock() and ia64 never had this requirement (it doesn't seem to do cpufreq at all), we can remove the requirement of disabling IRQs. MAINLINE PRE POST sched_clock_stable: 1 1 1 (cold) sched_clock: 329841 257223 221876 (cold) local_clock: 301773 309889 234692 (warm) sched_clock: 38375 25280 25602 (warm) local_clock: 100371 85268 33265 (warm) rdtsc: 27340 24247 24214 sched_clock_stable: 0 0 0 (cold) sched_clock: 382634 301224 235941 (cold) local_clock: 396890 399870 297017 (warm) sched_clock: 38194 25630 25233 (warm) local_clock: 143452 129629 71234 (warm) rdtsc: 27345 24307 24245 Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-36e5kohiasnr106d077mgubp@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>