summaryrefslogtreecommitdiff
path: root/kernel/debug/debug_core.c
AgeCommit message (Collapse)Author
2018-12-30kdb: Don't back trace on a cpu that didn't round upDouglas Anderson
If you have a CPU that fails to round up and then run 'btc' you'll end up crashing in kdb becaue we dereferenced NULL. Let's add a check. It's wise to also set the task to NULL when leaving the debugger so that if we fail to round up on a later entry into the debugger we won't backtrace a stale task. Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2018-12-30kgdb: Don't round up a CPU that failed rounding up beforeDouglas Anderson
If we're using the default implementation of kgdb_roundup_cpus() that uses smp_call_function_single_async() we can end up hanging kgdb_roundup_cpus() if we try to round up a CPU that failed to round up before. Specifically smp_call_function_single_async() will try to wait on the csd lock for the CPU that we're trying to round up. If the previous round up never finished then that lock could still be held and we'll just sit there hanging. There's not a lot of use trying to round up a CPU that failed to round up before. Let's keep a flag that indicates whether the CPU started but didn't finish to round up before. If we see that flag set then we'll skip the next round up. In general we have a few goals here: - We never want to end up calling smp_call_function_single_async() when the csd is still locked. This is accomplished because flush_smp_call_function_queue() unlocks the csd _before_ invoking the callback. That means that when kgdb_nmicallback() runs we know for sure the the csd is no longer locked. Thus when we set "rounding_up = false" we know for sure that the csd is unlocked. - If there are no timeouts rounding up we should never skip a round up. NOTE #1: In general trying to continue running after failing to round up CPUs doesn't appear to be supported in the debugger. When I simulate this I find that kdb reports "Catastrophic error detected" when I try to continue. I can overrule and continue anyway, but it should be noted that we may be entering the land of dragons here. Possibly the "Catastrophic error detected" was added _because_ of the future failure to round up, but even so this is an area of the code that hasn't been strongly tested. NOTE #2: I did a bit of testing before and after this change. I introduced a 10 second hang in the kernel while holding a spinlock that I could invoke on a certain CPU with 'taskset -c 3 cat /sys/...". Before this change if I did: - Invoke hang - Enter debugger - g (which warns about Catastrophic error, g again to go anyway) - g - Enter debugger ...I'd hang the rest of the 10 seconds without getting a debugger prompt. After this change I end up in the debugger the 2nd time after only 1 second with the standard warning about 'Timed out waiting for secondary CPUs.' I'll also note that once the CPU finished waiting I could actually debug it (aka "btc" worked) I won't promise that everything works perfectly if the errant CPU comes back at just the wrong time (like as we're entering or exiting the debugger) but it certainly seems like an improvement. NOTE #3: setting 'kgdb_info[cpu].rounding_up = false' is in kgdb_nmicallback() instead of kgdb_call_nmi_hook() because some implementations override kgdb_call_nmi_hook(). It shouldn't hurt to have it in kgdb_nmicallback() in any case. NOTE #4: this logic is really only needed because there is no API call like "smp_try_call_function_single_async()" or "smp_csd_is_locked()". If such an API existed then we'd use it instead, but it seemed a bit much to add an API like this just for kgdb. Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2018-12-30kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()Douglas Anderson
When I had lockdep turned on and dropped into kgdb I got a nice splat on my system. Specifically it hit: DEBUG_LOCKS_WARN_ON(current->hardirq_context) Specifically it looked like this: sysrq: SysRq : DEBUG ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(current->hardirq_context) WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 lockdep_hardirqs_on+0xf0/0x160 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27 pstate: 604003c9 (nZCv DAIF +PAN -UAO) pc : lockdep_hardirqs_on+0xf0/0x160 ... Call trace: lockdep_hardirqs_on+0xf0/0x160 trace_hardirqs_on+0x188/0x1ac kgdb_roundup_cpus+0x14/0x3c kgdb_cpu_enter+0x53c/0x5cc kgdb_handle_exception+0x180/0x1d4 kgdb_compiled_brk_fn+0x30/0x3c brk_handler+0x134/0x178 do_debug_exception+0xfc/0x178 el1_dbg+0x18/0x78 kgdb_breakpoint+0x34/0x58 sysrq_handle_dbg+0x54/0x5c __handle_sysrq+0x114/0x21c handle_sysrq+0x30/0x3c qcom_geni_serial_isr+0x2dc/0x30c ... ... irq event stamp: ...45 hardirqs last enabled at (...44): [...] __do_softirq+0xd8/0x4e4 hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130 softirqs last enabled at (...42): [...] _local_bh_enable+0x2c/0x34 softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100 ---[ end trace adf21f830c46e638 ]--- Looking closely at it, it seems like a really bad idea to be calling local_irq_enable() in kgdb_roundup_cpus(). If nothing else that seems like it could violate spinlock semantics and cause a deadlock. Instead, let's use a private csd alongside smp_call_function_single_async() to round up the other CPUs. Using smp_call_function_single_async() doesn't require interrupts to be enabled so we can remove the offending bit of code. In order to avoid duplicating this across all the architectures that use the default kgdb_roundup_cpus(), we'll add a "weak" implementation to debug_core.c. Looking at all the people who previously had copies of this code, there were a few variants. I've attempted to keep the variants working like they used to. Specifically: * For arch/arc we passed NULL to kgdb_nmicallback() instead of get_irq_regs(). * For arch/mips there was a bit of extra code around kgdb_nmicallback() NOTE: In this patch we will still get into trouble if we try to round up a CPU that failed to round up before. We'll try to round it up again and potentially hang when we try to grab the csd lock. That's not new behavior but we'll still try to do better in a future patch. Suggested-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: James Hogan <jhogan@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2018-12-30kgdb: Remove irq flags from roundupDouglas Anderson
The function kgdb_roundup_cpus() was passed a parameter that was documented as: > the flags that will be used when restoring the interrupts. There is > local_irq_save() call before kgdb_roundup_cpus(). Nobody used those flags. Anyone who wanted to temporarily turn on interrupts just did local_irq_enable() and local_irq_disable() without looking at them. So we can definitely remove the flags. Signed-off-by: Douglas Anderson <dianders@chromium.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: James Hogan <jhogan@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/nmi.h> We are going to move softlockup APIs out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. <linux/nmi.h> already includes <linux/sched.h>. Include the <linux/nmi.h> header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02mm/vmacache, sched/headers: Introduce 'struct vmacache' and move it from ↵Ingo Molnar
<linux/sched.h> to <linux/mm_types> The <linux/sched.h> header includes various vmacache related defines, which are arguably misplaced. Move them to mm_types.h and minimize the sched.h impact by putting all task vmacache state into a new 'struct vmacache' structure. No change in functionality. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-14kernel/debug/debug_core.c: more properly delay for secondary CPUsDouglas Anderson
We've got a delay loop waiting for secondary CPUs. That loop uses loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how many tight loops make up a jiffy on all architectures. It is quite common to see things like this in the boot log: Calibrating delay loop (skipped), value calculated using timer frequency.. 48.00 BogoMIPS (lpj=24000) In my case I was seeing lots of cases where other CPUs timed out entering the debugger only to print their stack crawls shortly after the kdb> prompt was written. Elsewhere in kgdb we already use udelay(), so that should be safe enough to use to implement our timeout. We'll delay 1 ms for 1000 times, which should give us a full second of delay (just like the old code wanted) but allow us to notice that we're done every 1 ms. [akpm@linux-foundation.org: simplifications, per Daniel] Link: http://lkml.kernel.org/r/1477091361-2039-1-git-send-email-dianders@chromium.org Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Brian Norris <briannorris@chromium.org> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-19debug: prevent entering debug mode on panic/exception.Colin Cross
On non-developer devices, kgdb prevents the device from rebooting after a panic. Incase of panics and exceptions, to allow the device to reboot, prevent entering debug mode to avoid getting stuck waiting for the user to interact with debugger. To avoid entering the debugger on panic/exception without any extra configuration, panic_timeout is being used which can be set via /proc/sys/kernel/panic at run time and CONFIG_PANIC_TIMEOUT sets the default value. Setting panic_timeout indicates that the user requested machine to perform unattended reboot after panic. We dont want to get stuck waiting for the user input incase of panic. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: kgdb-bugreport@lists.sourceforge.net Cc: linux-kernel@vger.kernel.org Cc: Android Kernel Team <kernel-team@android.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Colin Cross <ccross@android.com> [Kiran: Added context to commit message. panic_timeout is used instead of break_on_panic and break_on_exception to honor CONFIG_PANIC_TIMEOUT Modified the commit as per community feedback] Signed-off-by: Kiran Raparthy <kiran.kumar@linaro.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19kdb: Fix off by one error in kdb_cpu()Jason Wessel
There was a follow on replacement patch against the prior "kgdb: Timeout if secondary CPUs ignore the roundup". See: https://lkml.org/lkml/2015/1/7/442 This patch is the delta vs the patch that was committed upstream: * Fix an off-by-one error in kdb_cpu(). * Replace NR_CPUS with CONFIG_NR_CPUS to tell checkpatch that we really want a static limit. * Removed the "KGDB: " prefix from the pr_crit() in debug_core.c (kgdb-next contains a patch which introduced pr_fmt() to this file to the tag will now be applied automatically). Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2014-11-11kernel/debug/debug_core.c: Logging clean-upFabian Frederick
-Convert printk( to pr_foo() -Add pr_fmt -Coalesce formats Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2014-11-11kgdb: timeout if secondary CPUs ignore the roundupDaniel Thompson
Currently if an active CPU fails to respond to a roundup request the CPU that requested the roundup will become stuck. This needlessly reduces the robustness of the debugger. This patch introduces a timeout allowing the system state to be examined even when the system contains unresponsive processors. It also modifies kdb's cpu command to make it censor attempts to switch to unresponsive processors and to report their state as (D)ead. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2014-04-18arch: Mass conversion of smp_mb__*()Peter Zijlstra
Mostly scripted conversion of the smp_mb__* barriers. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-07mm: per-thread vma cachingDavidlohr Bueso
This patch is a continuation of efforts trying to optimize find_vma(), avoiding potentially expensive rbtree walks to locate a vma upon faults. The original approach (https://lkml.org/lkml/2013/11/1/410), where the largest vma was also cached, ended up being too specific and random, thus further comparison with other approaches were needed. There are two things to consider when dealing with this, the cache hit rate and the latency of find_vma(). Improving the hit-rate does not necessarily translate in finding the vma any faster, as the overhead of any fancy caching schemes can be too high to consider. We currently cache the last used vma for the whole address space, which provides a nice optimization, reducing the total cycles in find_vma() by up to 250%, for workloads with good locality. On the other hand, this simple scheme is pretty much useless for workloads with poor locality. Analyzing ebizzy runs shows that, no matter how many threads are running, the mmap_cache hit rate is less than 2%, and in many situations below 1%. The proposed approach is to replace this scheme with a small per-thread cache, maximizing hit rates at a very low maintenance cost. Invalidations are performed by simply bumping up a 32-bit sequence number. The only expensive operation is in the rare case of a seq number overflow, where all caches that share the same address space are flushed. Upon a miss, the proposed replacement policy is based on the page number that contains the virtual address in question. Concretely, the following results are seen on an 80 core, 8 socket x86-64 box: 1) System bootup: Most programs are single threaded, so the per-thread scheme does improve ~50% hit rate by just adding a few more slots to the cache. +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 50.61% | 19.90 | | patched | 73.45% | 13.58 | +----------------+----------+------------------+ 2) Kernel build: This one is already pretty good with the current approach as we're dealing with good locality. +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 75.28% | 11.03 | | patched | 88.09% | 9.31 | +----------------+----------+------------------+ 3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload. +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 70.66% | 17.14 | | patched | 91.15% | 12.57 | +----------------+----------+------------------+ 4) Ebizzy: There's a fair amount of variation from run to run, but this approach always shows nearly perfect hit rates, while baseline is just about non-existent. The amounts of cycles can fluctuate between anywhere from ~60 to ~116 for the baseline scheme, but this approach reduces it considerably. For instance, with 80 threads: +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 1.06% | 91.54 | | patched | 99.97% | 14.18 | +----------------+----------+------------------+ [akpm@linux-foundation.org: fix nommu build, per Davidlohr] [akpm@linux-foundation.org: document vmacache_valid() logic] [akpm@linux-foundation.org: attempt to untangle header files] [akpm@linux-foundation.org: add vmacache_find() BUG_ON] [hughd@google.com: add vmacache_valid_mm() (from Oleg)] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: adjust and enhance comments] Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Tested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-26KGDB: make kgdb_breakpoint() as noinlineVijaya Kumar K
The function kgdb_breakpoint() sets up break point at compile time by calling arch_kgdb_breakpoint(); Though this call is surrounded by wmb() barrier, the compile can still re-order the break point, because this scheduling barrier is not a code motion barrier in gcc. Making kgdb_breakpoint() as noinline solves this problem of code reording around break point instruction and also avoids problem of being called as inline function from other places More details about discussion on this can be found here http://comments.gmane.org/gmane.linux.ports.arm.kernel/269732 Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-25kgdb/kdb: Fix no KDB config problemMike Travis
Some code added to the debug_core module had KDB dependencies that it shouldn't have. Move the KDB dependent REASON back to the caller to remove the dependency in the debug core code. Update the call from the UV NMI handler to conform to the new interface. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20140114162551.318251993@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-03kdb: Add support for external NMI handler to call KGDB/KDBMike Travis
This patch adds a kgdb_nmicallin() interface that can be used by external NMI handlers to call the KGDB/KDB handler. The primary need for this is for those types of NMI interrupts where all the CPUs have already received the NMI signal. Therefore no send_IPI(NMI) is required, and in fact it will cause a 2nd unhandled NMI to occur. This generates the "Dazed and Confuzed" messages. Since all the CPUs are getting the NMI at roughly the same time, it's not guaranteed that the first CPU that hits the NMI handler will manage to enter KGDB and set the dbg_master_lock before the slaves start entering. The new argument "send_ready" was added for KGDB to signal the NMI handler to release the slave CPUs for entry into KGDB. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Jason Wessel <jason.wessel@windriver.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20131002151417.928886849@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-30kgdb/sysrq: fix inconstistent help message of sysrq keyzhangwei(Jovi)
Currently help message of /proc/sysrq-trigger highlight its upper-case characters, like below: SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) ... this would confuse user trigger sysrq by upper-case character, which is inconsistent with the real lower-case character registed key. This inconsistent help message will also lead more confused when 26 upper-case letters put into use in future. This patch fix kgdb sysrq key: "debug(g)" Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-04kgdb: remove #include <linux/serial_8250.h> from kgdb.hGreg Kroah-Hartman
There's no reason kgdb.h itself needs to include the 8250 serial port header file. So push it down to the _very_ limited number of individual drivers that need the values in that file, and fix up the places where people really wanted serial_core.h and platform_device.h. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-13Merge tag 'for_linus-3.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb Pull KGDB/KDB fixes and cleanups from Jason Wessel: "Cleanups - Clean up compile warnings in kgdboc.c and x86/kernel/kgdb.c - Add module event hooks for simplified debugging with gdb Fixes - Fix kdb to stop paging with 'q' on bta and dmesg - Fix for data that scrolls off the vga console due to line wrapping when using the kdb pager New - The debug core registers for kernel module events which allows a kernel aware gdb to automatically load symbols and break on entry to a kernel module - Allow kgdboc=kdb to setup kdb on the vga console" * tag 'for_linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb: tty/console: fix warnings in drivers/tty/serial/kgdboc.c kdb,vt_console: Fix missed data due to pager overruns kdb: Fix dmesg/bta scroll to quit with 'q' kgdboc: Accept either kbd or kdb to activate the vga + keyboard kdb shell kgdb,x86: fix warning about unused variable mips,kgdb: fix recursive page fault with CONFIG_KPROBES kgdb: Add module event hooks
2012-10-12kgdb: Add module event hooksJason Wessel
Allow gdb to auto load kernel modules when it is attached, which makes it trivially easy to debug module init functions or pre-set breakpoints in a kernel module that has not loaded yet. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-09-26kernel/debug: Mask KGDB NMI upon entryAnton Vorontsov
The new arch callback should manage NMIs that usually cause KGDB to enter. That is, not all NMIs should be enabled/disabled, but only those that issue kgdb_handle_exception(). We must mask it as serial-line interrupt can be used as an NMI, so if the original KGDB-entry cause was say a breakpoint, then every input to KDB console will cause KGDB to reenter, which we don't want. Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-04Merge tag 'for_linus-3.4-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb Pull KGDB/KDB regression fixes from Jason Wessel: - Fix a Smatch warning that appeared in the 3.4 merge window - Fix kgdb test suite with SMP for all archs without HW single stepping - Fix kgdb sw breakpoints with CONFIG_DEBUG_RODATA=y limitations on x86 - Fix oops on kgdb test suite with CONFIG_DEBUG_RODATA - Fix kgdb test suite with SMP for all archs with HW single stepping * tag 'for_linus-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb: x86,kgdb: Fix DEBUG_RODATA limitation using text_poke() kgdb,debug_core: pass the breakpoint struct instead of address and memory kgdbts: (2 of 2) fix single step awareness to work correctly with SMP kgdbts: (1 of 2) fix single step awareness to work correctly with SMP kgdbts: Fix kernel oops with CONFIG_DEBUG_RODATA kdb: Fix smatch warning on dbg_io_ops->is_console
2012-03-29kgdb,debug_core: pass the breakpoint struct instead of address and memoryJason Wessel
There is extra state information that needs to be exposed in the kgdb_bpt structure for tracking how a breakpoint was installed. The debug_core only uses the the probe_kernel_write() to install breakpoints, but this is not enough for all the archs. Some arch such as x86 need to use text_poke() in order to install a breakpoint into a read only page. Passing the kgdb_bpt structure to kgdb_arch_set_breakpoint() and kgdb_arch_remove_breakpoint() allows other archs to set the type variable which indicates how the breakpoint was installed. Cc: stable@vger.kernel.org # >= 2.6.36 Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-28Merge tag 'split-asm_system_h-for-linus-20120328' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...
2012-03-28Remove all #inclusions of asm/system.hDavid Howells
Remove all #inclusions of asm/system.h preparatory to splitting and killing it. Performed with the following command: perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *` Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-22kgdb,debug_core: add the ability to control the reboot notifierJason Wessel
Sometimes it is desirable to stop the kernel debugger before allowing a system to reboot either with kdb or kgdb. This patch adds the ability to turn the reboot notifier on and off or enter the debugger and stop kernel execution before rebooting. It is possible to change the setting after booting the kernel with the following: echo 1 > /sys/module/debug_core/parameters/kgdbreboot It is also possible to change this setting using kdb / kgdb to manipulate the variable directly. Using KDB: mm kgdbreboot 1 Using gdb: set kgdbreboot=1 Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-22kgdb,debug-core,gdbstub: Hook the reboot notifier for debugger detachJason Wessel
The gdbstub and kdb should get detached if the system is rebooting. Calling gdbstub_exit() will set the proper debug core state and send a message to any debugger that is connected to correctly detach. An attached debugger will receive the exit code from include/linux/reboot.h based on SYS_HALT, SYS_REBOOT, etc... Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2011-07-26atomic: use <linux/atomic.h>Arun Sharma
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2010-10-29debug_core,x86,blackfin: Clean up hw debug disable APIDongdong Deng
The kgdb_disable_hw_debug() was an architecture specific function for disabling all hardware breakpoints on a per cpu basis when entering the debug core. This patch will remove the weak function kdbg_disable_hw_debug() and change it into a call back which lives with the rest of hw breakpoint call backs in struct kgdb_arch. Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-22kdb,debug_core: adjust master cpu switch logic against new debug_core lockingJason Wessel
The kdb shell needs to enforce switching back to the original CPU that took the exception before restoring normal kernel execution. Resuming from a different CPU than what took the original exception will cause problems with spin locks that are freed from the a different processor than had taken the lock. The special logic in dbg_cpu_switch() can go away entirely with because the state of what cpus want to be masters or slaves will remain unchanged between entry and exit of the debug_core exception context. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-22debug_core: refactor locking for master/slave cpusJason Wessel
For quite some time there have been problems with memory barriers and various races with NMI on multi processor systems using the kernel debugger. The algorithm for entering the kernel debug core and resuming kernel execution was racy and had several known edge case problems with attempting to debug something on a heavily loaded system using breakpoints that are hit repeatedly and quickly. The prior "locking" design entry worked as follows: * The atomic counter kgdb_active was used with atomic exchange in order to elect a master cpu out of all the cpus that may have taken a debug exception. * The master cpu increments all elements of passive_cpu_wait[]. * The master cpu issues the round up cpus message. * Each "slave cpu" that enters the debug core increments its own element in cpu_in_kgdb[]. * Each "slave cpu" spins on passive_cpu_wait[] until it becomes 0. * The master cpu debugs the system. The new scheme removes the two arrays of atomic counters and replaces them with 2 single counters. One counter is used to count the number of cpus waiting to become a master cpu (because one or more hit an exception). The second counter is use to indicate how many cpus have entered as slave cpus. The new entry logic works as follows: * One or more cpus enters via kgdb_handle_exception() and increments the masters_in_kgdb. Each cpu attempts to get the spin lock called dbg_master_lock. * The master cpu sets kgdb_active to the current cpu. * The master cpu takes the spinlock dbg_slave_lock. * The master cpu asks to round up all the other cpus. * Each slave cpu that is not already in kgdb_handle_exception() will enter and increment slaves_in_kgdb. Each slave will now spin try_locking on dbg_slave_lock. * The master cpu waits for the sum of masters_in_kgdb and slaves_in_kgdb to be equal to the sum of the online cpus. * The master cpu debugs the system. In the new design the kgdb_active can only be changed while holding dbg_master_lock. Stress testing has not turned up any further entry/exit races that existed in the prior locking design. The prior locking design suffered from atomic variables not being truly atomic (in the capacity as used by kgdb) along with memory barrier races. Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Acked-by: Dongdong Deng <dongdong.deng@windriver.com>
2010-10-22debug_core: disable hw_breakpoints on all cores in kgdb_cpu_enter()Dongdong Deng
The slave cpus do not have the hw breakpoints disabled upon entry to the debug_core and as a result could cause unrecoverable recursive faults on badly placed breakpoints, or get out of sync with the arch specific hw breakpoint operations. This patch addresses the problem by invoking kgdb_disable_hw_debug() earlier in kgdb_enter_cpu for each cpu that enters the debug core. The hw breakpoint dis/enable flow should be: master_debug_cpu slave_debug_cpu \ / kgdb_cpu_enter | kgdb_disable_hw_debug --> uninstall pre-enabled hw_breakpoint | do add/rm dis/enable operates to hw_breakpoints on master_debug_cpu.. | correct_hw_break --> correct/install the enabled hw_breakpoint | leave_kgdb Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-22debug_core: stop rcu warnings on kernel resumeJason Wessel
When returning from the kernel debugger reset the rcu jiffies_stall value to prevent the rcu stall detector from sending NMI events which invoke a stack dump for each cpu in the system. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-22debug_core: move all watch dog syncs to a single functionJason Wessel
Move the various clock and watch dog syncs to a single function in advance of adding another sync for the rcu stall detector. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-08-19Input: sysrq - drop tty argument from sysrq ops handlersDmitry Torokhov
Noone is using tty argument so let's get rid of it. Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: Jason Wessel <jason.wessel@windriver.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2010-08-05Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: debug_core,kdb: fix crash when arch does not have single step kgdb,x86: use macro HBP_NUM to replace magic number 4 kgdb,mips: remove unused kgdb_cpu_doing_single_step operations mm,kdb,kgdb: Add a debug reference for the kdb kmap usage KGDB: Remove set but unused newPC ftrace,kdb: Allow dumping a specific cpu's buffer with ftdump ftrace,kdb: Extend kdb to be able to dump the ftrace buffer kgdb,powerpc: Replace hardcoded offset by BREAK_INSTR_SIZE arm,kgdb: Add ability to trap into debugger on notify_die gdbstub: do not directly use dbg_reg_def[] in gdb_cmd_reg_set() gdbstub: Implement gdbserial 'p' and 'P' packets kgdb,arm: Individual register get/set for arm kgdb,mips: Individual register get/set for mips kgdb,x86: Individual register get/set for x86 kgdb,kdb: individual register set and and get API gdbstub: Optimize kgdb's "thread:" response for the gdb serial protocol kgdb: remove custom hex_to_bin()implementation
2010-08-05debug_core,kdb: fix crash when arch does not have single stepJason Wessel
When an arch such as mips and microblaze does not implement either HW or software single stepping the debug core should re-enter kdb. The kdb code will properly ignore the single step operation. Attempting to single step the kernel without software or hardware support causes unpredictable kernel crashes. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-08-04Merge branch 'master' into for-nextJiri Kosina
2010-07-21debug_core,kdb: fix kgdb_connected bit set in the wrong placeJason Wessel
Immediately following an exit from the kdb shell the kgdb_connected variable should be set to zero, unless there are breakpoints planted. If the kgdb_connected variable is not zeroed out with kdb, it is impossible to turn off kdb. This patch is merely a work around for now, the real fix will check for the breakpoints. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-07-19update email addressPavel Machek
pavel@suse.cz no longer works, replace it with working address. Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-05-20x86, kgdb, init: Add early and late debug statesJason Wessel
The kernel debugger can operate well before mm_init(), but the x86 hardware breakpoint code which uses the perf api requires that the kernel allocators are initialized. This means the kernel debug core needs to provide an optional arch specific call back to allow the initialization functions to run after the kernel has been further initialized. The kdb shell already had a similar restriction with an early initialization and late initialization. The kdb_init() was moved into the debug core's version of the late init which is called dbg_late_init(); CC: kgdb-bugreport@lists.sourceforge.net Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20kdb,debug_core: Allow the debug core to receive a panic notificationJason Wessel
It is highly desirable to trap into kdb on panic. The debug core will attempt to register as the first in line for the panic notifier. CC: Ingo Molnar <mingo@elte.hu> CC: Andrew Morton <akpm@linux-foundation.org> CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20debug_core,kdb: Allow the debug core to process a recursive debug entryJason Wessel
This allows kdb to debug a crash with in the kms code with a single level recursive re-entry. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20kgdb: Add the ability to schedule a breakpoint via a taskletJason Wessel
Some kgdb I/O modules require the ability to create a breakpoint tasklet, such as kgdboc and external modules such as kgdboe. The breakpoint tasklet is used as an asynchronous entry point into the debugger which will have a different function scope than the current execution path where it might not be safe to have an inline breakpoint. This is true of some of the kgdb I/O drivers which share code with kgdb and rest of the kernel users. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20x86,kgdb: Add low level debug hookJason Wessel
The only way the debugger can handle a trap in inside rcu_lock, notify_die, or atomic_notifier_call_chain without a triple fault is to have a low level "first opportunity handler" in the int3 exception handler. Generally this will be something the vast majority of folks will not need, but for those who need it, it is added as a kernel .config option called KGDB_LOW_LEVEL_TRAP. CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> CC: H. Peter Anvin <hpa@zytor.com> CC: x86@kernel.org Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20kgdb: remove post_primary_code referencesJason Wessel
Remove all the references to the kgdb_post_primary_code. This function serves no useful purpose because you can obtain the same information from the "struct kgdb_state *ks" from with in the debugger, if for some reason you want the data. Also remove the unintentional duplicate assignment for ks->ex_vector. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20kgdb: gdb "monitor" -> kdb passthroughJason Wessel
One of the driving forces behind integrating another front end (kdb) to the debug core is to allow front end commands to be accessible via gdb's monitor command. It is true that you could write gdb macros to get certain data, but you may want to just use gdb to access the commands that are available in the kdb front end. This patch implements the Rcmd gdb stub packet. In gdb you access this with the "monitor" command. For instance you could type "monitor help", "monitor lsmod" or "monitor ps A" etc... There is no error checking or command restrictions on what you can and cannot access at this point. Doing something like trying to set breakpoints with the monitor command is going to cause nothing but problems. Perhaps in the future only the commands that are actually known to work with the gdb monitor command will be available. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20kgdb,8250,pl011: Return immediately from console pollJason Wessel
The design of the kdb shell requires that every device that can provide input to kdb have a polling routine that exits immediately if there is no character available. This is required in order to get the page scrolling mechanism working. Changing the kernel debugger I/O API to require all polling character routines to exit immediately if there is no data allows the kernel debugger to process multiple input channels. NO_POLL_CHAR will be the return code to the polling routine when ever there is no character available. CC: linux-serial@vger.kernel.org Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20kgdb: core changes to support kdbJason Wessel
These are the minimum changes to the kgdb core in order to enable an API to connect a new front end (kdb) to the debug core. This patch introduces the dbg_kdb_mode variable controls where the user level I/O is routed. It will be routed to the gdbstub (kgdb) or to the kdb front end which is a simple shell available over the kgdboc connection. You can switch back and forth between kdb or the gdb stub mode of operation dynamically. From gdb stub mode you can blindly type "$3#33", or from the kdb mode you can enter "kgdb" to switch to the gdb stub. The logic in the debug core depends on kdb to look for the typical gdb connection sequences and return immediately with KGDB_PASS_EVENT if a gdb serial command sequence is detected. That should allow a reasonably seamless transition between kdb -> gdb without leaving the kernel exception state. The two gdb serial queries that kdb is responsible for detecting are the "?" and "qSupported" packets. CC: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Acked-by: Martin Hicks <mort@sgi.com>