summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/dumpstack_64.c
AgeCommit message (Collapse)Author
2019-11-05x86/dumpstack/64: Don't evaluate exception stacks before setupThomas Gleixner
Cyrill reported the following crash: BUG: unable to handle page fault for address: 0000000000001ff0 #PF: supervisor read access in kernel mode RIP: 0010:get_stack_info+0xb3/0x148 It turns out that if the stack tracer is invoked before the exception stack mappings are initialized in_exception_stack() can erroneously classify an invalid address as an address inside of an exception stack: begin = this_cpu_read(cea_exception_stacks); <- 0 end = begin + sizeof(exception stacks); i.e. any address between 0 and end will be considered as exception stack address and the subsequent code will then try to derefence the resulting stack frame at a non mapped address. end = begin + (unsigned long)ep->size; ==> end = 0x2000 regs = (struct pt_regs *)end - 1; ==> regs = 0x2000 - sizeof(struct pt_regs *) = 0x1ff0 info->next_sp = (unsigned long *)regs->sp; ==> Crashes due to accessing 0x1ff0 Prevent this by checking the validity of the cea_exception_stack base address and bailing out if it is zero. Fixes: afcd21dad88b ("x86/dumpstack/64: Use cpu_entry_area instead of orig_ist") Reported-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Cyrill Gorcunov <gorcunov@gmail.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1910231950590.1852@nanos.tec.linutronix.de
2019-04-17x86/irq/64: Rename irq_stack_ptr to hardirq_stack_ptrThomas Gleixner
Preparatory patch to share code with 32bit. No functional changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolai Stange <nstange@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190414160145.912584074@linutronix.de
2019-04-17x86/dumpstack/64: Speedup in_exception_stack()Thomas Gleixner
The current implementation of in_exception_stack() iterates over the exception stacks array. Most of the time this is an useless exercise, but even for the actual use cases (perf and ftrace) it takes at least 2 iterations to get to the NMI stack. As the exception stacks and the guard pages are page aligned the loop can be avoided completely. Add a initial check whether the stack pointer is inside the full exception stack area and leave early if not. Create a lookup table which describes the stack area. The table index is the page offset from the beginning of the exception stacks. So for any given stack pointer the page offset is computed and a lookup in the description table is performed. If it is inside a guard page, return. If not, use the descriptor to fill in the info structure. The table is filled at compile time and for the !KASAN case the interesting page descriptors exactly fit into a single cache line. Just the last guard page descriptor is in the next cacheline, but that should not be accessed in the regular case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190414160145.543320386@linutronix.de
2019-04-17x86/exceptions: Split debug IST stackThomas Gleixner
The debug IST stack is actually two separate debug stacks to handle #DB recursion. This is required because the CPU starts always at top of stack on exception entry, which means on #DB recursion the second #DB would overwrite the stack of the first. The low level entry code therefore adjusts the top of stack on entry so a secondary #DB starts from a different stack page. But the stack pages are adjacent without a guard page between them. Split the debug stack into 3 stacks which are separated by guard pages. The 3rd stack is never mapped into the cpu_entry_area and is only there to catch triple #DB nesting: --- top of DB_stack <- Initial stack --- end of DB_stack guard page --- top of DB1_stack <- Top of stack after entering first #DB --- end of DB1_stack guard page --- top of DB2_stack <- Top of stack after entering second #DB --- end of DB2_stack guard page If DB2 would not act as the final guard hole, a second #DB would point the top of #DB stack to the stack below #DB1 which would be valid and not catch the not so desired triple nesting. The backing store does not allocate any memory for DB2 and its guard page as it is not going to be mapped into the cpu_entry_area. - Adjust the low level entry code so it adjusts top of #DB with the offset between the stacks instead of exception stack size. - Make the dumpstack code aware of the new stacks. - Adjust the in_debug_stack() implementation and move it into the NMI code where it belongs. As this is NMI hotpath code, it just checks the full area between top of DB_stack and bottom of DB1_stack without checking for the guard page. That's correct because the NMI cannot hit a stackpointer pointing to the guard page between DB and DB1 stack. Even if it would, then the NMI operation still is unaffected, but the resume of the debug exception on the topmost DB stack will crash by touching the guard page. [ bp: Make exception_stack_names static const char * const ] Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: linux-doc@vger.kernel.org Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qian Cai <cai@lca.pw> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190414160145.439944544@linutronix.de
2019-04-17x86/dumpstack/64: Use cpu_entry_area instead of orig_istThomas Gleixner
The orig_ist[] array is a shadow copy of the IST array in the TSS. The reason why it exists is that older kernels used two TSS variants with different pointers into the debug stack. orig_ist[] contains the real starting points. There is no point anymore to do so because the same information can be retrieved using the base address of the cpu entry area mapping and the offsets of the various exception stacks. No functional change. Preparation for removing orig_ist. Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190414160144.974900463@linutronix.de
2019-04-17x86/exceptions: Make IST index zero basedThomas Gleixner
The defines for the exception stack (IST) array in the TSS are using the SDM convention IST1 - IST7. That causes all sorts of code to subtract 1 for array indices related to IST. That's confusing at best and does not provide any value. Make the indices zero based and fixup the usage sites. The only code which needs to adjust the 0 based index is the interrupt descriptor setup which needs to add 1 now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: linux-doc@vger.kernel.org Cc: Nicolai Stange <nstange@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qian Cai <cai@lca.pw> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190414160144.331772825@linutronix.de
2019-04-17x86/dumpstack: Fix off-by-one errors in stack identificationAndy Lutomirski
The get_stack_info() function is off-by-one when checking whether an address is on a IRQ stack or a IST stack. This prevents an overflowed IRQ or IST stack from being dumped properly. [ tglx: Do the same for 32-bit ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190414160143.785651055@linutronix.de
2018-03-08x86/dumpstack: Unify show_regs()Borislav Petkov
The 32-bit version uses KERN_EMERG and commit b0f4c4b32c8e ("bugs, x86: Fix printk levels for panic, softlockups and stack dumps") changed the 64-bit version to KERN_DEFAULT. The same justification in that commit that those messages do not belong in the terminal, holds true for 32-bit also, so make it so. Make code_bytes static, while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Link: https://lkml.kernel.org/r/20180306094920.16917-4-bp@alien8.de
2017-12-22x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stackDave Hansen
If the kernel oopses while on the trampoline stack, it will print "<SYSENTER>" even if SYSENTER is not involved. That is rather confusing. The "SYSENTER" stack is used for a lot more than SYSENTER now. Give it a better string to display in stack dumps, and rename the kernel code to match. Also move the 32-bit code over to the new naming even though it still uses the entry stack only for SYSENTER. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-17x86/dumpstack: Add get_stack_info() support for the SYSENTER stackAndy Lutomirski
get_stack_info() doesn't currently know about the SYSENTER stack, so unwinding will fail if we entered the kernel on the SYSENTER stack and haven't fully switched off. Teach get_stack_info() about the SYSENTER stack. With future patches applied that run part of the entry code on the SYSENTER stack and introduce an intentional BUG(), I would get: PANIC: double fault, error_code: 0x0 ... RIP: 0010:do_error_trap+0x33/0x1c0 ... Call Trace: Code: ... With this patch, I get: PANIC: double fault, error_code: 0x0 ... Call Trace: <SYSENTER> ? async_page_fault+0x36/0x60 ? invalid_op+0x22/0x40 ? async_page_fault+0x36/0x60 ? sync_regs+0x3c/0x40 ? sync_regs+0x2e/0x40 ? error_entry+0x6c/0xd0 ? async_page_fault+0x36/0x60 </SYSENTER> Code: ... which is a lot more informative. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.392711508@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-18x86/dumpstack: Fix interrupt and exception stack boundary checksJosh Poimboeuf
On x86_64, the double fault exception stack is located immediately after the interrupt stack in memory. This causes confusion in the unwinder when it tries to unwind through an empty interrupt stack, where the stack pointer points to the address bordering the two stacks. The unwinder incorrectly thinks it's running on the double fault stack. Fix this kind of stack border confusion by never considering the beginning address of an exception or interrupt stack to be part of the stack. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Fixes: 5fe599e02e41 ("x86/dumpstack: Add support for unwinding empty IRQ stacks") Link: http://lkml.kernel.org/r/bcc142160a5104de5c354c21c394c93a0173943f.1499786555.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-27x86/debug: Implement __WARN() using UD0Peter Zijlstra
By using "UD0" for WARN()s we remove the function call and its possible __FILE__ and __LINE__ immediate arguments from the instruction stream. Total image size will not change much, what we win in the instruction stream we'll lose because of the __bug_table entries. Still, saves on I$ footprint and the total image size does go down a bit. text data filename 10702123 4530992 defconfig-build/vmlinux.orig 10682460 4530992 defconfig-build/vmlinux.patched (UML didn't seem to use GENERIC_BUG at all, so remove it) Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/debug.h> We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/debug.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-21x86/dumpstack: Make stack name tags more comprehensibleJosh Poimboeuf
NMI stack dumps are bracketed by the following tags: <NMI> ... <EOE> The ending tag is kind of confusing if you don't already know what "EOE" means (end of exception). The same ending tag is also used to mark the end of all other exceptions' stacks. For example: <#DF> ... <EOE> And similarly, "EOI" is used as the ending tag for interrupts: <IRQ> ... <EOI> Change the tags to be more comprehensible by making them symmetrical and more XML-esque: <NMI> ... </NMI> <#DF> ... </#DF> <IRQ> ... </IRQ> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/180196e3754572540b595bc56b947d43658979a7.1479491159.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-27x86/dumpstack: Warn on stack recursionJosh Poimboeuf
Print a warning if stack recursion is detected. Use printk_deferred_once() because the unwinder can be called with the console lock by lockdep via save_stack_trace(). Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/def18247aafaab480844484398e793f552b79bda.1477496147.git.jpoimboe@redhat.com [ Unbroke the lines. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25x86/dumpstack: Remove raw stack dumpJosh Poimboeuf
For mostly historical reasons, the x86 oops dump shows the raw stack values: ... [registers] Stack: ffff880079af7350 ffff880079905400 0000000000000000 ffffc900008f3ae0 ffffffffa0196610 0000000000000001 00010000ffffffff 0000000087654321 0000000000000002 0000000000000000 0000000000000000 0000000000000000 Call Trace: ... This seems to be an artifact from long ago, and probably isn't needed anymore. It generally just adds noise to the dump, and it can be actively harmful because it leaks kernel addresses. Linus says: "The stack dump actually goes back to forever, and it used to be useful back in 1992 or so. But it used to be useful mainly because stacks were simpler and we didn't have very good call traces anyway. I definitely remember having used them - I just do not remember having used them in the last ten+ years. Of course, it's still true that if you can trigger an oops, you've likely already lost the security game, but since the stack dump is so useless, let's aim to just remove it and make games like the above harder." This also removes the related 'kstack=' cmdline option and the 'kstack_depth_to_print' sysctl. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e83bd50df52d8fe88e94d2566426ae40d813bf8f.1477405374.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20x86/dumpstack: Remove dump_trace() and related callbacksJosh Poimboeuf
All previous users of dump_trace() have been converted to use the new unwind interfaces, so we can remove it and the related print_context_stack() and print_context_stack_bp() callback functions. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/5b97da3572b40b5a4d8e185cf2429308d0987a13.1474045023.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinderJosh Poimboeuf
Convert show_trace_log_lvl() to use the new unwinder. dump_trace() has been deprecated. show_trace_log_lvl() is special compared to other users of the unwinder. It's the only place where both reliable *and* unreliable addresses are needed. With frame pointers enabled, most callers of the unwinder don't want to know about unreliable addresses. But in this case, when we're dumping the stack to the console because something presumably went wrong, the unreliable addresses are useful: - They show stale data on the stack which can provide useful clues. - If something goes wrong with the unwinder, or if frame pointers are corrupt or missing, all the stack addresses still get shown. So in order to show all addresses on the stack, and at the same time figure out which addresses are reliable, we have to do the scanning and the unwinding in parallel. The scanning is done with the help of get_stack_info() to traverse the stacks. The unwinding is done separately by the new unwinder. In theory we could simplify show_trace_log_lvl() by instead pushing some of this logic into the unwind code. But then we would need some kind of "fake" frame logic in the unwinder which would add a lot of complexity and wouldn't be worth it in order to support only one user. Another benefit of this approach is that once we have a DWARF unwinder, we should be able to just plug it in with minimal impact to this code. Another change here is that callers of show_trace_log_lvl() don't need to provide the 'bp' argument. The unwinder already finds the relevant frame pointer by unwinding until it reaches the first frame after the provided stack pointer. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/703b5998604c712a1f801874b43f35d6dac52ede.1474045023.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16x86/dumpstack: Remove NULL task pointer conventionJosh Poimboeuf
show_stack_log_lvl() and friends allow a NULL pointer for the task_struct to indicate the current task. This creates confusion and can cause sneaky bugs. Instead require the caller to pass 'current' directly. This only changes the internal workings of the dumpstack code. The dump_trace() and show_stack() interfaces still allow a NULL task pointer. Those interfaces should also probably be fixed as well. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16x86/dumpstack: Pin the target stack when dumping itAndy Lutomirski
Specifically, pin the stack in save_stack_trace_tsk() and show_trace_log_lvl(). This will prevent a crash if the target task dies before or while dumping its stack once we start freeing task stacks early. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jann Horn <jann@thejh.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/cf0082cde65d1941a996d026f2b2cdbfaca17bfa.1474003868.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15x86/dumpstack: Add recursion checking for all stacksJosh Poimboeuf
in_exception_stack() has some recursion checking which makes sure the stack trace code never traverses a given exception stack more than once. This prevents an infinite loop if corruption somehow causes a stack's "next stack" pointer to point to itself (directly or indirectly). The recursion checking can be useful for other stacks in addition to the exception stack, so extend it to work for all stacks. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/95de5db4cfe111754845a5cef04e20630d01423f.1473905218.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15x86/dumpstack: Add support for unwinding empty IRQ stacksJosh Poimboeuf
When an interrupt happens in entry code while running on a software IRQ stack, and the IRQ stack was empty, regs->sp will contain the stack end address (e.g., irq_stack_ptr). If the regs are passed to dump_trace(), get_stack_info() will report STACK_TYPE_UNKNOWN, causing dump_trace() to return prematurely without trying to go to the next stack. Update the bounds checking for software interrupt stacks so that the ending address is now considered part of the stack. This means that it's now possible for the 'walk_stack' callbacks -- print_context_stack() and print_context_stack_bp() -- to be called with an empty stack. But that's fine; they're already prepared to deal with that due to their on_stack() checks. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/5a5e5de92dcf11e8dc6b6e8e50ad7639d067830b.1473905218.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15x86/dumpstack: Add get_stack_info() interfaceJosh Poimboeuf
valid_stack_ptr() is buggy: it assumes that all stacks are of size THREAD_SIZE, which is not true for exception stacks. So the walk_stack() callbacks will need to know the location of the beginning of the stack as well as the end. Another issue is that in general the various features of a stack (type, size, next stack pointer, description string) are scattered around in various places throughout the stack dump code. Encapsulate all that information in a single place with a new stack_info struct and a get_stack_info() interface. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8164dd0db96b7e6a279fa17ae5e6dc375eecb4a9.1473905218.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15x86/dumpstack: Simplify in_exception_stack()Josh Poimboeuf
in_exception_stack() does some bad, bad things just so the unwinder can print different values for different areas of the debug exception stack. There's no need to clarify where exactly on the stack it is. Just print "#DB" and be done with it. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e91cb410169dd576678dd427c35efb716fd0cee1.1473905218.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-14x86/dumpstack: Allow preemption in show_stack_log_lvl() and dump_trace()Josh Poimboeuf
show_stack_log_lvl() and dump_trace() are already preemption safe: - If they're running in irq or exception context, preemption is already disabled and the percpu stack pointers can be trusted. - If they're running with preemption enabled, they must be running on the task stack anyway, so it doesn't matter if they're comparing the stack pointer against a percpu stack pointer from this CPU or another one: either way it won't match. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a0ca0b1044eca97d4f0ec7c1619cf80b3b65560d.1473371307.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-08x86/dumpstack: Remove unnecessary stack pointer argumentsJosh Poimboeuf
When calling show_stack_log_lvl() or dump_trace() with a regs argument, providing a stack pointer or frame pointer is redundant. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>d Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1694e2e955e3b9a73a3c3d5ba2634344014dd550.1472057064.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-08x86/dumpstack: Add get_stack_pointer() and get_frame_pointer()Josh Poimboeuf
The various functions involved in dumping the stack all do similar things with regard to getting the stack pointer and the frame pointer based on the regs and task arguments. Create helper functions to do that instead. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/f448914885a35f333fe04da1b97a6c2cc1f80974.1472057064.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18x86/dumpstack: Remove 64-byte gap at end of irq stackJosh Poimboeuf
There has been a 64-byte gap at the end of the irq stack for at least 12 years. It predates git history, and I can't find any good reason for it. Remove it. What's the worst that could happen? Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/14f9281c5475cc44af95945ea7546bff2e3836db.1471535549.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18x86/dumpstack: Remove extra brackets around "<EOE>"Josh Poimboeuf
When starting the dump of an exception stack, it shows "<<EOE>>" instead of "<EOE>". print_trace_stack() already adds brackets, no need to add them again. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/77f185fd5b81845869b400aa619415458df6b6cc.1471535549.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-01Merge branch 'x86-headers-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 header cleanups from Ingo Molnar: "This tree is a cleanup of the x86 tree reducing spurious uses of module.h - which should improve build performance a bit" * 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, crypto: Restore MODULE_LICENSE() to glue_helper.c so it loads x86/apic: Remove duplicated include from probe_64.c x86/ce4100: Remove duplicated include from ce4100.c x86/headers: Include spinlock_types.h in x8664_ksyms_64.c for missing spinlock_t x86/platform: Delete extraneous MODULE_* tags fromm ts5500 x86: Audit and remove any remaining unnecessary uses of module.h x86/kvm: Audit and remove any unnecessary uses of module.h x86/xen: Audit and remove any unnecessary uses of module.h x86/platform: Audit and remove any unnecessary uses of module.h x86/lib: Audit and remove any unnecessary uses of module.h x86/kernel: Audit and remove any unnecessary uses of module.h x86/mm: Audit and remove any unnecessary uses of module.h x86: Don't use module.h just for AUTHOR / LICENSE tags
2016-07-25Merge branch 'x86-debug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 stackdump update from Ingo Molnar: "A number of stackdump enhancements" * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/dumpstack: Add show_stack_regs() and use it printk: Make the printk*once() variants return a value x86/dumpstack: Honor supplied @regs arg
2016-07-15x86/dumpstack/64: Handle faults when printing the "Stack: " part of an OOPSAndy Lutomirski
If we overflow the stack into a guard page, we'll recursively fault when trying to dump the contents of the guard page. Use probe_kernel_address() so we can recover if this happens. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e626d47a55d7b04dcb1b4d33faa95e8505b217c8.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14x86/kernel: Audit and remove any unnecessary uses of module.hPaul Gortmaker
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance for the presence of either and replace as needed. Build testing revealed some implicit header usage that was fixed up accordingly. Note that some bool/obj-y instances remain since module.h is the header for some exception table entry stuff, and for things like __init_or_module (code that is tossed when MODULES=n). Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160714001901.31603-4-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-08x86/dumpstack: Honor supplied @regs argAndy Lutomirski
The comment suggests that show_stack(NULL, NULL) should backtrace the current context, but the code doesn't match the comment. If regs are given, start the "Stack:" hexdump at regs->sp. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1467671487-10344-2-git-send-email-bp@alien8.de Link: http://lkml.kernel.org/r/efcd79bf4106d61f1cd258c2caa87f3a0618eeac.1466036668.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-23x86: avoid avoid passing around 'thread_info' in stack dumping codeLinus Torvalds
None of the code actually wants a thread_info, it all wants a task_struct, and it's just converting to a thread_info pointer much too early. No semantic change. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-23x86/kernel: Use kstack_end() in dumpstack_64.cAdrien Schildknecht
i386 is already using kstack_end() in dumpstack_32.c. We should also use it to make the code clearer and unify the stack printing logic some more. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Adrien Schildknecht <adrien+dev@schischi.me> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: c: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1424618638-6375-1-git-send-email-adrien+dev@schischi.me Signed-off-by: Borislav Petkov <bp@suse.de>
2015-02-23x86/kernel: Fix output of show_stack_log_lvl()Adrien Schildknecht
show_stack_log_lvl() does not set the log level after a new line, the following messages printed with pr_cont() are thus assigned to the default log level. This patch prepends the log level to the next message following a new line. print_trace_address() uses printk(log_lvl). Using printk() with just a log level is ignored and thus has no effect on the next pr_cont(). We need to prepend the log level directly into the message. Signed-off-by: Adrien Schildknecht <adrien+dev@schischi.me> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1424399661-20327-1-git-send-email-adrien+dev@schischi.me Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-23x86_64, traps: Stop using IST for #SSAndy Lutomirski
On a 32-bit kernel, this has no effect, since there are no IST stacks. On a 64-bit kernel, #SS can only happen in user code, on a failed iret to user space, a canonical violation on access via RSP or RBP, or a genuine stack segment violation in 32-bit kernel code. The first two cases don't need IST, and the latter two cases are unlikely fatal bugs, and promoting them to double faults would be fine. This fixes a bug in which the espfix64 code mishandles a stack segment violation. This saves 4k of memory per CPU and a tiny bit of code. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-02x86: Fix dumpstack_64 irq stack handlingSteven Rostedt (Red Hat)
Commit 2223f6f6eeaa "x86: Clean up dumpstack_64.c code" changed the irq_stack processing a little from what it was before. The irq_stack_end variable needed to be cleared after its first use. By setting irq_stack to the per cpu irq_stack and passing that to analyze_stack(), and then clearing it after it is processed, we can get back the original behavior. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-02x86: Fix dumpstack_64 to keep state of "used" variable in loopSteven Rostedt (Red Hat)
Commit 2223f6f6eeaa "x86: Clean up dumpstack_64.c code" moved the used variable to a local within the loop, but the in_exception_stack() depended on being non-volatile with the ability to change it. By always re-initializing the "used" variable to zero, it would cause the in_exception_stack() to return the same thing each time, and cause the dump_stack loop to go into an infinite loop. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-06x86: Clean up dumpstack_64.c codeSteven Rostedt
The dump_trace() function in dumpstack_64.c is hard to follow. The test for exception stack is processed differently than the test for irq stack, and the normal stack is outside completely. By restructuring this code to have all the stacks determined by a single function that returns an enum of the following: STACK_IS_NORMAL STACK_IS_EXCEPTION STACK_IS_IRQ STACK_IS_UNKNOWN and has the logic of each within a switch statement. This should make the code much easier to read and understand. Link: http://lkml.kernel.org/r/20110806012354.684598995@goodmis.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20140206144322.086050042@goodmis.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-30dump_stack: unify debug information printed by show_regs()Tejun Heo
show_regs() is inherently arch-dependent but it does make sense to print generic debug information and some archs already do albeit in slightly different forms. This patch introduces a generic function to print debug information from show_regs() so that different archs print out the same information and it's much easier to modify what's printed. show_regs_print_info() prints out the same debug info as dump_stack() does plus task and thread_info pointers. * Archs which didn't print debug info now do. alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r, metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc, um, xtensa * Already prints debug info. Replaced with show_regs_print_info(). The printed information is superset of what used to be there. arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86 * s390 is special in that it used to print arch-specific information along with generic debug info. Heiko and Martin think that the arch-specific extra isn't worth keeping s390 specfic implementation. Converted to use the generic version. Note that now all archs print the debug info before actual register dumps. An example BUG() dump follows. kernel BUG at /work/os/work/kernel/workqueue.c:4841! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000 RIP: 0010:[<ffffffff8234a07e>] [<ffffffff8234a07e>] init_workqueues+0x4/0x6 RSP: 0000:ffff88007c861ec8 EFLAGS: 00010246 RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650 0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760 Call Trace: [<ffffffff81000312>] do_one_initcall+0x122/0x170 [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8 [<ffffffff81c47760>] ? rest_init+0x140/0x140 [<ffffffff81c4776e>] kernel_init+0xe/0xf0 [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0 [<ffffffff81c47760>] ? rest_init+0x140/0x140 ... v2: Typo fix in x86-32. v3: CPU number dropped from show_regs_print_info() as dump_stack_print_info() has been updated to print it. s390 specific implementation dropped as requested by s390 maintainers. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile bits] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-20x86: Move call to print_modules() out of show_regs()Jan Beulich
Printing the list of loaded modules is really unrelated to what this function is about, and is particularly unnecessary in the context of the SysRQ key handling (gets printed so far over and over). It should really be the caller of the function to decide whether this piece of information is useful (and to avoid redundantly printing it). Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/4FDF21A4020000780008A67F@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level>Joe Perches
Use a more current logging style: - Bare printks should have a KERN_<LEVEL> for consistency's sake - Add pr_fmt where appropriate - Neaten some macro definitions - Convert some Ok output to OK - Use "%s: ", __func__ in pr_fmt for summit - Convert some printks to pr_<level> Message output is not identical in all cases. Signed-off-by: Joe Perches <joe@perches.com> Cc: levinsasha928@gmail.com Link: http://lkml.kernel.org/r/1337655007.24226.10.camel@joe2Laptop [ merged two similar patches, tidied up the changelog ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09x86: Avoid double stack traces with show_regs()Jan Beulich
What was called show_registers() so far already showed a stack trace for kernel faults, and kernel_stack_pointer() isn't even valid to be used for faults from user mode, hence it was pointless for show_regs() to call show_trace() after show_registers(). Simply rename show_registers() to show_regs() and eliminate the old definition. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/4FAA3D3902000078000826E1@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-02-02Merge branches 'core-urgent-for-linus', 'perf-urgent-for-linus', ↵Linus Torvalds
'sched-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: bugs, x86: Fix printk levels for panic, softlockups and stack dumps * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf top: Fix number of samples displayed perf tools: Fix strlen() bug in perf_event__synthesize_event_type() perf tools: Fix broken build by defining _GNU_SOURCE in Makefile x86/dumpstack: Remove unneeded check in dump_trace() perf: Fix broken interrupt rate throttling * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/rt: Fix task stack corruption under __ARCH_WANT_INTERRUPTS_ON_CTXSW sched: Fix ancient race in do_exit() sched/nohz: Fix nohz cpu idle load balancing state with cpu hotplug sched/s390: Fix compile error in sched/core.c sched: Fix rq->nr_uninterruptible update race * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/reboot: Remove VersaLogic Menlow reboot quirk x86/reboot: Skip DMI checks if reboot set by user x86: Properly parenthesize cmpxchg() macro arguments
2012-01-28x86/dumpstack: Remove unneeded check in dump_trace()Dan Carpenter
Smatch complains that we have some inconsistent NULL checking. If "task" were NULL then it would lead to a NULL dereference later. We can remove this test because earlier on in the function we have: if (!task) task = current; Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Clemens Ladisch <clemens@ladisch.de> Link: http://lkml.kernel.org/r/20120128105246.GA25092@elgon.mountain Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-26bugs, x86: Fix printk levels for panic, softlockups and stack dumpsPrarit Bhargava
rsyslog will display KERN_EMERG messages on a connected terminal. However, these messages are useless/undecipherable for a general user. For example, after a softlockup we get: Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ... kernel:Stack: Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ... kernel:Call Trace: Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ... kernel:Code: ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9 <e8> ea 69 dd ff 4c 29 e8 48 89 c7 e8 0f bc da ff 49 89 c4 49 89 This happens because the printk levels for these messages are incorrect. Only an informational message should be displayed on a terminal. I modified the printk levels for various messages in the kernel and tested the output by using the drivers/misc/lkdtm.c kernel modules (ie, softlockups, panics, hard lockups, etc.) and confirmed that the console output was still the same and that the output to the terminals was correct. For example, in the case of a softlockup we now see the much more informative: Message from syslogd@intel-s3e37-04 at Jan 25 10:18:06 ... BUG: soft lockup - CPU4 stuck for 60s! instead of the above confusing messages. AFAICT, the messages no longer have to be KERN_EMERG. In the most important case of a panic we set console_verbose(). As for the other less severe cases the correct data is output to the console and /var/log/messages. Successfully tested by me using the drivers/misc/lkdtm.c module. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: dzickus@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1327586134-11926-1-git-send-email-prarit@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-19x86, dumpstack: Fix code bytes breakage due to missing KERN_CONTClemens Ladisch
When printing the code bytes in show_registers(), the markers around the byte at the fault address could make the printk() format string look like a valid log level and facility code. This would prevent this byte from being printed and result in a spurious newline: [ 7555.765589] Code: 8b 32 e9 94 00 00 00 81 7d 00 ff 00 00 00 0f 87 96 00 00 00 48 8b 83 c0 00 00 00 44 89 e2 44 89 e6 48 89 df 48 8b 80 d8 02 00 00 [ 7555.765683] 8b 48 28 48 89 d0 81 e2 ff 0f 00 00 48 c1 e8 0c 48 c1 e0 04 Add KERN_CONT where needed, and elsewhere in show_registers() for consistency. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Link: http://lkml.kernel.org/r/4EEFA7AE.9020407@ladisch.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>