summaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)Author
2019-05-15Merge tag 'trace-v5.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The major changes in this tracing update includes: - Removal of non-DYNAMIC_FTRACE from 32bit x86 - Removal of mcount support from x86 - Emulating a call from int3 on x86_64, fixes live kernel patching - Consolidated Tracing Error logs file Minor updates: - Removal of klp_check_compiler_support() - kdb ftrace dumping output changes - Accessing and creating ftrace instances from inside the kernel - Clean up of #define if macro - Introduction of TRACE_EVENT_NOP() to disable trace events based on config options And other minor fixes and clean ups" * tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) x86: Hide the int3_emulate_call/jmp functions from UML livepatch: Remove klp_check_compiler_support() ftrace/x86: Remove mcount support ftrace/x86_32: Remove support for non DYNAMIC_FTRACE tracing: Simplify "if" macro code tracing: Fix documentation about disabling options using trace_options tracing: Replace kzalloc with kcalloc tracing: Fix partial reading of trace event's id file tracing: Allow RCU to run between postponed startup tests tracing: Fix white space issues in parse_pred() function tracing: Eliminate const char[] auto variables ring-buffer: Fix mispelling of Calculate tracing: probeevent: Fix to make the type of $comm string tracing: probeevent: Do not accumulate on ret variable tracing: uprobes: Re-enable $comm support for uprobe events ftrace/x86_64: Emulate call function while updating in breakpoint handler x86_64: Allow breakpoints to emulate call instructions x86_64: Add gap to int3 to allow for call emulation tracing: kdb: Allow ftdump to skip all but the last few entries tracing: Add trace_total_entries() / trace_total_entries_cpu() ...
2019-05-11Merge tag 'gpio-v5.2-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull gpio updates from Linus Walleij: "This is the bulk of the GPIO changes for the v5.2 kernel cycle. A bit later than usual because I was ironing out my own mistakes. I'm holding some stuff back for the next kernel as a result, and this should be a healthy and well tested batch. Core changes: - The gpiolib MMIO driver has been enhanced to handle two direction registers, i.e. one register to set lines as input and one register to set lines as output. It turns out some silicon engineer thinks the ability to configure a line as input and output at the same time makes sense, this can be debated but includes a lot of analog electronics reasoning, and the registers are there and need to be handled consistently. Unsurprisingly, we enforce the lines to be either inputs or outputs in such schemes. - Send in the proper argument value to .set_config() dispatched to the pin control subsystem. Nobody used it before, now someone does, so fix it to work as expected. - The ACPI gpiolib portions can now handle pin bias setting (pull up or pull down). This has been in the ACPI spec for years and we finally have it properly integrated with Linux GPIOs. It was based on an observation from Andy Schevchenko that Thomas Petazzoni's changes to the core for biasing the PCA950x GPIO expander actually happen to fit hand-in-glove with what the ACPI core needed. Such nice synergies happen sometimes. New drivers: - A new driver for the Mellanox BlueField GPIO controller. This is using 64bit MMIO registers and can configure lines as inputs and outputs at the same time and after improving the MMIO library we handle it just fine. Interesting. - A new IXP4xx proper gpiochip driver with hierarchical interrupts should be coming in from the ARM SoC tree as well. Driver enhancements: - The PCA053x driver handles the CAT9554 GPIO expander. - The PCA053x driver handles the NXP PCAL6416 GPIO expander. - Wake-up support on PCA053x GPIO lines. - OMAP now does a nice asynchronous IRQ handling on wake-ups by letting everything wake up on edges, and this makes runtime PM work as expected too. Misc: - Several cleanups such as devres fixes. - Get rid of some languager comstructs that cause problems when compiling with LLVMs clang. - Documentation review and update" * tag 'gpio-v5.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (85 commits) gpio: Update documentation docs: gpio: convert docs to ReST and rename to *.rst gpio: sch: Remove write-only core_base gpio: pxa: Make two symbols static gpiolib: acpi: Respect pin bias setting gpiolib: acpi: Add acpi_gpio_update_gpiod_lookup_flags() helper gpiolib: acpi: Set pin value, based on bias, more accurately gpiolib: acpi: Change type of dflags gpiolib: Introduce GPIO_LOOKUP_FLAGS_DEFAULT gpiolib: Make use of enum gpio_lookup_flags consistent gpiolib: Indent entry values of enum gpio_lookup_flags gpio: pca953x: add support for pca6416 dt-bindings: gpio: pca953x: document the nxp,pca6416 gpio: pca953x: add pcal6416 to the of_device_id table gpio: gpio-omap: Remove conditional pm_runtime handling for GPIO interrupts gpio: gpio-omap: configure edge detection for level IRQs for idle wakeup tracing: stop making gpio tracing configurable gpio: pca953x: Configure wake-up path when wake-up is enabled gpio: of: Optimize quirk checks gpio: mmio: Drop bgpio_dir_inverted ...
2019-05-08tracing: Fix documentation about disabling options using trace_optionsSrivatsa S. Bhat (VMware)
To disable a tracing option using the trace_options file, the option name needs to be prefixed with 'no', and not suffixed, as the README states. Fix it. Link: http://lkml.kernel.org/r/154872690031.47356.5739053380942044586.stgit@srivatsa-ubuntu Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08tracing: Replace kzalloc with kcallocGustavo A. R. Silva
Replace kzalloc() function with its 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a, b, gfp) This code was detected with the help of Coccinelle. Link: http://lkml.kernel.org/r/20190115043408.GA23456@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08tracing: Fix partial reading of trace event's id fileElazar Leibovich
When reading only part of the id file, the ppos isn't tracked correctly. This is taken care by simple_read_from_buffer. Reading a single byte, and then the next byte would result EOF. While this seems like not a big deal, this breaks abstractions that reads information from files unbuffered. See for example https://github.com/golang/go/issues/29399 This code was mentioned as problematic in commit cd458ba9d5a5 ("tracing: Do not (ab)use trace_seq in event_id_read()") An example C code that show this bug is: #include <stdio.h> #include <stdint.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> int main(int argc, char **argv) { if (argc < 2) return 1; int fd = open(argv[1], O_RDONLY); char c; read(fd, &c, 1); printf("First %c\n", c); read(fd, &c, 1); printf("Second %c\n", c); } Then run with, e.g. sudo ./a.out /sys/kernel/debug/tracing/events/tcp/tcp_set_state/id You'll notice you're getting the first character twice, instead of the first two characters in the id file. Link: http://lkml.kernel.org/r/20181231115837.4932-1-elazar@lightbitslabs.com Cc: Orit Wasserman <orit.was@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 23725aeeab10b ("ftrace: provide an id file for each event") Signed-off-by: Elazar Leibovich <elazar@lightbitslabs.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08tracing: Allow RCU to run between postponed startup testsAnders Roxell
When building a allmodconfig kernel for arm64 and boot that in qemu, CONFIG_FTRACE_STARTUP_TEST gets enabled and that takes time so the watchdog expires and prints out a message like this: 'watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]' Depending on what the what test gets called from init_trace_selftests() it stays minutes in the loop. Rework so that function cond_resched() gets called in the init_trace_selftests loop. Link: http://lkml.kernel.org/r/20181130145622.26334-1-anders.roxell@linaro.org Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08tracing: Fix white space issues in parse_pred() functionColin Ian King
Trivial fix to clean up an indentation issue, a whole chunk of code has an extra space in the indentation. Link: http://lkml.kernel.org/r/20181109132312.20994-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08tracing: Eliminate const char[] auto variablesRasmus Villemoes
Automatic const char[] variables cause unnecessary code generation. For example, the this_mod variable leads to 3f04: 48 b8 5f 5f 74 68 69 73 5f 6d movabs $0x6d5f736968745f5f,%rax # __this_m 3f0e: 4c 8d 44 24 02 lea 0x2(%rsp),%r8 3f13: 48 8d 7c 24 10 lea 0x10(%rsp),%rdi 3f18: 48 89 44 24 02 mov %rax,0x2(%rsp) 3f1d: 4c 89 e9 mov %r13,%rcx 3f20: b8 65 00 00 00 mov $0x65,%eax # e 3f25: 48 c7 c2 00 00 00 00 mov $0x0,%rdx 3f28: R_X86_64_32S .rodata.str1.1+0x18d 3f2c: be 48 00 00 00 mov $0x48,%esi 3f31: c7 44 24 0a 6f 64 75 6c movl $0x6c75646f,0xa(%rsp) # odul 3f39: 66 89 44 24 0e mov %ax,0xe(%rsp) i.e., the string gets built on the stack at runtime. Similar code can be found for the other instances I'm replacing here. Putting the string in .rodata reduces the combined .text+.rodata size and saves time and stack space at runtime. The simplest fix, and what I've done for the this_mod case, is to just make the variable static. However, for the "<faulted>" case where the same string is used twice, that prevents the linker from merging those two literals, so instead use a macro - that also keeps the two instances automatically in sync (instead of only the compile-time strlen expression). Finally, for the two runs of spaces, it turns out that the "build these strings on the stack" is not the worst part of what gcc does - it turns print_func_help_header_irq() into "if (tgid) { /* print_event_info + five seq_printf calls */ } else { /* print event_info + another five seq_printf */}". Taking inspiration from a suggestion from Al Viro, use %.*s to make snprintf either stop after the first two spaces or print the whole string. As a bonus, the seq_printfs now fit on single lines (at least, they are not longer than the existing ones in the function just above), making it easier to see that the ascii art lines up. x86-64 defconfig + CONFIG_FUNCTION_TRACER: $ scripts/stackdelta /tmp/stackusage.{0,1} ./kernel/trace/ftrace.c ftrace_mod_callback 152 136 -16 ./kernel/trace/trace.c trace_default_header 56 32 -24 ./kernel/trace/trace.c tracing_mark_raw_write 96 72 -24 ./kernel/trace/trace.c tracing_mark_write 104 80 -24 bloat-o-meter add/remove: 1/0 grow/shrink: 0/4 up/down: 14/-375 (-361) Function old new delta this_mod - 14 +14 ftrace_mod_callback 577 542 -35 tracing_mark_raw_write 444 374 -70 tracing_mark_write 616 540 -76 trace_default_header 600 406 -194 Link: http://lkml.kernel.org/r/20190320081757.6037-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08ring-buffer: Fix mispelling of CalculateYangtao Li
It's not "Caculate". Link: http://lkml.kernel.org/r/20181101154640.23162-1-tiny.windzz@gmail.com Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08tracing: probeevent: Fix to make the type of $comm stringMasami Hiramatsu
Fix to make the type of $comm "string". If we set the other type to $comm argument, it shows meaningless value or wrong data. Currently probe events allow us to set string array type (e.g. ":string[2]"), or other digit types like x8 on $comm. But since clearly $comm is just a string data, it should not be fetched by other types including array. Link: http://lkml.kernel.org/r/155723736241.9149.14582064184468574539.stgit@devnote2 Cc: Andreas Ziegler <andreas.ziegler@fau.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 533059281ee5 ("tracing: probeevent: Introduce new argument fetching code") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08tracing: probeevent: Do not accumulate on ret variableMasami Hiramatsu
Do not accumulate strlen result on "ret" local variable, because it is accumulated on "total" local variable for array case. Link: http://lkml.kernel.org/r/155723735237.9149.3192150444705457531.stgit@devnote2 Fixes: 40b53b771806 ("tracing: probeevent: Add array type support") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08tracing: uprobes: Re-enable $comm support for uprobe eventsMasami Hiramatsu
Since commit 533059281ee5 ("tracing: probeevent: Introduce new argument fetching code") dropped the $comm support from uprobe events, this re-enables it. For $comm support, uses strlcpy() instead of strncpy_from_user() to copy current task's comm. Because it is in the kernel space, strncpy_from_user() always fails to copy the comm. This also uses strlen() instead of strnlen_user() to measure the length of the comm. Note that this uses -ECOMM as a token value to fetch the comm string. If the user-space pointer points -ECOMM, it will be translated to task->comm. Link: http://lkml.kernel.org/r/155723734162.9149.4042756162201097965.stgit@devnote2 Fixes: 533059281ee5 ("tracing: probeevent: Introduce new argument fetching code") Reported-by: Andreas Ziegler <andreas.ziegler@fau.de> Acked-by: Andreas Ziegler <andreas.ziegler@fau.de> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Highlights: 1) Support AES128-CCM ciphers in kTLS, from Vakul Garg. 2) Add fib_sync_mem to control the amount of dirty memory we allow to queue up between synchronize RCU calls, from David Ahern. 3) Make flow classifier more lockless, from Vlad Buslov. 4) Add PHY downshift support to aquantia driver, from Heiner Kallweit. 5) Add SKB cache for TCP rx and tx, from Eric Dumazet. This reduces contention on SLAB spinlocks in heavy RPC workloads. 6) Partial GSO offload support in XFRM, from Boris Pismenny. 7) Add fast link down support to ethtool, from Heiner Kallweit. 8) Use siphash for IP ID generator, from Eric Dumazet. 9) Pull nexthops even further out from ipv4/ipv6 routes and FIB entries, from David Ahern. 10) Move skb->xmit_more into a per-cpu variable, from Florian Westphal. 11) Improve eBPF verifier speed and increase maximum program size, from Alexei Starovoitov. 12) Eliminate per-bucket spinlocks in rhashtable, and instead use bit spinlocks. From Neil Brown. 13) Allow tunneling with GUE encap in ipvs, from Jacky Hu. 14) Improve link partner cap detection in generic PHY code, from Heiner Kallweit. 15) Add layer 2 encap support to bpf_skb_adjust_room(), from Alan Maguire. 16) Remove SKB list implementation assumptions in SCTP, your's truly. 17) Various cleanups, optimizations, and simplifications in r8169 driver. From Heiner Kallweit. 18) Add memory accounting on TX and RX path of SCTP, from Xin Long. 19) Switch PHY drivers over to use dynamic featue detection, from Heiner Kallweit. 20) Support flow steering without masking in dpaa2-eth, from Ioana Ciocoi. 21) Implement ndo_get_devlink_port in netdevsim driver, from Jiri Pirko. 22) Increase the strict parsing of current and future netlink attributes, also export such policies to userspace. From Johannes Berg. 23) Allow DSA tag drivers to be modular, from Andrew Lunn. 24) Remove legacy DSA probing support, also from Andrew Lunn. 25) Allow ll_temac driver to be used on non-x86 platforms, from Esben Haabendal. 26) Add a generic tracepoint for TX queue timeouts to ease debugging, from Cong Wang. 27) More indirect call optimizations, from Paolo Abeni" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1763 commits) cxgb4: Fix error path in cxgb4_init_module net: phy: improve pause mode reporting in phy_print_status dt-bindings: net: Fix a typo in the phy-mode list for ethernet bindings net: macb: Change interrupt and napi enable order in open net: ll_temac: Improve error message on error IRQ net/sched: remove block pointer from common offload structure net: ethernet: support of_get_mac_address new ERR_PTR error net: usb: smsc: fix warning reported by kbuild test robot staging: octeon-ethernet: Fix of_get_mac_address ERR_PTR check net: dsa: support of_get_mac_address new ERR_PTR error net: dsa: sja1105: Fix status initialization in sja1105_get_ethtool_stats vrf: sit mtu should not be updated when vrf netdev is the link net: dsa: Fix error cleanup path in dsa_init_module l2tp: Fix possible NULL pointer dereference taprio: add null check on sched_nest to avoid potential null pointer dereference net: mvpp2: cls: fix less than zero check on a u32 variable net_sched: sch_fq: handle non connected flows net_sched: sch_fq: do not assume EDT packets are ordered net: hns3: use devm_kcalloc when allocating desc_cb net: hns3: some cleanup for struct hns3_enet_ring ...
2019-05-06Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "The changes in here are: - text_poke() fixes and an extensive set of executability lockdowns, to (hopefully) eliminate the last residual circumstances under which we are using W|X mappings even temporarily on x86 kernels. This required a broad range of surgery in text patching facilities, module loading, trampoline handling and other bits. - tweak page fault messages to be more informative and more structured. - remove DISCONTIGMEM support on x86-32 and make SPARSEMEM the default. - reduce KASLR granularity on 5-level paging kernels from 512 GB to 1 GB. - misc other changes and updates" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) x86/mm: Initialize PGD cache during mm initialization x86/alternatives: Add comment about module removal races x86/kprobes: Use vmalloc special flag x86/ftrace: Use vmalloc special flag bpf: Use vmalloc special flag modules: Use vmalloc special flag mm/vmalloc: Add flag for freeing of special permsissions mm/hibernation: Make hibernation handle unmapped pages x86/mm/cpa: Add set_direct_map_*() functions x86/alternatives: Remove the return value of text_poke_*() x86/jump-label: Remove support for custom text poker x86/modules: Avoid breaking W^X while loading modules x86/kprobes: Set instruction page as executable x86/ftrace: Set trampoline pages as executable x86/kgdb: Avoid redundant comparison of patched code x86/alternatives: Use temporary mm for text poking x86/alternatives: Initialize temporary mm for patching fork: Provide a function for copying init_mm uprobes: Initialize uprobes earlier x86/mm: Save debug registers when loading a temporary mm ...
2019-05-06Merge branch 'core-stacktrace-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull stack trace updates from Ingo Molnar: "So Thomas looked at the stacktrace code recently and noticed a few weirdnesses, and we all know how such stories of crummy kernel code meeting German engineering perfection end: a 45-patch series to clean it all up! :-) Here's the changes in Thomas's words: 'Struct stack_trace is a sinkhole for input and output parameters which is largely pointless for most usage sites. In fact if embedded into other data structures it creates indirections and extra storage overhead for no benefit. Looking at all usage sites makes it clear that they just require an interface which is based on a storage array. That array is either on stack, global or embedded into some other data structure. Some of the stack depot usage sites are outright wrong, but fortunately the wrongness just causes more stack being used for nothing and does not have functional impact. Another oddity is the inconsistent termination of the stack trace with ULONG_MAX. It's pointless as the number of entries is what determines the length of the stored trace. In fact quite some call sites remove the ULONG_MAX marker afterwards with or without nasty comments about it. Not all architectures do that and those which do, do it inconsistenly either conditional on nr_entries == 0 or unconditionally. The following series cleans that up by: 1) Removing the ULONG_MAX termination in the architecture code 2) Removing the ULONG_MAX fixups at the call sites 3) Providing plain storage array based interfaces for stacktrace and stackdepot. 4) Cleaning up the mess at the callsites including some related cleanups. 5) Removing the struct stack_trace based interfaces This is not changing the struct stack_trace interfaces at the architecture level, but it removes the exposure to the generic code'" * 'core-stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) x86/stacktrace: Use common infrastructure stacktrace: Provide common infrastructure lib/stackdepot: Remove obsolete functions stacktrace: Remove obsolete functions livepatch: Simplify stack trace retrieval tracing: Remove the last struct stack_trace usage tracing: Simplify stack trace retrieval tracing: Make ftrace_trace_userstack() static and conditional tracing: Use percpu stack trace buffer more intelligently tracing: Simplify stacktrace retrieval in histograms lockdep: Simplify stack trace handling lockdep: Remove save argument from check_prev_add() lockdep: Remove unused trace argument from print_circular_bug() drm: Simplify stacktrace handling dm persistent data: Simplify stack trace handling dm bufio: Simplify stack trace retrieval btrfs: ref-verify: Simplify stack trace retrieval dma/debug: Simplify stracktrace retrieval fault-inject: Simplify stacktrace retrieval mm/page_owner: Simplify stack trace handling ...
2019-05-06Merge branch 'core-objtool-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: "This is a series from Peter Zijlstra that adds x86 build-time uaccess validation of SMAP to objtool, which will detect and warn about the following uaccess API usage bugs and weirdnesses: - call to %s() with UACCESS enabled - return with UACCESS enabled - return with UACCESS disabled from a UACCESS-safe function - recursive UACCESS enable - redundant UACCESS disable - UACCESS-safe disables UACCESS As it turns out not leaking uaccess permissions outside the intended uaccess functionality is hard when the interfaces are complex and when such bugs are mostly dormant. As a bonus we now also check the DF flag. We had at least one high-profile bug in that area in the early days of Linux, and the checking is fairly simple. The checks performed and warnings emitted are: - call to %s() with DF set - return with DF set - return with modified stack frame - recursive STD - redundant CLD It's all x86-only for now, but later on this can also be used for PAN on ARM and objtool is fairly cross-platform in principle. While all warnings emitted by this new checking facility that got reported to us were fixed, there might be GCC version dependent warnings that were not reported yet - which we'll address, should they trigger. The warnings are non-fatal build warnings" * 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions x86/uaccess: Dont leak the AC flag into __put_user() argument evaluation sched/x86_64: Don't save flags on context switch objtool: Add Direction Flag validation objtool: Add UACCESS validation objtool: Fix sibling call detection objtool: Rewrite alt->skip_orig objtool: Add --backtrace support objtool: Rewrite add_ignores() objtool: Handle function aliases objtool: Set insn->func for alternatives x86/uaccess, kcov: Disable stack protector x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP x86/uaccess, ubsan: Fix UBSAN vs. SMAP x86/uaccess, kasan: Fix KASAN vs SMAP x86/smap: Ditch __stringify() x86/uaccess: Introduce user_access_{save,restore}() x86/uaccess, signal: Fix AC=1 bloat x86/uaccess: Always inline user_access_begin() x86/uaccess, xen: Suppress SMAP warnings ...
2019-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Three trivial overlapping conflicts. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02tracing: kdb: Allow ftdump to skip all but the last few entriesDouglas Anderson
The 'ftdump' command in kdb is currently a bit of a last resort, at least if you have lots of traces turned on. It's going to print a whole boatload of data out your serial port which is probably running at 115200. This could easily take many, many minutes. Usually you're most interested in what's at the _end_ of the ftrace buffer, AKA what happened most recently. That means you've got to wait the full time for the dump. The 'ftdump' command does attempt to help you a little bit by allowing you to skip a fixed number of entries. Unfortunately it provides no way for you to know how many entries you should skip. Let's do similar to python and allow you to use a negative number to indicate that you want to skip all entries except the last few. This allows you to quickly see what you want. Note that we also change the printout in ftdump to print the (positive) number of entries actually skipped since that could be helpful to know when you've specified a negative skip count. Link: http://lkml.kernel.org/r/20190319171206.97107-3-dianders@chromium.org Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-02tracing: Add trace_total_entries() / trace_total_entries_cpu()Douglas Anderson
These two new exported functions will be used in a future patch by kdb_ftdump() to quickly skip all but the last few trace entries. Link: http://lkml.kernel.org/r/20190319171206.97107-2-dianders@chromium.org Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-02tracing: kdb: The skip_lines parameter should have been skip_entriesDouglas Anderson
The things skipped by kdb's "ftdump" command when you pass it a parameter has always been entries, not lines. The difference usually doesn't matter but when the trace buffer has multi-line entries (like a stack dump) it can matter. Let's fix this both in the help text for ftdump and also in the local variable names. Link: http://lkml.kernel.org/r/20190319171206.97107-1-dianders@chromium.org Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-30bpf: Fail bpf_probe_write_user() while mm is switchedNadav Amit
When using a temporary mm, bpf_probe_write_user() should not be able to write to user memory, since user memory addresses may be used to map kernel memory. Detect these cases and fail bpf_probe_write_user() in such cases. Suggested-by: Jann Horn <jannh@google.com> Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <akpm@linux-foundation.org> Cc: <ard.biesheuvel@linaro.org> Cc: <deneen.t.dock@intel.com> Cc: <kernel-hardening@lists.openwall.com> Cc: <kristen@linux.intel.com> Cc: <linux_dti@icloud.com> Cc: <will.deacon@arm.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190426001143.4983-24-namit@vmware.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-29tracing: Remove the last struct stack_trace usageThomas Gleixner
Simplify the stack retrieval code by using the storage array based interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: kasan-dev@googlegroups.com Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: iommu@lists.linux-foundation.org Cc: Robin Murphy <robin.murphy@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: linux-btrfs@vger.kernel.org Cc: dm-devel@redhat.com Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: intel-gfx@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: linux-arch@vger.kernel.org Link: https://lkml.kernel.org/r/20190425094803.340000461@linutronix.de
2019-04-29tracing: Simplify stack trace retrievalThomas Gleixner
Replace the indirection through struct stack_trace by using the storage array based interfaces. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: kasan-dev@googlegroups.com Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: iommu@lists.linux-foundation.org Cc: Robin Murphy <robin.murphy@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: linux-btrfs@vger.kernel.org Cc: dm-devel@redhat.com Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: intel-gfx@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: linux-arch@vger.kernel.org Link: https://lkml.kernel.org/r/20190425094803.248604594@linutronix.de
2019-04-29tracing: Make ftrace_trace_userstack() static and conditionalThomas Gleixner
It's only used in trace.c and there is absolutely no point in compiling it in when user space stack traces are not supported. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: kasan-dev@googlegroups.com Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: iommu@lists.linux-foundation.org Cc: Robin Murphy <robin.murphy@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: linux-btrfs@vger.kernel.org Cc: dm-devel@redhat.com Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: intel-gfx@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: linux-arch@vger.kernel.org Link: https://lkml.kernel.org/r/20190425094803.162400595@linutronix.de
2019-04-29tracing: Use percpu stack trace buffer more intelligentlyThomas Gleixner
The per cpu stack trace buffer usage pattern is odd at best. The buffer has place for 512 stack trace entries on 64-bit and 1024 on 32-bit. When interrupts or exceptions nest after the per cpu buffer was acquired the stacktrace length is hardcoded to 8 entries. 512/1024 stack trace entries in kernel stacks are unrealistic so the buffer is a complete waste. Split the buffer into 4 nest levels, which are 128/256 entries per level. This allows nesting contexts (interrupts, exceptions) to utilize the cpu buffer for stack retrieval and avoids the fixed length allocation along with the conditional execution pathes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Potapenko <glider@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: kasan-dev@googlegroups.com Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: iommu@lists.linux-foundation.org Cc: Robin Murphy <robin.murphy@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: linux-btrfs@vger.kernel.org Cc: dm-devel@redhat.com Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: intel-gfx@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: linux-arch@vger.kernel.org Link: https://lkml.kernel.org/r/20190425094803.066064076@linutronix.de
2019-04-29tracing: Simplify stacktrace retrieval in histogramsThomas Gleixner
The indirection through struct stack_trace is not necessary at all. Use the storage array based interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: kasan-dev@googlegroups.com Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: iommu@lists.linux-foundation.org Cc: Robin Murphy <robin.murphy@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: linux-btrfs@vger.kernel.org Cc: dm-devel@redhat.com Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: intel-gfx@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: linux-arch@vger.kernel.org Link: https://lkml.kernel.org/r/20190425094802.979089273@linutronix.de
2019-04-29tracing: Cleanup stack trace codeThomas Gleixner
- Remove the extra array member of stack_dump_trace[] along with the ARRAY_SIZE - 1 initialization for struct stack_trace :: max_entries. Both are historical leftovers of no value. The stack tracer never exceeds the array and there is no extra storage requirement either. - Make variables which are only used in trace_stack.c static. - Simplify the enable/disable logic. - Rename stack_trace_print() as it's using the stack_trace_ namespace. Free the name up for stack trace related functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: kasan-dev@googlegroups.com Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: iommu@lists.linux-foundation.org Cc: Robin Murphy <robin.murphy@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: linux-btrfs@vger.kernel.org Cc: dm-devel@redhat.com Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: intel-gfx@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: linux-arch@vger.kernel.org Link: https://lkml.kernel.org/r/20190425094801.230654524@linutronix.de
2019-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-04-28 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Introduce BPF socket local storage map so that BPF programs can store private data they associate with a socket (instead of e.g. separate hash table), from Martin. 2) Add support for bpftool to dump BTF types. This is done through a new `bpftool btf dump` sub-command, from Andrii. 3) Enable BPF-based flow dissector for skb-less eth_get_headlen() calls which was currently not supported since skb was used to lookup netns, from Stanislav. 4) Add an opt-in interface for tracepoints to expose a writable context for attached BPF programs, used here for NBD sockets, from Matt. 5) BPF xadd related arm64 JIT fixes and scalability improvements, from Daniel. 6) Change the skb->protocol for bpf_skb_adjust_room() helper in order to support tunnels such as sit. Add selftests as well, from Willem. 7) Various smaller misc fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26bpf: add writable context for raw tracepointsMatt Mullins
This is an opt-in interface that allows a tracepoint to provide a safe buffer that can be written from a BPF_PROG_TYPE_RAW_TRACEPOINT program. The size of the buffer must be a compile-time constant, and is checked before allowing a BPF program to attach to a tracepoint that uses this feature. The pointer to this buffer will be the first argument of tracepoints that opt in; the pointer is valid and can be bpf_probe_read() by both BPF_PROG_TYPE_RAW_TRACEPOINT and BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE programs that attach to such a tracepoint, but the buffer to which it points may only be written by the latter. Signed-off-by: Matt Mullins <mmullins@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-26Merge tag 'trace-v5.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Three tracing fixes: - Use "nosteal" for ring buffer splice pages - Memory leak fix in error path of trace_pid_write() - Fix preempt_enable_no_resched() (use preempt_enable()) in ring buffer code" * tag 'trace-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: trace: Fix preempt_enable_no_resched() abuse tracing: Fix a memory leak by early error exit in trace_pid_write() tracing: Fix buffer_ref pipe ops
2019-04-26trace: Fix preempt_enable_no_resched() abusePeter Zijlstra
Unless the very next line is schedule(), or implies it, one must not use preempt_enable_no_resched(). It can cause a preemption to go missing and thereby cause arbitrary delays, breaking the PREEMPT=y invariant. Link: http://lkml.kernel.org/r/20190423200318.GY14281@hirez.programming.kicks-ass.net Cc: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: the arch/x86 maintainers <x86@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: huang ying <huang.ying.caritas@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: stable@vger.kernel.org Fixes: 2c2d7329d8af ("tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-26tracing: Fix a memory leak by early error exit in trace_pid_write()Wenwen Wang
In trace_pid_write(), the buffer for trace parser is allocated through kmalloc() in trace_parser_get_init(). Later on, after the buffer is used, it is then freed through kfree() in trace_parser_put(). However, it is possible that trace_pid_write() is terminated due to unexpected errors, e.g., ENOMEM. In that case, the allocated buffer will not be freed, which is a memory leak bug. To fix this issue, free the allocated buffer when an error is encountered. Link: http://lkml.kernel.org/r/1555726979-15633-1-git-send-email-wang6495@umn.edu Fixes: f4d34a87e9c10 ("tracing: Use pid bitmap instead of a pid array for set_event_pid") Cc: stable@vger.kernel.org Signed-off-by: Wenwen Wang <wang6495@umn.edu> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-26tracing: Fix buffer_ref pipe opsJann Horn
This fixes multiple issues in buffer_pipe_buf_ops: - The ->steal() handler must not return zero unless the pipe buffer has the only reference to the page. But generic_pipe_buf_steal() assumes that every reference to the pipe is tracked by the page's refcount, which isn't true for these buffers - buffer_pipe_buf_get(), which duplicates a buffer, doesn't touch the page's refcount. Fix it by using generic_pipe_buf_nosteal(), which refuses every attempted theft. It should be easy to actually support ->steal, but the only current users of pipe_buf_steal() are the virtio console and FUSE, and they also only use it as an optimization. So it's probably not worth the effort. - The ->get() and ->release() handlers can be invoked concurrently on pipe buffers backed by the same struct buffer_ref. Make them safe against concurrency by using refcount_t. - The pointers stored in ->private were only zeroed out when the last reference to the buffer_ref was dropped. As far as I know, this shouldn't be necessary anyway, but if we do it, let's always do it. Link: http://lkml.kernel.org/r/20190404215925.253531-1-jannh@google.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Fixes: 73a757e63114d ("ring-buffer: Return reader page back into existing ring buffer") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two easy cases of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-04-22 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) allow stack/queue helpers from more bpf program types, from Alban. 2) allow parallel verification of root bpf programs, from Alexei. 3) introduce bpf sysctl hook for trusted root cases, from Andrey. 4) recognize var/datasec in btf deduplication, from Andrii. 5) cpumap performance optimizations, from Jesper. 6) verifier prep for alu32 optimization, from Jiong. 7) libbpf xsk cleanup, from Magnus. 8) other various fixes and cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-21function_graph: Have selftest also emulate tr->reset() as it did with tr->init()Steven Rostedt (VMware)
The function_graph boot up self test emulates the tr->init() function in order to add a wrapper around the function graph tracer entry code to test for lock ups and such. But it does not emulate the tr->reset(), and just calls the function_graph tracer tr->reset() function which will use its own fgraph_ops to unregister function tracing with. As the fgraph_ops is becoming more meaningful with the register_ftrace_graph() and unregister_ftrace_graph() functions, the two need to be the same. The emulated tr->init() uses its own fgraph_ops descriptor, which means the unregister_ftrace_graph() must use the same ftrace_ops, which the selftest currently does not do. By emulating the tr->reset() as the selftest does with the tr->init() it will be able to pass the same fgraph_ops descriptor to the unregister_ftrace_graph() as it did with the register_ftrace_graph(). Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-19kprobes: Mark ftrace mcount handler functions nokprobeMasami Hiramatsu
Mark ftrace mcount handler functions nokprobe since probing on these functions with kretprobe pushes return address incorrectly on kretprobe shadow stack. Reported-by: Francis Deslauriers <francis.deslauriers@efficios.com> Tested-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/155094062044.6137.6419622920568680640.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16bpf: add map helper functions push, pop, peek in more BPF programsAlban Crequy
commit f1a2e44a3aec ("bpf: add queue and stack maps") introduced new BPF helper functions: - BPF_FUNC_map_push_elem - BPF_FUNC_map_pop_elem - BPF_FUNC_map_peek_elem but they were made available only for network BPF programs. This patch makes them available for tracepoint, cgroup and lirc programs. Signed-off-by: Alban Crequy <alban@kinvolk.io> Cc: Mauricio Vasquez B <mauricio.vasquez@polito.it> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-14Merge branch 'page-refs' (page ref overflow)Linus Torvalds
Merge page ref overflow branch. Jann Horn reported that he can overflow the page ref count with sufficient memory (and a filesystem that is intentionally extremely slow). Admittedly it's not exactly easy. To have more than four billion references to a page requires a minimum of 32GB of kernel memory just for the pointers to the pages, much less any metadata to keep track of those pointers. Jann needed a total of 140GB of memory and a specially crafted filesystem that leaves all reads pending (in order to not ever free the page references and just keep adding more). Still, we have a fairly straightforward way to limit the two obvious user-controllable sources of page references: direct-IO like page references gotten through get_user_pages(), and the splice pipe page duplication. So let's just do that. * branch page-refs: fs: prevent page refcount overflow in pipe_buf_get mm: prevent get_user_pages() from overflowing page refcount mm: add 'try_get_page()' helper function mm: make page ref count overflow check tighter and more explicit
2019-04-14tracing: Remove the ULONG_MAX stack trace hackeryThomas Gleixner
No architecture terminates the stack trace with ULONG_MAX anymore. As the code checks the number of entries stored anyway there is no point in keeping all that ULONG_MAX magic around. The histogram code zeroes the storage before saving the stack, so if the trace is shorter than the maximum number of entries it can terminate the print loop if a zero entry is detected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Potapenko <glider@google.com> Link: https://lkml.kernel.org/r/20190410103645.048761764@linutronix.de
2019-04-14fs: prevent page refcount overflow in pipe_buf_getMatthew Wilcox
Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Matthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-11ftrace: Do not process STUB functions in ftrace_ops_list_func()Steven Rostedt (VMware)
The function_graph tracer has a stub function and its ops flag has the FTRACE_OPS_FL_STUB set. As the function graph does not use the ftrace_ops->func pointer but instead is called by a separate part of the ftrace trampoline. The function_graph tracer still requires to pass in a ftrace_ops that may also hold the hash of the functions to call. But there's no reason to test that hash in the function tracing portion. Instead of testing to see if we should call the stub function, just test if the ops has FTRACE_OPS_FL_STUB set, and just skip it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-10ftrace: Remove ASSIGN_OPS_HASH() macro from ftrace.cSteven Rostedt (VMware)
The ASSIGN_OPS_HASH() macro was moved to fgraph.c where it was used, but for some reason it wasn't removed from ftrace.c, as it is no longer referenced there. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-08tracing: Add error_log to READMETom Zanussi
Add brief blurb about error_log to the 'Important files' section. Link: http://lkml.kernel.org/r/c81e60f9aded495081231a32d2d1023c4d043a7a.1554072478.git.tom.zanussi@linux.intel.com Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-08tracing: Have the error logs show up in the proper instancesSteven Rostedt (VMware)
As each instance has their own error_log file, it makes more sense that the instances show the errors of their own instead of all error_logs having the same data. Make it that the errors show up in the instance error_log file that the error happens in. If no instance trace_array is available, then NULL can be passed in which will create the error in the top level instance (the one at the top of the tracefs directory). Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-08tracing: Have histogram code pass around trace_array for error handlingSteven Rostedt (VMware)
Have the trace_array that associates the trace instance of the histogram passed around to functions so that error handling can display the error message in the proper instance. Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-08tracing: Add trace_array parameter to create_event_filter()Steven Rostedt (VMware)
Pass in the trace_array that represents the instance the filter being changed is in to create_event_filter(). This will allow for error messages that happen when writing to the filter can be displayed in the proper instance "error_log" file. Note, for calls to create_filter() (that was also modified to support create_event_filter()), that changes filters that do not exist in a instance (for perf for example), NULL may be passed in, which means that there will not be any message to log for that filter. Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-08tracing: stop making gpio tracing configurableUwe Kleine-König
gpio tracing was made configurable in 4.4-rc1 (commit ddd70280bf0e ("tracing: gpio: Add Kconfig option for enabling/disabling trace events")). Since then it is the only event type that can be compiled conditionally. Given that there is only little overhead I don't understand the reasoning and I was annoyed more than once that gpio events were not available without recompiling. So drop the Kconfig symbol and make gpio events available unconditionally. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-04-05syscalls: Remove start and number from syscall_get_arguments() argsSteven Rostedt (Red Hat)
At Linux Plumbers, Andy Lutomirski approached me and pointed out that the function call syscall_get_arguments() implemented in x86 was horribly written and not optimized for the standard case of passing in 0 and 6 for the starting index and the number of system calls to get. When looking at all the users of this function, I discovered that all instances pass in only 0 and 6 for these arguments. Instead of having this function handle different cases that are never used, simply rewrite it to return the first 6 arguments of a system call. This should help out the performance of tracing system calls by ptrace, ftrace and perf. Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Dave Martin <dave.martin@arm.com> Cc: "Dmitry V. Levin" <ldv@altlinux.org> Cc: x86@kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: uclinux-h8-devel@lists.sourceforge.jp Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: nios2-dev@lists.rocketboards.org Cc: openrisc@lists.librecores.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-riscv@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-arch@vger.kernel.org Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86 Reviewed-by: Dmitry V. Levin <ldv@altlinux.org> Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-04tracing/syscalls: Pass in hardcoded 6 into syscall_get_arguments()Steven Rostedt (Red Hat)
The only users that calls syscall_get_arguments() with a variable and not a hard coded '6' is ftrace_syscall_enter(). syscall_get_arguments() can be optimized by removing a variable input, and always grabbing 6 arguments regardless of what the system call actually uses. Change ftrace_syscall_enter() to pass the 6 args into a local stack array and copy the necessary arguments into the trace event as needed. This is needed to remove two parameters from syscall_get_arguments(). Link: http://lkml.kernel.org/r/20161107213233.627583542@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>