summaryrefslogtreecommitdiff
path: root/include/trace
AgeCommit message (Collapse)Author
2018-01-23rpcrdma: infrastructure for static trace points in rpcrdma.koChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-01-23rdma/ib: Add trace point macros to display human-readable valuesChuck Lever
These can be shared with all kernel ULPs, and more can easily be added as needed. Note: checkpatch.pl has some heartburn with the TRACE_DEFINE_ENUM macros and the LIST macros. These follow the same style as other header files under include/tracing/events , thus should be considered acceptable exceptions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-01-22btrfs: Remove redundant FLAG_VACANCYNikolay Borisov
Commit 9036c10208e1 ("Btrfs: update hole handling v2") added the FLAG_VACANCY to denote holes, however there was already a consistent way of flagging extents which represent hole - ->block_start = EXTENT_MAP_HOLE. And also the only place where this flag is checked is in the fiemap code, but the block_start value is also checked and every other place in the filesystem detects holes by using block_start value's. So remove the extra flag. This survived a full xfstest run. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-18Merge branch 'pm-cpufreq'Rafael J. Wysocki
* pm-cpufreq: (36 commits) cpufreq: scpi: remove arm_big_little dependency drivers: psci: remove cluster terminology and dependency on physical_package_id cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin cpufreq: intel_pstate: Add Skylake servers support cpufreq: intel_pstate: Replace bxt_funcs with core_funcs cpufreq: imx6q: add 696MHz operating point for i.mx6ul ARM: dts: imx6ul: add 696MHz operating point cpufreq: stats: Change return type of cpufreq_stats_update() as void powernv-cpufreq: Treat pstates as opaque 8-bit values powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates powernv-cpufreq: Add helper to extract pstate from PMSR cpu_cooling: Remove static-power related documentation cpufreq: imx6q: switch to Use clk_bulk_get() to refine clk operations PM / OPP: Make local function ti_opp_supply_set_opp() static PM / OPP: Add ti-opp-supply driver dt-bindings: opp: Introduce ti-opp-supply bindings cpufreq: ti-cpufreq: Add support for multiple regulators cpufreq: ti-cpufreq: Convert to module_platform_driver cpufreq: Add DVFS support for Armada 37xx MAINTAINERS: add new entries for Armada 37xx cpufreq driver ...
2018-01-18Merge branch 'pm-cpufreq-thermal' into pm-cpufreqRafael J. Wysocki
* pm-cpufreq-thermal: cpu_cooling: Remove static-power related documentation cpu_cooling: Drop static-power related stuff cpu_cooling: Keep only one of_cpufreq*cooling_register() helper cpu_cooling: Remove unused cpufreq_power_cooling_register() cpu_cooling: Make of_cpufreq_power_cooling_register() parse DT
2018-01-16hrtimer: Add clock bases and hrtimer mode for softirq contextAnna-Maria Gleixner
Currently hrtimer callback functions are always executed in hard interrupt context. Users of hrtimers, which need their timer function to be executed in soft interrupt context, make use of tasklets to get the proper context. Add additional hrtimer clock bases for timers which must expire in softirq context, so the detour via the tasklet can be avoided. This is also required for RT, where the majority of hrtimer is moved into softirq hrtimer context. The selection of the expiry mode happens via a mode bit. Introduce HRTIMER_MODE_SOFT and the matching combinations with the ABS/REL/PINNED bits and update the decoding of hrtimer_mode in tracepoints. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-27-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16tracing/hrtimer: Print the hrtimer mode in the 'hrtimer_start' tracepointAnna-Maria Gleixner
The 'hrtimer_start' tracepoint lacks the mode information. The mode is important because consecutive starts can switch from ABS to REL or from PINNED to non PINNED. Append the mode field. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-10-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16tracing/hrtimer: Fix tracing bugs by taking all clock bases and modes into ↵Anna-Maria Gleixner
account So far only CLOCK_MONOTONIC and CLOCK_REALTIME were taken into account as well as HRTIMER_MODE_ABS/REL in the hrtimer_init tracepoint. The query for detecting the ABS or REL timer modes is not valid anymore, it got broken by the introduction of HRTIMER_MODE_PINNED. HRTIMER_MODE_PINNED is not evaluated in the hrtimer_init() call, but for the sake of completeness print all given modes. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-9-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-08net: tracepoint: exposing sk_faimily in tracepoint inet_sock_set_stateYafang Shao
As of now, there're two sk_family are traced with sock:inet_sock_set_state, which are AF_INET and AF_INET6. So the sk_family are exposed as well. Then we can conveniently use it to do the filter. Both sk_family and sk_protocol are showed in the printk message, so we need not expose them as tracepoint arguments. Suggested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Suggested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-03Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Updates to use cond_resched() instead of cond_resched_rcu_qs() where feasible (currently everywhere except in kernel/rcu and in kernel/torture.c). Also a couple of fixes to avoid sending IPIs to offline CPUs. - Updates to simplify RCU's dyntick-idle handling. - Updates to remove almost all uses of smp_read_barrier_depends() and read_barrier_depends(). - Miscellaneous fixes. - Torture-test updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-02f2fs: recover directory operations by fsyncJaegeuk Kim
This fixes generic/342 which doesn't recover renamed file which was fsynced before. It will be done via another fsync on newly created file. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-01-02net: dccp: Add DCCP sendmsg trace eventMasami Hiramatsu
Add DCCP sendmsg trace event (dccp/dccp_probe) for replacing dccpprobe. User can trace this event via ftrace or perftools. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-02net: sctp: Add SCTP ACK tracking trace eventMasami Hiramatsu
Add SCTP ACK tracking trace event to trace the changes of SCTP association state in response to incoming packets. It is used for debugging SCTP congestion control algorithms, and will replace sctp_probe module. Note that this event a bit tricky. Since this consists of 2 events (sctp_probe and sctp_probe_path) so you have to enable both events as below. # cd /sys/kernel/debug/tracing # echo 1 > events/sctp/sctp_probe/enable # echo 1 > events/sctp/sctp_probe_path/enable Or, you can enable all the events under sctp. # echo 1 > events/sctp/enable Since sctp_probe_path event is always invoked from sctp_probe event, you can not see any output if you only enable sctp_probe_path. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-02net: tcp: Add trace events for TCP congestion window tracingMasami Hiramatsu
This adds an event to trace TCP stat variables with slightly intrusive trace-event. This uses ftrace/perf event log buffer to trace those state, no needs to prepare own ring-buffer, nor custom user apps. User can use ftrace to trace this event as below; # cd /sys/kernel/debug/tracing # echo 1 > events/tcp/tcp_probe/enable (run workloads) # cat trace Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-02Merge 4.15-rc6 into char-misc-nextGreg Kroah-Hartman
We want the fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
net/ipv6/ip6_gre.c is a case of parallel adds. include/trace/events/tcp.h is a little bit more tricky. The removal of in-trace-macro ifdefs in 'net' paralleled with moving show_tcp_state_name and friends over to include/trace/events/sock.h in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) IPv6 gre tunnels end up with different default features enabled depending upon whether netlink or ioctls are used to bring them up. Fix from Alexey Kodanev. 2) Fix read past end of user control message in RDS< from Avinash Repaka. 3) Missing RCU barrier in mini qdisc code, from Cong Wang. 4) Missing policy put when reusing per-cpu route entries, from Florian Westphal. 5) Handle nested PCI errors properly in bnx2x driver, from Guilherme G. Piccoli. 6) Run nested transport mode IPSEC packets via tasklet, from Herbert Xu. 7) Fix handling poll() for stream sockets in tipc, from Parthasarathy Bhuvaragan. 8) Fix two stack-out-of-bounds issues in IPSEC, from Steffen Klassert. 9) Another zerocopy ubuf handling fix, from Willem de Bruijn. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits) strparser: Call sock_owned_by_user_nocheck sock: Add sock_owned_by_user_nocheck skbuff: in skb_copy_ubufs unclone before releasing zerocopy tipc: fix hanging poll() for stream sockets sctp: Replace use of sockets_allocated with specified macro. bnx2x: Improve reliability in case of nested PCI errors tg3: Enable PHY reset in MTU change path for 5720 tg3: Add workaround to restrict 5762 MRRS to 2048 tg3: Update copyright net: fec: unmap the xmit buffer that are not transferred by DMA tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path tipc: error path leak fixes in tipc_enable_bearer() RDS: Check cmsg_len before dereferencing CMSG_DATA tcp: Avoid preprocessor directives in tracepoint macro args tipc: fix memory leak of group member when peer node is lost net: sched: fix possible null pointer deref in tcf_block_put tipc: base group replicast ack counter on number of actual receivers net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap() net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround ip6_gre: fix device features for ioctl setup ...
2017-12-27net/trace: fix printk format in inet_sock_set_stateYafang Shao
There's a space character missed in the printk messages. Put the message into one line could simplify searching for the messages in the kernel source. Fixes: 563e0bb0dc74("net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint") Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-26tcp: Avoid preprocessor directives in tracepoint macro argsMat Martineau
Using a preprocessor directive to check for CONFIG_IPV6 in the middle of a DECLARE_EVENT_CLASS macro's arg list causes sparse to report a series of errors: ./include/trace/events/tcp.h:68:1: error: directive in argument list ./include/trace/events/tcp.h:75:1: error: directive in argument list ./include/trace/events/tcp.h:144:1: error: directive in argument list ./include/trace/events/tcp.h:151:1: error: directive in argument list ./include/trace/events/tcp.h:216:1: error: directive in argument list ./include/trace/events/tcp.h:223:1: error: directive in argument list ./include/trace/events/tcp.h:274:1: error: directive in argument list ./include/trace/events/tcp.h:281:1: error: directive in argument list Once sparse finds an error, it stops printing warnings for the file it is checking. This masks any sparse warnings that would normally be reported for the core TCP code. Instead, handle the preprocessor conditionals in a couple of auxiliary macros. This also has the benefit of reducing duplicate code. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-22Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Here's a trio of fixes: - The runtime PM clk patches that landed this merge window forgot to runtime resume devices that may be off while recalculating and setting rates of child clks of whatever clk is changing rates. - We had a NULL pointer deref in an old clk tracepoint when clk_set_parent() is called with a NULL parent pointer. This shouldn't really happen, but it's best to avoid this regardless. - The sun9i-mmc clk driver didn't provide 'reset' support, just 'assert' and 'deassert' so the MMC driver stopped probing when the probe was changed to do a reset instead of assert/deassert pair. This implements the reset so things work again" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: sunxi: sun9i-mmc: Implement reset callback for reset controls clk: fix a panic error caused by accessing NULL pointer clk: Manage proper runtime PM state in clk_change_rate()
2017-12-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Lots of overlapping changes. Also on the net-next side the XDP state management is handled more in the generic layers so undo the 'net' nfp fix which isn't applicable in net-next. Include a necessary change by Jakub Kicinski, with log message: ==================== cls_bpf no longer takes care of offload tracking. Make sure netdevsim performs necessary checks. This fixes a warning caused by TC trying to remove a filter it has not added. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "ARM fixes: - A bug in handling of SPE state for non-vhe systems - A fix for a crash on system shutdown - Three timer fixes, introduced by the timer optimizations for v4.15 x86 fixes: - fix for a WARN that was introduced in 4.15 - fix for SMM when guest uses PCID - fixes for several bugs found by syzkaller ... and a dozen papercut fixes for the kvm_stat tool" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) tools/kvm_stat: sort '-f help' output kvm: x86: fix RSM when PCID is non-zero KVM: Fix stack-out-of-bounds read in write_mmio KVM: arm/arm64: Fix timer enable flow KVM: arm/arm64: Properly handle arch-timer IRQs after vtimer_save_state KVM: arm/arm64: timer: Don't set irq as forwarded if no usable GIC KVM: arm/arm64: Fix HYP unmapping going off limits arm64: kvm: Prevent restoring stale PMSCR_EL1 for vcpu KVM/x86: Check input paging mode when cs.l is set tools/kvm_stat: add line for totals tools/kvm_stat: stop ignoring unhandled arguments tools/kvm_stat: suppress usage information on command line errors tools/kvm_stat: handle invalid regular expressions tools/kvm_stat: add hint on '-f help' to man page tools/kvm_stat: fix child trace events accounting tools/kvm_stat: fix extra handling of 'help' with fields filter tools/kvm_stat: fix missing field update after filter change tools/kvm_stat: fix drilldown in events-by-guests mode tools/kvm_stat: fix command line option '-g' kvm: x86: fix WARN due to uninitialized guest FPU state ...
2017-12-20net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state ↵Yafang Shao
tracepoint As sk_state is a common field for struct sock, so the state transition tracepoint should not be a TCP specific feature. Currently it traces all AF_INET state transition, so I rename this tracepoint to inet_sock_set_state tracepoint with some minor changes and move it into trace/events/sock.h. We dont need to create a file named trace/events/inet_sock.h for this one single tracepoint. Two helpers are introduced to trace sk_state transition - void inet_sk_state_store(struct sock *sk, int newstate); - void inet_sk_set_state(struct sock *sk, int state); As trace header should not be included in other header files, so they are defined in sock.c. The protocol such as SCTP maybe compiled as a ko, hence export inet_sk_set_state(). Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-20tcp: Export to userspace the TCP state names for the trace eventsSteven Rostedt (VMware)
The TCP trace events (specifically tcp_set_state), maps emums to symbol names via __print_symbolic(). But this only works for reading trace events from the tracefs trace files. If perf or trace-cmd were to record these events, the event format file does not convert the enum names into numbers, and you get something like: __print_symbolic(REC->oldstate, { TCP_ESTABLISHED, "TCP_ESTABLISHED" }, { TCP_SYN_SENT, "TCP_SYN_SENT" }, { TCP_SYN_RECV, "TCP_SYN_RECV" }, { TCP_FIN_WAIT1, "TCP_FIN_WAIT1" }, { TCP_FIN_WAIT2, "TCP_FIN_WAIT2" }, { TCP_TIME_WAIT, "TCP_TIME_WAIT" }, { TCP_CLOSE, "TCP_CLOSE" }, { TCP_CLOSE_WAIT, "TCP_CLOSE_WAIT" }, { TCP_LAST_ACK, "TCP_LAST_ACK" }, { TCP_LISTEN, "TCP_LISTEN" }, { TCP_CLOSING, "TCP_CLOSING" }, { TCP_NEW_SYN_RECV, "TCP_NEW_SYN_RECV" }) Where trace-cmd and perf do not know the values of those enums. Use the TRACE_DEFINE_ENUM() macros that will have the trace events convert the enum strings into their values at system boot. This will allow perf and trace-cmd to see actual numbers and not enums: __print_symbolic(REC->oldstate, { 1, "TCP_ESTABLISHED" }, { 2, "TCP_SYN_SENT" }, { 3, "TCP_SYN_RECV" }, { 4, "TCP_FIN_WAIT1" }, { 5, "TCP_FIN_WAIT2" }, { 6, "TCP_TIME_WAIT" }, { 7, "TCP_CLOSE" }, { 8, "TCP_CLOSE_WAIT" }, { 9, "TCP_LAST_ACK" }, { 10, "TCP_LISTEN" }, { 11, "TCP_CLOSING" }, { 12, "TCP_NEW_SYN_RECV" }) Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-19siox: add support for tracingUwe Kleine-König
Implement tracing for SIOX. There are events for the data that is written to the bus and for data being read from it. Acked-by: Gavin Schenk <g.schenk@eckelmann.de> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-18KVM: Fix stack-out-of-bounds read in write_mmioWanpeng Li
Reported by syzkaller: BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm] Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298 CPU: 6 PID: 32298 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #18 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 Call Trace: dump_stack+0xab/0xe1 print_address_description+0x6b/0x290 kasan_report+0x28a/0x370 write_mmio+0x11e/0x270 [kvm] emulator_read_write_onepage+0x311/0x600 [kvm] emulator_read_write+0xef/0x240 [kvm] emulator_fix_hypercall+0x105/0x150 [kvm] em_hypercall+0x2b/0x80 [kvm] x86_emulate_insn+0x2b1/0x1640 [kvm] x86_emulate_instruction+0x39a/0xb90 [kvm] handle_exception+0x1b4/0x4d0 [kvm_intel] vcpu_enter_guest+0x15a0/0x2640 [kvm] kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm] kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 entry_SYSCALL_64_fastpath+0x23/0x9a The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall) to the guest memory, however, write_mmio tracepoint always prints 8 bytes through *(u64 *)val since kvm splits the mmio access into 8 bytes. This leaks 5 bytes from the kernel stack (CVE-2017-17741). This patch fixes it by just accessing the bytes which we operate on. Before patch: syz-executor-5567 [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f After patch: syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Three sets of overlapping changes, two in the packet scheduler and one in the meson-gxl PHY driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-14Merge tag 'trace-v4.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Various fix-ups: - comment fixes - build fix - better memory alloction (don't use NR_CPUS) - configuration fix - build warning fix - enhanced callback parameter (to simplify users of trace hooks) - give up on stack tracing when RCU isn't watching (it's a lost cause)" * tag 'trace-v4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Have stack trace not record if RCU is not watching tracing: Pass export pointer as argument to ->write() ring-buffer: Remove unused function __rb_data_page_index() tracing: make PREEMPTIRQ_EVENTS depend on TRACING tracing: Allocate mask_str buffer dynamically tracing: always define trace_{irq,preempt}_{enable_disable} tracing: Fix code comments in trace.c
2017-12-13net: bridge: use rhashtable for fdbsNikolay Aleksandrov
Before this patch the bridge used a fixed 256 element hash table which was fine for small use cases (in my tests it starts to degrade above 1000 entries), but it wasn't enough for medium or large scale deployments. Modern setups have thousands of participants in a single bridge, even only enabling vlans and adding a few thousand vlan entries will cause a few thousand fdbs to be automatically inserted per participating port. So we need to scale the fdb table considerably to cope with modern workloads, and this patch converts it to use a rhashtable for its operations thus improving the bridge scalability. Tests show the following results (10 runs each), at up to 1000 entries rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it is 2 times faster and at 30000 it is 50 times faster. Obviously this happens because of the properties of the two constructs and is expected, rhashtable keeps pretty much a constant time even with 10000000 entries (tested), while the fixed hash table struggles considerably even above 10000. As a side effect this also reduces the net_bridge struct size from 3248 bytes to 1344 bytes. Also note that the key struct is 8 bytes. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11Merge branches 'cond_resched.2017.12.04a', 'dyntick.2017.11.28a', ↵Paul E. McKenney
'fixes.2017.12.11a', 'srbd.2017.12.05a' and 'torture.2017.12.11a' into HEAD cond_resched.2017.12.04a: Convert cond_resched_rcu_qs() to cond_resched() dyntick.2017.11.28a: Make RCU dynticks handle interrupts from NMI fixes.2017.12.11a: Miscellaneous fixes srbd.2017.12.05a: Remove now-redundant smp_read_barrier_depends() torture.2017.12.11a: Torture-testing update
2017-12-11tracing, rcu: Hide trace event rcu_nocb_wake when not usedSteven Rostedt (VMware)
The trace event rcu_nocb_wake is only used when CONFIG_RCU_NOCB_CPU is defined. But the trace event is defined regardless. As defined trace events take up memory, it is a waste to have it defined when not used. Surround the trace event with an #ifdef to have it only defined when it is used. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-12-07cpu_cooling: Drop static-power related stuffViresh Kumar
No one has used it for the last two and half years (since it was introduced by commit c36cf0717631 (thermal: cpu_cooling: implement the power cooling device API), get rid of it. Acked-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-12-05clk: fix a panic error caused by accessing NULL pointerCai Li
In some cases the clock parent would be set NULL when doing re-parent, it will cause a NULL pointer accessing if clk_set trace event is enabled. This patch sets the parent as "none" if the input parameter is NULL. Fixes: dfc202ead312 (clk: Add tracepoints for hardware operations) Signed-off-by: Cai Li <cai.li@spreadtrum.com> Signed-off-by: Chunyan Zhang <chunyan.zhang@spreadtrum.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2017-12-04tracing: always define trace_{irq,preempt}_{enable_disable}Arnd Bergmann
We get a build error in the irqsoff tracer in some configurations: kernel/trace/trace_irqsoff.c: In function 'trace_preempt_on': kernel/trace/trace_irqsoff.c:855:2: error: implicit declaration of function 'trace_preempt_enable_rcuidle'; did you mean 'trace_irq_enable_rcuidle'? [-Werror=implicit-function-declaration] trace_preempt_enable_rcuidle(a0, a1); The problem is that trace_preempt_enable_rcuidle() has different definition based on multiple Kconfig symbols, but not all combinations have a valid definition. This changes the conditions so that we always get exactly one definition of each of the four tracing macros. I have not tried to verify that these definitions are sensible, but now we can build all randconfig combinations again. Link: http://lkml.kernel.org/r/20171019083230.2450779-1-arnd@arndb.de Fixes: d59158162e03 ("tracing: Add support for preempt and irq enable/disable events") Acked-by: Joel Fernandes <joelaf@google.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-12-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2017-12-02 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a compilation warning in xdp redirect tracepoint due to missing bpf.h include that pulls in struct bpf_map, from Xie. 2) Limit the maximum number of attachable BPF progs for a given perf event as long as uabi is not frozen yet. The hard upper limit is now 64 and therefore the same as with BPF multi-prog for cgroups. Also add related error checking for the sample BPF loader when enabling and attaching to the perf event, from Yonghong. 3) Specifically set the RLIMIT_MEMLOCK for the test_verifier_log case, so that the test case can always pass and not fail in some environments due to too low default limit, also from Yonghong. 4) Fix up a missing license header comment for kernel/bpf/offload.c, from Jakub. 5) Several fixes for bpftool, among others a crash on incorrect arguments when json output is used, error message handling fixes on unknown options and proper destruction of json writer for some exit cases, all from Quentin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30trace/xdp: fix compile warning: 'struct bpf_map' declared inside parameter listXie XiuQi
We meet this compile warning, which caused by missing bpf.h in xdp.h. In file included from ./include/trace/events/xdp.h:10:0, from ./include/linux/bpf_trace.h:6, from drivers/net/ethernet/intel/i40e/i40e_txrx.c:29: ./include/trace/events/xdp.h:93:17: warning: ‘struct bpf_map’ declared inside parameter list will not be visible outside of this definition or declaration const struct bpf_map *map, u32 map_index), ^ ./include/linux/tracepoint.h:187:34: note: in definition of macro ‘__DECLARE_TRACE’ static inline void trace_##name(proto) \ ^~~~~ ./include/linux/tracepoint.h:352:24: note: in expansion of macro ‘PARAMS’ __DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), \ ^~~~~~ ./include/linux/tracepoint.h:477:2: note: in expansion of macro ‘DECLARE_TRACE’ DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) ^~~~~~~~~~~~~ ./include/linux/tracepoint.h:477:22: note: in expansion of macro ‘PARAMS’ DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) ^~~~~~ ./include/trace/events/xdp.h:89:1: note: in expansion of macro ‘DEFINE_EVENT’ DEFINE_EVENT(xdp_redirect_template, xdp_redirect, ^~~~~~~~~~~~ ./include/trace/events/xdp.h:90:2: note: in expansion of macro ‘TP_PROTO’ TP_PROTO(const struct net_device *dev, ^~~~~~~~ ./include/trace/events/xdp.h:93:17: warning: ‘struct bpf_map’ declared inside parameter list will not be visible outside of this definition or declaration const struct bpf_map *map, u32 map_index), ^ ./include/linux/tracepoint.h:203:38: note: in definition of macro ‘__DECLARE_TRACE’ register_trace_##name(void (*probe)(data_proto), void *data) \ ^~~~~~~~~~ ./include/linux/tracepoint.h:354:4: note: in expansion of macro ‘PARAMS’ PARAMS(void *__data, proto), \ ^~~~~~ Reported-by: Huang Daode <huangdaode@hisilicon.com> Cc: Hanjun Guo <guohanjun@huawei.com> Fixes: 8d3b778ff544 ("xdp: tracepoint xdp_redirect also need a map argument") Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) The forcedeth conversion from pci_*() DMA interfaces to dma_*() ones missed one spot. From Zhu Yanjun. 2) Missing CRYPTO_SHA256 Kconfig dep in cfg80211, from Johannes Berg. 3) Fix checksum offloading in thunderx driver, from Sunil Goutham. 4) Add SPDX to vm_sockets_diag.h, from Stephen Hemminger. 5) Fix use after free of packet headers in TIPC, from Jon Maloy. 6) "sizeof(ptr)" vs "sizeof(*ptr)" bug in i40e, from Gustavo A R Silva. 7) Tunneling fixes in mlxsw driver, from Petr Machata. 8) Fix crash in fanout_demux_rollover() of AF_PACKET, from Mike Maloney. 9) Fix race in AF_PACKET bind() vs. NETDEV_UP notifier, from Eric Dumazet. 10) Fix regression in sch_sfq.c due to one of the timer_setup() conversions. From Paolo Abeni. 11) SCTP does list_for_each_entry() using wrong struct member, fix from Xin Long. 12) Don't use big endian netlink attribute read for IFLA_BOND_AD_ACTOR_SYSTEM, it is in cpu endianness. Also from Xin Long. 13) Fix mis-initialization of q->link.clock in CBQ scheduler, preventing adding filters there. From Jiri Pirko. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits) ethernet: dwmac-stm32: Fix copyright net: via: via-rhine: use %p to format void * address instead of %x net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit myri10ge: Update MAINTAINERS net: sched: cbq: create block for q->link.block atm: suni: remove extraneous space to fix indentation atm: lanai: use %p to format kernel addresses instead of %x VSOCK: Don't set sk_state to TCP_CLOSE before testing it atm: fore200e: use %pK to format kernel addresses instead of %x ambassador: fix incorrect indentation of assignment statement vxlan: use __be32 type for the param vni in __vxlan_fdb_delete bonding: use nla_get_u64 to extract the value for IFLA_BOND_AD_ACTOR_SYSTEM sctp: use right member as the param of list_for_each_entry sch_sfq: fix null pointer dereference at timer expiration cls_bpf: don't decrement net's refcount when offload fails net/packet: fix a race in packet_bind() and packet_notifier() packet: fix crash in fanout_demux_rollover() sctp: remove extern from stream sched sctp: force the params with right types for sctp csum apis sctp: force SCTP_ERROR_INV_STRM with __u32 when calling sctp_chunk_fail ...
2017-11-28tracing, rcu: Remove no longer used trace event rcu_prep_idleSteven Rostedt (VMware)
Commit c0f4dfd4f90 ("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks") removed the only instances of trace_rcu_prep_idle, but did not remove the TRACE_EVENT() that creates it. As defined trace events take up memory within the kernel even when they are not used, this is a waste of space. Remove the obsolete event. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-11-28rcu: Simplify rcu_eqs_{enter,exit}() non-idle task debug codePaul E. McKenney
The code that checks for non-idle non-nohz_idle-usermode tasks invoking rcu_eqs_enter() and rcu_eqs_exit() prints a considerable quantity of helpful information. However, these checks fire rarely, so the extra complexity is no longer worth it. This commit therefore replaces this debug code with simple WARN_ON_ONCE() statements. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-11-28rcu: Add ->dynticks field to rcu_dyntick trace eventPaul E. McKenney
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-11-28rcu: Shrink ->dynticks_{nmi_,}nesting from long long to longPaul E. McKenney
Because the ->dynticks_nesting field now only contains the process-based nesting level instead of a value encoding both the process nesting level and the irq "nesting" level, we no longer need a long long, even on 32-bit systems. This commit therefore changes both the ->dynticks_nesting and ->dynticks_nmi_nesting fields to long. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-11-28rcu: Add tracing to irq/NMI dyntick-idle transitionsPaul E. McKenney
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-11-26Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc fixes: a documentation fix, a Sparse warning fix and a debugging fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/debug: Fix task state recording/printout sched/deadline: Don't use dubious signed bitfields sched/deadline: Fix the description of runtime accounting in the documentation
2017-11-24rxrpc: Fix service endpoint expiryDavid Howells
RxRPC service endpoints expire like they're supposed to by the following means: (1) Mark dead rxrpc_net structs (with ->live) rather than twiddling the global service conn timeout, otherwise the first rxrpc_net struct to die will cause connections on all others to expire immediately from then on. (2) Mark local service endpoints for which the socket has been closed (->service_closed) so that the expiration timeout can be much shortened for service and client connections going through that endpoint. (3) rxrpc_put_service_conn() needs to schedule the reaper when the usage count reaches 1, not 0, as idle conns have a 1 count. (4) The accumulator for the earliest time we might want to schedule for should be initialised to jiffies + MAX_JIFFY_OFFSET, not ULONG_MAX as the comparison functions use signed arithmetic. (5) Simplify the expiration handling, adding the expiration value to the idle timestamp each time rather than keeping track of the time in the past before which the idle timestamp must go to be expired. This is much easier to read. (6) Ignore the timeouts if the net namespace is dead. (7) Restart the service reaper work item rather the client reaper. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-24rxrpc: Add keepalive for a callDavid Howells
We need to transmit a packet every so often to act as a keepalive for the peer (which has a timeout from the last time it received a packet) and also to prevent any intervening firewalls from closing the route. Do this by resetting a timer every time we transmit a packet. If the timer ever expires, we transmit a PING ACK packet and thereby also elicit a PING RESPONSE ACK from the other side - which prevents our last-rx timeout from expiring. The timer is set to 1/6 of the last-rx timeout so that we can detect the other side going away if it misses 6 replies in a row. This is particularly necessary for servers where the processing of the service function may take a significant amount of time. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-24rxrpc: Add a timeout for detecting lost ACKs/lost DATADavid Howells
Add an extra timeout that is set/updated when we send a DATA packet that has the request-ack flag set. This allows us to detect if we don't get an ACK in response to the latest flagged packet. The ACK packet is adjudged to have been lost if it doesn't turn up within 2*RTT of the transmission. If the timeout occurs, we schedule the sending of a PING ACK to find out the state of the other side. If a new DATA packet is ready to go sooner, we cancel the sending of the ping and set the request-ack flag on that instead. If we get back a PING-RESPONSE ACK that indicates a lower tx_top than what we had at the time of the ping transmission, we adjudge all the DATA packets sent between the response tx_top and the ping-time tx_top to have been lost and retransmit immediately. Rather than sending a PING ACK, we could just pick a DATA packet and speculatively retransmit that with request-ack set. It should result in either a REQUESTED ACK or a DUPLICATE ACK which we can then use in lieu the a PING-RESPONSE ACK mentioned above. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-24rxrpc: Fix call timeoutsDavid Howells
Fix the rxrpc call expiration timeouts and make them settable from userspace. By analogy with other rx implementations, there should be three timeouts: (1) "Normal timeout" This is set for all calls and is triggered if we haven't received any packets from the peer in a while. It is measured from the last time we received any packet on that call. This is not reset by any connection packets (such as CHALLENGE/RESPONSE packets). If a service operation takes a long time, the server should generate PING ACKs at a duration that's substantially less than the normal timeout so is to keep both sides alive. This is set at 1/6 of normal timeout. (2) "Idle timeout" This is set only for a service call and is triggered if we stop receiving the DATA packets that comprise the request data. It is measured from the last time we received a DATA packet. (3) "Hard timeout" This can be set for a call and specified the maximum lifetime of that call. It should not be specified by default. Some operations (such as volume transfer) take a long time. Allow userspace to set/change the timeouts on a call with sendmsg, using a control message: RXRPC_SET_CALL_TIMEOUTS The data to the message is a number of 32-bit words, not all of which need be given: u32 hard_timeout; /* sec from first packet */ u32 idle_timeout; /* msec from packet Rx */ u32 normal_timeout; /* msec from data Rx */ This can be set in combination with any other sendmsg() that affects a call. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-24sched/debug: Fix task state recording/printoutThomas Gleixner
The recent conversion of the task state recording to use task_state_index() broke the sched_switch tracepoint task state output. task_state_index() returns surprisingly an index (0-7) which is then printed with __print_flags() applying bitmasks. Not really working and resulting in weird states like 'prev_state=t' instead of 'prev_state=I'. Use TASK_REPORT_MAX instead of TASK_STATE_MAX to report preemption. Build a bitmask from the return value of task_state_index() and store it in entry->prev_state, which makes __print_flags() work as expected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: stable@vger.kernel.org Fixes: efb40f588b43 ("sched/tracing: Fix trace_sched_switch task-state printing") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1711221304180.1751@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-18Merge tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Lots of good bugfixes, including: - fix a number of races in the NFSv4+ state code - fix some shutdown crashes in multiple-network-namespace cases - relax our 4.1 session limits; if you've an artificially low limit to the number of 4.1 clients that can mount simultaneously, try upgrading" * tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux: (22 commits) SUNRPC: Improve ordering of transport processing nfsd: deal with revoked delegations appropriately svcrdma: Enqueue after setting XPT_CLOSE in completion handlers nfsd: use nfs->ns.inum as net ID rpc: remove some BUG()s svcrdma: Preserve CB send buffer across retransmits nfds: avoid gettimeofday for nfssvc_boot time fs, nfsd: convert nfs4_file.fi_ref from atomic_t to refcount_t fs, nfsd: convert nfs4_cntl_odstate.co_odcount from atomic_t to refcount_t fs, nfsd: convert nfs4_stid.sc_count from atomic_t to refcount_t lockd: double unregister of inetaddr notifiers nfsd4: catch some false session retries nfsd4: fix cached replies to solo SEQUENCE compounds sunrcp: make function _svc_create_xprt static SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status nfsd: use ARRAY_SIZE nfsd: give out fewer session slots as limit approaches nfsd: increase DRC cache limit nfsd: remove unnecessary nofilehandle checks nfs_common: convert int to bool ...
2017-11-17Merge tag 'trace-v4.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from - allow module init functions to be traced - clean up some unused or not used by config events (saves space) - clean up of trace histogram code - add support for preempt and interrupt enabled/disable events - other various clean ups * tag 'trace-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (30 commits) tracing, thermal: Hide cpu cooling trace events when not in use tracing, thermal: Hide devfreq trace events when not in use ftrace: Kill FTRACE_OPS_FL_PER_CPU perf/ftrace: Small cleanup perf/ftrace: Fix function trace events perf/ftrace: Revert ("perf/ftrace: Fix double traces of perf on ftrace:function") tracing, dma-buf: Remove unused trace event dma_fence_annotate_wait_on tracing, memcg, vmscan: Hide trace events when not in use tracing/xen: Hide events that are not used when X86_PAE is not defined tracing: mark trace_test_buffer as __maybe_unused printk: Remove superfluous memory barriers from printk_safe ftrace: Clear hashes of stale ips of init memory tracing: Add support for preempt and irq enable/disable events tracing: Prepare to add preempt and irq trace events ftrace/kallsyms: Have /proc/kallsyms show saved mod init functions ftrace: Add freeing algorithm to free ftrace_mod_maps ftrace: Save module init functions kallsyms symbols for tracing ftrace: Allow module init functions to be traced ftrace: Add a ftrace_free_mem() function for modules to use tracing: Reimplement log2 ...