Age | Commit message (Collapse) | Author |
|
BPF C program attaches to
blk_mq_start_request()/blk_update_request() kprobe events to
calculate IO latency.
For every completed block IO event it computes the time delta
in nsec and records in a histogram map:
map[log10(delta)*10]++
User space reads this histogram map every 2 seconds and prints
it as a 'heatmap' using gray shades of text terminal. Black
spaces have many events and white spaces have very few events.
Left most space is the smallest latency, right most space is
the largest latency in the range.
Usage:
$ sudo ./tracex3
and do 'sudo dd if=/dev/sda of=/dev/null' in other terminal.
Observe IO latencies and how different activity (like 'make
kernel') affects it.
Similar experiments can be done for network transmit latencies,
syscalls, etc.
'-t' flag prints the heatmap using normal ascii characters:
$ sudo ./tracex3 -t
heatmap of IO latency
# - many events with this latency
- few events
|1us |10us |100us |1ms |10ms |100ms |1s |10s
*ooo. *O.#. # 221
. *# . # 125
.. .o#*.. # 55
. . . . .#O # 37
.# # 175
.#*. # 37
# # 199
. . *#*. # 55
*#..* # 42
# # 266
...***Oo#*OO**o#* . # 629
# # 271
. .#o* o.*o* # 221
. . o* *#O.. # 50
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1427312966-8434-9-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
write() syscall
this example has two probes in one C file that attach to
different kprove events and use two different maps.
1st probe is x64 specific equivalent of dropmon. It attaches to
kfree_skb, retrevies 'ip' address of kfree_skb() caller and
counts number of packet drops at that 'ip' address. User space
prints 'location - count' map every second.
2nd probe attaches to kprobe:sys_write and computes a histogram
of different write sizes
Usage:
$ sudo tracex2
location 0xffffffff81695995 count 1
location 0xffffffff816d0da9 count 2
location 0xffffffff81695995 count 2
location 0xffffffff816d0da9 count 2
location 0xffffffff81695995 count 3
location 0xffffffff816d0da9 count 2
557145+0 records in
557145+0 records out
285258240 bytes (285 MB) copied, 1.02379 s, 279 MB/s
syscall write() stats
byte_size : count distribution
1 -> 1 : 3 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 2 | |
32 -> 63 : 3 | |
64 -> 127 : 1 | |
128 -> 255 : 1 | |
256 -> 511 : 0 | |
512 -> 1023 : 1118968 |************************************* |
Ctrl-C at any time. Kernel will auto cleanup maps and programs
$ addr2line -ape ./bld_x64/vmlinux 0xffffffff81695995
0xffffffff816d0da9 0xffffffff81695995:
./bld_x64/../net/ipv4/icmp.c:1038 0xffffffff816d0da9:
./bld_x64/../net/unix/af_unix.c:1231
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1427312966-8434-8-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
tracex1_kern.c - C program compiled into BPF.
It attaches to kprobe:netif_receive_skb()
When skb->dev->name == "lo", it prints sample debug message into
trace_pipe via bpf_trace_printk() helper function.
tracex1_user.c - corresponding user space component that:
- loads BPF program via bpf() syscall
- opens kprobes:netif_receive_skb event via perf_event_open()
syscall
- attaches the program to event via ioctl(event_fd,
PERF_EVENT_IOC_SET_BPF, prog_fd);
- prints from trace_pipe
Note, this BPF program is non-portable. It must be recompiled
with current kernel headers. kprobe is not a stable ABI and
BPF+kprobe scripts may no longer be meaningful when kernel
internals change.
No matter in what way the kernel changes, neither the kprobe,
nor the BPF program can ever crash or corrupt the kernel,
assuming the kprobes, perf and BPF subsystem has no bugs.
The verifier will detect that the program is using
bpf_trace_printk() and the kernel will print 'this is a DEBUG
kernel' warning banner, which means that bpf_trace_printk()
should be used for debugging of the BPF program only.
Usage:
$ sudo tracex1
ping-19826 [000] d.s2 63103.382648: : skb ffff880466b1ca00 len 84
ping-19826 [000] d.s2 63103.382684: : skb ffff880466b1d300 len 84
ping-19826 [000] d.s2 63104.382533: : skb ffff880466b1ca00 len 84
ping-19826 [000] d.s2 63104.382594: : skb ffff880466b1d300 len 84
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1427312966-8434-7-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Debugging of BPF programs needs some form of printk from the
program, so let programs call limited trace_printk() with %d %u
%x %p modifiers only.
Similar to kernel modules, during program load verifier checks
whether program is calling bpf_trace_printk() and if so, kernel
allocates trace_printk buffers and emits big 'this is debug
only' banner.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1427312966-8434-6-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
bpf_ktime_get_ns() is used by programs to compute time delta
between events or as a timestamp
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1427312966-8434-5-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
BPF programs, attached to kprobes, provide a safe way to execute
user-defined BPF byte-code programs without being able to crash or
hang the kernel in any way. The BPF engine makes sure that such
programs have a finite execution time and that they cannot break
out of their sandbox.
The user interface is to attach to a kprobe via the perf syscall:
struct perf_event_attr attr = {
.type = PERF_TYPE_TRACEPOINT,
.config = event_id,
...
};
event_fd = perf_event_open(&attr,...);
ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
'prog_fd' is a file descriptor associated with BPF program
previously loaded.
'event_id' is an ID of the kprobe created.
Closing 'event_fd':
close(event_fd);
... automatically detaches BPF program from it.
BPF programs can call in-kernel helper functions to:
- lookup/update/delete elements in maps
- probe_read - wraper of probe_kernel_read() used to access any
kernel data structures
BPF programs receive 'struct pt_regs *' as an input ('struct pt_regs' is
architecture dependent) and return 0 to ignore the event and 1 to store
kprobe event into the ring buffer.
Note, kprobes are a fundamentally _not_ a stable kernel ABI,
so BPF programs attached to kprobes must be recompiled for
every kernel version and user must supply correct LINUX_VERSION_CODE
in attr.kern_version during bpf_prog_load() call.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1427312966-8434-4-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
add TRACE_EVENT_FL_KPROBE flag to differentiate kprobe type of
tracepoints, since bpf programs can only be attached to kprobe
type of PERF_TYPE_TRACEPOINT perf events.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1427312966-8434-3-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Socket filter code and other subsystems with upcoming eBPF
support should not need to deal with the fact that we have
CONFIG_BPF_SYSCALL defined or not.
Having the bpf syscall as a config option is a nice thing and
I'd expect it to stay that way for expert users (I presume one
day the default setting of it might change, though), but code
making use of it should not care if it's actually enabled or
not.
Instead, hide this via header files and let the rest deal with it.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1427312966-8434-2-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This WIP branch is now ready to be merged.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Fix 'perf script' pipe mode segfault, by always initializing ordered_events in
perf_session__new(). (Arnaldo Carvalho de Melo)
- Fix ppid for synthesized fork events. (David Ahern)
- Fix kernel symbol resolution of callchains in S/390 by remembering the
cpumode. (David Hildenbrand)
Infrastructure changes:
- Disable libbabeltrace check by default in the build system. (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
As these can be obtained from the ordered_events pointer, via
container_of, reducing the cross section of ordered_samples.
These were added to ordered_samples in:
commit b7b61cbebd789a3dbca522e3fdb727fe5c95593f
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Tue Mar 3 11:58:45 2015 -0300
perf ordered_events: Shorten function signatures
By keeping pointers to machines, evlist and tool in ordered_events.
But that was more a transitional patch while moving stuff out from
perf_session.c to ordered_events.c and possibly not even needed by then,
as we could use the container_of() method and instead of having the
nr_unordered_samples stats in events_stats, we can have it in
ordered_samples.
Based-on-a-patch-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4lk0t9js82g0tfc0x1onpkjt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Even when it is not used to actually reorder events, some of its fields
are used, like session->ordered_events->tool, to shorten function
signatures where tool, for instance, was being passed, as the tool is
needed for the ordered_events code, we need it there and might as well
use it for other perf_session needs.
This fixes a problem where 'perf script' had some condition that made
session->ordered_events not to be initialized even with its
script->tool ordered_events related flags asking for it to be, which
looks like another bug and needs to be investigated further.
Always initializing session->ordered_events at least leaves the current
assumptions in place, so do it now.
Reported-by: David Ahern <dsahern@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-b1xxk0rwkz2a0gip1uufmjqg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
363b785f38 added synthesized fork events and set a thread's parent id to
itself. Since we are already processing /proc/<pid>/status the ppid can
be determined properly. Make it so.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Link: http://lkml.kernel.org/r/1427747758-18510-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rather than parsing /proc/pid/status file one line at a time, read it
into a buffer in one shot and search for all strings in one pass.
tgid conversion also simplified -- removing the isspace walk. As noted
by Arnaldo those are not needed for atoi == strtol calls.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Link: http://lkml.kernel.org/r/1427747758-18510-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit 2e77784bb7d8 ("perf callchain: Move cpumode resolve code to
add_callchain_ip") promised "No change in behavior.".
As this commit breaks callchains on s390x (symbols not getting resolved,
observed when profiling the kernel), this statement is wrong. The cpumode
must be kept when iterating over all ips, otherwise the default
(PERF_RECORD_MISC_USER) will be used by error.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1427703060-59883-1-git-send-email-dahi@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Disabling libbabeltrace check by default and replacing the
NO_LIBBABELTRACE make variable with LIBBABELTRACE.
Users wanting the libbabeltrace feature need to build via:
$ make LIBBABELTRACE=1
The reason for this is that the libababeltrace interface we use (version
1.3) hasn't been packaged/released yet, thus the failing feature check
only slows down build and confuses other (non CTF) developers.
Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150328103030.GA8431@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"The latest and greatest fixes for ARM platform code. Worth pointing
out are:
- Lines-wise, largest is a PXA fix for dealing with interrupts on DT
that was quite broken. It's still newish code so while we could
have held this off, it seemed appropriate to include now
- Some GPIO fixes for OMAP platforms added a few lines. This was
also fixes for code recently added (this release).
- Small OMAP timer fix to behave better with partially upstreamed
platforms, which is quite welcome.
- Allwinner fixes about operating point control, reducing
overclocking in some cases for better stability.
plus a handful of other smaller fixes across the map"
* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: juno: Fix misleading name of UART reference clock
ARM: dts: sunxi: Remove overclocked/overvoltaged OPP
ARM: dts: sun4i: a10-lime: Override and remove 1008MHz OPP setting
ARM: socfpga: dts: fix spi1 interrupt
ARM: dts: Fix gpio interrupts for dm816x
ARM: dts: dra7: remove ti,hwmod property from pcie phy
ARM: OMAP: dmtimer: disable pm runtime on remove
ARM: OMAP: dmtimer: check for pm_runtime_get_sync() failure
ARM: OMAP2+: Fix socbus family info for AM33xx devices
ARM: dts: omap3: Add missing dmas for crypto
ARM: dts: rockchip: disable gmac by default in rk3288.dtsi
MAINTAINERS: add rockchip regexp to the ARM/Rockchip entry
ARM: pxa: fix pxa interrupts handling in DT
ARM: pxa: Fix typo in zeus.c
ARM: sunxi: Have ARCH_SUNXI select RESET_CONTROLLER for clock driver usage
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes
Allwinner fixes for 4.0
There's a few fixes to merge for 4.0, one to add a select in the machine
Kconfig option to fix a potential build failure, and two fixing cpufreq related
issues.
* tag 'sunxi-fixes-for-4.0' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
ARM: dts: sunxi: Remove overclocked/overvoltaged OPP
ARM: dts: sun4i: a10-lime: Override and remove 1008MHz OPP setting
ARM: sunxi: Have ARCH_SUNXI select RESET_CONTROLLER for clock driver usage
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omaps for the -rc cycle:
- Fix a device tree based booting vs legacy booting regression for
omap3 crypto hardware by adding the missing DMA channels.
- Fix /sys/bus/soc/devices/soc0/family for am33xx devices.
- Fix two timer issues that can cause hangs if the timer related
hwmod data is missing like it often initially is for new SoCs.
- Remove pcie hwmods entry from dts as that causes runtime PM to
fail for the PHYs.
- A paper bag type dts configuration fix for dm816x GPIO
interrupts that I just noticed. This is most of the changes
diffstat wise, but as it's a basic feature for connecting
devices and things work otherwise, it should be fixed.
* tag 'fixes-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: Fix gpio interrupts for dm816x
ARM: dts: dra7: remove ti,hwmod property from pcie phy
ARM: OMAP: dmtimer: disable pm runtime on remove
ARM: OMAP: dmtimer: check for pm_runtime_get_sync() failure
ARM: OMAP2+: Fix socbus family info for AM33xx devices
ARM: dts: omap3: Add missing dmas for crypto
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.rocketboards.org/linux-socfpga-next into fixes
Late fix for v4.0 on the SoCFPGA platform:
- Fix interrupt number for SPI1 interface
* tag 'socfpga_fix_for_v4.0_2' of git://git.rocketboards.org/linux-socfpga-next:
ARM: socfpga: dts: fix spi1 interrupt
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
The UART reference clock speed is 7273.8 kHz, not 72738 kHz.
Dots aren't usually used in node names even though ePAPR permits
them. However, this can easily be avoided by expressing the
frequency in Hz, not kHz.
This patch changes the name to refclk7273800hz, reflecting the
actual clock speed.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
arm: pxa: fixes for v4.0-rc5
There are only 2 fixes, one for the zeus board about the regulator changes,
where a typo prevented the zeus board from having a working can regulator,
and one regression triggered by the interrupts IRQ shift of 16 affecting all
boards.
* tag 'fixes-for-v4.0-rc5' of https://github.com/rjarzmik/linux:
ARM: pxa: fix pxa interrupts handling in DT
ARM: pxa: Fix typo in zeus.c
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
"Fix x86 syscall exit code bug that resulted in spurious non-execution
of TIF-driven user-return worklets, causing big trouble for things
like KVM that rely on user notifiers for correctness of their vcpu
model, causing crashes like double faults"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm/entry: Check for syscall exit work with IRQs disabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
"Two clocksource driver fixes, and an idle loop RCU warning fix"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/sun5i: Fix cpufreq interaction with sched_clock()
clocksource/drivers: Fix various !CONFIG_HAS_IOMEM build errors
timers/tick/broadcast-hrtimer: Fix suspicious RCU usage in idle loop
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"A single sched/rt corner case fix for RLIMIT_RTIME correctness"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix RLIMIT_RTTIME when PI-boosting to RT
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
"A perf kernel side fix for a fuzzer triggered lockup"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix irq_work 'tail' recursion
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Ingo Molnar:
"A module unload lockdep race fix"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lockdep: Fix the module unload key range freeing logic
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parsic fixes from Helge Deller:
"One patch from Mikulas fixes a bug on parisc by artifically
incrementing the counter in pmd_free when the kernel tries to free
the preallocated pmd.
Other than that we now prevent that syscalls gets added without
incrementing __NR_Linux_syscalls and fix the initial pmd setup code
if a default page size greater than 4k has been selected"
* 'parisc-4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix pmd code to depend on PT_NLEVELS value, not on CONFIG_64BIT
parisc: mm: don't count preallocated pmds
parisc: Add compile-time check when adding new syscalls
|
|
Pull kvm ppc bugfixes from Marcelo Tosatti.
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PPC: Book3S HV: Fix instruction emulation
KVM: PPC: Book3S HV: Endian fix for accessing VPA yield count
KVM: PPC: Book3S HV: Fix spinlock/mutex ordering issue in kvmppc_set_lpcr()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"We found some issues with signal handling taking down the system. I
know its late, but these are important and all marked for stable.
ARC signal handling related fixes uncovered during recent testing of
NPTL tools"
* tag 'arc-4.0-fixes-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: signal handling robustify
ARC: SA_SIGINFO ucontext regs off-by-one
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull selinux bugfix from James Morris.
Fix broken return value.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: fix sel_write_enforce broken return value
|
|
Pull watchdog fixes from Wim Van Sebroeck:
- mtk_wdt: signedness bug in mtk_wdt_start()
- imgpdc: Fix NULL pointer dereference during probe and fix the default
heartbeat
* git://www.linux-watchdog.org/linux-watchdog:
watchdog: imgpdc: Fix default heartbeat
watchdog: imgpdc: Fix probe NULL pointer dereference
watchdog: mtk_wdt: signedness bug in mtk_wdt_start()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Three trivial oneliner fixes for HD-audio.
Two are device-specific quirks while one is a generic fix for recent
Realtek codecs"
* tag 'sound-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add one more node in the EAPD supporting candidate list
ALSA: hda_intel: apply the Seperate stream_tag for Sunrise Point
ALSA: hda - Add dock support for Thinkpad T450s (17aa:5036)
|
|
into for-linus
|
|
While thinking on the whole clock discussion it occurred to me we have
two distinct uses of time:
1) the tracking of event/ctx/cgroup enabled/running/stopped times
which includes the self-monitoring support in struct
perf_event_mmap_page.
2) the actual timestamps visible in the data records.
And we've been conflating them.
The first is all about tracking time deltas, nobody should really care
in what time base that happens, its all relative information, as long
as its internally consistent it works.
The second however is what people are worried about when having to
merge their data with external sources. And here we have the
discussion on MONOTONIC vs MONOTONIC_RAW etc..
Where MONOTONIC is good for correlating between machines (static
offset), MONOTNIC_RAW is required for correlating against a fixed rate
hardware clock.
This means configurability; now 1) makes that hard because it needs to
be internally consistent across groups of unrelated events; which is
why we had to have a global perf_clock().
However, for 2) it doesn't really matter, perf itself doesn't care
what it writes into the buffer.
The below patch makes the distinction between these two cases by
adding perf_event_clock() which is used for the second case. It
further makes this configurable on a per-event basis, but adds a few
sanity checks such that we cannot combine events with different clocks
in confusing ways.
And since we then have per-event configurability we might as well
retain the 'legacy' behaviour as a default.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
An upcoming patch will depend on tai_ns() and NMI-safe ktime_get_raw_fast(),
so merge timers/core here in a separate topic branch until it's all cooked
and timers/core is merged upstream.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
While looking at some fuzzer output I noticed that we do not hold any
locks on leader->ctx and therefore the sibling_list iteration is
unsafe.
Acquire the relevant ctx->mutex before calling into the pmu specific
code.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Link: http://lkml.kernel.org/r/20150225151639.GL5029@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
perf_pmu_disable() is called before pmu->add() and perf_pmu_enable() is called
afterwards. No need to call these inside of x86_pmu_add() as well.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424281543-67335-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
the tree
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Because it was the only clock for which we didn't have a _ns()
accessor yet.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add the NMI safe CLOCK_MONOTONIC_RAW accessor..
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150319093400.562746929@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
In preparation for more tk_fast instances, remove all hard-coded
tk_fast_mono references.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150319093400.484279927@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Introduce tkr_raw and make use of it.
base_raw -> tkr_raw.base
clock->{mult,shift} -> tkr_raw.{mult.shift}
Kill timekeeping_get_ns_raw() in favour of
timekeeping_get_ns(&tkr_raw), this removes all mono_raw special
casing.
Duplicate the updates to tkr_mono.cycle_last into tkr_raw.cycle_last,
both need the same value.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150319093400.422589590@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
In preparation of adding another tkr field, rename this one to
tkr_mono. Also rename tk_read_base::base_mono to tk_read_base::base,
since the structure is not specific to CLOCK_MONOTONIC and the mono
name got added to the tk_read_base instance.
Lots of trivial churn.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150319093400.344679419@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
On Broadwell INST_RETIRED.ALL cannot be used with any period
that doesn't have the lowest 6 bits cleared. And the period
should not be smaller than 128.
This is erratum BDM11 and BDM55:
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/5th-gen-core-family-spec-update.pdf
BDM11: When using a period < 100; we may get incorrect PEBS/PMI
interrupts and/or an invalid counter state.
BDM55: When bit0-5 of the period are !0 we may get redundant PEBS
records on overflow.
Add a new callback to enforce this, and set it for Broadwell.
How does this handle the case when an app requests a specific
period with some of the bottom bits set?
Short answer:
Any useful instruction sampling period needs to be 4-6 orders
of magnitude larger than 128, as an PMI every 128 instructions
would instantly overwhelm the system and be throttled.
So the +-64 error from this is really small compared to the
period, much smaller than normal system jitter.
Long answer (by Peterz):
IFF we guarantee perf_event_attr::sample_period >= 128.
Suppose we start out with sample_period=192; then we'll set period_left
to 192, we'll end up with left = 128 (we truncate the lower bits). We
get an interrupt, find that period_left = 64 (>0 so we return 0 and
don't get an overflow handler), up that to 128. Then we trigger again,
at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get
an overflow). We increment with sample_period so we get left = 128. We
fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an
overflow). And on and on.
So while the individual interrupts are 'wrong' we get then with
interval=256,128 in exactly the right ratio to average out at 192. And
this works for everything >=128.
So the num_samples*fixed_period thing is still entirely correct +- 127,
which is good enough I'd say, as you already have that error anyhow.
So no need to 'fix' the tools, al we need to do is refuse to create
INST_RETIRED:ALL events with sample_period < 128.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
[ Updated comments and changelog a bit. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424225886-18652-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add Broadwell support for Broadwell to perf.
The basic support is very similar to Haswell. We use the new cache
event list added for Haswell earlier. The only differences
are a few bits related to remote nodes. To avoid an extra,
mostly identical, table these are patched up in the initialization code.
The constraint list has one new event that needs to be handled over Haswell.
Includes code and testing from Kan Liang.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424225886-18652-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Haswell offcore events are quite different from Sandy Bridge.
Add a new table to handle Haswell properly.
Note that the offcore bits listed in the SDM are not quite correct
(this is currently being fixed). An uptodate list of bits is
in the patch.
The basic setup is similar to Sandy Bridge. The prefetch columns
have been removed, as prefetch counting is not very reliable
on Haswell. One L1 event that is not in the event list anymore
has been also removed.
- data reads do not include code reads (comparable to earlier Sandy Bridge tables)
- data counts include speculative execution (except L1 write, dtlb, bpu)
- remote node access includes both remote memory, remote cache, remote mmio.
- prefetches are not included in the counts for consistency
(different from Sandy Bridge, which includes prefetches in the remote node)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
[ Removed the HSM30 comments; we don't have them for SNB/IVB either. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424225886-18652-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|