Age | Commit message (Collapse) | Author |
|
Reflecting the fact that it now augments more than syscalls:sys_enter_SYSCALL
tracepoints that have filename strings as args. Also mention how the
extra data is handled by the by now modified 'perf trace' beautifiers,
that will use special "augmented" beautifiers when extra data is found
after the expected syscall enter/exit tracepoints.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ybskanehmdilj5fs7080nz1g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that we can hook to the syscalls:sys_exit_SYSCALL tracepoints in
addition to the syscalls:sys_enter_SYSCALL we hook using the
syscall_enter() helper.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6qh8aph1jklyvdu7w89c0izc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
header file
In order to make libtraceevent into a proper library, all its APIs
should be defined in corresponding header files. This patch splits
trace-seq related APIs in a separate header file: trace-seq.h
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20180828185038.2dcb2743@gandalf.local.home
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Create auxiliary trace data log files when invoked with option
--itrace=d as in:
[root@s35lp76 perf] perf report -i perf.data.aux1 --stdio --itrace=d
perf report creates several data files in the current directory named
aux.smp.## where ## is a 2 digit hex number with leading zeros
representing the CPU number this trace data was recorded from. The file
contents is binary and contains the CPU-Measurement Sampling Data Blocks
(SDBs).
The directory to save the auxiliary trace buffer can be changed using
the perf config file and command. Specify section 'auxtrace' keyword
'dumpdir' and assign it a valid directory name. If the directory does
not exist or has the wrong file type, the current directory is used.
[root@p23lp27 perf]# perf config auxtrace.dumpdir=/tmp
[root@p23lp27 perf]# perf config --user -l auxtrace.dumpdir=/tmp
[root@p23lp27 perf]# perf report ...
[root@p23lp27 perf]# ll /tmp/aux.smp.00
-rw-r--r-- 1 root root 204800 Aug 2 13:48 /tmp/aux.smp.00
[root@p23lp27 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180809045650.89197-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use an array to multiplex by sockaddr->sa_family, this way adding new
families gets a bit easier and tidy.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-v3s85ra659tc40g1s1xaqoun@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Its a 'struct sockaddr' pointer, augment it with the same beautifier as
for 'connect' and 'bind', that all receive from userspace that pointer.
Doing it in the other direction remains to be done, hooking at the
syscalls:sys_exit_{accept4?,recvmsg} tracepoints somehow.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-k2eu68lsphnm2fthc32gq76c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
One more, to reuse the augmented_sockaddr_syscall_enter() macro
introduced from the augmentation of connect's sockaddr arg, also to get
a subset of the struct arg augmentations done using the manual method,
before switching to something automatic, using tracefs's format file or,
even better, BTF containing the syscall args structs.
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c
0.000 sshd/11479 bind(fd: 3<socket:[170336]>, umyaddr: { .family: NETLINK }, addrlen: 12)
1.752 sshd/11479 bind(fd: 3<socket:[170336]>, umyaddr: { .family: INET, port: 22, addr: 0.0.0.0 }, addrlen: 16)
1.924 sshd/11479 bind(fd: 4<socket:[170338]>, umyaddr: { .family: INET6, port: 22, addr: :: }, addrlen: 28)
^C#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-a2drqpahpmc7uwb3n3gj2plu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
From the one for 'connect', so that we can use it with sendto and others
that receive a 'struct sockaddr'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-8bdqv1q0ndcjl1nqns5r5je2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As the first example of augmenting something other than a 'filename',
augment the 'struct sockaddr' argument for the 'connect' syscall:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c ssh -6 fedorapeople.org
0.000 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
0.042 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
1.329 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
1.362 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
1.458 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
1.478 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
1.683 ssh/29669 connect(fd: 3<socket:[125942]>, uservaddr: { .family: INET, port: 53, addr: 192.168.43.1 }, addrlen: 16)
4.710 ssh/29669 connect(fd: 3<socket:[125942]>, uservaddr: { .family: INET6, port: 22, addr: 2610:28:3090:3001:5054:ff:fea7:9474 }, addrlen: 28)
root@fedorapeople.org: Permission denied (publickey).
#
This is still just augmenting the syscalls:sys_enter_connect part, later
we'll wire this up to augment the enter+exit combo, like in the
tradicional 'perf trace' and 'strace' outputs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-s7l541cbiqb22ifio6z7dpf6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that we don't have to define sockaddr_storage in the
augmented_syscalls.c bpf example when hooking into syscalls needing it,
idea is to mimic the system headers. Eventually we probably need to have
sys/socket.h, etc. Start by having at least linux/socket.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-yhzarcvsjue8pgpvkjhqgioc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I need to check the need for $KERNEL_INC_OPTIONS when building eBPF
restricted C programs, for now just give precedence to
$PERF_BPF_INC_OPTIONS so that we can get a linux/socket.h usable
in eBPF programs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-5z7qw529sdebrn9y1xxqw9hf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We need to put common stuff into a separate header in tools/perf/include/bpf/
for these augmented syscalls, but I couldn't resist adding a etcsnoop.c tool,
combining augmented syscalls + filtering, that in the future will be passed
from 'perf trace''s command line, to use in building the eBPF program to do
that specific filtering at the source, inside the kernel:
Running system wide: (hope there isn't any embarassing stuff here... ;-) )
# perf trace -e tools/perf/examples/bpf/etcsnoop.c
0.000 sed/21878 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
1741.473 cat/21883 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
1741.892 cat/21883 openat(dfd: CWD, filename: /etc/passwd)
1748.948 sed/21886 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
1777.136 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1777.738 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1778.158 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1778.528 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1778.595 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1778.901 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1778.939 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1778.966 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1778.992 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.019 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.045 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.071 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.095 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.121 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.148 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.175 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.202 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.229 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.254 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.279 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.309 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.336 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.363 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.388 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.414 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.442 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.470 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.500 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.529 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.557 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.586 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.617 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.648 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.679 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.706 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.739 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.769 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.798 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.823 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.844 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.862 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.880 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.911 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.942 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1779.972 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1780.004 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
1780.035 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
13059.154 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13060.739 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13061.990 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13063.177 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13064.265 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13065.483 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13067.383 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13068.902 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13069.922 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13070.915 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13072.612 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13074.816 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13077.343 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13078.731 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13559.064 DNS Res~er #22/21054 open(filename: /etc/hosts, flags: CLOEXEC)
22419.522 sed/21896 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
24473.313 git/21900 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
24491.988 less/21901 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
24493.793 git/21901 openat(dfd: CWD, filename: /etc/sysless)
24565.772 sed/21924 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
25878.752 git/21928 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
26075.666 git/21928 open(filename: /etc/localtime, flags: CLOEXEC)
26075.565 less/21929 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
26076.060 less/21929 openat(dfd: CWD, filename: /etc/sysless)
26346.395 sed/21932 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
26483.583 sed/21938 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
26954.890 sed/21944 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
27016.165 gsd-color/1762 openat(dfd: CWD, filename: /etc/localtime)
27016.414 gsd-color/1762 openat(dfd: CWD, filename: /etc/localtime)
27712.313 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime)
27712.616 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime)
27829.035 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
27829.368 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
27829.584 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
27829.800 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
27830.107 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
27830.521 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
27961.516 git/21948 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
27987.568 less/21949 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
27988.948 bash/21949 openat(dfd: CWD, filename: /etc/sysless)
28043.536 sed/21972 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
28736.008 sed/21978 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
34882.664 git/21991 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
34882.664 sort/21990 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
34884.441 uniq/21992 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
35593.098 git/21997 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
35638.839 git/21997 openat(dfd: CWD, filename: /etc/gitattributes)
35702.851 sed/22000 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
36076.039 sed/22006 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
37569.049 git/22014 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
37673.712 git/22014 open(filename: /etc/localtime, flags: CLOEXEC)
37781.710 vim/22040 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
37783.667 git/22040 openat(dfd: CWD, filename: /etc/vimrc)
37792.394 git/22040 open(filename: /etc/nsswitch.conf, flags: CLOEXEC)
37792.436 git/22040 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
37792.580 git/22040 open(filename: /etc/passwd, flags: CLOEXEC)
43893.625 DNS Res~er #23/21365 open(filename: /etc/hosts, flags: CLOEXEC)
48060.409 nm-dhcp-helper/22044 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48071.745 systemd/1 openat(dfd: CWD, filename: /etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service, flags: CLOEXEC|NOFOLLOW|NOCTTY)
48082.780 nm-dispatcher/22049 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48111.418 systemd/22049 open(filename: /etc/NetworkManager/dispatcher.d, flags: CLOEXEC|DIRECTORY|NONBLOCK)
48111.904 systemd/22049 open(filename: /etc/localtime, flags: CLOEXEC)
48118.357 00-netreport/22052 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48119.668 systemd/22052 open(filename: /etc/nsswitch.conf, flags: CLOEXEC)
48119.762 systemd/22052 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48119.887 systemd/22052 open(filename: /etc/passwd, flags: CLOEXEC)
48120.025 systemd/22052 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/00-netreport)
48124.144 hostname/22054 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48125.492 systemd/22052 openat(dfd: CWD, filename: /etc/init.d/functions)
48127.253 systemd/22052 openat(dfd: CWD, filename: /etc/profile.d/lang.sh)
48127.388 systemd/22052 openat(dfd: CWD, filename: /etc/locale.conf)
48137.749 cat/22056 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48143.519 04-iscsi/22058 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48144.438 04-iscsi/22058 open(filename: /etc/nsswitch.conf, flags: CLOEXEC)
48144.478 04-iscsi/22058 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48144.577 04-iscsi/22058 open(filename: /etc/passwd, flags: CLOEXEC)
48144.819 04-iscsi/22058 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/04-iscsi)
48145.620 10-ifcfg-rh-ro/22059 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48146.169 systemd/22059 open(filename: /etc/nsswitch.conf, flags: CLOEXEC)
48146.207 systemd/22059 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48146.287 systemd/22059 open(filename: /etc/passwd, flags: CLOEXEC)
48146.387 systemd/22059 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/10-ifcfg-rh-routes.sh)
48147.215 11-dhclient/22060 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48147.787 11-dhclient/22060 open(filename: /etc/nsswitch.conf, flags: CLOEXEC)
48147.813 11-dhclient/22060 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48147.929 11-dhclient/22060 open(filename: /etc/passwd, flags: CLOEXEC)
48148.016 11-dhclient/22060 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/11-dhclient)
48148.906 grep/22063 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48151.165 11-dhclient/22060 openat(dfd: CWD, filename: /etc/sysconfig/network)
48151.560 11-dhclient/22060 open(filename: /etc/dhcp/dhclient.d/, flags: CLOEXEC|DIRECTORY|NONBLOCK)
48151.704 11-dhclient/22060 openat(dfd: CWD, filename: /etc/dhcp/dhclient.d/chrony.sh)
48153.593 20-chrony/22065 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48154.695 20-chrony/22065 open(filename: /etc/nsswitch.conf, flags: CLOEXEC)
48154.756 20-chrony/22065 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48154.914 20-chrony/22065 open(filename: /etc/passwd, flags: CLOEXEC)
48155.067 20-chrony/22065 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/20-chrony)
48156.962 25-polipo/22066 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48157.824 systemd/22066 open(filename: /etc/nsswitch.conf, flags: CLOEXEC)
48157.866 systemd/22066 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
48157.981 systemd/22066 open(filename: /etc/passwd, flags: CLOEXEC)
48158.090 systemd/22066 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/25-polipo)
48533.616 gsd-housekeepi/2412 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC)
87122.021 gsd-color/1762 openat(dfd: CWD, filename: /etc/localtime)
87122.146 gsd-color/1762 openat(dfd: CWD, filename: /etc/localtime)
87825.582 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime)
87825.844 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime)
87829.524 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
87830.531 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
87831.288 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
87832.011 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
87832.672 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
87833.276 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime)
^C#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-0o770jvdcy04ee6vhv6v471m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This one will need some more work, that 'statbuf' pointer requires a
beautifier in 'perf trace'.
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c
0.000 weechat/3596 stat(filename: /etc/localtime, statbuf: 0x7ffd87d11f60)
0.186 perf/29818 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_stat/format)
0.279 perf/29818 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_newstat/for)
0.670 perf/29818 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/form)
60.805 DNS Res~er #20/21308 stat(filename: /etc/resolv.conf, statbuf: 0x7ffa733fe4a0)
60.836 DNS Res~er #20/21308 open(filename: /etc/hosts, flags: CLOEXEC)
60.931 perf/29818 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format)
607.070 DNS Res~er #21/29812 stat(filename: /etc/resolv.conf, statbuf: 0x7ffa5e1fe3f0)
607.098 DNS Res~er #21/29812 open(filename: /etc/hosts, flags: CLOEXEC)
999.336 weechat/3596 stat(filename: /etc/localtime, statbuf: 0x7ffd87d11f60)
^C#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-4lhabe7m4uzo76lnqpyfmnvk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Helping with tons of boilerplate for syscalls that only want to augment
a filename. Now supporting one such syscall is just a matter of
declaring its arguments struct + using:
augmented_filename_syscall_enter(openat);
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ls7ojdseu8fxw7fvj77ejpao@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Again, just changing tools/perf/examples/bpf/augmented_syscalls.c, that
is starting to have too much boilerplate, some macro will come to the
rescue.
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c
0.000 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /var/cache/app-info/yaml, mask: 16789454)
0.023 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /var/lib/app-info/xmls, mask: 16789454)
0.028 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /var/lib/app-info/yaml, mask: 16789454)
0.032 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /usr/share/app-info/yaml, mask: 16789454)
0.039 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /usr/local/share/app-info/xmls, mask: 16789454)
0.045 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /usr/local/share/app-info/yaml, mask: 16789454)
0.049 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /home/acme/.local/share/app-info/yaml, mask: 16789454)
0.056 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: , mask: 16789454)
0.010 gmain/2245 inotify_add_watch(fd: 7<anon_inode:inotify>, pathname: /home/acme/~, mask: 16789454)
0.087 perf/20116 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_inotify_add)
0.436 perf/20116 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/form)
56.042 gmain/2791 inotify_add_watch(fd: 4<anon_inode:inotify>, pathname: /var/lib/fwupd/remotes.d/lvfs-testing, mask: 16789454)
113.986 gmain/1721 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /var/lib/gdm/~, mask: 16789454)
3777.265 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime)
3777.550 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime)
^C[root@jouet perf]#
Still not combining raw_syscalls:sys_enter + raw_syscalls:sys_exit, to
get it strace-like, but that probably will come very naturally with some
more wiring up...
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ol83juin2cht9vzquynec5hz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As described in the previous cset, all we had to do was to touch the
augmented_syscalls.c eBPF program, fire up 'perf trace' with that new
eBPF script in system wide mode and wait for 'open' syscalls, in
addition to 'openat' ones to see that it works:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c
0.000 StreamT~s #200/16150 openat(dfd: CWD, filename: /home/acme/.mozilla/firefox/fqxhj76d.default/prefs.js, flags: CREAT|EXCL|TRUNC|WRONLY, mode: IRUSR|IWUSR)
0.065 StreamT~s #200/16150 openat(dfd: CWD, filename: /home/acme/.mozilla/firefox/fqxhj76d.default/prefs-1.js, flags: CREAT|EXCL|TRUNC|WRONLY, mode: IRUSR|IWUSR)
0.435 StreamT~s #200/16150 openat(dfd: CWD, filename: /home/acme/.mozilla/firefox/fqxhj76d.default/prefs-1.js, flags: CREAT|TRUNC|WRONLY, mode: IRUSR|IWUSR)
1.875 perf/16772 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/form)
1227.260 gnome-shell/1463 openat(dfd: CWD, filename: /proc/self/stat)
1227.397 gnome-shell/2125 openat(dfd: CWD, filename: /proc/self/stat)
7227.619 gnome-shell/1463 openat(dfd: CWD, filename: /proc/self/stat)
7227.661 gnome-shell/2125 openat(dfd: CWD, filename: /proc/self/stat)
10018.079 gnome-shell/1463 openat(dfd: CWD, filename: /proc/self/stat)
10018.514 perf/16772 openat(dfd: CWD, filename: /proc/1237/status)
10018.568 perf/16772 openat(dfd: CWD, filename: /proc/1237/status)
10022.409 gnome-shell/2125 openat(dfd: CWD, filename: /proc/self/stat)
10090.044 NetworkManager/1237 openat(dfd: CWD, filename: /proc/2125/stat)
10090.351 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
10090.407 perf/16772 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format)
10091.763 NetworkManager/1237 openat(dfd: CWD, filename: /proc/2125/stat)
10091.812 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
10092.807 NetworkManager/1237 openat(dfd: CWD, filename: /proc/2125/stat)
10092.851 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
10094.650 NetworkManager/1237 openat(dfd: CWD, filename: /proc/1463/stat)
10094.926 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
10096.010 NetworkManager/1237 openat(dfd: CWD, filename: /proc/1463/stat)
10096.057 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
10097.056 NetworkManager/1237 openat(dfd: CWD, filename: /proc/1463/stat)
10097.099 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC)
13228.345 gnome-shell/1463 openat(dfd: CWD, filename: /proc/self/stat)
13232.734 gnome-shell/2125 openat(dfd: CWD, filename: /proc/self/stat)
15198.956 lighttpd/16748 open(filename: /proc/loadavg, mode: ISGID|IXOTH)
^C#
It even catches 'perf' itself looking at the sys_enter_open and
sys_enter_openat tracefs format dictionaries when it first finds them in
the trace... :-)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-upmogc57uatljr6el6u8537l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is the final touch in showing how a syscall argument beautifier can
access the augmented args put in place by the
tools/perf/examples/bpf/augmented_syscalls.c eBPF script, right after
the regular raw syscall args, i.e. the up to 6 long integer values in
the syscall interface.
With this we are able to show the 'openat' syscall arg, now with up to
64 bytes, but in time this will be configurable, just like with the
'strace -s strsize' argument, from 'strace''s man page:
-s strsize Specify the maximum string size to print (the default is 32).
This actually is the maximum string to _collect_ and store in the ring
buffer, not just print.
Before:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
0.000 ( ): cat/9658 openat(dfd: CWD, filename: 0x6626eda8, flags: CLOEXEC)
0.017 ( 0.007 ms): cat/9658 openat(dfd: CWD, filename: 0x6626eda8, flags: CLOEXEC) = 3
0.049 ( ): cat/9658 openat(dfd: CWD, filename: 0x66476ce0, flags: CLOEXEC)
0.051 ( 0.007 ms): cat/9658 openat(dfd: CWD, filename: 0x66476ce0, flags: CLOEXEC) = 3
0.377 ( ): cat/9658 openat(dfd: CWD, filename: 0x1e8f806b)
0.379 ( 0.005 ms): cat/9658 openat(dfd: CWD, filename: 0x1e8f806b) = 3
#
After:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
0.000 ( ): cat/11966 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
0.006 ( 0.006 ms): cat/11966 openat(dfd: CWD, filename: 0x4bfdcda8, flags: CLOEXEC) = 3
0.034 ( ): cat/11966 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC)
0.036 ( 0.008 ms): cat/11966 openat(dfd: CWD, filename: 0x4c1e4ce0, flags: CLOEXEC) = 3
0.375 ( ): cat/11966 openat(dfd: CWD, filename: /etc/passwd)
0.377 ( 0.005 ms): cat/11966 openat(dfd: CWD, filename: 0xe87906b) = 3
#
This cset should show all the aspects of establishing a protocol between
an eBPF syscall arg augmenter program, tools/perf/examples/bpf/augmented_syscalls.c and
a 'perf trace' beautifier, the one associated with all 'char *' point
syscall args with names that can heuristically be associated with
filenames.
Now to wire up 'open' to show a second syscall using this scheme, all we
have to do now is to change tools/perf/examples/bpf/augmented_syscalls.c,
as 'perf trace' will notice that the perf_sample.raw_size is more than
what is expected for a particular syscall payload as defined by its
tracefs format file and will then use the augmented payload in the
'filename' syscall arg beautifier.
The same protocol will be used for structs such as 'struct sockaddr *',
'struct pollfd', etc, with additions for handling arrays.
This will all be done under the hood when 'perf trace' realizes the
system has the necessary components, and also can be done by providing
a precompiled augmented_syscalls.c eBPF ELF object.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-gj9kqb61wo7m3shtpzercbcr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To get us a bit more like the sys_enter + sys_exit combo:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
0.000 ( ): openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC)
0.009 ( 0.009 ms): cat/3619 openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC) = 3
0.051 ( ): openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC)
0.054 ( 0.010 ms): cat/3619 openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC) = 3
0.539 ( ): openat(dfd: CWD, filename: 0xca71506b)
0.543 ( 0.115 ms): cat/3619 openat(dfd: CWD, filename: 0xca71506b) = 3
#
After:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
0.000 ( ): cat/4919 openat(dfd: CWD, filename: 0xc8358da8, flags: CLOEXEC)
0.007 ( 0.005 ms): cat/4919 openat(dfd: CWD, filename: 0xc8358da8, flags: CLOEXEC) = 3
0.032 ( ): cat/4919 openat(dfd: CWD, filename: 0xc8560ce0, flags: CLOEXEC)
0.033 ( 0.006 ms): cat/4919 openat(dfd: CWD, filename: 0xc8560ce0, flags: CLOEXEC) = 3
0.301 ( ): cat/4919 openat(dfd: CWD, filename: 0x91fa306b)
0.304 ( 0.004 ms): cat/4919 openat(dfd: CWD, filename: 0x91fa306b) = 3
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6w8ytyo5y655a1hsyfpfily6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Will be used with augmented syscalls, where we haven't transitioned
completely to combining sys_enter_FOO with sys_exit_FOO, so we'll go
as far as having it similar to the end result, strace like, as possible.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-canomaoiybkswwnhj69u9ae4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since we copy all the payload for raw_syscalls:sys_enter plus add
expanded pointers, we can use the syscall id to get its name, etc:
# grep 'field:.* id' /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/format
field:long id; offset:8; size:8; signed:1;
#
Before:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
0.000 ( ): __augmented_syscalls__:dfd: CWD, filename: 0xec9f9da8, flags: CLOEXEC
0.006 ( 0.006 ms): cat/2395 openat(dfd: CWD, filename: 0xec9f9da8, flags: CLOEXEC) = 3
0.041 ( ): __augmented_syscalls__:dfd: CWD, filename: 0xecc01ce0, flags: CLOEXEC
0.042 ( 0.007 ms): cat/2395 openat(dfd: CWD, filename: 0xecc01ce0, flags: CLOEXEC) = 3
0.376 ( ): __augmented_syscalls__:dfd: CWD, filename: 0xac0a806b
0.379 ( 0.006 ms): cat/2395 openat(dfd: CWD, filename: 0xac0a806b) = 3
#
After:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
0.000 ( ): openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC)
0.009 ( 0.009 ms): cat/3619 openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC) = 3
0.051 ( ): openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC)
0.054 ( 0.010 ms): cat/3619 openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC) = 3
0.539 ( ): openat(dfd: CWD, filename: 0xca71506b)
0.543 ( 0.115 ms): cat/3619 openat(dfd: CWD, filename: 0xca71506b) = 3
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-epz6y9i0eavmerc5ha98t7gn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When we attach a eBPF object to a tracepoint, if we return 1, then that
tracepoint will be stored in the perf's ring buffer. In the
augmented_syscalls.c case we want to just attach and _override_ the
tracepoint payload with an augmented, extended one.
In this example, tools/perf/examples/bpf/augmented_syscalls.c, we are
attaching to the 'openat' syscall, and adding, after the
syscalls:sys_enter_openat usual payload as defined by
/sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/format, a
snapshot of its sole pointer arg:
# grep 'field:.*\*' /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/format
field:const char * filename; offset:24; size:8; signed:0;
#
For now this is not being considered, the next csets will make use of
it, but as this is overriding the syscall tracepoint enter, we don't
want that event appearing on the ring buffer, just our synthesized one.
Before:
# perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
0.000 ( ): __augmented_syscalls__:dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC
0.006 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: , flags: CLOEXEC
0.007 ( 0.004 ms): cat/24044 openat(dfd: CWD, filename: 0x216dda8, flags: CLOEXEC ) = 3
0.028 ( ): __augmented_syscalls__:dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC
0.030 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: , flags: CLOEXEC
0.031 ( 0.006 ms): cat/24044 openat(dfd: CWD, filename: 0x2375ce0, flags: CLOEXEC ) = 3
0.291 ( ): __augmented_syscalls__:dfd: CWD, filename: /etc/passwd
0.293 ( ): syscalls:sys_enter_openat:dfd: CWD, filename:
0.294 ( 0.004 ms): cat/24044 openat(dfd: CWD, filename: 0x637db06b ) = 3
#
After:
# perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
0.000 ( ): __augmented_syscalls__:dfd: CWD, filename: 0x9c6a1da8, flags: CLOEXEC
0.005 ( 0.015 ms): cat/27341 openat(dfd: CWD, filename: 0x9c6a1da8, flags: CLOEXEC ) = 3
0.040 ( ): __augmented_syscalls__:dfd: CWD, filename: 0x9c8a9ce0, flags: CLOEXEC
0.041 ( 0.006 ms): cat/27341 openat(dfd: CWD, filename: 0x9c8a9ce0, flags: CLOEXEC ) = 3
0.294 ( ): __augmented_syscalls__:dfd: CWD, filename: 0x482a706b
0.296 ( 0.067 ms): cat/27341 openat(dfd: CWD, filename: 0x482a706b ) = 3
#
Now lets replace that __augmented_syscalls__ name with the syscall name,
using:
# grep 'field:.*syscall_nr' /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/format
field:int __syscall_nr; offset:8; size:4; signed:1;
#
That the synthesized payload has exactly where the syscall enter
tracepoint puts it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-og4r9k87mzp9hv7el046idmd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If the tracepoint payload is bigger than what a syscall expected from
what is in its format file in tracefs, then that will be used as
augmented args, i.e. the expansion of syscall arg pointers, with things
like a filename, structs, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-bsbqx7xi2ot4q9bf570f7tqs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Starting with binutils 2.28, aarch64 objdump adds comments to the
disassembly output to show the alternative names of a condition code
[1].
It is assumed that commas in objdump comments could occur in other
arches now or in the future, so this fix is arch-independent.
The fix could have been done with arm64 specific jump__parse and
jump__scnprintf functions, but the jump__scnprintf instruction would
have to have its comment character be a literal, since the scnprintf
functions cannot receive a struct arch easily.
This inconvenience also applies to the generic jump__scnprintf, which is
why we add a raw_comment pointer to struct ins_operands, so the __parse
function assigns it to be re-used by its corresponding __scnprintf
function.
Example differences in 'perf annotate --stdio2' output on an aarch64
perf.data file:
BEFORE: → b.cs ffff200008133d1c <unwind_frame+0x18c> // b.hs, dffff7ecc47b
AFTER : ↓ b.cs 18c
BEFORE: → b.cc ffff200008d8d9cc <get_alloc_profile+0x31c> // b.lo, b.ul, dffff727295b
AFTER : ↓ b.cc 31c
The branch target labels 18c and 31c also now appear in the output:
BEFORE: add x26, x29, #0x80
AFTER : 18c: add x26, x29, #0x80
BEFORE: add x21, x21, #0x8
AFTER : 31c: add x21, x21, #0x8
The Fixes: tag below is added so stable branches will get the update; it
doesn't necessarily mean that commit was broken at the time, rather it
didn't withstand the aarch64 objdump update.
Tested no difference in output for sample x86_64, power arch perf.data files.
[1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bb7eff5206e4795ac79c177a80fe9f4630aaf730
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: b13bbeee5ee6 ("perf annotate: Fix branch instruction with multiple operands")
Link: http://lkml.kernel.org/r/20180827125340.a2f7e291901d17cea05daba4@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This makes sure that the SyS symbols are ignored for any powerpc system,
not just the big endian ones.
Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes: fb6d59423115 ("perf probe ppc: Use the right prefix when ignoring SyS symbols on ppc")
Link: http://lkml.kernel.org/r/20180828090848.1914-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some implementations of libc do not support the 'm' width modifier as
part of the scanf string format specifier. This can cause the parsing to
fail. Since the parser never checks if the scanf parsing was
successesful, this can result in a crash.
Change the comm string to be allocated as a fixed size instead of
dynamically using 'm' scanf width modifier. This can be safely done
since comm size is limited to 16 bytes by TASK_COMM_LEN within the
kernel.
This change prevents perf from crashing when linked against bionic as
well as reduces the total number of heap allocations and frees invoked
while accomplishing the same task.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830021950.15563-1-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In the write to the output_fd in the error condition of
record_saved_cmdline(), we are writing 8 bytes from a memory location on
the stack that contains a primitive that is only 4 bytes in size.
Change the primitive to 8 bytes in size to match the size of the write
in order to avoid reading unknown memory from the stack.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180829061954.18871-1-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We were emitting 4 lines, two of them misleading:
make: Entering directory '/home/acme/git/perf/tools/perf'
<SNIP>
INSTALL lib
INSTALL include/bpf
INSTALL lib
INSTALL examples/bpf
<SNIP>
make: Leaving directory '/home/acme/git/perf/tools/perf'
Make it more compact by showing just two lines:
make: Entering directory '/home/acme/git/perf/tools/perf'
INSTALL bpf-headers
INSTALL bpf-examples
make: Leaving directory '/home/acme/git/perf/tools/perf'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-0nvkyciqdkrgy829lony5925@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If evsel is NULL, we should return NULL to avoid a NULL pointer
dereference a bit later in the code.
Signed-off-by: Hisao Tanabe <xtanabe@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 03e0a7df3efd ("perf tools: Introduce bpf-output event")
LPU-Reference: 20180824154556.23428-1-xtanabe@gmail.com
Link: https://lkml.kernel.org/n/tip-e5plzjhx6595a5yjaf22jss3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The new syscall table support for arm64 mistakenly used the system's
asm-generic/unistd.h file when processing the
tools/arch/arm64/include/uapi/asm/unistd.h file's include directive:
#include <asm-generic/unistd.h>
See "Committer notes" section of commit 2b5882435606 "perf arm64:
Generate system call table from asm/unistd.h" for more details.
This patch removes the committer's temporary workaround, and instructs
the host compiler to search the build tree's include path for the right
copy of the unistd.h file, instead of the one on the system's
/usr/include path.
It thus fixes the committer's test that cross-builds an arm64 perf on an
x86 platform running Ubuntu 14.04.5 LTS with an old toolchain:
$ tools/perf/arch/arm64/entry/syscalls/mksyscalltbl /gcc-linaro-5.4.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc gcc `pwd`/tools tools/arch/arm64/include/uapi/asm/unistd.h | grep bpf
[280] = "bpf",
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Fixes: 2b5882435606 ("perf arm64: Generate system call table from asm/unistd.h")
Link: http://lkml.kernel.org/r/20180806172800.bbcec3cfcc51e2facc978bf2@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We can safely enable the breakpoint back for both the fail and success
paths by checking only the bp->attr.disabled, which either holds the new
'requested' disabled state or the original breakpoint state.
Committer testing:
At the end of the series, the 'perf test' entry introduced as the first
patch now runs to completion without finding the fixed issues:
# perf test "bp modify"
62: x86 bp modify : Ok
#
In verbose mode:
# perf test -v "bp modify"
62: x86 bp modify :
--- start ---
test child forked, pid 5161
rip 5950a0, bp_1 0x5950a0
in bp_1
rip 5950a0, bp_1 0x5950a0
in bp_1
test child finished with 0
---- end ----
x86 bp modify: Ok
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Milind Chabbi <chabbi.milind@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180827091228.2878-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently we enable the breakpoint back only if the breakpoint
modification was successful. If it fails we can leave the breakpoint in
disabled state with attr->disabled == 0.
We can safely enable the breakpoint back for both the fail and success
paths by checking the bp->attr.disabled, which either holds the new
'requested' disabled state or the original breakpoint state.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Milind Chabbi <chabbi.milind@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180827091228.2878-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Once the breakpoint was succesfully modified, the attr->disabled value
is in bp->attr.disabled. So there's no reason to set it again, removing
that.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Milind Chabbi <chabbi.milind@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180827091228.2878-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We need to change the breakpoint even if the attr with new fields has
disabled set to true.
Current code prevents following user code to change the breakpoint
address:
ptrace(PTRACE_POKEUSER, child, offsetof(struct user, u_debugreg[0]), addr_1)
ptrace(PTRACE_POKEUSER, child, offsetof(struct user, u_debugreg[0]), addr_2)
ptrace(PTRACE_POKEUSER, child, offsetof(struct user, u_debugreg[7]), dr7)
The first PTRACE_POKEUSER creates the breakpoint with attr.disabled set
to true:
ptrace_set_breakpoint_addr(nr = 0)
struct perf_event *bp = t->ptrace_bps[nr];
ptrace_register_breakpoint(..., disabled = true)
ptrace_fill_bp_fields(..., disabled)
register_user_hw_breakpoint
So the second PTRACE_POKEUSER will be omitted:
ptrace_set_breakpoint_addr(nr = 0)
struct perf_event *bp = t->ptrace_bps[nr];
struct perf_event_attr attr = bp->attr;
modify_user_hw_breakpoint(bp, &attr)
if (!attr->disabled)
modify_user_hw_breakpoint_check
Reported-by: Milind Chabbi <chabbi.milind@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180827091228.2878-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adding to tests that aims on kernel breakpoint modification bugs.
First test creates HW breakpoint, tries to change it and checks it was
properly changed. It aims on kernel issue that prevents HW breakpoint to
be changed via ptrace interface.
The first test forks, the child sets itself as ptrace tracee and waits
in signal for parent to trace it, then it calls bp_1 and quits.
The parent does following steps:
- creates a new breakpoint (id 0) for bp_2 function
- changes that breakpoint to bp_1 function
- waits for the breakpoint to hit and checks
it has proper rip of bp_1 function
This test aims on an issue in kernel preventing to change disabled
breakpoints
Second test mimics the first one except for few steps
in the parent:
- creates a new breakpoint (id 0) for bp_1 function
- changes that breakpoint to bogus (-1) address
- waits for the breakpoint to hit and checks
it has proper rip of bp_1 function
This test aims on an issue in kernel disabling enabled
breakpoint after unsuccesful change.
Committer testing:
# uname -a
Linux jouet 4.18.0-rc8-00002-g1236568ee3cb #12 SMP Tue Aug 7 14:08:26 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
# perf test -v "bp modify"
62: x86 bp modify :
--- start ---
test child forked, pid 25671
in bp_1
tracee exited prematurely 2
FAILED arch/x86/tests/bp-modify.c:209 modify test 1 failed
test child finished with -1
---- end ----
x86 bp modify: FAILED!
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Milind Chabbi <chabbi.milind@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180827091228.2878-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The patch changes the parsing of:
callq *0x8(%rbx)
from:
0.26 │ → callq *8
to:
0.26 │ → callq *0x8(%rbx)
in this case an address is followed by a register, thus one can't parse
only the address.
Committer testing:
1) run 'perf record sleep 10'
2) before applying the patch, run:
perf annotate --stdio2 > /tmp/before
3) after applying the patch, run:
perf annotate --stdio2 > /tmp/after
4) diff /tmp/before /tmp/after:
--- /tmp/before 2018-08-28 11:16:03.238384143 -0300
+++ /tmp/after 2018-08-28 11:15:39.335341042 -0300
@@ -13274,7 +13274,7 @@
↓ jle 128
hash_value = hash_table->hash_func (key);
mov 0x8(%rsp),%rdi
- 0.91 → callq *30
+ 0.91 → callq *0x30(%r12)
mov $0x2,%r8d
cmp $0x2,%eax
node_hash = hash_table->hashes[node_index];
@@ -13848,7 +13848,7 @@
mov %r14,%rdi
sub %rbx,%r13
mov %r13,%rdx
- → callq *38
+ → callq *0x38(%r15)
cmp %rax,%r13
1.91 ↓ je 240
1b4: mov $0xffffffff,%r13d
@@ -14026,7 +14026,7 @@
mov %rcx,-0x500(%rbp)
mov %r15,%rsi
mov %r14,%rdi
- → callq *38
+ → callq *0x38(%rax)
mov -0x500(%rbp),%rcx
cmp %rax,%rcx
↓ jne 9b0
<SNIP tons of other such cases>
Signed-off-by: Martin Liška <mliska@suse.cz>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kim Phillips <kim.phillips@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/bd1f3932-be2b-85f9-7582-111ee0a43b07@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pull mtd fixes from Boris Brezillon:
"Raw NAND fixes:
- denali: Fix a regression caused by the nand_scan() rework
- docg4: Fix a build error when gcc decides to not iniline some
functions (can be reproduced with gcc 4.1.2):
* tag 'mtd/for-4.19-rc2' of git://git.infradead.org/linux-mtd:
mtd: rawnand: denali: do not pass zero maxchips to nand_scan()
mtd: rawnand: docg4: Remove wrong __init annotations
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fix unsupported parallel dispatch of requests
MMC host:
- atmel-mci/android-goldfish: Fixup logic of sg_copy_{from,to}_buffer
- renesas_sdhi_internal_dmac: Prevent IRQ-storm due of DMAC IRQs
- renesas_sdhi_internal_dmac: Fixup bad register offset"
* tag 'mmc-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: renesas_sdhi_internal_dmac: mask DMAC interrupts
mmc: renesas_sdhi_internal_dmac: fix #define RST_RESERVED_BITS
mmc: block: Fix unsupported parallel dispatch of requests
mmc: android-goldfish: fix bad logic of sg_copy_{from,to}_buffer conversion
mmc: atmel-mci: fix bad logic of sg_copy_{from,to}_buffer conversion
|
|
When filtering by guest (interactive commands 'p'/'g'), and the respective
guest was destroyed, detect when the guest is up again through the guest
name if possible.
I.e. when displaying events for a specific guest, it is not necessary
anymore to restart kvm_stat in case the guest is restarted.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
For destroyed guests, kvm_stat essentially freezes with the last data
displayed. This is acceptable for users, in case they want to inspect the
final data. But it looks a bit irritating. Therefore, detect this situation
and display a respective indicator in the header.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
When running with the DebugFS provider, removal of a guest can result in a
negative CurAvg/s, which looks rather confusing.
If so, suppress the body refresh and print a message instead.
To reproduce, have at least one guest A completely booted. Then start
another guest B (which generates a huge amount of events), then destroy B.
On the next refresh, kvm_stat should display a whole lot of negative values
in the CurAvg/s column.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
When setting a PID filter in debugfs, we unnecessarily reset the
statistics, although there is no reason to do so. This behavior was
merely introduced with commit 9f114a03c6854f "tools/kvm_stat: add
interactive command 'r'", most likely to mimic the behavior of
the tracepoints provider in this respect. However, there are plenty
of differences between the two providers, so there is no reason not
to take advantage of the possibility to filter by PID without
resetting the statistics.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
With pid filtering active, when a guest is removed e.g. via virsh shutdown,
successive updates produce garbage.
Therefore, we add code to detect this case and prevent further body updates.
Note that when displaying the help dialog via 'h' in this case, once we exit
we're stuck with the 'Collecting data...' message till we remove the filter.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
When filtering by guest, kvm_stat displays garbage when the guest is
destroyed - see sample output below.
We add code to remove the invalid paths from the providers, so at least
no more garbage is displayed.
Here's a sample output to illustrate:
kvm statistics - pid 13986 (foo)
Event Total %Total CurAvg/s
diagnose_258 -2 0.0 0
deliver_program_interruption -3 0.0 0
diagnose_308 -4 0.0 0
halt_poll_invalid -91 0.0 -6
deliver_service_signal -244 0.0 -16
halt_successful_poll -250 0.1 -17
exit_pei -285 0.1 -19
exit_external_request -312 0.1 -21
diagnose_9c -328 0.1 -22
userspace_handled -713 0.1 -47
halt_attempted_poll -939 0.2 -62
deliver_emergency_signal -3126 0.6 -208
halt_wakeup -7199 1.5 -481
exit_wait_state -7379 1.5 -493
diagnose_500 -56499 11.5 -3757
exit_null -85491 17.4 -5685
diagnose_44 -133300 27.1 -8874
exit_instruction -195898 39.8 -13037
Total -492063
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Python3 returns a float for a regular division - switch to a division
operator that returns an integer.
Furthermore, filters return a generator object instead of the actual
list - wrap result in yet another list, which makes it still work in
both, Python2 and 3.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Allowing x86_emulate_instruction() to be called directly has led to
subtle bugs being introduced, e.g. not setting EMULTYPE_NO_REEXECUTE
in the emulation type. While most of the blame lies on re-execute
being opt-out, exporting x86_emulate_instruction() also exposes its
cr2 parameter, which may have contributed to commit d391f1207067
("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO
when running nested") using x86_emulate_instruction() instead of
emulate_instruction() because "hey, I have a cr2!", which in turn
introduced its EMULTYPE_NO_REEXECUTE bug.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Lack of the kvm_ prefix gives the impression that it's a VMX or SVM
specific function, and there's no conflict that prevents adding the
kvm_ prefix.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Commit a6f177efaa58 ("KVM: Reenter guest after emulation failure if
due to access to non-mmio address") added reexecute_instruction() to
handle the scenario where two (or more) vCPUS race to write a shadowed
page, i.e. reexecute_instruction() is intended to return true if and
only if the instruction being emulated was accessing a shadowed page.
As L0 is only explicitly shadowing L1 tables, an emulation failure of
a nested VM instruction cannot be due to a race to write a shadowed
page and so should never be re-executed.
This fixes an issue where an "MMIO" emulation failure[1] in L2 is all
but guaranteed to result in an infinite loop when TDP is enabled.
Because "cr2" is actually an L2 GPA when TDP is enabled, calling
kvm_mmu_gva_to_gpa_write() to translate cr2 in the non-direct mapped
case (L2 is never direct mapped) will almost always yield UNMAPPED_GVA
and cause reexecute_instruction() to immediately return true. The
!mmio_info_in_cache() check in kvm_mmu_page_fault() doesn't catch this
case because mmio_info_in_cache() returns false for a nested MMU (the
MMIO caching currently handles L1 only, e.g. to cache nested guests'
GPAs we'd have to manually flush the cache when switching between
VMs and when L1 updated its page tables controlling the nested guest).
Way back when, commit 68be0803456b ("KVM: x86: never re-execute
instruction with enabled tdp") changed reexecute_instruction() to
always return false when using TDP under the assumption that KVM would
only get into the emulator for MMIO. Commit 95b3cf69bdf8 ("KVM: x86:
let reexecute_instruction work for tdp") effectively reverted that
behavior in order to handle the scenario where emulation failed due to
an access from L1 to the shadow page tables for L2, but it didn't
account for the case where emulation failed in L2 with TDP enabled.
All of the above logic also applies to retry_instruction(), added by
commit 1cb3f3ae5a38 ("KVM: x86: retry non-page-table writing
instructions"). An indefinite loop in retry_instruction() should be
impossible as it protects against retrying the same instruction over
and over, but it's still correct to not retry an L2 instruction in
the first place.
Fix the immediate issue by adding a check for a nested guest when
determining whether or not to allow retry in kvm_mmu_page_fault().
In addition to fixing the immediate bug, add WARN_ON_ONCE in the
retry functions since they are not designed to handle nested cases,
i.e. they need to be modified even if there is some scenario in the
future where we want to allow retrying a nested guest.
[1] This issue was encountered after commit 3a2936dedd20 ("kvm: mmu:
Don't expose private memslots to L2") changed the page fault path
to return KVM_PFN_NOSLOT when translating an L2 access to a
prive memslot. Returning KVM_PFN_NOSLOT is semantically correct
when we want to hide a memslot from L2, i.e. there effectively is
no defined memory region for L2, but it has the unfortunate side
effect of making KVM think the GFN is a MMIO page, thus triggering
emulation. The failure occurred with in-development code that
deliberately exposed a private memslot to L2, which L2 accessed
with an instruction that is not emulated by KVM.
Fixes: 95b3cf69bdf8 ("KVM: x86: let reexecute_instruction work for tdp")
Fixes: 1cb3f3ae5a38 ("KVM: x86: retry non-page-table writing instructions")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Cc: Xiao Guangrong <xiaoguangrong@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Effectively force kvm_mmu_page_fault() to opt-in to allowing retry to
make it more obvious when and why it allows emulation to be retried.
Previously this approach was less convenient due to retry and
re-execute behavior being controlled by separate flags that were also
inverted in their implementations (opt-in versus opt-out).
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
retry_instruction() and reexecute_instruction() are a package deal,
i.e. there is no scenario where one is allowed and the other is not.
Merge their controlling emulation type flags to enforce this in code.
Name the combined flag EMULTYPE_ALLOW_RETRY to make it abundantly
clear that we are allowing re{try,execute} to occur, as opposed to
explicitly requesting retry of a previously failed instruction.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Re-execution of an instruction after emulation decode failure is
intended to be used only when emulating shadow page accesses. Invert
the flag to make allowing re-execution opt-in since that behavior is
by far in the minority.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|