Age | Commit message (Collapse) | Author |
|
Add mmaping tests that generates prctl call on every cpu validates it
gets all the related events in ring buffer.
Committer testing:
# make -C tools/perf/lib tests
make: Entering directory '/home/acme/git/perf/tools/perf/lib'
LINK test-cpumap-a
LINK test-threadmap-a
LINK test-evlist-a
LINK test-evsel-a
LINK test-cpumap-so
LINK test-threadmap-so
LINK test-evlist-so
LINK test-evsel-so
running static:
- running test-cpumap.c...OK
- running test-threadmap.c...OK
- running test-evlist.c...OK
- running test-evsel.c...OK
running dynamic:
- running test-cpumap.c...OK
- running test-threadmap.c...OK
- running test-evlist.c...OK
- running test-evsel.c...OK
make: Leaving directory '/home/acme/git/perf/tools/perf/lib'
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20191017105918.20873-8-jolsa@kernel.org
[ Added _GNU_SOURCE define for sched.h to get sched_[gs]et_affinity
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add mmaping tests that generates 100 prctl calls in monitored child
process and validates it gets 100 events in ring buffer.
Committer tests:
# make -C tools/perf/lib tests
make: Entering directory '/home/acme/git/perf/tools/perf/lib'
LINK test-cpumap-a
LINK test-threadmap-a
LINK test-evlist-a
LINK test-evsel-a
LINK test-cpumap-so
LINK test-threadmap-so
LINK test-evlist-so
LINK test-evsel-so
running static:
- running test-cpumap.c...OK
- running test-threadmap.c...OK
- running test-evlist.c...OK
- running test-evsel.c...OK
running dynamic:
- running test-cpumap.c...OK
- running test-threadmap.c...OK
- running test-evlist.c...OK
- running test-evsel.c...OK
make: Leaving directory '/home/acme/git/perf/tools/perf/lib'
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20191017105918.20873-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Both static and dynamic tests needs to link with libapi.a, because it's
using its functions. Also include path for libapi includes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20191017105918.20873-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move the mask setup to perf_evlist__mmap_ops(), because it's the same on
both perf and libperf path.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20191017105918.20873-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move allocation of the mmap array into perf_evlist__mmap_ops::get, to
centralize the mmap allocation.
Also move nr_mmap setup to perf_evlist__mmap_ops so it's centralized and
shared by both perf and libperf mmap code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20191017105918.20873-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add the perf_evlist__for_each_mmap() function and export it in the
perf/evlist.h header, so that the user can iterate through 'struct
perf_mmap' objects.
Add a internal perf_mmap__link() function to do the actual linking.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20191017105918.20873-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As there are several discussions for enabling perf breakpoint signal
testing on arm64 platform: arm64 needs to rely on single-step to execute
the breakpointed instruction and then reinstall the breakpoint exception
handler. But if we hook the breakpoint with a signal, the signal
handler will do the stepping rather than the breakpointed instruction,
this causes infinite loops as below:
Kernel space | Userspace
---------------------------------|--------------------------------
| __test_function() -> hit
| breakpoint
breakpoint_handler() |
`-> user_enable_single_step() |
do_signal() |
| sig_handler() -> Step one
| instruction and
| trap to kernel
single_step_handler() |
`-> reinstall_suspended_bps() |
| __test_function() -> hit
| breakpoint again and
| repeat up flow infinitely
As Will Deacon mentioned [1]: "that we require the overflow handler to
do the stepping on arm/arm64, which is relied upon by GDB/ptrace. The
hw_breakpoint code is a complete disaster so my preference would be to
rip out the perf part and just implement something directly in ptrace,
but it's a pretty horrible job". Though Will commented this on arm
architecture, but the comment also can apply on arm64 architecture.
For complete information, I searched online and found a few years back,
Wang Nan sent one patch 'arm64: Store breakpoint single step state into
pstate' [2]; the patch tried to resolve this issue by avoiding single
stepping in signal handler and defer to enable the signal stepping when
return to __test_function(). The fixing was not merged due to the
concern for missing to handle different usage cases.
Based on the info, the most feasible way is to skip Perf breakpoint
signal testing for arm64 and this could avoid the duplicate
investigation efforts when people see the failure. This patch skips
this case on arm64 platform, which is same with arm architecture.
[1] https://lkml.org/lkml/2018/11/15/205
[2] https://lkml.org/lkml/2015/12/23/477
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Will Deacon <will@kernel.org>
Link: http://lore.kernel.org/lkml/20191018085531.6348-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The arm architecture supports breakpoint accounting but it doesn't
support breakpoint overflow signal handling. The current code uses the
same checking helper, thus it disables both testings (bp_account and
bp_signal) for arm platform.
For handling two testings separately, this patch adds a dedicated
checking helper is_supported() for breakpoint accounting testing, thus
it allows supporting breakpoint accounting testing on arm platform; the
old helper test__bp_signal_is_supported() is only used to checking for
breakpoint overflow signal testing.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Will Deacon <will@kernel.org>
Link: http://lore.kernel.org/lkml/20191018085531.6348-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A few headers are not needed and were introduced by copying from other
test file. This patch removes the needless headers for the breakpoint
accounting testing.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Will Deacon <will@kernel.org>
Link: http://lore.kernel.org/lkml/20191018085531.6348-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There are some deprecated events listed by perf list. But we can't
remove them from perf list with ease because some old scripts may use
them.
Deprecated events are old names of renamed events. When an event gets
renamed the old name is kept around for some time and marked with
Deprecated. The newer Intel event lists in the tree already have these
headers.
So we need to keep them in the event list, but provide a new option to
show them. The new option is "--deprecated".
With this patch, the deprecated events are hidden by default but they
can be displayed when option "--deprecated" is enabled.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191015025357.8708-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
With just what we need for the STUL_STRARRAY, i.e. the 'struct strarray'
pointer to be used, just like with syscall_arg_fmt->scnprintf() for the
other direction (number -> string).
With this all the strarrays that are associated with syscalls can be
used with '-e syscalls:sys_enter_SYSCALLNAME --filter', and soon will be
possible as well to use with the strace-like shorter form, with just the
syscall names, i.e. something like:
-e lseek/whence==END/
For now we have to use the longer form:
# perf trace -e syscalls:sys_enter_lseek
0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
0.031 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
0.046 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
5003.528 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
5003.575 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
5003.593 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
10002.017 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
10002.051 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
10002.068 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
^C# perf trace -e syscalls:sys_enter_lseek --filter="whence!=CUR"
0.000 sshd/24476 syscalls:sys_enter_lseek(fd: 3, offset: 9032, whence: SET)
0.060 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 9032, whence: SET)
0.187 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET)
0.203 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET)
0.349 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 61936, whence: SET)
^C#
And for those curious about what are those lseek(DSO, offset, SET), well, its the loader:
# perf trace -e syscalls:sys_enter_lseek/max-stack=16/ --filter="whence!=CUR"
0.000 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET)
__libc_lseek64 (/usr/lib64/ld-2.29.so)
_dl_map_object (/usr/lib64/ld-2.29.so)
0.067 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET)
__libc_lseek64 (/usr/lib64/ld-2.29.so)
_dl_map_object_from_fd (/usr/lib64/ld-2.29.so)
_dl_map_object (/usr/lib64/ld-2.29.so)
0.198 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET)
__libc_lseek64 (/usr/lib64/ld-2.29.so)
_dl_map_object (/usr/lib64/ld-2.29.so)
0.219 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET)
__libc_lseek64 (/usr/lib64/ld-2.29.so)
_dl_map_object_from_fd (/usr/lib64/ld-2.29.so)
_dl_map_object (/usr/lib64/ld-2.29.so)
^C#
:-)
With this we can use strings in strarrays in filters, which allows us to
reuse all these that are in place for syscalls:
$ find tools/perf/trace/beauty/ -name "*.c" | xargs grep -w DEFINE_STRARRAY
tools/perf/trace/beauty/fcntl.c: static DEFINE_STRARRAY(fcntl_setlease, "F_");
tools/perf/trace/beauty/mmap.c: static DEFINE_STRARRAY(mmap_flags, "MAP_");
tools/perf/trace/beauty/mmap.c: static DEFINE_STRARRAY(madvise_advices, "MADV_");
tools/perf/trace/beauty/sync_file_range.c: static DEFINE_STRARRAY(sync_file_range_flags, "SYNC_FILE_RANGE_");
tools/perf/trace/beauty/socket.c: static DEFINE_STRARRAY(socket_ipproto, "IPPROTO_");
tools/perf/trace/beauty/mount_flags.c: static DEFINE_STRARRAY(mount_flags, "MS_");
tools/perf/trace/beauty/pkey_alloc.c: static DEFINE_STRARRAY(pkey_alloc_access_rights, "PKEY_");
tools/perf/trace/beauty/sockaddr.c:DEFINE_STRARRAY(socket_families, "PF_");
tools/perf/trace/beauty/tracepoints/x86_irq_vectors.c:static DEFINE_STRARRAY(x86_irq_vectors, "_VECTOR");
tools/perf/trace/beauty/tracepoints/x86_msr.c:static DEFINE_STRARRAY(x86_MSRs, "MSR_");
tools/perf/trace/beauty/prctl.c: static DEFINE_STRARRAY(prctl_options, "PR_");
tools/perf/trace/beauty/prctl.c: static DEFINE_STRARRAY(prctl_set_mm_options, "PR_SET_MM_");
tools/perf/trace/beauty/fspick.c: static DEFINE_STRARRAY(fspick_flags, "FSPICK_");
tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(ioctl_tty_cmd, "");
tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(drm_ioctl_cmds, "");
tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(sndrv_pcm_ioctl_cmds, "");
tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(sndrv_ctl_ioctl_cmds, "");
tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(kvm_ioctl_cmds, "");
tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(vhost_virtio_ioctl_cmds, "");
tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(vhost_virtio_ioctl_read_cmds, "");
tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(perf_ioctl_cmds, "");
tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(usbdevfs_ioctl_cmds, "");
tools/perf/trace/beauty/fsmount.c: static DEFINE_STRARRAY(fsmount_attr_flags, "MOUNT_ATTR_");
tools/perf/trace/beauty/renameat.c: static DEFINE_STRARRAY(rename_flags, "RENAME_");
tools/perf/trace/beauty/kcmp.c: static DEFINE_STRARRAY(kcmp_types, "KCMP_");
tools/perf/trace/beauty/move_mount.c: static DEFINE_STRARRAY(move_mount_flags, "MOVE_MOUNT_");
$
Well, some, as the mmap flags are like:
$ tools/perf/trace/beauty/mmap_flags.sh
static const char *mmap_flags[] = {
[ilog2(0x40) + 1] = "32BIT",
[ilog2(0x01) + 1] = "SHARED",
[ilog2(0x02) + 1] = "PRIVATE",
[ilog2(0x10) + 1] = "FIXED",
[ilog2(0x20) + 1] = "ANONYMOUS",
[ilog2(0x008000) + 1] = "POPULATE",
[ilog2(0x010000) + 1] = "NONBLOCK",
[ilog2(0x020000) + 1] = "STACK",
[ilog2(0x040000) + 1] = "HUGETLB",
[ilog2(0x080000) + 1] = "SYNC",
[ilog2(0x100000) + 1] = "FIXED_NOREPLACE",
[ilog2(0x0100) + 1] = "GROWSDOWN",
[ilog2(0x0800) + 1] = "DENYWRITE",
[ilog2(0x1000) + 1] = "EXECUTABLE",
[ilog2(0x2000) + 1] = "LOCKED",
[ilog2(0x4000) + 1] = "NORESERVE",
};
$
So we'll need a strarray__strtoul_flags() that will break donw the flags
into tokens separated by '|' before doing the lookup and then go on
reconstructing the value from, say:
# perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|FIXED|DENYWRITE"
into:
# perf trace -e syscalls:sys_enter_mmap --filter="flags==0x2|0x10|0x0800"
and finally into:
# perf trace -e syscalls:sys_enter_mmap --filter="flags==0x812"
That is what we see if we don't use the augmented view obtained from:
# perf trace -e mmap
<SNIP>
211792.885 procmail/15393 mmap(addr: 0x7fcd11645000, len: 8192, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 8, off: 0xa000) = 0x7fcd11645000
<SNIP>
But plain use tracefs:
procmail-15559 [000] .... 54557.178262: sys_mmap(addr: 7f5c9bf7a000, len: 9b000, prot: 1, flags: 812, fd: 3, off: a9000)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c6mgkjt8ujnc263eld5tb7q3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We were doing this only at the sys_exit syscall tracepoint, as for
strace-like we count the pair of sys_enter and sys_exit as one event,
but when asking specifically for a the syscalls:sys_enter_NAME
tracepoint we need to count each of those as an event.
I.e. things like:
# perf trace --max-events=4 -e syscalls:sys_enter_lseek
0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
0.034 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
0.051 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
2307.900 sshd/30800 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libsystemd.so.0.25.0>, offset: 9032, whence: SET)
#
Were going on forever, since we only had sys_enter events.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-0ob1dky1a9ijlfrfhxyl40wr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To go from strarrays strings to its indexes.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wta0qvo207z27huib2c4ijxq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
trace_find_next_event() was buggy and pretty much a useless helper. As
there are no more users, just remove it.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191017210636.224045576@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Instead of calling a useless (and broken) helper function to get the
next event of a tep event array, just get the array directly and iterate
over it.
Note, the broken part was from trace_find_next_event() which after this
will no longer be used, and can be removed.
Committer notes:
This fixes a segfault when generating python scripts from perf.data
files with multiple tracepoint events, i.e. the following use case is
fixed by this patch:
# perf record -e sched:* sleep 1
[ perf record: Woken up 31 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data (9 samples) ]
# perf script -g python
Segmentation fault (core dumped)
#
Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lkml.kernel.org/r/20191017153733.630cd5eb@gandalf.local.home
Link: http://lore.kernel.org/lkml/20191017210636.061448713@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
From the syscall_fmts->arg entries for formatting strace-like syscalls.
This is when resolving the string "whence" on a filter expression for
the syscalls:sys_enter_lseek:
Breakpoint 3, perf_evsel__syscall_arg_fmt (evsel=0xc91ed0, arg=0x7fffffff7cd0 "whence") at builtin-trace.c:3626
3626 {
(gdb) n
3628 struct syscall_arg_fmt *fmt = __evsel__syscall_arg_fmt(evsel);
(gdb) n
3630 if (evsel->tp_format == NULL || fmt == NULL)
(gdb) n
3633 for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
(gdb) n
3634 if (strcmp(field->name, arg) == 0)
(gdb) p field->name
$3 = 0xc945e0 "__syscall_nr"
(gdb) n
3633 for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
(gdb) p *fmt
$4 = {scnprintf = 0x0, strtoul = 0x0, mask_val = 0x0, parm = 0x0, name = 0x0, nr_entries = 0, show_zero = false}
(gdb) n
3634 if (strcmp(field->name, arg) == 0)
(gdb) p field->name
$5 = 0xc94690 "fd"
(gdb) n
3633 for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
(gdb) n
3634 if (strcmp(field->name, arg) == 0)
(gdb) n
3633 for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
(gdb) n
3634 if (strcmp(field->name, arg) == 0)
(gdb) p *fmt
$9 = {scnprintf = 0x489be2 <syscall_arg__scnprintf_strarray>, strtoul = 0x0, mask_val = 0x0, parm = 0xa2da80 <strarray.whences>, name = 0x0,
nr_entries = 0, show_zero = false}
(gdb) p field->name
$10 = 0xc947b0 "whence"
(gdb) p fmt->parm
$11 = (void *) 0xa2da80 <strarray.whences>
(gdb) p *(struct strarray *)fmt->parm
$12 = {offset = 0, nr_entries = 5, prefix = 0x724d37 "SEEK_", entries = 0xa2da40 <whences>}
(gdb) p (struct strarray *)fmt->parm)->entries
Junk after end of expression.
(gdb) p ((struct strarray *)fmt->parm)->entries
$13 = (const char **) 0xa2da40 <whences>
(gdb) p ((struct strarray *)fmt->parm)->entries[0]
$14 = 0x724d21 "SET"
(gdb) p ((struct strarray *)fmt->parm)->entries[1]
$15 = 0x724d25 "CUR"
(gdb) p ((struct strarray *)fmt->parm)->entries[2]
$16 = 0x724d29 "END"
(gdb) p ((struct strarray *)fmt->parm)->entries[2]
$17 = 0x724d29 "END"
(gdb) p ((struct strarray *)fmt->parm)->entries[3]
$18 = 0x724d2d "DATA"
(gdb) p ((struct strarray *)fmt->parm)->entries[4]
$19 = 0x724d32 "HOLE"
(gdb)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lc8h9jgvbnboe0g7ic8tra1y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For syscalls we need to cache the 'syscall_id' and 'ret' field offsets
but as well have a pointer to the syscall_fmt_arg array for the fields,
so that we can expand strings in filter expressions, so introduce
a 'struct evsel_trace' to have in evsel->priv that allows for that.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hx8ukasuws5sz6rsar73cocv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Next step will be to have a 'struct evsel_trace' to allow for handling
the syscalls tracepoints via the strace-like code while reusing parts of
that code with the other tracepoints, where we don't have things like
the 'syscall_nr' or 'ret' ((raw_)?syscalls:sys_{enter,exit}(_SYSCALL)?)
args that we want to cache offsets and have been using evsel->priv for
that, while for the other tracepoints we'll have just an array of
'struct syscall_arg_fmt' (i.e. ->scnprint() for number->string and
->strtoul() string->number conversions and other state those functions
need).
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-fre21jbyoqxmmquxcho7oa0x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We're using evsel->priv in syscalls:sys_{enter,exit}_SYSCALL and in
raw_syscalls:sys_{enter,exit} to cache the offset of the common fields,
the multiplexor id/syscall_id in the sys_enter case and syscall_id + ret
for sys_exit.
And for the rest of the tracepoints we use it to have a syscall_arg_fmt
array to have scnprintf/strtoul for tracepoint args.
So we better clearly mark them with accessors so that we can move to
having a 'struct evsel_trace' struct for all 'perf trace' specific
evsel->priv usage.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-dcoyxfslg7atz821tz9aupjh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
expression
It was there, but as pr_debug(), make it pr_err() so that we can see it
without -v:
# trace -e syscalls:*lseek --filter="whenc==SET" sleep 1
"whenc" not found in "syscalls:sys_enter_lseek", can't set filter "whenc==SET"
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ly4rgm1bto8uwc2itpaixjob@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
scnprintf/strtoul
Ended up only being useful when filtering multiple irq_vectors
tracepoints, as we end up having a tracepoint for each of the entries,
i.e.:
This will always come with the "RESCHEDULE_VECTOR" in the 'vector' arg:
# perf trace --max-events 8 -e irq_vectors:reschedule*
0.000 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
0.004 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
0.553 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
0.556 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
1.182 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
1.185 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
1.203 :29052/29052 irq_vectors:reschedule_entry(vector: RESCHEDULE)
1.206 :29052/29052 irq_vectors:reschedule_exit(vector: RESCHEDULE)
#
While filtering that value will produce nothing:
# perf trace --max-events 8 -e irq_vectors:reschedule* --filter="vector != RESCHEDULE"
^C#
Maybe it'll be useful for those other tracepoints:
# perf list irq_vectors:vector_*
List of pre-defined events (to be used in -e):
irq_vectors:vector_activate [Tracepoint event]
irq_vectors:vector_alloc [Tracepoint event]
irq_vectors:vector_alloc_managed [Tracepoint event]
irq_vectors:vector_clear [Tracepoint event]
irq_vectors:vector_config [Tracepoint event]
irq_vectors:vector_deactivate [Tracepoint event]
irq_vectors:vector_free_moved [Tracepoint event]
irq_vectors:vector_reserve [Tracepoint event]
irq_vectors:vector_reserve_managed [Tracepoint event]
irq_vectors:vector_setup [Tracepoint event]
irq_vectors:vector_teardown [Tracepoint event]
irq_vectors:vector_update [Tracepoint event]
#
But since we have it done, keep it.
This at least served to teach me that all those irq vectors have a entry
and an exit tracepoint that I can then use just like with
raw_syscalls:sys_{enter,exit}, i.e. pair them, use just a
trace__irq_vectors_entry() + trace__irq_vectors_exit() and use the
'vector' arg as I use the 'syscall id' one for syscalls.
Then the default for 'perf trace' will include irq_vectors in addition
to syscalls.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wer4cwbbqub3o7sa8h1j3uzb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We need to wrap this autogenerated string array with the
strarray__scnprintf() formatter and the strarray__strotul() lookup
method, do it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-bx2cjcyv6aerhyy3gvu3uwcy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In some cases, like with x86 IRQ vectors, the common part in names is at
the end, so a suffix, add a scnprintf function for that.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-agxbj6es2ke3rehwt4gkdw23@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I.e. after running:
$ make -C tools/perf O=/tmp/build/perf
We end up with:
$ cat /tmp/build/perf/trace/beauty/generated/x86_arch_irq_vectors_array.c
static const char *x86_irq_vectors[] = {
[0x02] = "NMI",
[0x12] = "MCE",
[0x20] = "IRQ_MOVE_CLEANUP",
[0x80] = "IA32_SYSCALL",
[0xec] = "LOCAL_TIMER",
[0xed] = "HYPERV_STIMER0",
[0xee] = "HYPERV_REENLIGHTENMENT",
[0xef] = "MANAGED_IRQ_SHUTDOWN",
[0xf0] = "POSTED_INTR_NESTED",
[0xf1] = "POSTED_INTR_WAKEUP",
[0xf2] = "POSTED_INTR",
[0xf3] = "HYPERVISOR_CALLBACK",
[0xf4] = "DEFERRED_ERROR",
[0xf6] = "IRQ_WORK",
[0xf7] = "X86_PLATFORM_IPI",
[0xf8] = "REBOOT",
[0xf9] = "THRESHOLD_APIC",
[0xfa] = "THERMAL_APIC",
[0xfb] = "CALL_FUNCTION_SINGLE",
[0xfc] = "CALL_FUNCTION",
[0xfd] = "RESCHEDULE",
[0xfe] = "ERROR_APIC",
[0xff] = "SPURIOUS_APIC",
};
$
Now its just a matter of using it, associating it to tracepoint arguments named
'vector', all of which can be correctly used with this table, for int args.
At some point we should move tools/perf/trace/beauty to tools/beauty/,
so that it can be used more generally and even made available externally
like libbpf, libperf, libtraceevent, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-0p2df4kq1afrxbck4e4ct34r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We'll wire this up with the 'vector' arg in irq_vectors:*, etc:
Just run it straight away and check what it produces:
$ tools/perf/trace/beauty/tracepoints/x86_irq_vectors.sh
static const char *x86_irq_vectors[] = {
[0x02] = "NMI",
[0x12] = "MCE",
[0x20] = "IRQ_MOVE_CLEANUP",
[0x80] = "IA32_SYSCALL",
[0xec] = "LOCAL_TIMER",
[0xed] = "HYPERV_STIMER0",
[0xee] = "HYPERV_REENLIGHTENMENT",
[0xef] = "MANAGED_IRQ_SHUTDOWN",
[0xf0] = "POSTED_INTR_NESTED",
[0xf1] = "POSTED_INTR_WAKEUP",
[0xf2] = "POSTED_INTR",
[0xf3] = "HYPERVISOR_CALLBACK",
[0xf4] = "DEFERRED_ERROR",
[0xf6] = "IRQ_WORK",
[0xf7] = "X86_PLATFORM_IPI",
[0xf8] = "REBOOT",
[0xf9] = "THRESHOLD_APIC",
[0xfa] = "THERMAL_APIC",
[0xfb] = "CALL_FUNCTION_SINGLE",
[0xfc] = "CALL_FUNCTION",
[0xfd] = "RESCHEDULE",
[0xfe] = "ERROR_APIC",
[0xff] = "SPURIOUS_APIC",
};
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-cpl1pa7kkwn0llufi5qw4li8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We'll use it to generate a table and then convert the irq_vectors:*
tracepoint 'vector' arg in things like perf trace, script, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-z7gi058lzhnrm32slevg3xod@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add some more missing events.
A trivial typo is also fixed.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1567612484-195727-5-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add some more missing events.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1567612484-195727-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add some more missing events.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1567612484-195727-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The "EventName" for the DDRC precharge command event is incorrect, so
fix it.
Fixes: 57cc732479ba ("perf jevents: Add support for Hisi hip08 DDRC PMU aliasing")
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1567612484-195727-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Things like:
# grep __data_loc /sys/kernel/debug/tracing/events/sched/sched_process_exec/format
field:__data_loc char[] filename; offset:8; size:4; signed:1;
#
That, at that offset (8) and with that size(8) have an integer that
contains the real length and offset for the contents of that array.
Now this works:
# perf trace --max-events 1 -e sched:*exec -a
0.000 sed/19441 sched:sched_process_exec(filename: "/usr/bin/sync", pid: 19441 (sync), old_pid: 19441 (sync))
#
As when using the libtraceevent based beautifier:
# perf trace --libtraceevent --max-events 1 -e sched:*exec -a
0.000 sync/19463 sched:sched_process_exec(filename=/usr/bin/sync pid=19463 old_pid=19463)
#
I.e. that 'filename' is implemented as a dynamic char array.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-950p0m842fe6n7sxsdwqj5i2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When doing a system wide 'perf trace record' we need, just like in 'perf
trace' live mode, to filter out perf trace's own pid, so set up a
tracepoint filter for the raw_syscalls tracepoints right after adding
them to the argv array that is set up to then call cmd_record().
Reported-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-uysx5w8f2y5ndoln5cq370tv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Will be used directly in 'perf trace' for setting up the command line
argv array to pass to cmd_record, as this was how 'perf trace record'
was implemented, following the model used in 'perf kvm record', 'perf
sched record', etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-w3cuwjs63lxf5zpryy3145uv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To be used with -S or -s, using just this new option implies -s,
examples:
# perf trace --errno-summary sleep 1
Summary of events:
sleep (10793), 80 events, 93.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1000.427 1000.427 1000.427 1000.427 0.00%
mmap 8 0 0.026 0.002 0.003 0.005 9.18%
close 5 0 0.018 0.001 0.004 0.009 48.97%
mprotect 4 0 0.017 0.003 0.004 0.006 16.49%
openat 3 0 0.012 0.003 0.004 0.005 9.41%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
brk 4 0 0.005 0.001 0.001 0.002 22.77%
read 4 0 0.005 0.001 0.001 0.002 22.33%
access 1 1 0.004 0.004 0.004 0.004 0.00%
ENOENT: 1
fstat 3 0 0.004 0.001 0.001 0.002 17.18%
lseek 3 0 0.003 0.001 0.001 0.001 11.62%
arch_prctl 2 1 0.002 0.001 0.001 0.001 3.32%
EINVAL: 1
execve 1 0 0.000 0.000 0.000 0.000 0.00%
#
Works as well together with --failure and -S, i.e. collect the stats and
show just the syscalls that failed:
# perf trace --failure -S --errno-summary sleep 1
0.032 arch_prctl(option: 0x3001, arg2: 0x7fffdb11b580) = -1 EINVAL (Invalid argument)
0.045 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory)
Summary of events:
sleep (10806), 80 events, 93.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1000.094 1000.094 1000.094 1000.094 0.00%
mmap 8 0 0.026 0.002 0.003 0.005 9.06%
close 5 0 0.018 0.001 0.004 0.010 49.58%
mprotect 4 0 0.017 0.003 0.004 0.006 17.56%
openat 3 0 0.014 0.004 0.005 0.006 12.29%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
brk 4 0 0.005 0.001 0.001 0.002 22.75%
read 4 0 0.005 0.001 0.001 0.002 17.19%
access 1 1 0.005 0.005 0.005 0.005 0.00%
ENOENT: 1
fstat 3 0 0.004 0.001 0.001 0.002 21.66%
lseek 3 0 0.003 0.001 0.001 0.001 11.71%
arch_prctl 2 1 0.002 0.001 0.001 0.001 2.66%
EINVAL: 1
execve 1 0 0.000 0.000 0.000 0.000 0.00%
#
Suggested-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-l0mjwczkpouov7lss5zn8d9h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Just like strace has:
# trace -s sleep 1
Summary of events:
sleep (32370), 80 events, 93.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1000.402 1000.402 1000.402 1000.402 0.00%
mmap 8 0 0.023 0.002 0.003 0.004 8.49%
close 5 0 0.015 0.001 0.003 0.009 51.39%
mprotect 4 0 0.014 0.002 0.003 0.005 16.95%
openat 3 0 0.013 0.003 0.004 0.005 14.29%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
read 4 0 0.005 0.001 0.001 0.002 16.83%
brk 4 0 0.004 0.001 0.001 0.002 20.82%
access 1 1 0.004 0.004 0.004 0.004 0.00%
fstat 3 0 0.003 0.001 0.001 0.001 12.17%
lseek 3 0 0.003 0.001 0.001 0.001 11.45%
arch_prctl 2 1 0.002 0.001 0.001 0.001 2.30%
execve 1 0 0.000 0.000 0.000 0.000 0.00%
#
# perf trace -S sleep 1
? ... [continued]: execve()) = 0
0.028 brk(brk: NULL) = 0x559f5bd96000
0.033 arch_prctl(option: 0x3001, arg2: 0x7ffda8b715a0) = -1 EINVAL (Invalid argument)
0.046 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory)
0.055 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) = 3
0.060 fstat(fd: 3, statbuf: 0x7ffda8b707a0) = 0
0.062 mmap(addr: NULL, len: 134346, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3aedfc4000
0.066 close(fd: 3) = 0
0.079 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY|CLOEXEC) = 3
0.085 read(fd: 3, buf: 0x7ffda8b70948, count: 832) = 832
0.088 lseek(fd: 3, offset: 792, whence: SET) = 792
0.090 read(fd: 3, buf: 0x7ffda8b70810, count: 68) = 68
0.093 fstat(fd: 3, statbuf: 0x7ffda8b707f0) = 0
0.095 mmap(addr: NULL, len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7f3aedfc2000
0.101 lseek(fd: 3, offset: 792, whence: SET) = 792
0.103 read(fd: 3, buf: 0x7ffda8b70450, count: 68) = 68
0.105 lseek(fd: 3, offset: 864, whence: SET) = 864
0.107 read(fd: 3, buf: 0x7ffda8b70470, count: 32) = 32
0.110 mmap(addr: NULL, len: 1857472, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3, off: 0) = 0x7f3aeddfc000
0.114 mprotect(start: 0x7f3aede1e000, len: 1679360, prot: NONE) = 0
0.121 mmap(addr: 0x7f3aede1e000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000) = 0x7f3aede1e000
0.127 mmap(addr: 0x7f3aedf6b000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000) = 0x7f3aedf6b000
0.131 mmap(addr: 0x7f3aedfb8000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000) = 0x7f3aedfb8000
0.138 mmap(addr: 0x7f3aedfbe000, len: 14272, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7f3aedfbe000
0.147 close(fd: 3) = 0
0.158 arch_prctl(option: SET_FS, arg2: 0x7f3aedfc3580) = 0
0.210 mprotect(start: 0x7f3aedfb8000, len: 16384, prot: READ) = 0
0.230 mprotect(start: 0x559f5b27d000, len: 4096, prot: READ) = 0
0.236 mprotect(start: 0x7f3aee00f000, len: 4096, prot: READ) = 0
0.240 munmap(addr: 0x7f3aedfc4000, len: 134346) = 0
0.300 brk(brk: NULL) = 0x559f5bd96000
0.302 brk(brk: 0x559f5bdb7000) = 0x559f5bdb7000
0.305 brk(brk: NULL) = 0x559f5bdb7000
0.310 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY|CLOEXEC) = 3
0.315 fstat(fd: 3, statbuf: 0x7f3aedfbdac0) = 0
0.318 mmap(addr: NULL, len: 217750512, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3ae0e52000
0.325 close(fd: 3) = 0
0.358 nanosleep(rqtp: 0x7ffda8b714b0, rmtp: NULL) = 0
1000.622 close(fd: 1) = 0
1000.641 close(fd: 2) = 0
1000.664 exit_group(error_code: 0) = ?
Summary of events:
sleep (722), 80 events, 93.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1000.194 1000.194 1000.194 1000.194 0.00%
mmap 8 0 0.025 0.002 0.003 0.005 10.17%
close 5 0 0.018 0.001 0.004 0.010 50.18%
mprotect 4 0 0.016 0.003 0.004 0.006 16.81%
openat 3 0 0.011 0.003 0.004 0.004 6.57%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
brk 4 0 0.005 0.001 0.001 0.002 20.72%
read 4 0 0.005 0.001 0.001 0.002 16.71%
access 1 1 0.005 0.005 0.005 0.005 0.00%
fstat 3 0 0.004 0.001 0.001 0.002 14.82%
lseek 3 0 0.003 0.001 0.001 0.001 11.66%
arch_prctl 2 1 0.002 0.001 0.001 0.001 3.59%
execve 1 0 0.000 0.000 0.000 0.000 0.00%
#
Works for system wide, e.g. for 1ms:
# perf trace -s -a sleep 0.001
Summary of events:
sleep (768), 94 events, 37.9%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1.133 1.133 1.133 1.133 0.00%
execve 7 6 0.351 0.003 0.050 0.316 88.53%
mmap 8 0 0.024 0.002 0.003 0.004 8.86%
mprotect 4 0 0.017 0.003 0.004 0.006 16.02%
openat 3 0 0.013 0.004 0.004 0.005 8.34%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
brk 4 0 0.007 0.001 0.002 0.002 10.99%
close 5 0 0.005 0.001 0.001 0.002 11.69%
read 5 0 0.005 0.000 0.001 0.002 30.53%
access 1 1 0.004 0.004 0.004 0.004 0.00%
fstat 3 0 0.004 0.001 0.001 0.002 10.74%
lseek 3 0 0.003 0.001 0.001 0.001 10.20%
arch_prctl 2 1 0.002 0.001 0.001 0.001 3.34%
Web Content (21258), 46 events, 18.5%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
recvmsg 12 12 0.015 0.001 0.001 0.002 8.50%
futex 2 0 0.008 0.003 0.004 0.005 27.08%
poll 6 0 0.006 0.000 0.001 0.002 22.14%
read 2 0 0.006 0.002 0.003 0.003 26.08%
write 1 0 0.002 0.002 0.002 0.002 0.00%
Web Content (4365), 36 events, 14.5%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
recvmsg 10 10 0.015 0.001 0.002 0.003 11.83%
poll 5 0 0.006 0.000 0.001 0.002 28.44%
futex 2 0 0.005 0.001 0.003 0.004 48.29%
read 1 0 0.003 0.003 0.003 0.003 0.00%
Timer (21275), 14 events, 5.6%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
futex 6 1 0.240 0.000 0.040 0.149 64.58%
write 1 0 0.008 0.008 0.008 0.008 0.00%
Timer (4383), 14 events, 5.6%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
futex 6 2 0.186 0.000 0.031 0.181 96.45%
write 1 0 0.010 0.010 0.010 0.010 0.00%
Web Content (20354), 28 events, 11.3%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
recvmsg 8 8 0.010 0.001 0.001 0.002 15.24%
poll 4 0 0.004 0.000 0.001 0.002 35.68%
futex 1 0 0.003 0.003 0.003 0.003 0.00%
read 1 0 0.003 0.003 0.003 0.003 0.00%
Timer (20371), 10 events, 4.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
futex 4 1 0.077 0.000 0.019 0.075 95.46%
write 1 0 0.005 0.005 0.005 0.005 0.00%
[root@quaco ~]#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-k7kh2muo5oeg56yx446hnw9v@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf record' has supported --all-kernel / --all-user to configure all
used events to run in kernel space or run in user space. But 'perf stat'
doesn't support these options.
It would be useful to support these options in 'perf stat' too to keep
the same semantics available in both tools.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191011050545.3899-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The build of file libperf-jvmti.so succeeds but the resulting
object fails to load:
# ~/linux/tools/perf/perf record -k mono -- java \
-XX:+PreserveFramePointer \
-agentpath:/root/linux/tools/perf/libperf-jvmti.so \
hog 100000 123450
Error occurred during initialization of VM
Could not find agent library /root/linux/tools/perf/libperf-jvmti.so
in absolute path, with error:
/root/linux/tools/perf/libperf-jvmti.so: undefined symbol: _ctype
Add the missing _ctype symbol into the build script.
Fixes: 79743bc927f6 ("perf jvmti: Link against tools/lib/string.o to have weak strlcpy()")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20191008093841.59387-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove redirection of objdump's stderr to /dev/null to help diagnose
failures.
Fix the '--no-show-raw' flag to be '--no-show-raw-insn' which binutils
is permissive and allows, but fails with LLVM objdump.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Avoiding a pipe allows objdump command failures to surface. Move to the
caller of symbol__parse_objdump_line the call to strim that removes
leading and trailing tabs. Add a new expand_tabs function that if a tab
is present allocate a new line in which tabs are expanded. In
symbol__parse_objdump_line the line had no leading spaces, so simplify
the line_ip processing.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Simplify the objdump command by not piping the output of objdump through
grep. Instead, drop lines that match the grep pattern during the reading
loop.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Reduce duplicated logic by using the subcmd library. Ensure when errors
occur they are reported to the caller. Before this patch, if no lines
are read the error status is 0.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-3-irogers@google.com
Link: http://lore.kernel.org/lkml/20191015003418.62563-1-irogers@google.com
[ merged follow up fix for NULL termination as in the 2nd link above ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Objdump output is parsed using getline which allocates memory for the
read. Getline will realloc if the memory is too small, but currently the
line is always freed after the call.
Simplify parse_objdump_line by performing the reading in symbol__disassemble.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We received a user report that call-graph DWARF mode was enabled in
'perf record' but 'perf report' didn't unwind the callstack correctly.
The reason was, libunwind was not compiled in.
We can use 'perf -vv' to check the compiled libraries but it would be
valuable to report a warning to user directly (especially valuable for
a perf newbie).
The warning is:
Warning:
Please install libunwind development packages during the perf build.
Both TUI and stdio are supported.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191011022122.26369-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When executing the task exit testing case, perf gets stuck in an endless
loop this case and doesn't return back on Arm64 Juno board.
After digging into this issue, since Juno board has Arm's big.LITTLE
CPUs, thus the PMUs are not compatible between the big CPUs and little
CPUs. This leads to a PMU event that cannot be enabled properly when
the traced task is migrated from one variant's CPU to another variant.
Finally, the test case runs into infinite loop for cannot read out any
event data after return from polling.
Eventually, we need to work out formal solution to allow PMU events can
be freely migrated from one CPU variant to another, but this is a
difficult task and a different topic. This patch tries to fix the Perf
test case to avoid infinite loop, when the testing detects 1000 times
retrying for reading empty events, it will directly bail out and return
failure. This allows the Perf tool can continue its other test cases.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191011091942.29841-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When fail to mmap events in task exit case, it misses to set 'err' to
-1; thus the testing will not report failure for it.
This patch sets 'err' to -1 when fails to mmap events, thus Perf tool
can report correct result.
Fixes: d723a55096b8 ("perf test: Add test case for checking number of EXIT events")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191011091942.29841-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In the earlier fix for the memory overrun of id arrays I managed to typo
the wrong event in the fix.
Of course we need to close the current event in the loop, not the
original failing event.
The same test case as in the original patch still passes.
Fixes: 7834fa948beb ("perf evlist: Fix access of freed id arrays")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
My earlier patch to just enable --reltime with --time was a little too
optimistic. The --time parsing would accept absolute time, which is
very confusing to the user.
Support relative time in --time parsing too. This only works with recent
perf record that records the first sample time. Otherwise we error out.
Fixes: 3714437d3fcc ("perf script: Allow --time with --reltime")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
By using "make TCMALLOC=1" you can enable perf to be build for usage
with libtcmalloc.so (gperftools).
Get heap profile (tools/perf directory):
$ <install gperftools>
$ make TCMALLOC=1 DEBUG=1
$ HEAPPROFILE=/tmp/heapprof ./perf ...
$ pprof ./perf /tmp/heapprof.000*
(pprof) top
Total: 2335.5 MB
1735.1 74.3% 74.3% 1735.1 74.3% memdup
402.0 17.2% 91.5% 402.0 17.2% zalloc
140.2 6.0% 97.5% 145.8 6.2% map__new
33.6 1.4% 98.9% 33.6 1.4% symbol__new
12.4 0.5% 99.5% 12.4 0.5% alloc_event
6.2 0.3% 99.7% 6.2 0.3% nsinfo__new
5.5 0.2% 100.0% 5.5 0.2% nsinfo__copy
0.3 0.0% 100.0% 0.3 0.0% dso__new
0.1 0.0% 100.0% 0.1 0.0% do_read_string
0.0 0.0% 100.0% 0.0 0.0% __GI__IO_file_doallocate
See callstack:
$ pprof --pdf ./perf /tmp/heapprof.00* > callstack.pdf
$ pprof --web ./perf /tmp/heapprof.00*
Committer testing:
Install gperftools, on fedora:
# dnf install gperftools-devel
Then build:
$ make TCMALLOC=1 DEBUG=1 -C tools/perf O=/tmp/build/perf install-bin
Verify that it linked against the right library:
$ ldd ~/bin/perf | grep tcma
libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fb2953a7000)
$
Run 'perf trace' system wide for 1 minute:
# HEAPPROFILE=/tmp/heapprof perf trace -a sleep 1m
<SNIP>
59985.524 ( 0.006 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafb0) = -1 EAGAIN (Resource temporarily unavailable)
59985.536 ( 0.005 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafc0) = -1 EAGAIN (Resource temporarily unavailable)
59981.956 (10.143 ms): SCTP timer/21716 ... [continued]: select()) = 0 (Timeout)
59985.549 ( ): Web Content/20354 poll(ufds: 0x7f1df38af180, nfds: 3, timeout_msecs: 4294967295) ...
0.926 (59999.481 ms): sleep/29764 ... [continued]: nanosleep()) = 0
59992.133 ( ): SCTP timer/21716 select(tvp: 0x7ff5bf7fee80) ...
60000.477 ( 0.009 ms): sleep/29764 close(fd: 1) = 0
60000.493 ( 0.005 ms): sleep/29764 close(fd: 2) = 0
60000.514 ( ): sleep/29764 exit_group() = ?
Dumping heap profile to /tmp/heapprof.0001.heap (Exiting, 3 MB in use)
[root@quaco ~]#
Install pprof:
# dnf install pprof
And run it:
# pprof ~/bin/perf /tmp/heapprof.0001.heap
Using local file /root/bin/perf.
Using local file /tmp/heapprof.0001.heap.
Welcome to pprof! For help, type 'help'.
(pprof) top
Total: 4.0 MB
1.7 42.0% 42.0% 2.2 54.1% map__new
0.9 23.3% 65.3% 0.9 23.3% zalloc
0.5 11.4% 76.7% 0.5 11.4% dso__new
0.2 5.6% 82.3% 0.3 8.5% trace__sys_enter
0.2 4.9% 87.2% 0.2 4.9% __GI___strdup
0.2 3.8% 91.0% 0.2 3.8% new_term
0.1 2.2% 93.2% 0.4 10.1% __perf_pmu__new_alias
0.0 1.0% 94.3% 0.0 1.2% event_read_fields
0.0 0.8% 95.1% 0.0 0.8% nsinfo__new
0.0 0.7% 95.8% 0.1 3.2% trace__read_syscall_info
(pprof)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191013151427.11941-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf trace:
Arnaldo Carvalho de Melo:
- Reuse the strace-like syscall_arg_fmt->scnprintf() beautification routines
(convert integer arguments into strings, like open flags, etc) in tracepoint
arguments.
For now the type based scnprintf routines (pid_t, umode_t, etc) and the
ones based in well known arg name based ("fd", etc) gets associated with
tracepoint args of that type.
A tracepoint only arg, "msr", for the msr:{write,read}_msr gets added as
an initial step.
- Introduce syscall_arg_fmt->strtoul() methods to be the reverse operation
of ->scnprintf(), i.e. to go from a string to an integer.
- Implement --filter, just like in 'perf record', that affects the tracepoint
events specied thus far in the command line, use the ->strtoul() methods
to allow strings in tables associated with beautifiers to the integers
the in-kernel tracepoint (eBPF later) filters expect, e.g.:
# perf trace --max-events 1 -e sched:*ipi --filter="cpu==1 || cpu==2"
0.000 as/24630 sched:sched_wake_idle_without_ipi(cpu: 1)
#
# perf trace --max-events 1 --max-stack=32 -e msr:* --filter="msr==IA32_TSC_DEADLINE"
207.000 cc1/19963 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 5442316760822)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
lapic_next_deadline ([kernel.kallsyms])
clockevents_program_event ([kernel.kallsyms])
hrtimer_interrupt ([kernel.kallsyms])
smp_apic_timer_interrupt ([kernel.kallsyms])
apic_timer_interrupt ([kernel.kallsyms])
[0x6ff66c] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x7047c3] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x707708] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
execute_one_pass (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x4f3d37] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x4f3d49] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
execute_pass_list (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
cgraph_node::expand (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x2625b4] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
symbol_table::finalize_compilation_unit (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x5ae8b9] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
toplev::main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x26b6a] (/usr/lib/x86_64-linux-gnu/libc-2.29.so)
#
# perf trace --max-events 8 -e msr:* --filter="msr==IA32_SPEC_CTRL"
0.000 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.063 migration/3/25 msr:write_msr(msr: IA32_SPEC_CTRL)
0.217 kworker/u16:1-/4826 msr:write_msr(msr: IA32_SPEC_CTRL)
0.687 rcu_sched/11 msr:write_msr(msr: IA32_SPEC_CTRL)
0.696 :13280/13280 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.305 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.355 :13274/13274 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
2.743 kworker/u16:0-/6711 msr:write_msr(msr: IA32_SPEC_CTRL)
#
# perf trace --max-events 8 --cpu 1 -e msr:* --filter="msr!=IA32_SPEC_CTRL && msr!=IA32_TSC_DEADLINE && msr != FS_BASE"
0.000 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 68719479037)
0.096 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
238.925 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 8589936893)
511.010 :0/0 msr:write_msr(msr: 0x830, val: 68719479037)
1005.052 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
1235.131 CPU 0/KVM/3750 msr:write_msr(msr: 0x830, val: 4294969595)
1235.195 CPU 0/KVM/3750 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199023037952)
1235.201 CPU 0/KVM/3750 msr:read_msr(msr: IA32_APICBASE, val: 4276096000)
#
- Default to not using libtraceevent and its plugins for beautifying
tracepoint arguments, since now we're reusing the strace-like beatufiers.
Use --libtraceevent_print (using just --libtrace is unambiguous and can
be used as a short hand) to go back to those beautifiers.
This will help in the transition, as can be seen in some of the sched tracepoints
that still need some work in the libbeauty based mode:
# trace --no-inherit -e msr:*,*sleep,sched:* sleep 1
0.000 ( ): sched:sched_waking(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
0.006 ( ): sched:sched_wakeup(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
0.348 ( ): sched:sched_process_exec(filename: 140212596720100, pid: 3319 (sleep), old_pid: 3319 (sleep))
0.490 ( ): msr:write_msr(msr: FS_BASE, val: 139631189321088)
0.670 ( ): nanosleep(rqtp: 0x7ffc52c23bc0) ...
0.674 ( ): sched:sched_stat_runtime(comm: "sleep", pid: 3319 (sleep), runtime: 659259, vruntime: 78942418342)
0.675 ( ): sched:sched_switch(prev_comm: "sleep", prev_pid: 3319 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/0", next_prio: 120)
1001.059 ( ): sched:sched_waking(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
1001.098 ( ): sched:sched_wakeup(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
0.670 (1000.504 ms): ... [continued]: nanosleep()) = 0
1001.456 ( ): sched:sched_process_exit(comm: "sleep", pid: 3319 (sleep), prio: 120)
# trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
# trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
0.000 ( ): sched:sched_waking(comm=trace pid=3323 prio=120 target_cpu=000)
0.007 ( ): sched:sched_wakeup(comm=trace pid=3323 prio=120 target_cpu=000)
0.382 ( ): sched:sched_process_exec(filename=/usr/bin/sleep pid=3323 old_pid=3323)
0.525 ( ): msr:write_msr(c0000100, value 7f5d508a0580)
0.713 ( ): nanosleep(rqtp: 0x7fff487fb4a0) ...
0.717 ( ): sched:sched_stat_runtime(comm=sleep pid=3323 runtime=617722 [ns] vruntime=78957731636 [ns])
0.719 ( ): sched:sched_switch(prev_comm=sleep prev_pid=3323 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120)
1001.117 ( ): sched:sched_waking(comm=sleep pid=3323 prio=120 target_cpu=000)
1001.157 ( ): sched:sched_wakeup(comm=sleep pid=3323 prio=120 target_cpu=000)
0.713 (1000.522 ms): ... [continued]: nanosleep()) = 0
1001.538 ( ): sched:sched_process_exit(comm=sleep pid=3323 prio=120)
#
- Make -v (verbose) mode be honoured for .perfconfig based trace.add_events,
to help in diagnosing problems with building eBPF events (-e source.c).
- When using eBPF syscall payload augmentation do not show strace-like
syscalls when all the user specified was some tracepoint event, bringing
the behaviour in line with that of when not using eBPF augmentation.
Intel PT:
exported-sql-viewer GUI:
Adrian Hunter:
- Add LookupModel, HBoxLayout, VBoxLayout, global time range calculations
so as to add a time chart by CPU.
perf script:
Andi Kleen:
- Allow --time (to specify a time span of interest) with --reltime
perf diff:
Jin Yao:
- Report noise for cycles diff, i.e. a histogram + stddev.
(timestamps relative to start).
perf annotate:
Arnaldo Carvalho de Melo:
- Initialize env->cpuid when running in live mode (perf top), as it
is used in some of the per arch annotation init routines.
samples bpf:
Björn Töpel:
- Fixup fallout of using tools/perf/perf-sys. from outside tools/perf.
Core:
Ian Rogers:
- Avoid 'sample_reg_masks' being const + weak, as this breaks with some
compilers that constant-propagate from the weak symbol.
libperf:
- First part of moving the perf_mmap class from tools/perf to libperf.
- Propagate CFLAGS to libperf from the tools/perf Makefile.
Vendor events:
John Garry:
- Add entry in MAINTAINERS with reviewers for the for perf tool arm64
pmu-events files.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
|