Age | Commit message (Collapse) | Author |
|
Graph Graph
Enhance the call graph to display IPC information if it is available.
Committer testing:
[acme@quaco adrian.hunter]$ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db
Reports -> Context Sensitive Callgraph, then expand a few trees, then
select with the mouse and press control+C:
Call Path Object Count Time(ns) Time(%) Insn Insn Cyc Cyc IPC Branch Branch
▼ simple-retpolin Cnt Cnt(%) Cnt Cnt(%) Cnt Cnt(%)
▼ 23003:23003
▼ _start ld-2.28.so 1 218295 100.0 127746 100.0 207320 100.0 0.62 13046 100.0
▶ unknown unknown 1 3202 1.5 0 0.0 0 0.0 0 1 0.0
▶ _dl_start ld-2.28.so 1 188471 86.3 123394 96.6 180007 86.8 0.69 12529 96.0
▶ _dl_init ld-2.28.so 1 13406 6.1 3207 2.5 14868 7.2 0.22 327 2.5
▼ _start simple-retpoline 1 12899 5.9 1142 0.9 11561 5.6 0.10 184 1.4
▶ unknown unknown 1 846 6.6 0 0.0 0 0.0 0 1 0.5
▼ __libc_start_main libc-2.28.so 1 11621 90.1 1129 98.9 10350 89.5 0.11 181 98.4
▶ __cxa_atexit libc-2.28.so 1 2302 19.8 101 8.9 1817 17.6 0.06 13 7.2
▶ __libc_csu_init simple-retpoline 1 121 1.0 43 3.8 340 3.3 0.13 8 4.4
▼ _setjmp libc-2.28.so 1 74 0.6 46 4.1 206 2.0 0.22 4 2.2
▼ __sigsetjmp libc-2.28.so 1 74 100.0 46 100.0 206 100.0 0.22 3 75.0
▶ __sigjmp_save libc-2.28.so 1 0 0.0 0 0.0 0 0.0 0 1 33.3
▼ main simple-retpoline 1 44 0.4 23 2.0 126 1.2 0.18 12 6.6
▼ foo simple-retpoline 2 44 100.0 23 100.0 126 100.0 0.18 10 83.3
bar simple-retpoline 2 22 50.0 6 26.1 61 48.4 0.10 2 20.0
▶ exit libc-2.28.so 1 9029 77.7 878 77.8 7765 75.0 0.11 139 76.8
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a parameter to call graph and call tree, to determine whether IPC
information is available.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-20-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Branch reports
Enhance the "All branches" and "Selected branches" reports to display IPC
information if it is available.
Committer testing:
So, testing this I noticed that it all starts with the left arrow in every
line, that should mean there is some tree there, i.e. look at all those ▶
symbols:
Reports -> All Branches:
Time CPU Command PID TID Branch Type In Tx Insn Cnt Cyc Cnt IPC Branch
▶ 187836112195670 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4f110
+_start (ld-2.28.so)
▶ 187836112195987 7 simple-retpolin 23003 23003 trace end No 0 883 0 7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown
+(unknown)
▶ 187836112199189 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4f110
+_start (ld-2.28.so)
▶ 187836112199189 7 simple-retpolin 23003 23003 call No 0 0 0 7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50
+_dl_start (ld-2.28.so)
▶ 187836112199544 7 simple-retpolin 23003 23003 trace end No 17 996 0.02 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0
+unknown (unknown)
▶ 187836112200939 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4ff73
+_dl_start+0x23 (ld-2.28.so)
▶ 187836112201229 7 simple-retpolin 23003 23003 trace end No 1 816 0.00 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0
+unknown (unknown)
▶ 187836112203500 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4ff7a
+_dl_start+0x2a (ld-2.28.so)
But if you click on it, that ▶ disappears and a new click doesn't make
it reappear, looks buggy, minor oddity, reported to Adrian.
Reports -> Selected Branches, then ask for branches in the ld-2.28.so
DSO:
Time CPU Command PID TID Branch Type In Tx Insn Cnt Cyc Cnt IPC Branch
▶ 187836112195987 7 simple-retpolin 23003 23003 trace end No 0 883 0 7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown (unknown)
▶ 187836112199189 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4f110 _start (ld-2.28.so)
▶ 187836112199189 7 simple-retpolin 23003 23003 call No 0 0 0 7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50 _dl_start (ld-2.28.so)
▶ 187836112199544 7 simple-retpolin 23003 23003 trace end No 17 996 0.02 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0 unknown (unknown)
▶ 187836112200939 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so)
▶ 187836112201229 7 simple-retpolin 23003 23003 trace end No 1 816 0.00 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0 unknown (unknown)
▶ 187836112203500 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so)
▶ 187836112203528 7 simple-retpolin 23003 23003 unconditional jump No 0 0 0 7f6f33d4ffe7 _dl_start+0x97 (ld-2.28.so) -> 7f6f33d5000b _dl_start+0xbb (ld-2.28.so)
▶ 187836112203528 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
▶ 187836112203528 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
▶ 187836112203539 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d50025 _dl_start+0xd5 (ld-2.28.so) -> 7f6f33d50210 _dl_start+0x2c0 (ld-2.28.so)
▶ 187836112203539 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5021a _dl_start+0x2ca (ld-2.28.so) -> 7f6f33d50360 _dl_start+0x410 (ld-2.28.so)
▶ 187836112203539 7 simple-retpolin 23003 23003 unconditional jump No 0 0 0 7f6f33d50377 _dl_start+0x427 (ld-2.28.so) -> 7f6f33d4ffff _dl_start+0xaf (ld-2.28.so)
▶ 187836112203539 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
▶ 187836112203562 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-19-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Export cycle and instruction counts on samples and calls tables.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-18-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Export cycle and instruction counts on samples and calls tables.
Committer testing:
First runs some workload collecting intel_pt with the 'cyc' ter just for
userspace:
[root@quaco adrian.hunter]# perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.035 MB simple-retpoline.perf.data ]
[root@quaco adrian.hunter]#
Then use the export-to-sqlite.py script to see if the changes in this
cset don't make it to break and if the changes in the db schema are the
ones expected:
[root@quaco adrian.hunter]# perf script -i simple-retpoline.perf.data --itrace=be -s ~acme/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
2019-05-31 11:50:46.942710 Creating database ...
2019-05-31 11:50:46.949663 Writing records...
2019-05-31 11:50:47.224033 Adding indexes
2019-05-31 11:50:47.231599 Done
[root@quaco adrian.hunter]#
Now lets use the db:
[root@quaco adrian.hunter]# sqlite3 simple-retpoline.db
SQLite version 3.26.0 2018-12-01 12:34:55
Enter ".help" for usage hints.
sqlite> .schema samples
CREATE TABLE samples (id integer NOT NULL PRIMARY KEY,evsel_id bigint,machine_id bigint,thread_id bigint,comm_id bigint,dso_id bigint,symbol_id bigint,sym_offset bigint,ip bigint,time bigint,cpuinteger,to_dso_id bigint,to_symbol_id bigint,to_sym_offset bigint,to_ip bigint,branch_type integer,in_tx boolean,call_path_id bigint,insn_count bigint,cyc_count bigint);
sqlite>
Cool, the 'insn_count' and 'cyc_count' are there, now lets see if we can
use them in a query:
sqlite> select insn_count,cyc_count from samples where cyc_count > 1500 and insn_count < 10;
6|1507
sqlite> select insn_count,cyc_count from samples where cyc_count > 1500;
118|2210
140|1516
3783|1861
132|1521
6|1507
sqlite>
Seems to work :-)
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-17-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Export cycle and instruction counts on samples and call-returns.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add brief documentation to explain how the database export maintains
backward and forward compatibility.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Cycle and instruction counts are added to the stack. The IPC of a
function and all functions it calls, is also recorded.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add brief documentation about instructions-per-cycle (IPC) information
derived from Intel PT.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When CYC packets are not available, it is still possible to count cycles
using TSC/TMA/MTC timestamps.
As the timestamp increments in TSC ticks, convert to CPU cycles using
the current core-to-bus ratio.
Do not accumulate cycles when control flow packet generation is not
enabled, nor when time has been "lost", typically due to mwait, which is
indicated by a TSC/TMA packet that is not part of PSB+.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To make it easier to add new code for different TIP cases, separate each
case.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In preparation for using MTC packets to count cycles, record whether
decoding is between a PSB and PSBEND packets.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add field 'ipc' to display instructions-per-cycle.
Example:
perf record -e intel_pt/cyc/u ls
perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
ls 2670177.697113434: 7f0dfdbcd090 _start+0x0 mov %rsp, %rdi IPC: 0.00 (1/877)
ls 2670177.697113434: 7f0dfdbcd093 _start+0x3 callq 0x7f0dfdbce030
ls 2670177.697113434: 7f0dfdbce030 _dl_start+0x0 pushq %rbp
ls 2670177.697113434: 7f0dfdbce031 _dl_start+0x1 mov %rsp, %rbp
ls 2670177.697113434: 7f0dfdbce034 _dl_start+0x4 pushq %r15
ls 2670177.697113434: 7f0dfdbce036 _dl_start+0x6 pushq %r14
ls 2670177.697113434: 7f0dfdbce038 _dl_start+0x8 pushq %r13
ls 2670177.697113434: 7f0dfdbce03a _dl_start+0xa pushq %r12
ls 2670177.697113434: 7f0dfdbce03c _dl_start+0xc mov %rdi, %r12
ls 2670177.697113434: 7f0dfdbce03f _dl_start+0xf pushq %rbx
ls 2670177.697113434: 7f0dfdbce040 _dl_start+0x10 sub $0x38, %rsp
ls 2670177.697113434: 7f0dfdbce044 _dl_start+0x14 rdtsc
ls 2670177.697113434: 7f0dfdbce046 _dl_start+0x16 mov %eax, %eax
ls 2670177.697113434: 7f0dfdbce048 _dl_start+0x18 shl $0x20, %rdx
ls 2670177.697113434: 7f0dfdbce04c _dl_start+0x1c or %rax, %rdx
ls 2670177.697114471: 7f0dfdbce04f _dl_start+0x1f movq 0x27e22(%rip), %rax IPC: 0.00 (15/1685)
ls 2670177.697116177: 7f0dfdbce056 _dl_start+0x26 movq %rdx, 0x27683(%rip) IPC: 0.00 (1/881)
Note, the IPC values are low due to page faults at the beginning of
execution. The additional cycles are due to the time to enter the
kernel, not the actual kernel page fault handler.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Copy the incremental instruction count and cycle count onto 'instructions'
and 'branches' samples.
Because Intel PT does not update the cycle count on every branch or
instruction, the incremental values will often be zero.
When there are values, they will be the number of instructions and
number of cycles since the last update, and thus represent the average
IPC since the last IPC value.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add counts of instructions and cycles, in order to represent
instructions-per-cycle (IPC).
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In preparation for providing instructions-per-cycle (IPC) information,
accumulate cycle count from CYC packets.
Although CYC packets are optional (requires config term 'cyc' to enable
cycle-accurate mode when recording), the simplest way to count cycles is
with CYC packets.
The first complication is that cycles must be counted only when also
counting instructions.
That means when control flow packet generation is enabled i.e. between
TIP.PGE and TIP.PGD packets.
Also, sampling the cycle count follows the same rules as sampling the
timestamp, that is, not before the instruction to which the decoder is
walking is reached.
In addition, the cycle count is not accurate for any but the first
branch of a TNT packet.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To eliminate some duplication and make the code more understandable,
factor out intel_pt_update_sample_time.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When DWARF stacks were requested and at the same time that the user
specifies a register set using the --user-regs option the full register
context was being captured on samples:
$ perf record -g --call-graph dwarf,1024 --user-regs=IP,SP,BP -- stack_test2.g.O3
188143843893585 0x6b48 [0x4f8]: PERF_RECORD_SAMPLE(IP, 0x4002): 23828/23828: 0x401236 period: 1363819 addr: 0x7ffedbdd51ac
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
.... AX 0x53b
.... BX 0x7ffedbdd3cc0
.... CX 0xffffffff
.... DX 0x33d3a
.... SI 0x7f09b74c38d0
.... DI 0x0
.... BP 0x401260
.... SP 0x7ffedbdd3cc0
.... IP 0x401236
.... FLAGS 0x20a
.... CS 0x33
.... SS 0x2b
.... R8 0x7f09b74c3800
.... R9 0x7f09b74c2da0
.... R10 0xfffffffffffff3ce
.... R11 0x246
.... R12 0x401070
.... R13 0x7ffedbdd5db0
.... R14 0x0
.... R15 0x0
... ustack: size 1024, offset 0xe0
. data_src: 0x5080021
... thread: stack_test2.g.O:23828
...... dso: /root/abudanko/stacks/stack_test2.g.O3
I.e. the --user-regs=IP,SP,BP was being ignored, being overridden by the
needs of --call-graph=dwarf.
After applying the change in this patch the sample data contains the
user specified register, but making sure that at least the minimal set
of register needed for DWARF unwinding (DWARF_MINIMAL_REGS) is
requested.
The user is warned that DWARF unwinding may not work if extra registers
end up being needed.
-g call-graph dwarf,K full_regs
--user-regs=user_regs user_regs
-g call-graph dwarf,K --user-regs=user_regs user_regs + DWARF_MINIMAL_REGS
$ perf record -g --call-graph dwarf,1024 --user-regs=BP -- ls
WARNING: The use of --call-graph=dwarf may require all the user registers, specifying a subset with --user-regs may render DWARF unwinding unreliable, so the minimal registers set (IP, SP) is explicitly forced.
arch COPYING Documentation include Kbuild lbuild MAINTAINERS modules.builtin Module.symvers perf.data.old scripts System.map virt
block CREDITS drivers init Kconfig lib Makefile modules.builtin.modinfo net README security tools vmlinux
certs crypto fs ipc kernel LICENSES mm modules.order perf.data samples sound usr vmlinux.o
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]
188368474305373 0x5e40 [0x470]: PERF_RECORD_SAMPLE(IP, 0x4002): 23839/23839: 0x401236 period: 1260507 addr: 0x7ffd3d85e96c
... FP chain: nr:0
... user regs: mask 0x1c0 ABI 64-bit
.... BP 0x401260
.... SP 0x7ffd3d85cc20
.... IP 0x401236
... ustack: size 1024, offset 0x58
. data_src: 0x5080021
Committer notes:
Detected build failures on arches where PERF_REGS_ is not available,
such as debian:experimental-x-{mips,mips64,mipsel}, fedora 24 and 30 for
ARC uClibc and glibc, reported to Alexey that provided a patch moving
the DWARF_MINIMAL_REGS from evsel.c to util/perf_regs.h, where it is
guarded by an HAVE_PERF_REGS_SUPPORT ifdef.
Committer testing:
# perf record --user-regs=bp,ax -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.955 MB perf.data (1773 samples) ]
# perf script -F+uregs | grep AX: | head -5
perf 1719 [000] 181.272398: 1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
perf 1719 [000] 181.272402: 1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
perf 1719 [000] 181.272403: 8 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
perf 1719 [000] 181.272405: 181 cycles: ffffffffba06a7c6 native_write_msr+0x6 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
perf 1719 [000] 181.272406: 4405 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
# perf record --call-graph=dwarf --user-regs=bp,ax -a sleep 1
WARNING: The use of --call-graph=dwarf may require all the user registers, specifying a subset with --user-regs may render DWARF unwinding unreliable, so the minimal registers set (IP, SP) is explicitly forced.
[ perf record: Woken up 55 times to write data ]
[ perf record: Captured and wrote 24.184 MB perf.data (2841 samples) ]
[root@quaco ~]# perf script --hide-call-graph -F+uregs | grep AX: | head -5
perf 1729 [000] 211.268006: 1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
perf 1729 [000] 211.268014: 1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
perf 1729 [000] 211.268017: 5 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
perf 1729 [000] 211.268020: 48 cycles: ffffffffba06a7c6 native_write_msr+0x6 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
perf 1729 [000] 211.268024: 490 cycles: ffffffffba00e471 intel_bts_enable_local+0x21 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
#
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/e7fd37b1-af22-0d94-a0dc-5895e803bbfe@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Variable 'err' is defined but never used in function symsrc__init(),
remove it and directly return -1 at the end of the function.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190530093801.20510-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We forgot to update the perf.data file format document for the
HEADER_DIR_FORMAT header, do it now from comments in the patch
introducing it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Chong Jiang <chongjiang@chromium.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Fixes: 258031c017c3 ("perf header: Add DIR_FORMAT feature to describe directory data")
Link: https://lkml.kernel.org/n/tip-jbrzb7ijb5al33gi8br6f9rr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We forgot to update the perf.data file format document for the
HEADER_CLOCKID header, do it now from comments in the patch introducing
it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Chong Jiang <chongjiang@chromium.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Fixes: cf7905165fee ("perf record: Encode -k clockid frequency into Perf trace")
Link: https://lkml.kernel.org/n/tip-slhnjp06027j3ae17qqetzxj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We forgot to update the perf.data file format document for the
HEADER_MEM_TOPOLOGY header, do it now from comments in the patch
introducing it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Chong Jiang <chongjiang@chromium.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Simon Que <sque@chromium.org>
Fixes: e2091cedd51b ("perf tools: Add MEM_TOPOLOGY feature to perf data file")
Link: https://lkml.kernel.org/n/tip-h5lcm1nbe9ztxwm61gmadd56@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch addes description of HEADER_BPF_PROG_INFO and HEADER_BPF_BTF to
perf.data-file-format.txt.
Requested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 606f972b1361 ("perf bpf: Save bpf_prog_info information as headers to perf.data")
Link: http://lkml.kernel.org/r/20190521064406.2498925-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If CONFIG_FUNCTION_GRAPH_TRACER is enabled function
arch_counter_get_cntvct() is marked as notrace. However, function
__arch_counter_get_cntvct is marked as inline. If
CONFIG_OPTIMIZE_INLINING is set that will make the two functions
tracable which they shouldn't.
Rework so that functions __arch_counter_get_* are marked with
__always_inline so they will be inlined even if CONFIG_OPTIMIZE_INLINING
is turned on.
Fixes: 0ea415390cd3 ("clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable counters")
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
asm/smp.h is included by linux/smp.h and some drivers, in particular
irqchip drivers can access cpu_logical_map[] in order to perform SMP
affinity tasks. Make arm64 consistent with other architectures here.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
In commit 06a916feca2b ("arm64: Expose SVE2 features for
userspace"), new hwcaps are added that are detected via fields in
the SVE-specific ID register ID_AA64ZFR0_EL1.
In order to check compatibility of secondary cpus with the hwcaps
established at boot, the cpufeatures code uses
__read_sysreg_by_encoding() to read this ID register based on the
sys_reg field of the arm64_elf_hwcaps[] table.
This leads to a kernel splat if an hwcap uses an ID register that
__read_sysreg_by_encoding() doesn't explicitly handle, as now
happens when exercising cpu hotplug on an SVE2-capable platform.
So fix it by adding the required case in there.
Fixes: 06a916feca2b ("arm64: Expose SVE2 features for userspace")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
test_lirc_mode2_user is included in test_lirc_mode2.sh test and should
not be run directly.
Fixes: 6bdd533cee9a ("bpf: add selftest for lirc_mode2 type program")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This patch fixes the chipmunk-like voice that manifets randomly when
using the integrated mic of the Logitech Webcam HD C270.
The issue was solved initially for this device by commit 2394d67e446b
("USB: add RESET_RESUME for webcams shown to be quirky") but it was then
reintroduced by e387ef5c47dd ("usb: Add USB_QUIRK_RESET_RESUME for all
Logitech UVC webcams"). This patch is to have the fix back.
Signed-off-by: Marco Zatta <marco@zatta.me>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There is one more Realtek card reader requires ums-realtek to work
correctly.
Add the device ID to support it.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
drm-intel-fixes
gvt-fixes-2019-06-05
- Fix i915 guest debug build for register command access (Weinan)
- Fix guest ring state after execution for hangcheck (Xiaolin)
- klocwork static check fixes (Alek)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190605084903.GX9684@zhen-hp.sh.intel.com
|
|
unmap_udmabuf fails to actually unmap the scatterlist, leaving dangling
mappings around.
Fixes: fbb0de795078 ("Add udmabuf misc device")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190604202331.17482-1-l.stach@pengutronix.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
The MIPS GIC contains a block of registers used to map local interrupts
to a particular CPU interrupt pin. Since these registers are found at a
consecutive range of addresses we access them using an index, via the
(read|write)_gic_v[lo]_map accessor functions. We currently use values
from enum mips_gic_local_interrupt as those indices.
Unfortunately whilst enum mips_gic_local_interrupt provides the correct
offsets for bits in the pending & mask registers, the ordering of the
map registers is subtly different... Compared with the ordering of
pending & mask bits, the map registers move the FDC from the end of the
list to index 3 after the timer interrupt. As a result the performance
counter & software interrupts are therefore at indices 4-6 rather than
indices 3-5.
Notably this causes problems with performance counter interrupts being
incorrectly mapped on some systems, and presumably will also cause
problems for FDC interrupts.
Introduce a function to map from enum mips_gic_local_interrupt to the
index of the corresponding map register, and use it to ensure we access
the map registers for the correct interrupts.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: a0dc5cb5e31b ("irqchip: mips-gic: Simplify gic_local_irq_domain_map()")
Fixes: da61fcf9d62a ("irqchip: mips-gic: Use irq_cpu_online to (un)mask all-VP(E) IRQs")
Reported-and-tested-by: Archer Yan <ayan@wavecomp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
irq_create_fwspec_mapping() can fail, returning 0 as parent_virq. In this
case vint_desc is going to be NULL in ti_sci_inta_alloc_irq() which will
cause NULL pointer dereference.
Also note that irq_create_fwspec_mapping() returns 'unsigned int' so the
check '<=' was wrong.
Use -EINVAL if irq_create_fwspec_mapping() returned with 0.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
The csky,mpintc could deliver a external irq to one cpu or all cpus, but
it couldn't deliver a external irq to a group of cpus with cpu_mask. So
we only use auto deliver mode when affinity mask_val is equal to
cpu_present_mask.
There is no limitation for only two cpus in SMP system.
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
As Eric noted, the current wrapper for ptype func hook inside
__netif_receive_skb_list_ptype() has no chance of avoiding the indirect
call: we enter such code path only for protocols other than ipv4 and
ipv6.
Instead we can wrap the list_func invocation.
v1 -> v2:
- use the correct fix tag
Fixes: f5737cbadb7d ("net: use indirect calls helpers for ptype hook")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There's some NICs, such as hinic, with NETIF_F_IP_CSUM and NETIF_F_TSO
on but NETIF_F_HW_CSUM off. And ipvlan device features will be
NETIF_F_TSO on with NETIF_F_IP_CSUM and NETIF_F_IP_CSUM both off as
IPVLAN_FEATURES only care about NETIF_F_HW_CSUM. So TSO will be
disabled in netdev_fix_features.
For example:
Features for enp129s0f0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: on
Fixes: a188222b6ed2 ("net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We need to drop the "ctrl_info->sync_request_sem" lock before returning.
Fixes: 6c223761eb54 ("smartpqi: initial commit of Microsemi smartpqi driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
struct ufs_dev_cmd is the main container that supports device management
commands. In the case of a read descriptor request, we assume that the
proper space was allocated in dev_cmd to hold the returning descriptor.
This is no longer true, as there are flows that doesn't use dev_cmd for
device management requests, and was wrong in the first place.
Fixes: d44a5f98bb49 (ufs: query descriptor API)
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Acked-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
By default, packets received in another VRF should not be passed to an
unbound socket in the default VRF. This patch updates the IPv4 UDP
multicast logic to match the unicast VRF logic (in compute_score()),
as well as the IPv6 mcast logic (in __udp_v6_is_mcast_sock()).
The particular case I noticed was DHCP discover packets going
to the 255.255.255.255 address, which are handled by
__udp4_lib_mcast_deliver(). The previous code meant that running
multiple different DHCP server or relay agent instances across VRFs
did not work correctly - any server/relay agent in the default VRF
received DHCP discover packets for all other VRFs.
Fixes: 6da5b0f027a8 ("net: ensure unbound datagram socket to be chosen when not in a VRF")
Signed-off-by: Tim Beale <timbeale@catalyst.net.nz>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jakub Kicinski says:
====================
net/tls: redo the RX resync locking
Take two of making sure we don't use a NULL netdev pointer
for RX resync. This time using a bit and an open coded
wait loop.
v2:
- fix build warning (DaveM).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 38030d7cb779 ("net/tls: avoid NULL-deref on resync during device removal")
tried to fix a potential NULL-dereference by taking the
context rwsem. Unfortunately the RX resync may get called
from soft IRQ, so we can't use the rwsem to protect from
the device disappearing. Because we are guaranteed there
can be only one resync at a time (it's called from strparser)
use a bit to indicate resync is busy and make device
removal wait for the bit to get cleared.
Note that there is a leftover "flags" field in struct
tls_context already.
Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 38030d7cb77963ba84cdbe034806e2b81245339f.
Unfortunately the RX resync may get called from soft IRQ,
so we can't take the rwsem to protect from the device
disappearing.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: "Ed L. Cashin" <ed.cashin@acm.org>
Cc: linux-block@vger.kernel.org
Cc: Omar Sandoval <osandov@osandov.com>
Acked-by: Justin Sanders <justin@coraid.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The hardware values for link speed are held in the sja1105_speed_t enum.
However they do not increase in the order that sja1105_get_speed_cfg was
iterating over them (basically from SJA1105_SPEED_AUTO - 0 - to
SJA1105_SPEED_1000MBPS - 1 - skipping the other two).
Another bug is that the code in sja1105_adjust_port_config relies on the
fact that an invalid link speed is detected by sja1105_get_speed_cfg and
returned as -EINVAL. However storing this into an enum that only has
positive members will cast it into an unsigned value, and it will miss
the negative check.
So take the simplest approach and remove the sja1105_get_speed_cfg
function and replace it with a simple switch-case statement.
Fixes: 8aa9ebccae87 ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Avoid reducing the support mask as a result of the interface type
selected for SFP modules, or when setting the link settings through
ethtool - this should only change when the supported link modes of
the hardware combination change.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The following error occurs for the `make ARCH=arm64 checkstack` case:
aarch64-linux-gnu-objdump -d vmlinux $(find . -name '*.ko') | \
perl ./scripts/checkstack.pl arm64
wrong or unknown architecture "arm64"
As suggested by Masahiro Yamada, fix the above error using regular
expressions in the same way it was fixed for the `ARCH=x86` case via
commit fda9f9903be6 ("scripts/checkstack.pl: automatically handle
32-bit and 64-bit mode for ARCH=x86").
Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: George G. Davis <george_davis@mentor.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
The buildtar script might want to invoke a make, so tell the parent
make to pass the jobserver token pipe to the subcommand by prefixing
the command with a +.
This addresses the issue seen here:
/bin/sh ../scripts/package/buildtar tar-pkg
make[3]: warning: jobserver unavailable: using -j1. Add '+' to parent make rule.
See https://www.gnu.org/software/make/manual/html_node/Job-Slots.html
for more information.
Signed-off-by: Trevor Bourget <tgb.kernel@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Adding SPDX license identifier is pretty safe; however, here is one
exception.
Since commit ec8f24b7faaf ("treewide: Add SPDX license identifier -
Makefile/Kconfig"), "make testconfig" would not pass.
When Kconfig detects a circular file inclusion, it displays error
messages with a file name and a line number prefixed to each line.
The unit test checks if Kconfig emits the error messages correctly
(this also checks the line number correctness).
Now that the test input has the SPDX license identifier at the very top,
the line numbers in the expected stderr should be incremented by 1.
Fixes: ec8f24b7faaf ("treewide: Add SPDX license identifier - Makefile/Kconfig")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Current implementation of kselftest-merge only finds config files that
are one level deep using `$(srctree)/tools/testing/selftests/*/config`.
Often, config files are added in nested directories, and do not get
picked up by kselftest-merge.
Use `find` to catch all config files under
`$(srctree)/tools/testing/selftests` instead.
Signed-off-by: Dan Rue <dan.rue@linaro.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
The WRITE ZEROES command has no data transfer so that we need to
initialize the struct (nvmet_req *req)->data_len to 0x0. While
(nvmet_req *req)->transfer_len is initialized in nvmet_req_init(),
data_len will be initialized by nowhere which might cause the failure
with status code NVME_SC_SGL_INVALID_DATA | NVME_SC_DNR randomly. It's
because nvmet_req_execute() checks like:
if (unlikely(req->data_len != req->transfer_len)) {
req->error_loc = offsetof(struct nvme_common_command, dptr);
nvmet_req_complete(req, NVME_SC_SGL_INVALID_DATA | NVME_SC_DNR);
} else
req->execute(req);
This patch fixes req->data_len not to be a randomly assigned by
initializing it to 0x0 when preparing the command in
nvmet_bdev_parse_io_cmd().
nvmet_file_parse_io_cmd() which is for file-backed I/O has already
initialized the data_len field to 0x0, though.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com>
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
|