summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2017-12-27perf stat: Update or print per-thread statsJin Yao
If the stats pointer in stat_config structure is not null, it will update the per-thread stats or print the per-thread stats on this buffer. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-9-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Allocate shadow stats buffer for threadsJin Yao
After perf_evlist__create_maps() being executed, we can get all threads from /proc. And via thread_map__nr(), we can also get the number of threads. With the number of threads, the patch allocates a buffer which will record the shadow stats for these threads. The buffer pointer is saved in stat_config. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-8-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Remove a set of shadow stats static variablesJin Yao
In previous patches, we have reconstructed the code and let it not access the static variables directly. This patch removes these static variables. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-7-git-send-email-yao.jin@linux.intel.com [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Print per-thread shadow statsJin Yao
The function perf_stat__print_shadow_stats() is called to print the shadow stats on a set of static variables. But the static variables are the limitations to support per-thread shadow stats. This patch lets the perf_stat__print_shadow_stats() support to print the shadow stats from a input parameter 'st'. It will not directly get value from static variable. Instead, it now uses runtime_stat_avg() and runtime_stat_n() to get and compute the values. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-6-git-send-email-yao.jin@linux.intel.com [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Update per-thread shadow statsJin Yao
The functions perf_stat__update_shadow_stats() is called to update the shadow stats on a set of static variables. But the static variables are the limitations to be extended to support per-thread shadow stats. This patch lets the perf_stat__update_shadow_stats() support to update the shadow stats on a input parameter 'st' and uses update_runtime_stat() to update the stats. It will not directly update the static variables as before. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-5-git-send-email-yao.jin@linux.intel.com [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Create the runtime_stat init/exit functionJin Yao
It mainly initializes and releases the rblist which is defined in struct runtime_stat. For the original rblist 'runtime_saved_values', it's still kept there for keeping the patch bisectable. The rblist 'runtime_saved_values' will be removed in later patch at switching time. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-4-git-send-email-yao.jin@linux.intel.com [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Extend rbtree to support per-thread shadow statsJin Yao
Previously the rbtree was used to link generic metrics. This patches adds new ctx/type/stat into rbtree keys because we will use this rbtree to maintain shadow metrics to replace original a couple of static arrays for supporting per-thread shadow stats. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Define a structure for per-thread shadow statsJin Yao
Perf has a set of static variables to record the runtime shadow metrics stats. While if we want to record the runtime shadow stats for per-thread, it will be the limitation. This patch creates a structure and the next patches will use this structure to update the runtime shadow stats for per-thread. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-18tools arch s390: Do not include header files from the kernel sourcesArnaldo Carvalho de Melo
Long ago we decided to be verbotten including files in the kernel git sources from tools/ living source code, to avoid disturbing kernel development (and perf's and other tools/) when, say, a kernel hacker adds something, tests everything but tools/ and have tools/ build broken. This got broken recently by s/390, fix it by copying arch/s390/include/uapi/asm/perf_regs.h to tools/arch/s390/include/uapi/asm/, making this one be used by means of <asm/perf_regs.h> and updating tools/perf/check_headers.sh to make sure we are notified when the original changes, so that we can check if anything is needed on the tooling side. This would have been caught by the 'tarkpg' test entry in: $ make -C tools/perf build-test When run on a s/390 build system or container. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: f704ef44602f ("s390/perf: add support for perf_regs and libdw") Link: https://lkml.kernel.org/n/tip-n57139ic0v9uffx8wdqi3d8a@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18perf jvmti: Generate correct debug information for inlined codeBen Gainey
tools/perf/jvmti is broken in so far as it generates incorrect debug information. Specifically it attributes all debug lines to the original method being output even in the case that some code is being inlined from elsewhere. This patch fixes the issue. To test (from within linux/tools/perf): export JDIR=/usr/lib/jvm/java-8-openjdk-amd64/ make cat << __EOF > Test.java public class Test { private StringBuilder b = new StringBuilder(); private void loop(int i, String... args) { for (String a : args) b.append(a); long hc = b.hashCode() * System.nanoTime(); b = new StringBuilder(); b.append(hc); System.out.printf("Iteration %d = %d\n", i, hc); } public void run(String... args) { for (int i = 0; i < 10000; ++i) { loop(i, args); } } public static void main(String... args) { Test t = new Test(); t.run(args); } } __EOF $JDIR/bin/javac Test.java ./perf record -F 10000 -g -k mono $JDIR/bin/java -agentpath:`pwd`/libperf-jvmti.so Test ./perf inject --jit -i perf.data -o perf.data.jitted ./perf annotate -i perf.data.jitted --stdio | grep Test\.java: | sort -u Before this patch, Test.java line numbers get reported that are greater than the number of lines in the Test.java file. They come from the source file of the inlined function, e.g. java/lang/String.java:1085. For further validation one can examine those lines in the JDK source distribution and confirm that they map to inlined functions called by Test.java. After this patch, the filename of the inlined function is output rather than the incorrect original source filename. Signed-off-by: Ben Gainey <ben.gainey@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Colin King <colin.king@canonical.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 598b7c6919c7 ("perf jit: add source line info support") Link: http://lkml.kernel.org/r/20171122182541.d25599a3eb1ada3480d142fa@arm.com Signed-off-by: Kim Phillips <kim.phillips@arm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18perf tools: Fix up build in hardened environmentsJiri Olsa
On Fedora systems the perl and python CFLAGS/LDFLAGS include the hardened specs from redhat-rpm-config package. We apply them only for perl/python objects, which makes them not compatible with the rest of the objects and the build fails with: /usr/bin/ld: perf-in.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -f +PIC /usr/bin/ld: libperf.a(libperf-in.o): relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile w +ith -fPIC /usr/bin/ld: final link failed: Nonrepresentable section on output collect2: error: ld returned 1 exit status make[2]: *** [Makefile.perf:507: perf] Error 1 make[1]: *** [Makefile.perf:210: sub-make] Error 2 make: *** [Makefile:69: all] Error 2 Mainly it's caused by perl/python objects being compiled with: -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 which prevent the final link impossible, because it will check for 'proper' objects with following option: -specs=/usr/lib/rpm/redhat/redhat-hardened-ld Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20171204082437.GC30564@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18perf tools: Use shell function for perl cflags retrievalJiri Olsa
Using the shell function for perl CFLAGS retrieval instead of back quotes (``). Both execute shell with the command, but the latter is more explicit and seems to be the preferred way. Also we don't have any other use of the back quotes in perf Makefiles. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171108102739.30338-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18Merge tag 'v4.15-rc4' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - fix the s2ram regression related to confusion around segment register restoration, plus related cleanups that make the code more robust - a guess-unwinder Kconfig dependency fix - an isoimage build target fix for certain tool chain combinations - instruction decoder opcode map fixes+updates, and the syncing of the kernel decoder headers to the objtool headers - a kmmio tracing fix - two 5-level paging related fixes - a topology enumeration fix on certain SMP systems" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Resync objtool's instruction decoder source code copy with the kernel's latest version x86/decoder: Fix and update the opcodes map x86/power: Make restore_processor_context() sane x86/power/32: Move SYSENTER MSR restoration to fix_processor_context() x86/power/64: Use struct desc_ptr for the IDT in struct saved_context x86/unwinder/guess: Prevent using CONFIG_UNWINDER_GUESS=y with CONFIG_STACKDEPOT=y x86/build: Don't verify mtools configuration file for isoimage x86/mm/kmmio: Fix mmiotrace for page unaligned addresses x86/boot/compressed/64: Print error if 5-level paging is not supported x86/boot/compressed/64: Detect and handle 5-level paging at boot-time x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation
2017-12-15x86/decoder: Fix and update the opcodes mapRandy Dunlap
Update x86-opcode-map.txt based on the October 2017 Intel SDM publication. Fix INVPID to INVVPID. Add UD0 and UD1 instruction opcodes. Also sync the objtool and perf tooling copies of this file. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/aac062d7-c0f6-96e3-5c92-ed299e2bd3da@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12Merge tag 'v4.15-rc3' into perf/core, to refresh the treeIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12tools/perf: Convert ACCESS_ONCE() to READ_ONCE()Mark Rutland
Recently there was a treewide conversion of ACCESS_ONCE() to {READ,WRITE}_ONCE(), but a new use was introduced concurrently by commit: 1695849735752d2a ("perf mmap: Move perf_mmap and methods to separate mmap.[ch] files") Let's convert this over to READ_ONCE() so that we can remove the ACCESS_ONCE() definitions in subsequent patches. Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: apw@canonical.com Link: http://lkml.kernel.org/r/20171127103824.36526-2-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) CAN fixes from Martin Kelly (cancel URBs properly in all the CAN usb drivers). 2) Revert returning -EEXIST from __dev_alloc_name() as this propagates to userspace and broke some apps. From Johannes Berg. 3) Fix conn memory leaks and crashes in TIPC, from Jon Malloc and Cong Wang. 4) Gianfar MAC can't do EEE so don't advertise it by default, from Claudiu Manoil. 5) Relax strict netlink attribute validation, but emit a warning. From David Ahern. 6) Fix regression in checksum offload of thunderx driver, from Florian Westphal. 7) Fix UAPI bpf issues on s390, from Hendrik Brueckner. 8) New card support in iwlwifi, from Ihab Zhaika. 9) BBR congestion control bug fixes from Neal Cardwell. 10) Fix port stats in nfp driver, from Pieter Jansen van Vuuren. 11) Fix leaks in qualcomm rmnet, from Subash Abhinov Kasiviswanathan. 12) Fix DMA API handling in sh_eth driver, from Thomas Petazzoni. 13) Fix spurious netpoll warnings in bnxt_en, from Calvin Owens. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits) net: mvpp2: fix the RSS table entry offset tcp: evaluate packet losses upon RTT change tcp: fix off-by-one bug in RACK tcp: always evaluate losses in RACK upon undo tcp: correctly test congestion state in RACK bnxt_en: Fix sources of spurious netpoll warnings tcp_bbr: reset long-term bandwidth sampling on loss recovery undo tcp_bbr: reset full pipe detection on loss recovery undo tcp_bbr: record "full bw reached" decision in new full_bw_reached bit sfc: pass valid pointers from efx_enqueue_unwind gianfar: Disable EEE autoneg by default tcp: invalidate rate samples during SACK reneging can: peak/pcie_fd: fix potential bug in restarting tx queue can: usb_8dev: cancel urb on -EPIPE and -EPROTO can: kvaser_usb: cancel urb on -EPIPE and -EPROTO can: esd_usb2: cancel urb on -EPIPE and -EPROTO can: ems_usb: cancel urb on -EPIPE and -EPROTO can: mcba_usb: cancel urb on -EPROTO usbnet: fix alignment for frames with no ethernet header tcp: use current time in tcp_rcv_space_adjust() ...
2017-12-06Merge branch 'perf/urgent' into perf/core, to pick up fixes and to refresh ↵Ingo Molnar
to v4.15 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06tooling/headers: Synchronize updated s390 and x86 UAPI headersIngo Molnar
There were two trivial updates to these upstream UAPI headers: arch/s390/include/uapi/asm/kvm.h arch/s390/include/uapi/asm/kvm_perf.h arch/x86/lib/x86-opcode-map.txt Synchronize them with their tooling copies. (The x86 opcode map includes a new instruction pattern now.) Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06Merge branch 'linus' into perf/urgent, to synchronize UAPI headersIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-05perf tools: Rename 'backward' to 'overwrite' in evlist, mmap and recordWang Nan
Remove the backward/forward concept to make it uniform with user interface (the '--overwrite' option). Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Mengting Zhang <zhangmengting@huawei.com> Link: http://lkml.kernel.org/r/20171204165107.95327-4-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf mmap: Don't discard prev in backward modeWang Nan
'perf record' can switch its output data file. The new output should only store the data after switching. However, in overwrite backward mode, the new output still can have data from before switching. That also brings extra overhead. At the end of mmap_read(), the position of the processed ring buffer is saved in md->prev. Next mmap_read should be end in md->prev if it is not overwriten. That avoids processing duplicate data. However, md->prev is discarded. So next the mmap_read() has to process whole valid ring buffer, which probably includes old processed data. Avoid calling backward_rb_find_range() when md->prev is still available. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Kan Liang <kan.liang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mengting Zhang <zhangmengting@huawei.com> Link: http://lkml.kernel.org/r/20171204165107.95327-3-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf mmap: Fix perf backward recordingWang Nan
'perf record' backward recording doesn't work as we expected: it never overwrites when ring buffer gets full. Test: Run a busy python printing task background like this: while True: print 123 send SIGUSR2 to perf to capture snapshot, then: # ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit --exclude-perf -a --switch-output [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2017110101520743 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2017110101521251 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2017110101521692 ] ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2017110101521936 ] [ perf record: Captured and wrote 0.826 MB perf.data.<timestamp> ] # ./perf script -i ./perf.data.2017110101520743 | head -n3 perf 2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0) perf 2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0) python 2545 [000] 12449.310800: raw_syscalls:sys_exit: NR 1 = 4 # ./perf script -i ./perf.data.2017110101521251 | head -n3 perf 2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0) perf 2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0) python 2545 [000] 12449.310800: raw_syscalls:sys_exit: NR 1 = 4 # ./perf script -i ./perf.data.2017110101521692 | head -n3 perf 2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0) perf 2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0) python 2545 [000] 12449.310800: raw_syscalls:sys_exit: NR 1 = 4 Timestamps never change, but my background task is a dead loop, can easily overwhelm the ring buffer. This patch fixes it by forcing unsetting PROT_WRITE for a backward ring buffer, so all backward ring buffers become overwrite ring buffers. Test result: # ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit --exclude-perf -a --switch-output [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2017110101285323 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2017110101290053 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2017110101290446 ] ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2017110101290837 ] [ perf record: Captured and wrote 0.826 MB perf.data.<timestamp> ] # ./perf script -i ./perf.data.2017110101285323 | head -n3 python 2545 [000] 11064.268083: raw_syscalls:sys_exit: NR 1 = 4 python 2545 [000] 11064.268084: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0) python 2545 [000] 11064.268086: raw_syscalls:sys_exit: NR 1 = 4 # ./perf script -i ./perf.data.2017110101290 | head -n3 failed to open ./perf.data.2017110101290: No such file or directory # ./perf script -i ./perf.data.2017110101290053 | head -n3 python 2545 [000] 11071.564062: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0) python 2545 [000] 11071.564064: raw_syscalls:sys_exit: NR 1 = 4 python 2545 [000] 11071.564066: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0) # ./perf script -i ./perf.data.2017110101290 | head -n3 perf.data.2017110101290053 perf.data.2017110101290446 perf.data.2017110101290837 # ./perf script -i ./perf.data.2017110101290446 | head -n3 sshd 1321 [000] 11075.499473: raw_syscalls:sys_exit: NR 14 = 0 sshd 1321 [000] 11075.499474: raw_syscalls:sys_enter: NR 14 (2, 7ffe98899490, 0, 8, 0, 3000) sshd 1321 [000] 11075.499474: raw_syscalls:sys_exit: NR 14 = 0 # ./perf script -i ./perf.data.2017110101290837 | head -n3 python 2545 [000] 11079.280844: raw_syscalls:sys_exit: NR 1 = 4 python 2545 [000] 11079.280847: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0) python 2545 [000] 11079.280850: raw_syscalls:sys_exit: NR 1 = 4 Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Mengting Zhang <zhangmengting@huawei.com> Link: http://lkml.kernel.org/r/20171204165107.95327-2-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf report: Set browser mode right before setup_browser()Seokho Song
There are codes that print messages to the screen between assignment of the use_browser variable and setup_browser(). But since the GUI browser is not initialized during that period, all messages fail to show if the user passed the --gtk option to perf as GTK is not initialized yet. Reorder the code to assign use_browser variable right before setup_browser() is called. Signed-off-by: Seokho Song <0xdevssh@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20171204160244.6332-1-0xdevssh@gmail.com Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf vendor events: Use more flexible pattern matching for CPU ↵William Cohen
identification for mapfile.csv The powerpc cpuid information includes chip revision information. Changes between chip revisions are usually minor bug fixes and usually do not affect the operation of the performance monitoring hardware. The original mapfile.csv matching requires enumerating every possible cpuid string. When a new minor chip revision is produced a new entry has to be added to the mapfile.csv and the code recompiled to allow perf to have the implementation specific perf events for this new minor revision. For users of various distibutions of Linux having to wait for a new release of the kernel's perf tool to be built with these trivial patches is inconvenient. Using regular expressions rather than exactly string matching of the entire cpuid string allows developers to write mapfile.csv files that do not require patches and recompiles for each of these minor version changes. If special cases need to be made for some particular versions, they can be placed earlier in the mapfile.csv file before the more general matches. Signed-off-by: William Cohen <wcohen@redhat.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shriya <shriyak@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20171204145728.16792-1-wcohen@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf c2c: Add a tip about cacheline eventsSangwon Hong
Signed-off-by: Sangwon Hong <qpakzk@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1512188201-14109-1-git-send-email-qpakzk@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf mmap: Remove overwrite and check_messup from mmap readWang Nan
All perf_mmap__read_forward() read from read-write ring buffer, so no need check_messup. Reading from backward ring buffer doesn't require check_messup because it never mess up. Cleanup arguments lists. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/20171203020044.81680-6-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf mmap: Remove overwrite from arguments list of perf_mmap__pushWang Nan
'overwrite' argument is always 'false'. Remove it from arguments list of perf_mmap__push(). Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/20171203020044.81680-5-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf evlist: Remove evlist->overwriteWang Nan
evlist->overwrite is set to false in all users. It can be removed. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/20171203020044.81680-4-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap_exWang Nan
All users of perf_evlist__mmap_ex set !overwrite. Remove it from its arguments list. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/20171203020044.81680-3-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf evlist: Remove 'overwrite' parameter from perf_evlist__mmapWang Nan
Now all perf_evlist__mmap's users doesn't set 'overwrite'. Remove it from arguments list. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/20171203020044.81680-2-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf tools: Fix up build in hardnened environmentsJiri Olsa
On Fedora systems the perl and python CFLAGS/LDFLAGS include the hardened specs from redhat-rpm-config package. We apply them only for perl/python objects, which makes them not compatible with the rest of the objects and the build fails with: /usr/bin/ld: perf-in.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -f +PIC /usr/bin/ld: libperf.a(libperf-in.o): relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile w +ith -fPIC /usr/bin/ld: final link failed: Nonrepresentable section on output collect2: error: ld returned 1 exit status make[2]: *** [Makefile.perf:507: perf] Error 1 make[1]: *** [Makefile.perf:210: sub-make] Error 2 make: *** [Makefile:69: all] Error 2 Mainly it's caused by perl/python objects being compiled with: -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 which prevent the final link impossible, because it will check for 'proper' objects with following option: -specs=/usr/lib/rpm/redhat/redhat-hardened-ld Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20171204082437.GC30564@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf pmu: Add check for valid cpuid in perf_pmu__find_map()Ganapatrao Kulkarni
On some platforms(arm/arm64) which uses cpus map to get corresponding cpuid string, cpuid can be NULL for PMUs other than CORE PMUs. Adding check for NULL cpuid in function perf_pmu__find_map to avoid segmentation fault. Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <gklkml16@gmail.com> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@cavium.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20171016183222.25750-6-ganapatrao.kulkarni@cavium.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf vendor events arm64: Add ThunderX2 implementation defined pmu core eventsGanapatrao Kulkarni
This is not a full event list, but a short list of useful events. Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <gklkml16@gmail.com> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@cavium.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20171016183222.25750-5-ganapatrao.kulkarni@cavium.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf pmu: Add helper function is_pmu_core to detect PMU CORE devicesGanapatrao Kulkarni
On some platforms, PMU core devices sysfs name is not cpu. Adding function is_pmu_core to detect PMU core devices using core device specific hints in sysfs. For arm64 platforms, all core devices have file "cpus" in sysfs. Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Tested-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Will Deacon <will.deacon@arm.com> Link: https://lkml.kernel.org/n/tip-y1woxt1k2pqqwpprhonnft2s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf s390: add regs_query_register_offset()Hendrik Brueckner
The regs_query_register_offset() helper function converts register name like "%r0" to an offset of a register in user_pt_regs It is required by the BPF prologue generator. The user_pt_regs structure was recently added to "asm/ptrace.h". Hence, update tools/perf/check-headers.sh to keep the header file in sync with kernel changes. Suggested-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-05perf tools arm64: Add support for get_cpuid_str function.Ganapatrao Kulkarni
The get_cpuid_str function returns the MIDR string of the first online cpu from the range of cpus associated with the PMU CORE device. Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <gklkml16@gmail.com> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@cavium.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20171016183222.25750-3-ganapatrao.kulkarni@cavium.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf pmu: Pass pmu as a parameter to get_cpuid_str()Ganapatrao Kulkarni
The cpuid string will not be same on all CPUs on heterogeneous platforms like ARM's big.LITTLE, adding provision(using pmu->cpus) to find cpuid string from associated CPUs of PMU CORE device. Also optimise arguments to function pmu_add_cpu_aliases. Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@cavium.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: http://lkml.kernel.org/r/20171016183222.25750-2-ganapatrao.kulkarni@cavium.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf s390: Always build with -fPICHendrik Brueckner
On s390, object files must be compiled with position-indepedent code in order to be incrementally linked or linked to shared libraries. Therefore, add -fPIC to the CFLAGS for s390 to ensure each object file is built properly. Reported-by: Jonathan Hermann <jonathan.hermann@de.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: linux s390 list <linux-s390@vger.kernel.org> LPU-Reference: 1512031765-9382-1-git-send-email-brueckner@linux.vnet.ibm.com Link: https://lkml.kernel.org/n/tip-a8wga8hrl0d0r84cal96fmgv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf thread_map: Add method to map all threads in the systemArnaldo Carvalho de Melo
Reusing the thread_map__new_by_uid() proc scanning already in place to return a map with all threads in the system. Based-on-a-patch-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-khh28q0wwqbqtrk32bfe07hd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf stat: Add rbtree node_delete opJin Yao
In current stat-shadow.c, the rbtree deleting is ignored. The patch adds the implementation to node_delete method of rblist. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512125856-22056-5-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf rblist: Create rblist__exit() functionJin Yao
Currently we have a rblist__delete() which is used to delete a rblist. While rblist__delete() will free the pointer of rblist at the end. It's an inconvenience for the user to delete a rblist which is not allocated by something like malloc(). For example, the rblist is embedded in a larger data structure. This patch creates a new function rblist__exit() which is similar to rblist__delete() but it will not free the pointer of rblist. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512125856-22056-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf annotate: Fix objdump comment parsing for Intel mov dissassemblyThomas Richter
The command 'perf annotate' parses the output of objdump and also investigates the comments produced by objdump. For example the output of objdump produces (on x86): 23eee: 4c 8b 3d 13 01 21 00 mov 0x210113(%rip),%r15 # 234008 <stderr@@GLIBC_2.2.5+0x9a8> and the function mov__parse() is called to investigate the complete line. Mov__parse() breaks this line into several parts and finally calls function comment__symbol() to parse the data after the comment character '#'. Comment__symbol() expects a hexadecimal address followed by a symbol in '<' and '>' brackets. However the 2nd parameter given to function comment__symbol() always points to the comment character '#'. The address parsing always returns 0 because the character '#' is not a digit and strtoull() fails without being noticed. Fix this by advancing the second parameter to function comment__symbol() by one byte before invocation and add an error check after strtoull() has been called. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Fixes: 6de783b6f50f ("perf annotate: Resolve symbols using objdump comment") Link: http://lkml.kernel.org/r/20171128075632.72182-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf annotate: Fix unnecessary memory allocation for s390xThomas Richter
This patch fixes a bug introduced with commit d9f8dfa9baf9 ("perf annotate s390: Implement jump types for perf annotate"). 'perf annotate' displays annotated assembler output by reading output of command objdump and parsing the disassembled lines. For each shown mnemonic this function sequence is executed: disasm_line__new() | +--> disasm_line__init_ins() | +--> ins__find() | +--> arch->associate_instruction_ops() The s390x specific function assigned to function pointer associate_instruction_ops refers to function s390__associate_ins_ops(). This function checks for supported mnemonics and assigns a NULL pointer to unsupported mnemonics. However even the NULL pointer is added to the architecture dependend instruction array. This leads to an extremely large architecture instruction array (due to array resize logic in function arch__grow_instructions()). Depending on the objdump output being parsed the array can end up with several ten-thousand elements. This patch checks if a mnemonic is supported and only adds supported ones into the architecture instruction array. The array does not contain elements with NULL pointers anymore. Before the patch (With some debug printf output): [root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxbb real 8m49.679s user 7m13.008s sys 0m1.649s [root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:' /tmp/xxxbb | tail -1 __ins__find sorted:1 nr_instructions:87433 ins:0x341583c0 [root@s35lp76 perf]# The number of different s390x branch/jump/call/return instructions entered into the array is 87433. After the patch (With some printf debug output:) [root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxaa real 1m24.553s user 0m0.587s sys 0m1.530s [root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:' /tmp/xxxaa | tail -1 __ins__find sorted:1 nr_instructions:56 ins:0x3f406570 [root@s35lp76 perf]# The number of different s390x branch/jump/call/return instructions entered into the array is 56 which is sensible. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20171124094637.55558-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf bench futex: Sync waker threadsJames Yang
Waker threads in the futex wake-parallel benchmark are started by a loop using pthread_create(). However, there is no synchronization for when the waker threads wake the waiting threads. Comparison of the waker threads' measurement timestamps show they are not all running concurrently because older waker threads finish their task before newer waker threads even start. This patch uses a barrier to better synchronize the waker threads. Signed-off-by: James Yang <james.yang@arm.com Cc: Kim Phillips <kim.phillips@arm.com> Link: http://lkml.kernel.org/r/20171127042101.3659-4-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> [ Disable the wake-parallel test for systems without pthread_barrier_t ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05tools build feature: Check if pthread_barrier_t is availableArnaldo Carvalho de Melo
As 'perf bench futex wake-parallel" will use this, which is not available in older systems such as versions of the android NDK used in my container build tests (r12b and r15c at the moment). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Yang <james.yang@arm.com Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-1i7iv54in4wj08lwo55b0pzv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-30perf bench futex: Use cpumapsDavidlohr Bueso
It was reported that the whole futex bench breaks when dealing with non-contiguously numbered cpus. $ echo 0 | sudo tee /sys/devices/system/cpu/cpu3/online $ ./perf bench futex all perf: pthread_create: Operation not permitted Run summary [PID 14934]: 7 threads, each .... James had implemented an approach with cpumaps that use an in house flavor. Instead of re-inventing the wheel, I've redone the patch such that we use the perf's util/cpumap.c interface instead. Applies to all futex benchmarks. Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Originally-from: James Yang <james.yang@arm.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Kim Phillips <kim.phillips@arm.com> Link: http://lkml.kernel.org/r/20171127042101.3659-2-dave@stgolabs.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-29perf intel-pt: Improve build messages for files that differ from the kernelAdrian Hunter
Print file names of files that differ. For example, instead of: Warning: Intel PT: x86 instruction decoder differs from kernel print: Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/inat.h' differs from latest version at 'arch/x86/include/asm/inat.h' Reported-by: Ingo Molnar <mingo@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/1511253326-22308-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>