summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2019-04-01tools build: Implement libzstd feature check, LIBZSTD_DIR and NO_LIBZSTD definesAlexey Budankov
Implement libzstd feature check, NO_LIBZSTD and LIBZSTD_DIR defines to override Zstd library sources or disable the feature from the command line: $ make -C tools/perf LIBZSTD_DIR=/path/to/zstd/sources/ clean all $ make -C tools/perf NO_LIBZSTD=1 clean all Auto detection feature status is reported just before compilation starts. If your system has some version of the zstd library preinstalled then the build system finds and uses it during the build. If you still prefer to compile with some other version of zstd library you have capability to refer the compilation to that version using LIBZSTD_DIR define. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/9b4cd8b0-10a3-1f1e-8d6b-5922a7ca216b@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Rename input arguments and local variables of ↵Tzvetomir Stoyanov
libtraceevent from pevent to tep "pevent" to "tep" renaming of: - all "pevent" input arguments of libtraceevent internal functions. - all local "pevent" variables of libtraceevent. This makes the implementation consistent with the chosen naming convention, tep (trace event parser), and will avoid any confusion with the old pevent name Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-5-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164344.944953447@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf tools, tools lib traceevent: Rename "pevent" member of struct ↵Tzvetomir Stoyanov
tep_event_filter to "tep" The member "pevent" of the struct tep_event_filter is renamed to "tep". This makes the struct consistent with the chosen naming convention: tep (trace event parser), instead of the old pevent. Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-4-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164344.785896189@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf tools, tools lib traceevent: Rename "pevent" member of struct tep_event ↵Tzvetomir Stoyanov
to "tep" The member "pevent" of the struct tep_event is renamed to "tep". This makes the struct consistent with the chosen naming convention: tep (trace event parser), instead of the old pevent. Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-3-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164344.627724996@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Rename input arguments of libtraceevent APIs from ↵Tzvetomir Stoyanov
pevent to tep Input arguments of libtraceevent APIs are renamed from "struct tep_handle *pevent" to "struct tep_handle *tep". This makes the API consistent with the chosen naming convention: tep (trace event parser), instead of the old pevent. Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-2-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164344.465573837@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools tools, tools lib traceevent: Make traceevent APIs more consistentTzvetomir Stoyanov
Rename some traceevent APIs for consistency: tep_pid_is_registered() to tep_is_pid_registered() tep_file_bigendian() to tep_is_file_bigendian() to make the names and return values consistent with other tep_is_... APIs tep_data_lat_fmt() to tep_data_latency_format() to make the name more descriptive tep_host_bigendian() to tep_is_bigendian() tep_set_host_bigendian() to tep_set_local_bigendian() tep_is_host_bigendian() to tep_is_local_bigendian() "host" can be confused with VMs, and "local" is about the local machine. All tep_is_..._bigendian(struct tep_handle *tep) APIs return the saved data in the tep handle, while tep_is_bigendian() returns the running machine's endianness. All tep_is_... functions are modified to return bool value, instead of int. Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190327141946.4353-2-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164344.288624897@goodmis.org [ Removed some extra parenthesis around return statements ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Remove call to exit() from tep_filter_add_filter_str()Tzvetomir Stoyanov
This patch removes call to exit() from tep_filter_add_filter_str(). A library function should not force the application to exit. In the current implementation tep_filter_add_filter_str() calls exit() when a special "test_filters" mode is set, used only for debugging purposes. When this mode is set and a filter is added - its string is printed to the console and exit() is called. This patch changes the logic - when in "test_filters" mode, the filter string is still printed, but the exit() is not called. It is up to the application to track when "test_filters" mode is set and to call exit, if needed. Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190326154328.28718-9-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164344.121717482@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Remove tep filter trivial APIsTzvetomir Stoyanov
This patch removes trivial filter tep APIs: enum tep_filter_trivial_type tep_filter_event_has_trivial() tep_update_trivial() tep_filter_clear_trivial() Trivial filters is an optimization, used only in the first version of KernelShark. The API is deprecated, the next KernelShark release does not use it. Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190326154328.28718-4-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164343.968458918@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Removed unneeded !! and return parenthesisSteven Rostedt (VMware)
As return is not a function we do not need parenthesis around the return value. Also, a function returning bool does not need to add !!. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com> Link: http://lkml.kernel.org/r/20190401164343.817886725@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Implement new traceevent APIs for accessing struct ↵Tzvetomir Stoyanov
tep_handler fields As struct tep_handler definition is not exposed as part of libtraceevent API, its fields cannot be accessed directly by the library users. This patch implements new APIs, which can be used to access the struct tep_handler fields: tep_get_event() - retrieves an event pointer at a specific index tep_get_first_event() - is modified to use tep_get_event() tep_clear_flag() - clears a tep handle flag tep_test_flag() - test if a given flag is set tep_get_header_timestamp_size() - returns the size of the timestamp stored in the header. tep_get_cpus() - returns the number of CPUs tep_is_old_format() - returns true if data was created by an older kernel with the old data format tep_set_print_raw() - have the output print in the raw format tep_set_test_filters() - debugging utility for testing tep filters Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/linux-trace-devel/20190325145017.30246-4-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164343.679629539@goodmis.org [ Renamed some newly added "pevent" to "tep" ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Coding style fixesTzvetomir Stoyanov
Fixed few coding style problems in event-parse-api.c Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/linux-trace-devel/20190325145017.30246-3-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164343.537086316@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Change description of few APIsTzvetomir Stoyanov
APIs descriptions should describe the purpose of the function, its parameters and return value. While working on man pages implementation, I noticed mismatches in the descriptions of few APIs. This patch changes the description of these APIs, making them consistent with the man pages: - tep_print_num_field() - tep_print_func_field() - tep_get_header_page_size() - tep_get_long_size() - tep_set_long_size() - tep_get_page_size() - tep_set_page_size() Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/linux-trace-devel/20190325145017.30246-2-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164343.396759247@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Add more debugging to see various internal ring buffer ↵Steven Rostedt (Red Hat)
entries When trace-cmd report --debug is set, show the internal ring buffer entries like time-extends and padding. This requires adding new kbuffer API to retrieve these items. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com> Link: http://lkml.kernel.org/r/20190401164343.257591565@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Implement a new API, tep_list_events_copy()Tzvetomir Stoyanov
Existing API tep_list_events() is not thread safe, it uses the internal array sort_events to keep cache of the sorted events and reuses it. This patch implements a new API, tep_list_events_copy(), which allocates new sorted array each time it is called. It could be used when a sorted events functionality is needed in thread safe use cases. It is up to the caller to free the array. Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/linux-trace-devel/20181218133013.31094-1-tstoyanov@vmware.com Link: http://lkml.kernel.org/r/20190401164343.117437443@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Add mono clocks to be parsed in secondsSteven Rostedt (VMware)
The mono clocks can display in seconds instead of whole numbers: trace-cmd-521 [001] 99176715281005: sched_waking: comm=kworker/u16:2 pid=32118 prio=120 target_cpu=002 trace-cmd-521 [001] 99176715286349: sched_wake_idle_without_ipi: cpu=2 trace-cmd-521 [001] 99176715288047: sched_wakeup: kworker/u16:2:32118 [120] success=1 CPU:002 trace-cmd-521 [001] 99176715290022: sched_waking: comm=trace-cmd pid=523 prio=120 target_cpu=000 trace-cmd-521 [001] 99176715292332: sched_wake_idle_without_ipi: cpu=0 trace-cmd-521 [001] 99176715292855: sched_wakeup: trace-cmd:523 [120] success=1 CPU:000 trace-cmd-521 [001] 99176715300697: sched_stat_runtime: comm=trace-cmd pid=521 runtime=80233 [ns] vruntime=66705540554 [ns Break it up in seconds: trace-cmd-521 [001] 99176.715281: sched_waking: comm=kworker/u16:2 pid=32118 prio=120 target_cpu=002 trace-cmd-521 [001] 99176.715286: sched_wake_idle_without_ipi: cpu=2 trace-cmd-521 [001] 99176.715288: sched_wakeup: kworker/u16:2:32118 [120] success=1 CPU:002 trace-cmd-521 [001] 99176.715290: sched_waking: comm=trace-cmd pid=523 prio=120 target_cpu=000 trace-cmd-521 [001] 99176.715292: sched_wake_idle_without_ipi: cpu=0 trace-cmd-521 [001] 99176.715293: sched_wakeup: trace-cmd:523 [120] success=1 CPU:000 trace-cmd-521 [001] 99176.715301: sched_stat_runtime: comm=trace-cmd pid=521 runtime=80233 [ns] vruntime=66705540554 [ns] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com> Link: http://lkml.kernel.org/r/20190401164342.976554023@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01tools lib traceevent: Handle trace_printk() "%px"Steven Rostedt (VMware)
With security updates, %p in the kernel is hashed to protect true kernel locations. But trace_printk() is not allowed in production systems, and when a pointer is used, there are many times that the actual address is needed. "%px" produces the real address. But libtraceevent does not know how to handle that extension. Add it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com> Link: http://lkml.kernel.org/r/20190401164342.837312153@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf list: Output tool eventsAndi Kleen
Add support in 'perf list' to output tool internal events, currently only 'duration_time'. Committer testing: $ perf list dur* List of pre-defined events (to be used in -e): duration_time [Tool event] Metric Groups: $ perf list sw List of pre-defined events (to be used in -e): alignment-faults [Software event] bpf-output [Software event] context-switches OR cs [Software event] cpu-clock [Software event] cpu-migrations OR migrations [Software event] dummy [Software event] emulation-faults [Software event] major-faults [Software event] minor-faults [Software event] page-faults OR faults [Software event] task-clock [Software event] duration_time [Tool event] $ perf list | grep duration duration_time [Tool event] [L1D miss outstandings duration in cycles] page walk duration are excluded in Skylake] load. EPT page walk duration are excluded in Skylake] page walk duration are excluded in Skylake] store. EPT page walk duration are excluded in Skylake] (instruction fetch) request. EPT page walk duration are excluded in instruction fetch request. EPT page walk duration are excluded in $ Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190326221823.11518-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf evsel: Support printing evsel name for 'duration_time'Andi Kleen
Implement printing the correct name for duration_time Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190326221823.11518-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf stat: Implement duration_time as a proper eventAndi Kleen
The perf metric expression use 'duration_time' internally to normalize events. Normal 'perf stat' without -x also prints the duration time. But when using -x, the interval is not output anywhere, which is inconvenient for any post processing which often wants to normalize values to time. So implement 'duration_time' as a proper perf event that can be specified explicitely with -e. The previous implementation of 'duration_time' only worked for metric processing. This adds the concept of a tool event that is handled by the tool. On the kernel level it is still mapped to the dummy software event, but the values are not read anymore, but instead computed by the tool. Add proper plumbing to handle this in the event parser, and display it in 'perf stat'. We don't want 'duration_time' to be added up, so it's only printed for the first CPU. % perf stat -e duration_time,cycles true Performance counter stats for 'true': 555,476 ns duration_time 771,958 cycles 0.000555476 seconds time elapsed 0.000644000 seconds user 0.000000000 seconds sys Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190326221823.11518-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf stat: Revert checks for duration_timeAndi Kleen
This reverts e864c5ca145e ("perf stat: Hide internal duration_time counter") but doing it manually since the code has now moved to a different file. The next patch will properly implement duration_time as a full event, so no need to hide it anymore. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190326221823.11518-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf list: Fix s390 counter long description for L1D_RO_EXCL_WRITESThomas Richter
Command # perf list --long-desc pmu lists the long description of the available counters. For counter named L1D_RO_EXCL_WRITES on machine types 3906 and 3907 the long description contains the counter number 'Counter:128 Name:' prefix. This is wrong. The fix changes the description text and removes this prefix. Output before: [root@m35lp76 perf]# ./perf list --long-desc pmu ... L1D_ONDRAWER_L4_SOURCED_WRITES [A directory write to the Level-1 Data cache directory where the returned cache line was sourced from On-Drawer Level-4 cache] L1D_RO_EXCL_WRITES [Counter:128 Name:L1D_RO_EXCL_WRITES A directory write to the Level-1 Data cache where the line was originally in a Read-Only state in the cache but has been updated to be in the Exclusive state that allows stores to the cache line] ... Output after: [root@m35lp76 perf]# ./perf list --long-desc pmu ... L1D_ONDRAWER_L4_SOURCED_WRITES [A directory write to the Level-1 Data cache directory where the returned cache line was sourced from On-Drawer Level-4 cache] L1D_RO_EXCL_WRITES [L1D_RO_EXCL_WRITES A directory write to the Level-1 Data cache where the line was originally in a Read-Only state in the cache but has been updated to be in the Exclusive state that allows stores to the cache line] ... Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Fixes: 109d59b900e7 ("perf vendor events s390: Add JSON files for IBM z14") Link: http://lkml.kernel.org/r/20190329133337.60255-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf tools: Add header defining used namespace struct to event.hArnaldo Carvalho de Melo
When adding the 'struct namespaces_event' to event.h, referencing the 'struct perf_ns_link_info' type, we forgot to add the header where it is defined, getting that definition only by sheer luck. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: f3b3614a284d ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info") Link: https://lkml.kernel.org/n/tip-qkrld0v7boc9uabjbd8csxux@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf trace beauty renameat: No need to include linux/fs.hArnaldo Carvalho de Melo
There is no use for what is in that file, as everything is built by the tools/perf/trace/beauty/rename_flags.sh script from the copied kernel headers, the end result being: $ cat /tmp/build/perf/trace/beauty/generated/rename_flags_array.c static const char *rename_flags[] = { [0 + 1] = "NOREPLACE", [1 + 1] = "EXCHANGE", [2 + 1] = "WHITEOUT", }; $ I.e. no use of any defines from uapi/linux/fs.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-lgugmfa8z4bpw5zsbuoitllb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf augmented_raw_syscalls: Use a PERCPU_ARRAY map to copy more string bytesArnaldo Carvalho de Melo
The previous method, copying to the BPF stack limited us in how many bytes we could copy from strings, use a PERCPU_ARRAY map like devised by the sysdig guys[1] to copy more bytes: Before: # trace --no-inherit -e openat touch `python -c "print "$s" 'a' * 2000"` touch: cannot touch 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa': File name too long openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", O_CREAT|O_NOCTTY|O_NONBLOCK|O_WRONLY, S_IRUGO|S_IWUGO) = -1 ENAMETOOLONG (File name too long) openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) <SNIP some openat calls> # After: [root@quaco acme]# trace --no-inherit -e openat touch `python -c "print "$s" 'a' * 2000"` <STRIP what is the same as in the 'before' part> openat(AT_FDCWD, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", O_CREAT|O_NOCTTY|O_NONBLOC) = -1 ENAMETOOLONG (File name too long) <STRIP what is the same as in the 'before' part> If we leave something like 'perf trace -e string' to trace all syscalls with a string, and then do some 'perf top', to get some annotation for the augmented_raw_syscalls.o BPF program we get: │ → callq *ffffffffc45576d1 ▒ │ augmented_args->filename.size = probe_read_str(&augmented_args->filename.value, ▒ 0.05 │ mov %eax,0x40(%r13) Looking with pahole, expanding types, asking for hex offsets and sizes, and use of BTF type information to see what is at that 0x40 offset from %r13: # pahole -F btf -C augmented_args_filename --expand_types --hex /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o struct augmented_args_filename { struct syscall_enter_args { long long unsigned int common_tp_fields; /* 0 0x8 */ long int syscall_nr; /* 0x8 0x8 */ long unsigned int args[6]; /* 0x10 0x30 */ } args; /* 0 0x40 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct augmented_filename { unsigned int size; /* 0x40 0x4 */ int reserved; /* 0x44 0x4 */ char value[4096]; /* 0x48 0x1000 */ } filename; /* 0x40 0x1008 */ /* size: 4168, cachelines: 66, members: 2 */ /* last cacheline: 8 bytes */ }; # Then looking if PATH_MAX leaves some signature in the tests: │ if (augmented_args->filename.size < sizeof(augmented_args->filename.value)) { ▒ │ cmp $0xfff,%rdi 0xfff == 4095 sizeof(augmented_args->filename.value) == PATH_MAX == 4096 [1] https://sysdig.com/blog/the-art-of-writing-ebpf-programs-a-primer/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: Daniel Borkmann <borkmann@iogearbox.net> Cc: Gianluca Borello <g.borello@gmail.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> cc: Martin Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/n/tip-76gce2d2ghzq537ubwhjkone@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf augmented_raw_syscalls: Copy strings from all syscalls with 1st or 2nd ↵Arnaldo Carvalho de Melo
string arg Gets the augmented_raw_syscalls a bit more useful as-is, add a comment stating that the intent is to have all this in a map populated by userspace via the 'syscalls' BPF map, that right now has only a flag stating if the syscall is filtered or not. With it: # grep -B1 augmented_raw ~/.perfconfig [trace] add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o # # perf trace -e string weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10) = 0 gnome-shell/1943 openat(AT_FDCWD, "/proc/self/stat", O_RDONLY) = 81 weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10) = 0 gmain/2475 inotify_add_watch(20<anon_inode:inotify>, "/home/acme/.config/firewall", 16789454) = -1 ENOENT (No such file or directory) gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "", 16789454) = -1 ENOENT (No such file or directory) gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/var/cache/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory) gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/var/lib/app-info/xmls", 16789454) = -1 ENOENT (No such file or directory) gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/var/lib/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory) gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/usr/share/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory) gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/usr/local/share/app-info/xmls", 16789454) = -1 ENOENT (No such file or directory) gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/usr/local/share/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory) gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/home/acme/.local/share/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory) gmain/1121 inotify_add_watch(12<anon_inode:inotify>, "/etc/NetworkManager/VPN", 16789454) = -1 ENOENT (No such file or directory) weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10) = 0 gmain/2050 inotify_add_watch(8<anon_inode:inotify>, "/home/acme/~", 16789454) = -1 ENOENT (No such file or directory) gmain/2521 inotify_add_watch(6<anon_inode:inotify>, "/var/lib/fwupd/remotes.d/lvfs-testing", 16789454) = -1 ENOENT (No such file or directory) weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10) = 0 DOM Worker/22714 ... [continued]: openat()) = 257 FS Broker 3982/3990 openat(AT_FDCWD, "/dev/urandom", O_RDONLY|O_CLOEXEC|O_NOCTTY) = 187 DOMCacheThread/16652 mkdir("/home/acme/.mozilla/firefox/ina67tev.default/storage/default/https+++web.whatsapp.com/cache/morgue/192", S_IRUGO|S_IXUGO|S_IWUSR) = -1 EEXIST (File exists) ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-a1hxffoy8t43e0wq6bzhp23u@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01perf trace: Add 'string' event alias to select syscalls with string argsArnaldo Carvalho de Melo
Will be used in conjunction with the change to augmented_raw_syscalls.c in the next cset that adds all syscalls with a first or second arg string. With just what we have in the syscall tracepoints we get: # perf trace -e string ls > /dev/null ? ( ): ls/22382 ... [continued]: execve()) = 0 0.043 ( 0.004 ms): ls/22382 access(filename: 0x51ad420, mode: R) = -1 ENOENT (No such file or directory) 0.051 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x51aa8b3, flags: RDONLY|CLOEXEC) = 3 0.071 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x51b4d00, flags: RDONLY|CLOEXEC) = 3 0.138 ( 0.009 ms): ls/22382 openat(dfd: CWD, filename: 0x51684d0, flags: RDONLY|CLOEXEC) = 3 0.192 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x51689c0, flags: RDONLY|CLOEXEC) = 3 0.255 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x5168eb0, flags: RDONLY|CLOEXEC) = 3 0.342 ( 0.003 ms): ls/22382 openat(dfd: CWD, filename: 0x51693a0, flags: RDONLY|CLOEXEC) = 3 0.380 ( 0.003 ms): ls/22382 openat(dfd: CWD, filename: 0x5169950, flags: RDONLY|CLOEXEC) = 3 0.670 ( 0.011 ms): ls/22382 statfs(pathname: 0x515c783, buf: 0x7fff54d75b70) = 0 0.683 ( 0.005 ms): ls/22382 statfs(pathname: 0x515c783, buf: 0x7fff54d75a60) = 0 0.725 ( 0.004 ms): ls/22382 access(filename: 0x515c7ab) = 0 0.744 ( 0.005 ms): ls/22382 openat(dfd: CWD, filename: 0x50fba20, flags: RDONLY|CLOEXEC) = 3 0.793 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x9e3e8390, flags: RDONLY|CLOEXEC|DIRECTORY|NONBLOCK) = 3 0.921 ( 0.006 ms): ls/22382 openat(dfd: CWD, filename: 0x50f7d90) = 3 # If we put the vfs_getname probe point in place: # perf probe 'vfs_getname=getname_flags:73 pathname=result->name:string' Added new events: probe:vfs_getname (on getname_flags:73 with pathname=result->name:string) probe:vfs_getname_1 (on getname_flags:73 with pathname=result->name:string) You can now use it in all perf tools, such as: perf record -e probe:vfs_getname_1 -aR sleep 1 # perf trace -e string ls > /dev/null ? ( ): ls/22440 ... [continued]: execve()) = 0 0.048 ( 0.008 ms): ls/22440 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT (No such file or directory) 0.061 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: RDONLY|CLOEXEC) = 3 0.092 ( 0.008 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libselinux.so.1, flags: RDONLY|CLOEXEC) = 3 0.165 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libcap.so.2, flags: RDONLY|CLOEXEC) = 3 0.216 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: RDONLY|CLOEXEC) = 3 0.282 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libpcre2-8.so.0, flags: RDONLY|CLOEXEC) = 3 0.340 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libdl.so.2, flags: RDONLY|CLOEXEC) = 3 0.383 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libpthread.so.0, flags: RDONLY|CLOEXEC) = 3 0.697 ( 0.021 ms): ls/22440 statfs(pathname: /sys/fs/selinux, buf: 0x7ffee7dc9010) = 0 0.720 ( 0.007 ms): ls/22440 statfs(pathname: /sys/fs/selinux, buf: 0x7ffee7dc8f00) = 0 0.757 ( 0.007 ms): ls/22440 access(filename: /etc/selinux/config) = 0 0.779 ( 0.009 ms): ls/22440 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: RDONLY|CLOEXEC) = 3 0.830 ( 0.006 ms): ls/22440 openat(dfd: CWD, filename: ., flags: RDONLY|CLOEXEC|DIRECTORY|NONBLOCK) = 3 0.958 ( 0.010 ms): ls/22440 openat(dfd: CWD, filename: /usr/lib64/gconv/gconv-modules.cache) = 3 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-6fh1myvn7ulf4xwq9iz3o776@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28perf pmu: Fix parser error for uncore event aliasKan Liang
Perf fails to parse uncore event alias, for example: # perf stat -e unc_m_clockticks -a --no-merge sleep 1 event syntax error: 'unc_m_clockticks' \___ parser error Current code assumes that the event alias is from one specific PMU. To find the PMU, perf strcmps the PMU name of event alias with the real PMU name on the system. However, the uncore event alias may be from multiple PMUs with common prefix. The PMU name of uncore event alias is the common prefix. For example, UNC_M_CLOCKTICKS is clock event for iMC, which include 6 PMUs with the same prefix "uncore_imc" on a skylake server. The real PMU names on the system for iMC are uncore_imc_0 ... uncore_imc_5. The strncmp is used to only check the common prefix for uncore event alias. With the patch: # perf stat -e unc_m_clockticks -a --no-merge sleep 1 Performance counter stats for 'system wide': 723,594,722 unc_m_clockticks [uncore_imc_5] 724,001,954 unc_m_clockticks [uncore_imc_3] 724,042,655 unc_m_clockticks [uncore_imc_1] 724,161,001 unc_m_clockticks [uncore_imc_4] 724,293,713 unc_m_clockticks [uncore_imc_2] 724,340,901 unc_m_clockticks [uncore_imc_0] 1.002090060 seconds time elapsed Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: stable@vger.kernel.org Fixes: ea1fa48c055f ("perf stat: Handle different PMU names with common prefix") Link: http://lkml.kernel.org/r/1552672814-156173-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28perf scripts python: exported-sql-viewer.py: Fix python3 supportAdrian Hunter
Unlike python2, python3 strings are not compatible with byte strings. That results in disassembly not working for the branches reports. Fixup those places overlooked in the port to python3. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Fixes: beda0e725e5f ("perf script python: Add Python3 support to exported-sql-viewer.py") Link: http://lkml.kernel.org/r/20190327072826.19168-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28perf scripts python: exported-sql-viewer.py: Fix never-ending loopAdrian Hunter
pyside version 1 fails to handle python3 large integers in some cases, resulting in Qt getting into a never-ending loop. This affects: samples Table samples_view Table All branches Report Selected branches Report Add workarounds for those cases. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Fixes: beda0e725e5f ("perf script python: Add Python3 support to exported-sql-viewer.py") Link: http://lkml.kernel.org/r/20190327072826.19168-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28perf machine: Update kernel map address and re-order properlyWei Li
Since commit 1fb87b8e9599 ("perf machine: Don't search for active kernel start in __machine__create_kernel_maps"), the __machine__create_kernel_maps() just create a map what start and end are both zero. Though the address will be updated later, the order of map in the rbtree may be incorrect. The commit ee05d21791db ("perf machine: Set main kernel end address properly") fixed the logic in machine__create_kernel_maps(), but it's still wrong in function machine__process_kernel_mmap_event(). To reproduce this issue, we need an environment which the module address is before the kernel text segment. I tested it on an aarch64 machine with kernel 4.19.25: [root@localhost hulk]# grep _stext /proc/kallsyms ffff000008081000 T _stext [root@localhost hulk]# grep _etext /proc/kallsyms ffff000009780000 R _etext [root@localhost hulk]# tail /proc/modules hisi_sas_v2_hw 77824 0 - Live 0xffff00000191d000 nvme_core 126976 7 nvme, Live 0xffff0000018b6000 mdio 20480 1 ixgbe, Live 0xffff0000018ab000 hisi_sas_main 106496 1 hisi_sas_v2_hw, Live 0xffff000001861000 hns_mdio 20480 2 - Live 0xffff000001822000 hnae 28672 3 hns_dsaf,hns_enet_drv, Live 0xffff000001815000 dm_mirror 40960 0 - Live 0xffff000001804000 dm_region_hash 32768 1 dm_mirror, Live 0xffff0000017f5000 dm_log 32768 2 dm_mirror,dm_region_hash, Live 0xffff0000017e7000 dm_mod 315392 17 dm_mirror,dm_log, Live 0xffff000001780000 [root@localhost hulk]# Before fix: [root@localhost bin]# perf record sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ] [root@localhost bin]# perf buildid-list -i perf.data 4c4e46c971ca935f781e603a09b52a92e8bdfee8 [vdso] [root@localhost bin]# perf buildid-list -i perf.data -H 0000000000000000000000000000000000000000 /proc/kcore [root@localhost bin]# After fix: [root@localhost tools]# ./perf/perf record sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ] [root@localhost tools]# ./perf/perf buildid-list -i perf.data 28a6c690262896dbd1b5e1011ed81623e6db0610 [kernel.kallsyms] 106c14ce6e4acea3453e484dc604d66666f08a2f [vdso] [root@localhost tools]# ./perf/perf buildid-list -i perf.data -H 28a6c690262896dbd1b5e1011ed81623e6db0610 /proc/kcore Signed-off-by: Wei Li <liwei391@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190228092003.34071-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28tools headers uapi: Sync powerpc's asm/kvm.h copy with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in: 2b57ecd0208f ("KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()") That don't cause any changes in the tools. This silences this perf build warning: Warning: Kernel ABI header at 'tools/arch/powerpc/include/uapi/asm/kvm.h' differs from latest version at 'arch/powerpc/include/uapi/asm/kvm.h' diff -u tools/arch/powerpc/include/uapi/asm/kvm.h arch/powerpc/include/uapi/asm/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Link: https://lkml.kernel.org/n/tip-4pb7ywp9536hub2pnj4hu6i4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28tools headers: Update x86's syscall_64.tbl and uapi/asm-generic/unistdArnaldo Carvalho de Melo
To pick up the changes introduced in the following csets: 2b188cc1bb85 ("Add io_uring IO interface") edafccee56ff ("io_uring: add support for pre-mapped user IO buffers") 3eb39f47934f ("signal: add pidfd_send_signal() syscall") This makes 'perf trace' to become aware of these new syscalls, so that one can use them like 'perf trace -e ui_uring*,*signal' to do a system wide strace-like session looking at those syscalls, for instance. For example: # perf trace -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla Summary of events: io_uring-cp (383), 1208866 events, 100.0% syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) -------------- ------ -------- ------ ------- ------- ------ io_uring_enter 605780 2955.615 0.000 0.005 33.804 1.94% openat 4 459.446 0.004 114.861 459.435 100.00% munmap 4 0.073 0.009 0.018 0.042 44.03% mmap 10 0.054 0.002 0.005 0.026 43.24% brk 28 0.038 0.001 0.001 0.003 7.51% io_uring_setup 1 0.030 0.030 0.030 0.030 0.00% mprotect 4 0.014 0.002 0.004 0.005 14.32% close 5 0.012 0.001 0.002 0.004 28.87% fstat 3 0.006 0.001 0.002 0.003 35.83% read 4 0.004 0.001 0.001 0.002 13.58% access 1 0.003 0.003 0.003 0.003 0.00% lseek 3 0.002 0.001 0.001 0.001 9.00% arch_prctl 2 0.002 0.001 0.001 0.001 0.69% execve 1 0.000 0.000 0.000 0.000 0.00% # # perf trace -e io_uring* -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla Summary of events: io_uring-cp (390), 1191250 events, 100.0% syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) -------------- ------ -------- ------ ------ ------ ------ io_uring_enter 597093 2706.060 0.001 0.005 14.761 1.10% io_uring_setup 1 0.038 0.038 0.038 0.038 0.00% # More work needed to make the tools/perf/examples/bpf/augmented_raw_syscalls.c BPF program to copy the 'struct io_uring_params' arguments to perf's ring buffer so that 'perf trace' can use the BTF info put in place by pahole's conversion of the kernel DWARF and then auto-beautify those arguments. This patch produces the expected change in the generated syscalls table for x86_64: --- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before 2019-03-26 13:37:46.679057774 -0300 +++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c 2019-03-26 13:38:12.755990383 -0300 @@ -334,5 +334,9 @@ static const char *syscalltbl_x86_64[] = [332] = "statx", [333] = "io_pgetevents", [334] = "rseq", + [424] = "pidfd_send_signal", + [425] = "io_uring_setup", + [426] = "io_uring_enter", + [427] = "io_uring_register", }; -#define SYSCALLTBL_x86_64_MAX_ID 334 +#define SYSCALLTBL_x86_64_MAX_ID 427 This silences these perf build warnings: Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h' diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl' diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Christian Brauner <christian@brauner.io> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/n/tip-p0ars3otuc52x5iznf21shhw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28tools headers uapi: Update drm/i915_drm.hArnaldo Carvalho de Melo
To get the changes in: e46c2e99f600 ("drm/i915: Expose RPCS (SSEU) configuration to userspace (Gen11 only)") That don't cause changes in the generated perf binaries. To silence this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h' diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://lkml.kernel.org/n/tip-h6bspm1nomjnpr90333rrx7q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28tools arch x86: Sync asm/cpufeatures.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes from: 52f64909409c ("x86: Add TSX Force Abort CPUID/MSR") That don't cause any changes in the generated perf binaries. And silence this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/n/tip-zv8kw8vnb1zppflncpwfsv2w@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28tools headers uapi: Sync linux/fcntl.h to get the F_SEAL_FUTURE_WRITE additionArnaldo Carvalho de Melo
To get the changes in: ab3948f58ff8 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") And silence this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/fcntl.h' differs from latest version at 'include/uapi/linux/fcntl.h' diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-lvfx5cgf0xzmdi9mcjva1ttl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28tools headers uapi: Sync asm-generic/mman-common.h and linux/mman.hArnaldo Carvalho de Melo
To deal with the move of some defines from asm-generic/mmap-common.h to linux/mman.h done in: 746c9398f5ac ("arch: move common mmap flags to linux/mman.h") The generated mmap_flags array stays the same: $ tools/perf/trace/beauty/mmap_flags.sh static const char *mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", [ilog2(0x01) + 1] = "SHARED", [ilog2(0x02) + 1] = "PRIVATE", [ilog2(0x10) + 1] = "FIXED", [ilog2(0x20) + 1] = "ANONYMOUS", [ilog2(0x100000) + 1] = "FIXED_NOREPLACE", [ilog2(0x0100) + 1] = "GROWSDOWN", [ilog2(0x0800) + 1] = "DENYWRITE", [ilog2(0x1000) + 1] = "EXECUTABLE", [ilog2(0x2000) + 1] = "LOCKED", [ilog2(0x4000) + 1] = "NORESERVE", [ilog2(0x8000) + 1] = "POPULATE", [ilog2(0x10000) + 1] = "NONBLOCK", [ilog2(0x20000) + 1] = "STACK", [ilog2(0x40000) + 1] = "HUGETLB", [ilog2(0x80000) + 1] = "SYNC", }; $ And to have the system's sys/mman.h find the definition of MAP_SHARED and MAP_PRIVATE, make sure they are defined in the tools/ mman-common.h in a way that keeps it the same as the kernel's, need for keeping the Android's NDK cross build working. This silences these perf build warnings: Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/mman-common.h' differs from latest version at 'include/uapi/asm-generic/mman-common.h' diff -u tools/include/uapi/asm-generic/mman-common.h include/uapi/asm-generic/mman-common.h Warning: Kernel ABI header at 'tools/include/uapi/linux/mman.h' differs from latest version at 'include/uapi/linux/mman.h' diff -u tools/include/uapi/linux/mman.h include/uapi/linux/mman.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-h80ycpc6pedg9s5z2rwpy6ws@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28perf evsel: Fix max perf_event_attr.precise_ip detectionJiri Olsa
After a discussion with Andi, move the perf_event_attr.precise_ip detection for maximum precise config (via :P modifier or for default cycles event) to perf_evsel__open(). The current detection in perf_event_attr__set_max_precise_ip() is tricky, because precise_ip config is specific for given event and it currently checks only hw cycles. We now check for valid precise_ip value right after failing sys_perf_event_open() for specific event, before any of the perf_event_attr fallback code gets executed. This way we get the proper config in perf_event_attr together with allowed precise_ip settings. We can see that code activity with -vv, like: $ perf record -vv ls ... ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 ... precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 9926 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -95 decreasing precise_ip by one (2) ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 ... precise_ip 2 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 9926 cpu 0 group_fd -1 flags 0x8 = 4 ... Suggested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/n/tip-dkvxxbeg7lu74155d4jhlmc9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28perf intel-pt: Fix TSC slipAdrian Hunter
A TSC packet can slip past MTC packets so that the timestamp appears to go backwards. One estimate is that can be up to about 40 CPU cycles, which is certainly less than 0x1000 TSC ticks, but accept slippage an order of magnitude more to be on the safe side. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: 79b58424b821c ("perf tools: Add Intel PT support for decoding MTC packets") Link: http://lkml.kernel.org/r/20190325135135.18348-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28perf cs-etm: Add missing case valueSolomon Tan
The following error was thrown when compiling `tools/perf` using OpenCSD v0.11.1. This patch fixes said error. CC util/intel-pt-decoder/intel-pt-log.o CC util/cs-etm-decoder/cs-etm-decoder.o util/cs-etm-decoder/cs-etm-decoder.c: In function ‘cs_etm_decoder__buffer_range’: util/cs-etm-decoder/cs-etm-decoder.c:370:2: error: enumeration value ‘OCSD_INSTR_WFI_WFE’ not handled in switch [-Werror=switch-enum] switch (elem->last_i_type) { ^~~~~~ CC util/intel-pt-decoder/intel-pt-decoder.o cc1: all warnings being treated as errors Because `OCSD_INSTR_WFI_WFE` case was added only in v0.11.0, the minimum required OpenCSD library version for this patch is no longer v0.10.0. Signed-off-by: Solomon Tan <solomonbobstoner@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190322052255.GA4809@w-OptiPlex-7050 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Fixes here and there, a couple new device IDs, as usual: 1) Fix BQL race in dpaa2-eth driver, from Ioana Ciornei. 2) Fix 64-bit division in iwlwifi, from Arnd Bergmann. 3) Fix documentation for some eBPF helpers, from Quentin Monnet. 4) Some UAPI bpf header sync with tools, also from Quentin Monnet. 5) Set descriptor ownership bit at the right time for jumbo frames in stmmac driver, from Aaro Koskinen. 6) Set IFF_UP properly in tun driver, from Eric Dumazet. 7) Fix load/store doubleword instruction generation in powerpc eBPF JIT, from Naveen N. Rao. 8) nla_nest_start() return value checks all over, from Kangjie Lu. 9) Fix asoc_id handling in SCTP after the SCTP_*_ASSOC changes this merge window. From Marcelo Ricardo Leitner and Xin Long. 10) Fix memory corruption with large MTUs in stmmac, from Aaro Koskinen. 11) Do not use ipv4 header for ipv6 flows in TCP and DCCP, from Eric Dumazet. 12) Fix topology subscription cancellation in tipc, from Erik Hugne. 13) Memory leak in genetlink error path, from Yue Haibing. 14) Valid control actions properly in packet scheduler, from Davide Caratti. 15) Even if we get EEXIST, we still need to rehash if a shrink was delayed. From Herbert Xu. 16) Fix interrupt mask handling in interrupt handler of r8169, from Heiner Kallweit. 17) Fix leak in ehea driver, from Wen Yang" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (168 commits) dpaa2-eth: fix race condition with bql frame accounting chelsio: use BUG() instead of BUG_ON(1) net: devlink: skip info_get op call if it is not defined in dumpit net: phy: bcm54xx: Encode link speed and activity into LEDs tipc: change to check tipc_own_id to return in tipc_net_stop net: usb: aqc111: Extend HWID table by QNAP device net: sched: Kconfig: update reference link for PIE net: dsa: qca8k: extend slave-bus implementations net: dsa: qca8k: remove leftover phy accessors dt-bindings: net: dsa: qca8k: support internal mdio-bus dt-bindings: net: dsa: qca8k: fix example net: phy: don't clear BMCR in genphy_soft_reset bpf, libbpf: clarify bump in libbpf version info bpf, libbpf: fix version info and add it to shared object rxrpc: avoid clang -Wuninitialized warning tipc: tipc clang warning net: sched: fix cleanup NULL pointer exception in act_mirr r8169: fix cable re-plugging issue net: ethernet: ti: fix possible object reference leak net: ibm: fix possible object reference leak ...
2019-03-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf 2019-03-24 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) libbpf verision fix up from Daniel. 2) fix liveness propagation from Jakub. 3) fix verbose print of refcounted regs from Martin. 4) fix for large map allocations from Martynas. 5) fix use after free in sanitize_ptr_alu from Xu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-24bpf, libbpf: clarify bump in libbpf version infoDaniel Borkmann
The current documentation suggests that we would need to bump the libbpf version on every change. Lets clarify this a bit more and reflect what we do today in practice, that is, bumping it once per development cycle. Fixes: 76d1b894c515 ("libbpf: Document API and ABI conventions") Reported-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-24bpf, libbpf: fix version info and add it to shared objectDaniel Borkmann
Even though libbpf's versioning script for the linker (libbpf.map) is pointing to 0.0.2, the BPF_EXTRAVERSION in the Makefile has not been updated along with it and is therefore still on 0.0.1. While fixing up, I also noticed that the generated shared object versioning information is missing, typical convention is to have a linker name (libbpf.so), soname (libbpf.so.0) and real name (libbpf.so.0.0.2) for library management. This is based upon the LIBBPF_VERSION as well. The build will then produce the following bpf libraries: # ll libbpf* libbpf.a libbpf.so -> libbpf.so.0.0.2 libbpf.so.0 -> libbpf.so.0.0.2 libbpf.so.0.0.2 # readelf -d libbpf.so.0.0.2 | grep SONAME 0x000000000000000e (SONAME) Library soname: [libbpf.so.0] And install them accordingly: # rm -rf /tmp/bld; mkdir /tmp/bld; make -j$(nproc) O=/tmp/bld install Auto-detecting system features: ... libelf: [ on ] ... bpf: [ on ] CC /tmp/bld/libbpf.o CC /tmp/bld/bpf.o CC /tmp/bld/nlattr.o CC /tmp/bld/btf.o CC /tmp/bld/libbpf_errno.o CC /tmp/bld/str_error.o CC /tmp/bld/netlink.o CC /tmp/bld/bpf_prog_linfo.o CC /tmp/bld/libbpf_probes.o CC /tmp/bld/xsk.o LD /tmp/bld/libbpf-in.o LINK /tmp/bld/libbpf.a LINK /tmp/bld/libbpf.so.0.0.2 LINK /tmp/bld/test_libbpf INSTALL /tmp/bld/libbpf.a INSTALL /tmp/bld/libbpf.so.0.0.2 # ll /usr/local/lib64/libbpf.* /usr/local/lib64/libbpf.a /usr/local/lib64/libbpf.so -> libbpf.so.0.0.2 /usr/local/lib64/libbpf.so.0 -> libbpf.so.0.0.2 /usr/local/lib64/libbpf.so.0.0.2 Fixes: 1bf4b05810fe ("tools: bpftool: add probes for eBPF program types") Fixes: 1b76c13e4b36 ("bpf tools: Introduce 'bpf' library and add bpf feature check") Reported-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-24Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Thomas Gleixner: "A larger set of perf updates. Not all of them are strictly fixes, but that's solely the tip maintainers fault as they let the timely -rc1 pull request fall through the cracks for various reasons including travel. So I'm sending this nevertheless because rebasing and distangling fixes and updates would be a mess and risky as well. As of tomorrow, a strict fixes separation is happening again. Sorry for the slip-up. Kernel: - Handle RECORD_MMAP vs. RECORD_MMAP2 correctly so different consumers of the mmap event get what they requested. Tools: - A larger set of updates to perf record/report/scripts vs. time stamp handling - More Python3 fixups - A pile of memory leak plumbing - perf BPF improvements and fixes - Finalize the perf.data directory storage" [ Note: the kernel part is strictly a fix, the updates are purely to tooling - Linus ] * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits) perf bpf: Show more BPF program info in print_bpf_prog_info() perf bpf: Extract logic to create program names from perf_event__synthesize_one_bpf_prog() perf tools: Save bpf_prog_info and BTF of new BPF programs perf evlist: Introduce side band thread perf annotate: Enable annotation of BPF programs perf build: Check what binutils's 'disassembler()' signature to use perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotation perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO perf feature detection: Add -lopcodes to feature-libbfd perf top: Add option --no-bpf-event perf bpf: Save BTF information as headers to perf.data perf bpf: Save BTF in a rbtree in perf_env perf bpf: Save bpf_prog_info information as headers to perf.data perf bpf: Save bpf_prog_info in a rbtree in perf_env perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of perf_tool perf bpf: Synthesize bpf events with bpf_program__get_prog_info_linear() bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump() tools lib bpf: Introduce bpf_program__get_prog_info_linear() perf record: Replace option --bpf-event with --no-bpf-event perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test() ...
2019-03-24Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Thomas Gleixner: "Two small fixes: - Move the large objtool_file struct off the stack so objtool works in setups with a tight stack limit. - Make a few variables static in the watchdog core code" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: watchdog/core: Make variables static objtool: Move objtool_file struct off the stack
2019-03-22Merge tag 'perf-core-for-mingo-5.1-20190321' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo: BPF: Song Liu: - Add support for annotating BPF programs, using the PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL recently added to the kernel and plugging binutils's libopcodes disassembly of BPF programs with the existing annotation interfaces in 'perf annotate', 'perf report' and 'perf top' various output formats (--stdio, --stdio2, --tui). perf list: Andi Kleen: - Filter metrics when using substring search. perf record: Andi Kleen: - Allow to limit number of reported perf.data files - Clarify help for --switch-output. perf report: Andi Kleen - Indicate JITed code better. - Show all sort keys in help output. perf script: Andi Kleen: - Support relative time. perf stat: Andi Kleen: - Improve scaling. General: Changbin Du: - Fix some mostly error path memory and reference count leaks found using gcc's ASan and UBSan. Vendor events: Mamatha Inamdar: - Remove P8 HW events which are not supported. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-03-22Merge tag 'perf-core-for-mingo-5.1-20190311' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo: kernel: Stephane Eranian : - Restore mmap record type correctly when handling PERF_RECORD_MMAP2 events, as the same template is used for all the threads interested in mmap events, some may want just PERF_RECORD_MMAP, while some may want the extra info in MMAP2 records. perf probe: Adrian Hunter: - Fix getting the kernel map, because since changes related to x86 PTI entry trampolines handling, there are more than one kernel map. perf script: Andi Kleen: - Support insn output for normal samples, i.e.: perf script -F ip,sym,insn --xed Will fetch the sample IP from the thread address space and feed it to Intel's XED disassembler, producing lines such as: ffffffffa4068804 native_write_msr wrmsr ffffffffa415b95e __hrtimer_next_event_base movq 0x18(%rax), %rdx That match 'perf annotate's output. - Make the --cpu filter apply to PERF_RECORD_COMM/FORK/... events, in addition to PERF_RECORD_SAMPLE. perf report: - Add a new --samples option to save a small random number of samples per hist entry, using a reservoir technique to select a representative number of samples. Then allow browsing the samples using 'perf script' as part of the hist entry context menu. This automatically adds the right filters, so only the thread or CPU of the sample is displayed. Then we use less' search functionality to directly jump to the time stamp of the selected sample. It uses different menus for assembler and source display. Assembler needs xed installed and source needs debuginfo. - Fix the UI browser scripts pop up menu when there are many scripts available. perf report: Andi Kleen: - Add 'time' sort option. E.g.: % perf report --sort time,overhead,symbol --time-quantum 1ms --stdio ... 0.67% 277061.87300 [.] _dl_start 0.50% 277061.87300 [.] f1 0.50% 277061.87300 [.] f2 0.33% 277061.87300 [.] main 0.29% 277061.87300 [.] _dl_lookup_symbol_x 0.29% 277061.87300 [.] dl_main 0.29% 277061.87300 [.] do_lookup_x 0.17% 277061.87300 [.] _dl_debug_initialize 0.17% 277061.87300 [.] _dl_init_paths 0.08% 277061.87300 [.] check_match 0.04% 277061.87300 [.] _dl_count_modids 1.33% 277061.87400 [.] f1 1.33% 277061.87400 [.] f2 1.33% 277061.87400 [.] main 1.17% 277061.87500 [.] main 1.08% 277061.87500 [.] f1 1.08% 277061.87500 [.] f2 1.00% 277061.87600 [.] main 0.83% 277061.87600 [.] f1 0.83% 277061.87600 [.] f2 1.00% 277061.87700 [.] main tools headers: Arnaldo Carvalho de Melo: - Update x86's syscall_64.tbl, no change in tools/perf behaviour. - Sync copies asm-generic/unistd.h and linux/in with the kernel sources. perf data: Jiri Olsa: - Prep work to support having perf.data stored as a directory, with one file per CPU, that ultimately will allow having one ring buffer reading thread per CPU. Vendor events: Martin Liška: - perf PMU events for AMD Family 17h. perf script python: Tony Jones: - Add python3 support for the remaining Intel PT related scripts, with these we should have a clean build of perf with python3 while still supporting the build with python2. libbpf: Arnaldo Carvalho de Melo: - Fix the build on uCLibc, adding the missing stdarg.h since we use va_list in one typedef. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-21bpf: verifier: propagate liveness on all framesJakub Kicinski
Commit 7640ead93924 ("bpf: verifier: make sure callees don't prune with caller differences") connected up parentage chains of all frames of the stack. It didn't, however, ensure propagate_liveness() propagates all liveness information along those chains. This means pruning happening in the callee may generate explored states with incomplete liveness for the chains in lower frames of the stack. The included selftest is similar to the prior one from commit 7640ead93924 ("bpf: verifier: make sure callees don't prune with caller differences"), where callee would prune regardless of the difference in r8 state. Now we also initialize r9 to 0 or 1 based on a result from get_random(). r9 is never read so the walk with r9 = 0 gets pruned (correctly) after the walk with r9 = 1 completes. The selftest is so arranged that the pruning will happen in the callee. Since callee does not propagate read marks of r8, the explored state at the pruning point prior to the callee will now ignore r8. Propagate liveness on all frames of the stack when pruning. Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-21net/sched: act_vlan: validate the control action inside init()Davide Caratti
the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action vlan pop pass index 90 # tc actions replace action vlan \ > pop goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action vlan had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: vlan pop goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000007974f067 P4D 800000007974f067 PUD 79638067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.0.0-rc4.gotochain_crash+ #536 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff982dfdb83be0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff982dfc55db00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff982df97099c0 RDI: ffff982dfc55db00 RBP: ffff982dfdb83c80 R08: ffff982df983fec8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff982df5aacd00 R13: ffff982df5aacd08 R14: 0000000000000001 R15: ffff982df97099c0 FS: 0000000000000000(0000) GS:ffff982dfdb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000796d0005 CR4: 00000000001606e0 Call Trace: <IRQ> tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ip6_finish_output2+0x369/0x590 ip6_finish_output2+0x369/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.35+0x79/0xc0 mld_sendpack+0x16f/0x220 mld_ifc_timer_expire+0x195/0x2c0 ? igmp6_timer_handler+0x70/0x70 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? enqueue_hrtimer+0x39/0x90 __do_softirq+0xe3/0x2f5 irq_exit+0xf0/0x100 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 Code: 7b ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffffa4714038feb8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffffffff840184f0 RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000001e57d3f387 RBP: 0000000000000003 R08: 001125d9ca39e1eb R09: 0000000000000000 R10: 000000000000027d R11: 000000000009f400 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? __sched_text_end+0x1/0x1 default_idle+0x1c/0x140 do_idle+0x1c4/0x280 cpu_startup_entry+0x19/0x20 start_secondary+0x1a7/0x200 secondary_startup_64+0xa4/0xb0 Modules linked in: act_vlan veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic mbcache crct10dif_pclmul jbd2 snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper joydev snd_timer virtio_balloon snd pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt virtio_net fb_sys_fops virtio_blk ttm net_failover virtio_console failover ata_piix drm libata crc32c_intel virtio_pci serio_raw virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_vlan_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21net/sched: act_tunnel_key: validate the control action inside init()Davide Caratti
the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action tunnel_key set src_ip 10.10.10.1 dst_ip 20.20.2 dst_port 3128 \ > nocsum id 1 pass index 90 # tc actions replace action tunnel_key \ > set src_ip 10.10.10.1 dst_ip 20.20.2 dst_port 3128 nocsum id 1 \ > goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action tunnel_key had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: tunnel_key set src_ip 10.10.10.1 dst_ip 20.20.2.0 key_id 1 dst_port 3128 nocsum goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002aba4067 P4D 800000002aba4067 PUD 795f9067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.0.0-rc4.gotochain_crash+ #536 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff9346bdb83be0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9346bb795c00 RCX: 0000000000000002 RDX: 0000000000000000 RSI: ffff93466c881700 RDI: 0000000000000246 RBP: ffff9346bdb83c80 R08: ffff9346b3e1e0c8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9346b978f000 R13: ffff9346b978f008 R14: 0000000000000001 R15: ffff93466dceeb40 FS: 0000000000000000(0000) GS:ffff9346bdb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007a6c2002 CR4: 00000000001606e0 Call Trace: <IRQ> tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ip6_finish_output2+0x369/0x590 ip6_finish_output2+0x369/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.35+0x79/0xc0 mld_sendpack+0x16f/0x220 mld_ifc_timer_expire+0x195/0x2c0 ? igmp6_timer_handler+0x70/0x70 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? tick_sched_timer+0x37/0x70 __do_softirq+0xe3/0x2f5 irq_exit+0xf0/0x100 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 Code: 55 ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffffa48a8038feb8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffffffffaa8184f0 RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000087 RDI: 0000000000000003 RBP: 0000000000000003 R08: 0011251c6fcfac49 R09: ffff9346b995be00 R10: ffffa48a805e7ce8 R11: 00000000024c38dd R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? __sched_text_end+0x1/0x1 default_idle+0x1c/0x140 do_idle+0x1c4/0x280 cpu_startup_entry+0x19/0x20 start_secondary+0x1a7/0x200 secondary_startup_64+0xa4/0xb0 Modules linked in: act_tunnel_key veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel mbcache snd_hda_intel jbd2 snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper joydev snd_timer snd pcspkr virtio_balloon soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect virtio_net sysimgblt fb_sys_fops ttm net_failover virtio_console virtio_blk failover drm serio_raw crc32c_intel ata_piix virtio_pci floppy virtio_ring libata virtio dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_tunnel_key_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>