summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-06perf env: Add perf_env__numa_node()Jiri Olsa
To speed up cpu to node lookup, add perf_env__numa_node(), that creates cpu array on the first lookup, that holds numa nodes for each stored cpu. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Joe Mario <jmario@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20190904073415.723-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf vendor events intel: Update all the Intel JSON metrics from TMAM 3.6.Haiyan Song
New Metrics: - DSB_Switches: fraction of cycles CPU was stalled due to switches from DSB to MITE pipeline [all] - L2_Evictions_{Silent|NonSilent}_PKI: L2 {silent|non silent} ecivtions rate per Kilo instruction [SKX+] - IpFarBranch - Instructions per Far Branch Other Enhancements & fixes: - KBLR/CFL & CLX move to separate columns (no column sharing via if #model) - Re-organized/renamed Metric Group Signed-off-by: Haiyan Song <haiyanx.song@intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Link: http://lore.kernel.org/lkml/20191030082308.10919-1-haiyanx.song@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf vendor events intel: Update CascadelakeX events to v1.05Haiyan Song
Update CascadelakeX events to v1.05. Other changes: remove duplicated and without description events. Signed-off-by: Haiyan Song <haiyanx.song@intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Link: http://lore.kernel.org/lkml/20191030082308.10919-1-haiyanx.song@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf tools: Splice events onto evlist even on errorIan Rogers
If event parsing fails the event list is leaked, instead splice the list onto the out result and let the caller cleanup. An example input for parse_events found by libFuzzer that reproduces this memory leak is 'm{'. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: clang-built-linux@googlegroups.com Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20191025180827.191916-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06libsubcmd: Use -O0 with DEBUG=1James Clark
When a 'make DEBUG=1' build is done, the command parser is still built with -O6 and is hard to step through, fix it making it use -O0 in that case. Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: nd <nd@arm.com> Link: http://lore.kernel.org/lkml/20191028113340.4282-1-james.clark@arm.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06libsubcmd: Move EXTRA_FLAGS to the end to allow overriding existing flagsJames Clark
Move EXTRA_WARNINGS and EXTRA_FLAGS to the end of the compilation line, otherwise they cannot be used to override the default values. Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: nd <nd@arm.com> Link: http://lore.kernel.org/lkml/20191028113340.4282-1-james.clark@arm.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf map_groups: Introduce for_each_entry() and for_each_entry_safe() iteratorsArnaldo Carvalho de Melo
To reduce boilerplate, providing a more compact form to iterate over the maps in a map_group. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-gc3go6fmdn30twusg91t2q56@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf maps: Add for_each_entry()/_safe() iteratorsArnaldo Carvalho de Melo
To reduce boilerplate, provide a more compact form using an idiom present in other trees of data structures. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-59gmq4kg1r68ou1wknyjl78x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06Merge branch 'net-bcmgenet-restore-internal-EPHY-support'David S. Miller
Doug Berger says: ==================== net: bcmgenet: restore internal EPHY support (part 2) This is a follow up to my previous submission (see [1]). The first commit provides what is intended to be a complete solution for the issues that can result from insufficient clocking of the MAC during reset of its state machines. It should be backported to the stable releases. It is intended to replace the partial solution of commit 1f515486275a ("net: bcmgenet: soft reset 40nm EPHYs before MAC init") which is reverted by the second commit of this series and should not be back- ported as noted in [2]. The third commit corrects a timing hazard with a polled PHY that can occur when the MAC resumes and also when a v3 internal EPHY is reset by the change in commit 25382b991d25 ("net: bcmgenet: reset 40nm EPHY on energy detect"). It is expected that commit 25382b991d25 be back- ported to stable first before backporting this commit. [1] https://lkml.org/lkml/2019/10/16/1706 [2] https://lkml.org/lkml/2019/10/31/749 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06net: bcmgenet: reapply manual settings to the PHYDoug Berger
The phy_init_hw() function may reset the PHY to a configuration that does not match manual network settings stored in the phydev structure. If the phy state machine is polled rather than event driven this can create a timing hazard where the phy state machine might alter the settings stored in the phydev structure from the value read from the BMCR. This commit follows invocations of phy_init_hw() by the bcmgenet driver with invocations of the genphy_config_aneg() function to ensure that the BMCR is written to match the settings held in the phydev structure. This prevents the risk of manual settings being accidentally altered. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06Revert "net: bcmgenet: soft reset 40nm EPHYs before MAC init"Doug Berger
This reverts commit 1f515486275a08a17a2c806b844cca18f7de5b34. This commit improved the chances of the umac resetting cleanly by ensuring that the PHY was restored to its normal operation prior to resetting the umac. However, there were still cases when the PHY might not be driving a Tx clock to the umac during this window (e.g. when the PHY detects no link). The previous commit now ensures that the unimac receives clocks from the MAC during its reset window so this commit is no longer needed. This commit also has an unintended negative impact on the MDIO performance of the UniMAC MDIO interface because it is used before the MDIO interrupts are reenabled, so it should be removed. Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06net: bcmgenet: use RGMII loopback for MAC resetDoug Berger
As noted in commit 28c2d1a7a0bf ("net: bcmgenet: enable loopback during UniMAC sw_reset") the UniMAC must be clocked while sw_reset is asserted for its state machines to reset cleanly. The transmit and receive clocks used by the UniMAC are derived from the signals used on its PHY interface. The bcmgenet MAC can be configured to work with different PHY interfaces including MII, GMII, RGMII, and Reverse MII on internal and external interfaces. Unfortunately for the UniMAC, when configured for MII the Tx clock is always driven from the PHY which places it outside of the direct control of the MAC. The earlier commit enabled a local loopback mode within the UniMAC so that the receive clock would be derived from the transmit clock which addressed the observed issue with an external GPHY disabling it's Rx clock. However, when a Tx clock is not available this loopback is insufficient. This commit implements a workaround that leverages the fact that the MAC can reliably generate all of its necessary clocking by enterring the external GPHY RGMII interface mode with the UniMAC in local loopback during the sw_reset interval. Unfortunately, this has the undesirable side efect of the RGMII GTXCLK signal being driven during the same window. In most configurations this is a benign side effect as the signal is either not routed to a pin or is already expected to drive the pin. The one exception is when an external MII PHY is expected to drive the same pin with its TX_CLK output creating output driver contention. This commit exploits the IEEE 802.3 clause 22 standard defined isolate mode to force an external MII PHY to present a high impedance on its TX_CLK output during the window to prevent any contention at the pin. The MII interface is used internally with the 40nm internal EPHY which agressively disables its clocks for power savings leading to incomplete resets of the UniMAC and many instabilities observed over the years. The workaround of this commit is expected to put an end to those problems. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06perf map: Allow map__next() to receive a NULL argArnaldo Carvalho de Melo
Just like free(), return NULL in that case, will simplify the for_each_entry_safe() iterators. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-pbde2ucn49khnrebclys9pny@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf map: Check if the map still has some refcounts on exitArnaldo Carvalho de Melo
We were checking just if it was still on some rb tree, but that is not the only way that this map can still have references, map->refcnt is there exactly for this, use it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-hany65tbeavsax7n3xvwl9pc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf dso: Add dso__data_write_cache_addr()Adrian Hunter
Add functions to write into the dso file data cache, but not change the file itself. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20191025130000.13032-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf dso: Refactor dso_cache__read()Adrian Hunter
Refactor dso_cache__read() to separate populating the cache from copying data from it. This is preparation for adding a cache "write" that will update the data in the cache. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20191025130000.13032-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf auxtrace: Add auxtrace_cache__remove()Adrian Hunter
Add auxtrace_cache__remove(). Intel PT uses an auxtrace_cache to store the results of code-walking, so that the same block of instructions does not have to be decoded repeatedly. However, when there are text poke events, the associated cache entries need to be removed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20191025130000.13032-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf probe: Fix to show ranges of variables in functions without entry_pcMasami Hiramatsu
Fix to show ranges of variables (--range and --vars option) in functions which DIE has only ranges but no entry_pc attribute. Without this fix: # perf probe --range -V clear_tasks_mm_cpumask Available variables at clear_tasks_mm_cpumask @<clear_tasks_mm_cpumask+0> (No matched variables) With this fix: # perf probe --range -V clear_tasks_mm_cpumask Available variables at clear_tasks_mm_cpumask @<clear_tasks_mm_cpumask+0> [VAL] int cpu @<clear_tasks_mm_cpumask+[0-35,317-317,2052-2059]> Committer testing: Before: [root@quaco ~]# perf probe --range -V clear_tasks_mm_cpumask Available variables at clear_tasks_mm_cpumask @<clear_tasks_mm_cpumask+0> (No matched variables) [root@quaco ~]# After: [root@quaco ~]# perf probe --range -V clear_tasks_mm_cpumask Available variables at clear_tasks_mm_cpumask @<clear_tasks_mm_cpumask+0> [VAL] int cpu @<clear_tasks_mm_cpumask+[0-23,23-105,105-106,106-106,1843-1850,1850-1862]> [root@quaco ~]# Using it: [root@quaco ~]# perf probe clear_tasks_mm_cpumask cpu Added new event: probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask with cpu) You can now use it in all perf tools, such as: perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1 [root@quaco ~]# perf probe -l probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c with cpu) [root@quaco ~]# [root@quaco ~]# perf trace -e probe:*cpumask ^C[root@quaco ~]# Fixes: 349e8d261131 ("perf probe: Add --range option to show a variable's location range") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/157199323018.8075.8179744380479673672.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf probe: Fix to show inlined function callsite without entry_pcMasami Hiramatsu
Fix 'perf probe --line' option to show inlined function callsite lines even if the function DIE has only ranges. Without this: # perf probe -L amd_put_event_constraints ... 2 { 3 if (amd_has_nb(cpuc) && amd_is_nb_event(&event->hw)) __amd_put_nb_event_constraints(cpuc, event); 5 } With this patch: # perf probe -L amd_put_event_constraints ... 2 { 3 if (amd_has_nb(cpuc) && amd_is_nb_event(&event->hw)) 4 __amd_put_nb_event_constraints(cpuc, event); 5 } Committer testing: Before: [root@quaco ~]# perf probe -L amd_put_event_constraints <amd_put_event_constraints@/usr/src/debug/kernel-5.2.fc30/linux-5.2.18-200.fc30.x86_64/arch/x86/events/amd/core.c:0> 0 static void amd_put_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event) 2 { 3 if (amd_has_nb(cpuc) && amd_is_nb_event(&event->hw)) __amd_put_nb_event_constraints(cpuc, event); 5 } PMU_FORMAT_ATTR(event, "config:0-7,32-35"); PMU_FORMAT_ATTR(umask, "config:8-15" ); [root@quaco ~]# After: [root@quaco ~]# perf probe -L amd_put_event_constraints <amd_put_event_constraints@/usr/src/debug/kernel-5.2.fc30/linux-5.2.18-200.fc30.x86_64/arch/x86/events/amd/core.c:0> 0 static void amd_put_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event) 2 { 3 if (amd_has_nb(cpuc) && amd_is_nb_event(&event->hw)) 4 __amd_put_nb_event_constraints(cpuc, event); 5 } PMU_FORMAT_ATTR(event, "config:0-7,32-35"); PMU_FORMAT_ATTR(umask, "config:8-15" ); [root@quaco ~]# perf probe amd_put_event_constraints:4 Added new event: probe:amd_put_event_constraints (on amd_put_event_constraints:4) You can now use it in all perf tools, such as: perf record -e probe:amd_put_event_constraints -aR sleep 1 [root@quaco ~]# [root@quaco ~]# perf probe -l probe:amd_put_event_constraints (on amd_put_event_constraints:4@arch/x86/events/amd/core.c) probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c) [root@quaco ~]# Using it: [root@quaco ~]# perf trace -e probe:* ^C[root@quaco ~]# Ok, Intel system here... :-) Fixes: 4cc9cec636e7 ("perf probe: Introduce lines walker interface") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/157199322107.8075.12659099000567865708.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf probe: Fix to list probe event with correct line numberMasami Hiramatsu
Since debuginfo__find_probe_point() uses dwarf_entrypc() for finding the entry address of the function on which a probe is, it will fail when the function DIE has only ranges attribute. To fix this issue, use die_entrypc() instead of dwarf_entrypc(). Without this fix, perf probe -l shows incorrect offset: # perf probe -l probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask+18446744071579263632@work/linux/linux/kernel/cpu.c) probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask+18446744071579263752@work/linux/linux/kernel/cpu.c) With this: # perf probe -l probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@work/linux/linux/kernel/cpu.c) probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:21@work/linux/linux/kernel/cpu.c) Committer testing: Before: [root@quaco ~]# perf probe -l probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask+18446744071579765152@kernel/cpu.c) [root@quaco ~]# After: [root@quaco ~]# perf probe -l probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c) [root@quaco ~]# Fixes: 1d46ea2a6a40 ("perf probe: Fix listing incorrect line number with inline function") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/157199321227.8075.14655572419136993015.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf probe: Fix to probe an inline function which has no entry pcMasami Hiramatsu
Fix perf probe to probe an inlne function which has no entry pc or low pc but only has ranges attribute. This seems very rare case, but I could find a few examples, as same as probe_point_search_cb(), use die_entrypc() to get the entry address in probe_point_inline_cb() too. Without this patch: # perf probe -D __amd_put_nb_event_constraints Failed to get entry address of __amd_put_nb_event_constraints. Probe point '__amd_put_nb_event_constraints' not found. Error: Failed to add events. With this patch: # perf probe -D __amd_put_nb_event_constraints p:probe/__amd_put_nb_event_constraints amd_put_event_constraints+43 Committer testing: Before: [root@quaco ~]# perf probe -D __amd_put_nb_event_constraints Failed to get entry address of __amd_put_nb_event_constraints. Probe point '__amd_put_nb_event_constraints' not found. Error: Failed to add events. [root@quaco ~]# After: [root@quaco ~]# perf probe -D __amd_put_nb_event_constraints p:probe/__amd_put_nb_event_constraints _text+33789 [root@quaco ~]# Fixes: 4ea42b181434 ("perf: Add perf probe subcommand, a kprobe-event setup helper") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/157199320336.8075.16189530425277588587.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf probe: Fix to probe a function which has no entry pcMasami Hiramatsu
Fix 'perf probe' to probe a function which has no entry pc or low pc but only has ranges attribute. probe_point_search_cb() uses dwarf_entrypc() to get the probe address, but that doesn't work for the function DIE which has only ranges attribute. Use die_entrypc() instead. Without this fix: # perf probe -k ../build-x86_64/vmlinux -D clear_tasks_mm_cpumask:0 Probe point 'clear_tasks_mm_cpumask' not found. Error: Failed to add events. With this: # perf probe -k ../build-x86_64/vmlinux -D clear_tasks_mm_cpumask:0 p:probe/clear_tasks_mm_cpumask clear_tasks_mm_cpumask+0 Committer testing: Before: [root@quaco ~]# perf probe clear_tasks_mm_cpumask:0 Probe point 'clear_tasks_mm_cpumask' not found. Error: Failed to add events. [root@quaco ~]# After: [root@quaco ~]# perf probe clear_tasks_mm_cpumask:0 Added new event: probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask) You can now use it in all perf tools, such as: perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1 [root@quaco ~]# Using it with 'perf trace': [root@quaco ~]# perf trace -e probe:clear_tasks_mm_cpumask Doesn't seem to be used in x86_64: $ find . -name "*.c" | xargs grep clear_tasks_mm_cpumask ./kernel/cpu.c: * clear_tasks_mm_cpumask - Safely clear tasks' mm_cpumask for a CPU ./kernel/cpu.c:void clear_tasks_mm_cpumask(int cpu) ./arch/xtensa/kernel/smp.c: clear_tasks_mm_cpumask(cpu); ./arch/csky/kernel/smp.c: clear_tasks_mm_cpumask(cpu); ./arch/sh/kernel/smp.c: clear_tasks_mm_cpumask(cpu); ./arch/arm/kernel/smp.c: clear_tasks_mm_cpumask(cpu); ./arch/powerpc/mm/nohash/mmu_context.c: clear_tasks_mm_cpumask(cpu); $ find . -name "*.h" | xargs grep clear_tasks_mm_cpumask ./include/linux/cpu.h:void clear_tasks_mm_cpumask(int cpu); $ find . -name "*.S" | xargs grep clear_tasks_mm_cpumask $ Fixes: e1ecbbc3fa83 ("perf probe: Fix to handle optimized not-inlined functions") Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/157199319438.8075.4695576954550638618.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf probe: Fix wrong address verificationMasami Hiramatsu
Since there are some DIE which has only ranges instead of the combination of entrypc/highpc, address verification must use dwarf_haspc() instead of dwarf_entrypc/dwarf_highpc. Also, the ranges only DIE will have a partial code in different section (e.g. unlikely code will be in text.unlikely as "FUNC.cold" symbol). In that case, we can not use dwarf_entrypc() or die_entrypc(), because the offset from original DIE can be a minus value. Instead, this simply gets the symbol and offset from symtab. Without this patch; # perf probe -D clear_tasks_mm_cpumask:1 Failed to get entry address of clear_tasks_mm_cpumask Error: Failed to add events. And with this patch: # perf probe -D clear_tasks_mm_cpumask:1 p:probe/clear_tasks_mm_cpumask clear_tasks_mm_cpumask+0 p:probe/clear_tasks_mm_cpumask_1 clear_tasks_mm_cpumask+5 p:probe/clear_tasks_mm_cpumask_2 clear_tasks_mm_cpumask+8 p:probe/clear_tasks_mm_cpumask_3 clear_tasks_mm_cpumask+16 p:probe/clear_tasks_mm_cpumask_4 clear_tasks_mm_cpumask+82 Committer testing: I managed to reproduce the above: [root@quaco ~]# perf probe -D clear_tasks_mm_cpumask:1 p:probe/clear_tasks_mm_cpumask _text+919968 p:probe/clear_tasks_mm_cpumask_1 _text+919973 p:probe/clear_tasks_mm_cpumask_2 _text+919976 [root@quaco ~]# But then when trying to actually put the probe in place, it fails if I use :0 as the offset: [root@quaco ~]# perf probe -L clear_tasks_mm_cpumask | head -5 <clear_tasks_mm_cpumask@/usr/src/debug/kernel-5.2.fc30/linux-5.2.18-200.fc30.x86_64/kernel/cpu.c:0> 0 void clear_tasks_mm_cpumask(int cpu) 1 { 2 struct task_struct *p; [root@quaco ~]# perf probe clear_tasks_mm_cpumask:0 Probe point 'clear_tasks_mm_cpumask' not found. Error: Failed to add events. [root@quaco The next patch is needed to fix this case. Fixes: 576b523721b7 ("perf probe: Fix probing symbols with optimization suffix") Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/157199318513.8075.10463906803299647907.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf jevents: Fix resource leak in process_mapfile() and main()Yunfeng Ye
There are memory leaks and file descriptor resource leaks in process_mapfile() and main(). Fix this by adding free(), fclose() and free_arch_std_events() on the error paths. Fixes: 80eeb67fe577 ("perf jevents: Program to convert JSON file") Fixes: 3f056b66647b ("perf jevents: Make build fail on JSON parse error") Fixes: e9d32c1bf0cd ("perf vendor events: Add support for arch standard events") Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Feilong Lin <linfeilong@huawei.com> Cc: Hu Shiyuan <hushiyuan@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Luke Mujica <lukemujica@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zenghui Yu <yuzenghui@huawei.com> Link: http://lore.kernel.org/lkml/d7907042-ec9c-2bef-25b4-810e14602f89@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf probe: Fix to show function entry line as probe-ableMasami Hiramatsu
Fix die_walk_lines() to list the function entry line correctly. Since the dwarf_entrypc() does not return the entry pc if the DIE has only range attribute, __die_walk_funclines() fails to list the declaration line (entry line) in that case. To solve this issue, this introduces die_entrypc() which correctly returns the entry PC (the first address range) even if the DIE has only range attribute. With this fix die_walk_lines() shows the function entry line is able to probe correctly. Fixes: 4cc9cec636e7 ("perf probe: Introduce lines walker interface") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/157190837419.1859.4619125803596816752.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf probe: Walk function lines in lexical blocksMasami Hiramatsu
Since some inlined functions are in lexical blocks of given function, we have to recursively walk through the DIE tree. Without this fix, perf-probe -L can miss the inlined functions which is in a lexical block (like if (..) { func() } case.) However, even though, to walk the lines in a given function, we don't need to follow the children DIE of inlined functions because those do not have any lines in the specified function. We need to walk though whole trees only if we walk all lines in a given file, because an inlined function can include another inlined function in the same file. Fixes: b0e9cb2802d4 ("perf probe: Fix to search nested inlined functions in CU") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/157190836514.1859.15996864849678136353.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf probe: Fix to find range-only function instanceMasami Hiramatsu
Fix die_is_func_instance() to find range-only function instance. In some case, a function instance can be made without any low PC or entry PC, but only with address ranges by optimization. (e.g. cold text partially in "text.unlikely" section) To find such function instance, we have to check the range attribute too. Fixes: e1ecbbc3fa83 ("perf probe: Fix to handle optimized not-inlined functions") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/157190835669.1859.8368628035930950596.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf kvm: Use evlist layer api when possibleIgor Lubashev
No need for layer violations when a proper evlist api is available. Signed-off-by: Igor Lubashev <ilubashe@akamai.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/1571795693-23558-4-git-send-email-ilubashe@akamai.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf tests: Fix a typoLeo Yan
Correct typo in comment: s/suck/stuck. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reported-by: Will Deacon <will@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191023083324.12093-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf tools: Avoid a malloc() for array eventsIan Rogers
Use realloc() rather than malloc()+memcpy() to possibly avoid a memory allocation when appending array elements. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: clang-built-linux@googlegroups.com Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20191023005337.196160-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf tools: Move ALLOC_LIST into a functionIan Rogers
Having a YYABORT in a macro makes it hard to free memory for components of a rule. Separate the logic out. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: clang-built-linux@googlegroups.com Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20191023005337.196160-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf evsel: Avoid close(-1)Andi Kleen
In some weak fallback cases close can be called a lot with -1. Check for this case and avoid calling close then. This is mainly to shut up valgrind which complains about this case. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20191020175202.32456-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf evsel: Always preserve errno while cleaning up perf_event_open failuresAndi Kleen
In some cases when perf_event_open fails, it may do some closes to clean up. In special cases these closes can fail too, which overwrites the errno of the perf_event_open, which is then incorrectly reported. Save/restore errno around closes. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20191020175202.32456-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf cs-etm: Fix definition of macro TO_CS_QUEUE_NRLeo Yan
Macro TO_CS_QUEUE_NR definition has a typo, which uses 'trace_id_chan' as its parameter, this doesn't match with its definition body which uses 'trace_chan_id'. So renames the parameter to 'trace_chan_id'. It's luck to have a local variable 'trace_chan_id' in the function cs_etm__setup_queue(), even we wrongly define the macro TO_CS_QUEUE_NR, the local variable 'trace_chan_id' is used rather than the macro's parameter 'trace_id_chan'; so the compiler doesn't complain for this before. After renaming the parameter, it leads to a compiling error due cs_etm__setup_queue() has no variable 'trace_id_chan'. This patch uses the variable 'trace_chan_id' for the macro so that fixes the compiling error. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight ml <coresight@lists.linaro.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20191021074808.25795-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf llvm: Make .o saving a debug message, not an info oneArnaldo Carvalho de Melo
Its a bit annoying to have that message, better make it a debug one. I.e. now this message will only appear when using '-v': [root@quaco tracebuffer]# trace -e bristot.c LLVM: dumping bristot.o ^C[root@quaco tracebuffer]# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-o7jd4i7s66kosec5torubqps@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf record: Put a copy of kcore into the perf.data directoryAdrian Hunter
Add a new 'perf record' option '--kcore' which will put a copy of /proc/kcore, kallsyms and modules into a perf.data directory. Note, that without the --kcore option, output goes to a file as previously. The tools' -o and -i options work with either a file name or directory name. Example: $ sudo perf record --kcore uname $ sudo tree perf.data perf.data ├── kcore_dir │   ├── kallsyms │   ├── kcore │   └── modules └── data $ sudo perf script -v build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541 Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field. Using CPUID GenuineIntel-6-8E-A Using perf.data/kcore_dir/kcore for kernel data Using perf.data/kcore_dir/kallsyms for symbols perf 19058 506778.423729: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423733: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423734: 7 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423736: 117 cycles: ffffffffa2caa54a native_write_msr+0xa (vmlinux) perf 19058 506778.423738: 2092 cycles: ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux) perf 19058 506778.423740: 37380 cycles: ffffffffa2f121d0 perf_event_addr_filters_exec+0x0 (vmlinux) uname 19058 506778.423751: 582673 cycles: ffffffffa303a407 propagate_protected_usage+0x147 (vmlinux) uname 19058 506778.423892: 2241841 cycles: ffffffffa2cae0c9 unwind_next_frame.part.5+0x79 (vmlinux) uname 19058 506778.424430: 2457397 cycles: ffffffffa3019232 check_memory_region+0x52 (vmlinux) Committer testing: # rm -rf perf.data* # perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ] # ls -l perf.data -rw-------. 1 root root 34772 Oct 21 11:08 perf.data # perf record --kcore uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ] ls[root@quaco ~]# ls -lad perf.data* drwx------. 3 root root 4096 Oct 21 11:08 perf.data -rw-------. 1 root root 34772 Oct 21 11:08 perf.data.old # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # perf evlist -v -i perf.data/data cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/20191004083121.12182-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf data: Support single perf.data file directoryAdrian Hunter
Support directory output that contains a regular perf.data file, named "data". By default the directory is named perf.data i.e. perf.data └── data Most of the infrastructure to support a directory is already there. This patch makes the changes needed to support the format above. Presently there is no 'perf record' option to output a directory. This is preparation for adding support for putting a copy of /proc/kcore in the directory. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191004083121.12182-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf session: Fix indent in perf_session__new()"Jiri Olsa
Fix up indentation. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lore.kernel.org/lkml/20191007112027.GD6919@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf data: Rename directory "header" file to "data"Adrian Hunter
In preparation to support a single file directory format, rename "header" to "data" because "header" is a mis-leading name when there is only 1 file. Note, in the multi-file case, the "header" file also contains data. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191004083121.12182-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf data: Move perf_dir_version into data.hAdrian Hunter
perf_dir_version belongs to struct perf_data which is declared in data.h. To allow its use in inline perf_data functions, move perf_dir_version to data.h Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191004083121.12182-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf data: Correctly identify directory data filesAdrian Hunter
In order to rename the "header" file to "data" without conflicting, correctly identify the non-header files as starting with "data." Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191004083121.12182-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06Merge tag 'perf-urgent-for-mingo-5.4-20191105' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf fixes from Arnaldo Carvalho de Melo: perf report/top: Jiri Olsa: - Fix time sorting for big numbers, i.e.: perf report -s time -F time,overhead --stdio was failing because the sort comparision routine was returning 'int' while that particular -s key was int64_t, fix it. perf scripting engines: Steven Rostedt (VMware): - Iterate on tep event arrays directly, fixing a bug when generating python/perl source code from a perf.data file with more than one tracepoint event. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06drm/atomic: fix self-refresh helpers crtc state dereferenceRob Clark
drm_self_refresh_helper_update_avg_times() was incorrectly accessing the new incoming state after drm_atomic_helper_commit_hw_done(). But this state might have already been superceeded by an !nonblock atomic update resulting in dereferencing an already free'd crtc_state. TODO I *think* this will more or less do the right thing.. althought I'm not 100% sure if, for example, we enter psr in a nonblock commit, and then leave psr in a !nonblock commit that overtakes the completion of the nonblock commit. Not sure if this sort of scenario can happen in practice. But not crashing is better than crashing, so I guess we should either take this patch or rever the self-refresh helpers until Sean can figure out a better solution. Fixes: d4da4e33341c ("drm: Measure Self Refresh Entry/Exit times to avoid thrashing") Cc: Sean Paul <seanpaul@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org> [seanpaul fixed up some checkpatch warns] Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20191104173737.142558-1-robdclark@gmail.com
2019-11-06RDMA/hns: Correct the value of srq_desc_sizeWenpeng Liang
srq_desc_size should be rounded up to pow of two before used, or related calculation may cause allocating wrong size of memory for srq buffer. Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode") Link: https://lore.kernel.org/r/1572575610-52530-3-git-send-email-liweihang@hisilicon.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-06RDMA/hns: Correct the value of HNS_ROCE_HEM_CHUNK_LENSirong Wang
Size of pointer to buf field of struct hns_roce_hem_chunk should be considered when calculating HNS_ROCE_HEM_CHUNK_LEN, or sg table size will be larger than expected when allocating hem. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Link: https://lore.kernel.org/r/1572575610-52530-2-git-send-email-liweihang@hisilicon.com Signed-off-by: Sirong Wang <wangsirong@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-06configfs: calculate the depth of parent itemHonggang Li
When create symbolic link, create_link should calculate the depth of the parent item. However, both the first and second parameters of configfs_get_target_path had been set to the target. Broken symbolic link created. $ targetcli ls / o- / ............................................................. [...] o- backstores .................................................. [...] | o- block ...................................... [Storage Objects: 0] | o- fileio ..................................... [Storage Objects: 2] | | o- vdev0 .......... [/dev/ramdisk1 (16.0MiB) write-thru activated] | | | o- alua ....................................... [ALUA Groups: 1] | | | o- default_tg_pt_gp ........... [ALUA state: Active/optimized] | | o- vdev1 .......... [/dev/ramdisk2 (16.0MiB) write-thru activated] | | o- alua ....................................... [ALUA Groups: 1] | | o- default_tg_pt_gp ........... [ALUA state: Active/optimized] | o- pscsi ...................................... [Storage Objects: 0] | o- ramdisk .................................... [Storage Objects: 0] o- iscsi ................................................ [Targets: 0] o- loopback ............................................. [Targets: 0] o- srpt ................................................. [Targets: 2] | o- ib.e89a8f91cb3200000000000000000000 ............... [no-gen-acls] | | o- acls ................................................ [ACLs: 2] | | | o- ib.e89a8f91cb3200000000000000000000 ........ [Mapped LUNs: 2] | | | | o- mapped_lun0 ............................. [BROKEN LUN LINK] | | | | o- mapped_lun1 ............................. [BROKEN LUN LINK] | | | o- ib.e89a8f91cb3300000000000000000000 ........ [Mapped LUNs: 2] | | | o- mapped_lun0 ............................. [BROKEN LUN LINK] | | | o- mapped_lun1 ............................. [BROKEN LUN LINK] | | o- luns ................................................ [LUNs: 2] | | o- lun0 ...... [fileio/vdev0 (/dev/ramdisk1) (default_tg_pt_gp)] | | o- lun1 ...... [fileio/vdev1 (/dev/ramdisk2) (default_tg_pt_gp)] | o- ib.e89a8f91cb3300000000000000000000 ............... [no-gen-acls] | o- acls ................................................ [ACLs: 0] | o- luns ................................................ [LUNs: 0] o- vhost ................................................ [Targets: 0] Fixes: e9c03af21cc7 ("configfs: calculate the symlink target only once") Signed-off-by: Honggang Li <honli@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-11-06IB/hfi1: TID RDMA WRITE should not return IB_WC_RNR_RETRY_EXC_ERRKaike Wan
Normal RDMA WRITE request never returns IB_WC_RNR_RETRY_EXC_ERR to ULPs because it does not need post receive buffer on the responder side. Consequently, as an enhancement to normal RDMA WRITE request inside the hfi1 driver, TID RDMA WRITE request should not return such an error status to ULPs, although it does receive RNR NAKs from the responder when TID resources are not available. This behavior is violated when qp->s_rnr_retry_cnt is set in current hfi1 implementation. This patch enforces these semantics by avoiding any reaction to the updates of the RNR QP attributes. Fixes: 3c6cb20a0d17 ("IB/hfi1: Add TID RDMA WRITE functionality into RDMA verbs") Link: https://lore.kernel.org/r/20191025195842.106825.71532.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-06IB/hfi1: Calculate flow weight based on QP MTU for TID RDMAKaike Wan
For a TID RDMA WRITE request, a QP on the responder side could be put into a queue when a hardware flow is not available. A RNR NAK will be returned to the requester with a RNR timeout value based on the position of the QP in the queue. The tid_rdma_flow_wt variable is used to calculate the timeout value and is determined by using a MTU of 4096 at the module loading time. This could reduce the timeout value by half from the desired value, leading to excessive RNR retries. This patch fixes the issue by calculating the flow weight with the real MTU assigned to the QP. Fixes: 07b923701e38 ("IB/hfi1: Add functions to receive TID RDMA WRITE request") Link: https://lore.kernel.org/r/20191025195836.106825.77769.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-06IB/hfi1: Ensure r_tid_ack is valid before building TID RDMA ACK packetKaike Wan
The index r_tid_ack is used to indicate the next TID RDMA WRITE request to acknowledge in the ring s_ack_queue[] on the responder side and should be set to a valid index other than its initial value before r_tid_tail is advanced to the next TID RDMA WRITE request and particularly before a TID RDMA ACK is built. Otherwise, a NULL pointer dereference may result: BUG: unable to handle kernel paging request at ffff9a32d27abff8 IP: [<ffffffffc0d87ea6>] hfi1_make_tid_rdma_pkt+0x476/0xcb0 [hfi1] PGD 2749032067 PUD 0 Oops: 0000 1 SMP Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) ib_ipoib(OE) hfi1(OE) rdmavt(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod target_core_mod ib_ucm dm_mirror dm_region_hash dm_log mlx5_ib dm_mod zfs(POE) rpcrdma sunrpc rdma_ucm ib_uverbs opa_vnic ib_iser zunicode(POE) ib_umad zavl(POE) icp(POE) sb_edac intel_powerclamp coretemp rdma_cm intel_rapl iosf_mbi iw_cm libiscsi scsi_transport_iscsi kvm ib_cm iTCO_wdt mxm_wmi iTCO_vendor_support irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd zcommon(POE) znvpair(POE) pcspkr spl(OE) mei_me sg mei ioatdma lpc_ich joydev i2c_i801 shpchp ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 mlx5_core drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ixgbe ahci ttm mlxfw ib_core libahci devlink mdio crct10dif_pclmul crct10dif_common drm ptp libata megaraid_sas crc32c_intel i2c_algo_bit pps_core i2c_core dca [last unloaded: rdmavt] CPU: 15 PID: 68691 Comm: kworker/15:2H Kdump: loaded Tainted: P W OE ------------ 3.10.0-862.2.3.el7_lustre.x86_64 #1 Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016 Workqueue: hfi0_0 _hfi1_do_tid_send [hfi1] task: ffff9a01f47faf70 ti: ffff9a11776a8000 task.ti: ffff9a11776a8000 RIP: 0010:[<ffffffffc0d87ea6>] [<ffffffffc0d87ea6>] hfi1_make_tid_rdma_pkt+0x476/0xcb0 [hfi1] RSP: 0018:ffff9a11776abd08 EFLAGS: 00010002 RAX: ffff9a32d27abfc0 RBX: ffff99f2d27aa000 RCX: 00000000ffffffff RDX: 0000000000000000 RSI: 0000000000000220 RDI: ffff99f2ffc05300 RBP: ffff9a11776abd88 R08: 000000000001c310 R09: ffffffffc0d87ad4 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a117a423c00 R13: ffff9a117a423c00 R14: ffff9a03500c0000 R15: ffff9a117a423cb8 FS: 0000000000000000(0000) GS:ffff9a117e9c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff9a32d27abff8 CR3: 0000002748a0e000 CR4: 00000000001607e0 Call Trace: [<ffffffffc0d88874>] _hfi1_do_tid_send+0x194/0x320 [hfi1] [<ffffffffaf0b2dff>] process_one_work+0x17f/0x440 [<ffffffffaf0b3ac6>] worker_thread+0x126/0x3c0 [<ffffffffaf0b39a0>] ? manage_workers.isra.24+0x2a0/0x2a0 [<ffffffffaf0bae31>] kthread+0xd1/0xe0 [<ffffffffaf0bad60>] ? insert_kthread_work+0x40/0x40 [<ffffffffaf71f5f7>] ret_from_fork_nospec_begin+0x21/0x21 [<ffffffffaf0bad60>] ? insert_kthread_work+0x40/0x40 hfi1 0000:05:00.0: hfi1_0: reserved_op: opcode 0xf2, slot 2, rsv_used 1, rsv_ops 1 Code: 00 00 41 8b 8d d8 02 00 00 89 c8 48 89 45 b0 48 c1 65 b0 06 48 8b 83 a0 01 00 00 48 01 45 b0 48 8b 45 b0 41 80 bd 10 03 00 00 00 <48> 8b 50 38 4c 8d 7a 50 74 45 8b b2 d0 00 00 00 85 f6 0f 85 72 RIP [<ffffffffc0d87ea6>] hfi1_make_tid_rdma_pkt+0x476/0xcb0 [hfi1] RSP <ffff9a11776abd08> CR2: ffff9a32d27abff8 This problem can happen if a RESYNC request is received before r_tid_ack is modified. This patch fixes the issue by making sure that r_tid_ack is set to a valid value before a TID RDMA ACK is built. Functions are defined to simplify the code. Fixes: 07b923701e38 ("IB/hfi1: Add functions to receive TID RDMA WRITE request") Fixes: 7cf0ad679de4 ("IB/hfi1: Add a function to receive TID RDMA RESYNC packet") Link: https://lore.kernel.org/r/20191025195830.106825.44022.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-06IB/hfi1: Ensure full Gen3 speed in a Gen4 systemJames Erwin
If an hfi1 card is inserted in a Gen4 systems, the driver will avoid the gen3 speed bump and the card will operate at half speed. This is because the driver avoids the gen3 speed bump when the parent bus speed isn't identical to gen3, 8.0GT/s. This is not compatible with gen4 and newer speeds. Fix by relaxing the test to explicitly look for the lower capability speeds which inherently allows for gen4 and all future speeds. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20191101192059.106248.1699.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: James Erwin <james.erwin@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>