summaryrefslogtreecommitdiff
path: root/arch/sparc/kernel/perf_event.c
AgeCommit message (Collapse)Author
2024-02-16sparc: Fix typosBjorn Helgaas
Fix typos, most reported by "codespell arch/sparc". Only touches comments, no code changes. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: sparclinux@vger.kernel.org Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20240103231605.1801364-9-helgaas@kernel.org
2019-06-02sparc: perf: fix updated event period in response to PERF_EVENT_IOC_PERIODYoung Xiao
The PERF_EVENT_IOC_PERIOD ioctl command can be used to change the sample period of a running perf_event. Consequently, when calculating the next event period, the new period will only be considered after the previous one has overflowed. This patch changes the calculation of the remaining event ticks so that they are offset if the period has changed. See commit 3581fe0ef37c ("ARM: 7556/1: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD") for details. Signed-off-by: Young Xiao <92siuyang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-22sparc64: Use ftrace_graph_get_ret_stack() instead of curr_ret_stackSteven Rostedt (VMware)
The structure of the ret_stack array on the task struct is going to change, and accessing it directly via the curr_ret_stack index will no longer give the ret_stack entry that holds the return address. To access that, architectures must now use ftrace_graph_get_ret_stack() to get the associated ret_stack that matches the saved return address. Cc: sparclinux@vger.kernel.org Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-29sparc64: Remvoe set_fs() from perf_callchain_user().David S. Miller
Ever since commit 88b0193d9418 ("perf/callchain: Force USER_DS when invoking perf_callchain_user()") the caller does this for us. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12sparc: Throttle perf events properly.David S. Miller
Like x86 and arm, call perf_sample_event_took() in perf event NMI interrupt handler. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12sparc: Fix single-pcr perf event counter management.David S. Miller
It is important to clear the hw->state value for non-stopped events when they are added into the PMU. Otherwise when the event is scheduled out, we won't read the counter because HES_UPTODATE is still set. This breaks 'perf stat' and similar use cases, causing all the events to show zero. This worked for multi-pcr because we make explicit sparc_pmu_start() calls in calculate_multiple_pcrs(). calculate_single_pcr() doesn't do this because the idea there is to accumulate all of the counter settings into the single pcr value. So we have to add explicit hw->state handling there. Like x86, we use the PERF_HES_ARCH bit to track truly stopped events so that we don't accidently start them on a reload. Related to all of this, sparc_pmu_start() is missing a userpage update so add it. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-16perf: Fix sibling iterationPeter Zijlstra
Mark noticed that the change to sibling_list changed some iteration semantics; because previously we used group_list as list entry, sibling events would always have an empty sibling_list. But because we now use sibling_list for both list head and list entry, siblings will report as having siblings. Fix this with a custom for_each_sibling_event() iterator. Fixes: 8343aae66167 ("perf/core: Remove perf_event::group_entry") Reported-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: vincent.weaver@maine.edu Cc: alexander.shishkin@linux.intel.com Cc: torvalds@linux-foundation.org Cc: alexey.budankov@linux.intel.com Cc: valery.cherepennikov@intel.com Cc: eranian@google.com Cc: acme@redhat.com Cc: linux-tip-commits@vger.kernel.org Cc: davidcc@google.com Cc: kan.liang@intel.com Cc: Dmitry.Prohorov@intel.com Cc: jolsa@redhat.com Link: https://lkml.kernel.org/r/20180315170129.GX4043@hirez.programming.kicks-ass.net
2018-03-12perf/core: Remove perf_event::group_entryPeter Zijlstra
Now that all the grouping is done with RB trees, we no longer need group_entry and can replace the whole thing with sibling_list. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-16perf core: Add a 'nr' field to perf_event_callchain_contextArnaldo Carvalho de Melo
We will use it to count how many addresses are in the entry->ip[] array, excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really return the number of entries specified by the user via the relevant sysctl, kernel.perf_event_max_contexts, or via the per event perf_event_attr.sample_max_stack knob. This way we keep the perf_sample->ip_callchain->nr meaning, that is the number of entries, be it real addresses or PERF_CONTEXT_ entries, while honouring the max_stack knobs, i.e. the end result will be max_stack entries if we have at least that many entries in a given stack trace. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16perf core: Pass max stack as a perf_callchain_entry contextArnaldo Carvalho de Melo
This makes perf_callchain_{user,kernel}() receive the max stack as context for the perf_callchain_entry, instead of accessing the global sysctl_perf_event_max_stack. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-27perf core: Allow setting up max frame stack depth via sysctlArnaldo Carvalho de Melo
The default remains 127, which is good for most cases, and not even hit most of the time, but then for some cases, as reported by Brendan, 1024+ deep frames are appearing on the radar for things like groovy, ruby. And in some workloads putting a _lower_ cap on this may make sense. One that is per event still needs to be put in place tho. The new file is: # cat /proc/sys/kernel/perf_event_max_stack 127 Chaging it: # echo 256 > /proc/sys/kernel/perf_event_max_stack # cat /proc/sys/kernel/perf_event_max_stack 256 But as soon as there is some event using callchains we get: # echo 512 > /proc/sys/kernel/perf_event_max_stack -bash: echo: write error: Device or resource busy # Because we only allocate the callchain percpu data structures when there is a user, which allows for changing the max easily, its just a matter of having no callchain users at that point. Reported-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-24sparc64: Perf should save/restore fault infoRob Gardner
There have been several reports of random processes being killed with a bus error or segfault during userspace stack walking in perf. One of the root causes of this problem is an asynchronous modification to thread_info fault_address and fault_code, which stems from a perf counter interrupt arriving during kernel processing of a "benign" fault, such as a TSB miss. Since perf_callchain_user() invokes copy_from_user() to read user stacks, a fault is not only possible, but probable. Validity checks on the stack address merely cover up the problem and reduce its frequency. The solution here is to save and restore fault_address and fault_code in perf_callchain_user() so that the benign fault handler is not disturbed by a perf interrupt. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24sparc64: Ensure perf can access user stacksRob Gardner
When an interrupt (such as a perf counter interrupt) is delivered while executing in user space, the trap entry code puts ASI_AIUS in %asi so that copy_from_user() and copy_to_user() will access the correct memory. But if a perf counter interrupt is delivered while the cpu is already executing in kernel space, then the trap entry code will put ASI_P in %asi, and this will prevent copy_from_user() from reading any useful stack data in either of the perf_callchain_user_X functions, and thus no user callgraph data will be collected for this sample period. An additional problem is that a fault is guaranteed to occur, and though it will be silently covered up, it wastes time and could perturb state. In perf_callchain_user(), we ensure that %asi contains ASI_AIUS because we know for a fact that the subsequent calls to copy_from_user() are intended to read the user's stack. [ Use get_fs()/set_fs() -DaveM ] Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-23treewide: Remove old email addressPeter Zijlstra
There were still a number of references to my old Red Hat email address in the kernel source. Remove these while keeping the Red Hat copyright notices intact. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-13perf/core: Drop PERF_EVENT_TXNSukadev Bhattiprolu
We currently use PERF_EVENT_TXN flag to determine if we are in the middle of a transaction. If in a transaction, we defer the schedulability checks from pmu->add() operation to the pmu->commit() operation. Now that we have "transaction types" (PERF_PMU_TXN_ADD, PERF_PMU_TXN_READ) we can use the type to determine if we are in a transaction and drop the PERF_EVENT_TXN flag. When PERF_EVENT_TXN is dropped, the cpuhw->group_flag on some architectures becomes unused, so drop that field as well. This is an extension of the Powerpc patch from Peter Zijlstra to s390, Sparc and x86 architectures. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1441336073-22750-11-git-send-email-sukadev@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-13perf/core: Add a 'flags' parameter to the PMU transactional interfacesSukadev Bhattiprolu
Currently, the PMU interface allows reading only one counter at a time. But some PMUs like the 24x7 counters in Power, support reading several counters at once. To leveage this functionality, extend the transaction interface to support a "transaction type". The first type, PERF_PMU_TXN_ADD, refers to the existing transactions, i.e. used to _schedule_ all the events on the PMU as a group. A second transaction type, PERF_PMU_TXN_READ, will be used in a follow-on patch, by the 24x7 counters to read several counters at once. Extend the transaction interfaces to the PMU to accept a 'txn_flags' parameter and use this parameter to ignore any transactions that are not of type PERF_PMU_TXN_ADD. Thanks to Peter Zijlstra for his input. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [peterz: s390 compile fix] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1441336073-22750-3-git-send-email-sukadev@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-13sparc, perf/sparc: Remove unnecessary assignmentSukadev Bhattiprolu
In ->commit_txn() 'cpuc' is already initialized when it is declared, so we can remove the duplicate assignment. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/1441336073-22750-2-git-send-email-sukadev@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-25sparc64: perf: Use UREG_FP rather than UREG_I6David Ahern
perf walks userspace callchains by following frame pointers. Use the UREG_FP macro to make it clearer that the %fp is being used. Signed-off-by: David Ahern <david.ahern@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-25sparc64: perf: Add sanity checking on addresses in user stackDavid Ahern
Processes are getting killed (sigbus or segv) while walking userspace callchains when using perf. In some instances I have seen ufp = 0x7ff which does not seem like a proper stack address. This patch adds a function to run validity checks against the address before attempting the copy_from_user. The checks are copied from the x86 version as a start point with the addition of a 4-byte alignment check. Signed-off-by: David Ahern <david.ahern@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-25sparc: perf: Disable pagefaults while walking userspace stacksDavid Ahern
Page faults generated walking userspace stacks can call schedule to switch out the task. When collecting callchains for scheduler tracepoints this causes a deadlock as the tracepoints can be hit with the runqueue lock held: [ 8138.159054] WARNING: CPU: 758 PID: 12488 at /opt/dahern/linux.git/arch/sparc/kernel/nmi.c:80 perfctr_irq+0x1f8/0x2b4() [ 8138.203152] Watchdog detected hard LOCKUP on cpu 758 [ 8138.410969] CPU: 758 PID: 12488 Comm: perf Not tainted 4.0.0-rc6+ #6 [ 8138.437146] Call Trace: [ 8138.447193] [000000000045cdd4] warn_slowpath_common+0x7c/0xa0 [ 8138.471238] [000000000045ce90] warn_slowpath_fmt+0x30/0x40 [ 8138.494189] [0000000000983e38] perfctr_irq+0x1f8/0x2b4 [ 8138.515716] [00000000004209f4] tl0_irq15+0x14/0x20 [ 8138.535791] [00000000009839ec] _raw_spin_trylock_bh+0x68/0x108 [ 8138.560180] [0000000000980018] __schedule+0xcc/0x710 [ 8138.580981] [00000000009806dc] preempt_schedule_common+0x10/0x3c [ 8138.606082] [000000000098077c] _cond_resched+0x34/0x44 [ 8138.627603] [0000000000565990] kmem_cache_alloc_node+0x24/0x1a0 [ 8138.652345] [0000000000450b60] tsb_grow+0xac/0x488 [ 8138.672429] [0000000000985040] do_sparc64_fault+0x4dc/0x6e4 [ 8138.695736] [0000000000407c2c] sparc64_realfault_common+0x10/0x20 [ 8138.721202] [00000000006f2e24] NG4copy_from_user+0xa4/0x3c0 [ 8138.744510] [000000000044f900] perf_callchain_user+0x5c/0x6c [ 8138.768182] [0000000000517b5c] perf_callchain+0x16c/0x19c [ 8138.790774] [0000000000515f84] perf_prepare_sample+0x68/0x218 [ 8138.814801] ---[ end trace 42ca6294b1ff7573 ]--- As with PowerPC (b59a1bfcc240, "powerpc/perf: Disable pagefaults during callchain stack read") disable pagefaults while walking userspace stacks. Signed-off-by: David Ahern <david.ahern@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-21sparc64: Use M7 PMC write on all chips T4 and onward.David S. Miller
They both work equally well, and the M7 implementation is simpler and cheaper (less register writes). With help from David Ahern. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-19sparc: perf: Add support M7 processorDavid Ahern
The M7 processor has a different hypervisor group id and different PCR fast trap values. PIC read/write functions and PCR bit fields are the same as the T4 so those are reused. Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-19sparc: perf: Make counting mode actually workDavid Ahern
Currently perf-stat (aka, counting mode) does not work: $ perf stat ls ... Performance counter stats for 'ls': 1.585665 task-clock (msec) # 0.580 CPUs utilized 24 context-switches # 0.015 M/sec 0 cpu-migrations # 0.000 K/sec 86 page-faults # 0.054 M/sec <not supported> cycles <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend <not supported> instructions <not supported> branches <not supported> branch-misses 0.002735100 seconds time elapsed The reason is that state is never reset (stays with PERF_HES_UPTODATE set). Add a call to sparc_pmu_enable_event during the added_event handling. Clean up the encoding since pmu_start calls sparc_pmu_enable_event which does the same. Passing PERF_EF_RELOAD to sparc_pmu_start means the call to sparc_perf_event_set_period can be removed as well. With this patch: $ perf stat ls ... Performance counter stats for 'ls': 1.552890 task-clock (msec) # 0.552 CPUs utilized 24 context-switches # 0.015 M/sec 0 cpu-migrations # 0.000 K/sec 86 page-faults # 0.055 M/sec 5,748,997 cycles # 3.702 GHz <not supported> stalled-cycles-frontend:HG <not supported> stalled-cycles-backend:HG 1,684,362 instructions:HG # 0.29 insns per cycle 295,133 branches:HG # 190.054 M/sec 28,007 branch-misses:HG # 9.49% of all branches 0.002815665 seconds time elapsed Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-19sparc: perf: Remove redundant perf_pmu_{en|dis}able callsDavid Ahern
perf_pmu_disable is called by core perf code before pmu->del and the enable function is called by core perf code afterwards. No need to call again within sparc_pmu_del. Ditto for pmu->add and sparc_pmu_add. Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-15Merge branch 'for-3.18-consistent-ops' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu consistent-ops changes from Tejun Heo: "Way back, before the current percpu allocator was implemented, static and dynamic percpu memory areas were allocated and handled separately and had their own accessors. The distinction has been gone for many years now; however, the now duplicate two sets of accessors remained with the pointer based ones - this_cpu_*() - evolving various other operations over time. During the process, we also accumulated other inconsistent operations. This pull request contains Christoph's patches to clean up the duplicate accessor situation. __get_cpu_var() uses are replaced with with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr(). Unfortunately, the former sometimes is tricky thanks to C being a bit messy with the distinction between lvalues and pointers, which led to a rather ugly solution for cpumask_var_t involving the introduction of this_cpu_cpumask_var_ptr(). This converts most of the uses but not all. Christoph will follow up with the remaining conversions in this merge window and hopefully remove the obsolete accessors" * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits) irqchip: Properly fetch the per cpu offset percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write. percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t Revert "powerpc: Replace __get_cpu_var uses" percpu: Remove __this_cpu_ptr clocksource: Replace __this_cpu_ptr with raw_cpu_ptr sparc: Replace __get_cpu_var uses avr32: Replace __get_cpu_var with __this_cpu_write blackfin: Replace __get_cpu_var uses tile: Use this_cpu_ptr() for hardware counters tile: Replace __get_cpu_var uses powerpc: Replace __get_cpu_var uses alpha: Replace __get_cpu_var ia64: Replace __get_cpu_var uses s390: cio driver &__get_cpu_var replacements s390: Replace __get_cpu_var uses mips: Replace __get_cpu_var uses MIPS: Replace __get_cpu_var uses in FPU emulator. arm: Replace __this_cpu_ptr with raw_cpu_ptr ...
2014-09-16sparc64: T5 PMUbob picco
The T5 (niagara5) has different PCR related HV fast trap values and a new HV API Group. This patch utilizes these and shares when possible with niagara4. We use the same sparc_pmu niagara4_pmu. Should there be new effort to obtain the MCU perf statistics then this would have to be changed. Cc: sparclinux@vger.kernel.org Signed-off-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26sparc: Replace __get_cpu_var usesChristoph Lameter
__get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : #define __get_cpu_var(var) (*this_cpu_ptr(&(var))) __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: sparclinux@vger.kernel.org Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-08-11sparc64: Fix pcr_ops initialization and usage bugs.David S. Miller
Christopher reports that perf_event_print_debug() can crash in uniprocessor builds. The crash is due to pcr_ops being NULL. This happens because pcr_arch_init() is only invoked by smp_cpus_done() which only executes in SMP builds. init_hw_perf_events() is closely intertwined with pcr_ops being setup properly, therefore: 1) Call pcr_arch_init() early on from init_hw_perf_events(), instead of from smp_cpus_done(). 2) Do not hook up a PMU type if pcr_ops is NULL after pcr_arch_init(). 3) Move init_hw_perf_events to a later initcall so that it we will be sure to invoke pcr_arch_init() after all cpus are brought up. Finally, guard the one naked sequence of pcr_ops dereferences in __global_pmu_self() with an appropriate NULL check. Reported-by: Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-18sparc64: fix sparse warnings in perf_event.cSam Ravnborg
Fix following sparse warnings: kernel/perf_event.c:113:1: warning: symbol 'cpu_hw_events' was not declared. Should it be static? kernel/perf_event.c:1156:6: warning: symbol 'perf_event_grab_pmc' was not declared. Should it be static? kernel/perf_event.c:1172:6: warning: symbol 'perf_event_release_pmc' was not declared. Should it be static? kernel/perf_event.c:1672:12: warning: symbol 'init_hw_perf_events' was not declared. Should it be static? kernel/perf_event.c:1749:52: warning: incorrect type in argument 2 (different address spaces) kernel/perf_event.c:1772:60: warning: incorrect type in argument 2 (different address spaces) kernel/perf_event.c:1779:60: warning: incorrect type in argument 2 (different address spaces) Define the functions static as they are not used outside this file. Fix it so copy_from_user are supplied with pointers annotated _user Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-26sparc64: Make montmul/montsqr/mpmul usable in 32-bit threads.David S. Miller
The Montgomery Multiply, Montgomery Square, and Multiple-Precision Multiply instructions work by loading a combination of the floating point and multiple register windows worth of integer registers with the inputs. These values are 64-bit. But for 32-bit userland processes we only save the low 32-bits of each integer register during a register spill. This is because the register window save area is in the user stack and has a fixed layout. Therefore, the only way to use these instruction in 32-bit mode is to perform the following sequence: 1) Load the top-32bits of a choosen integer register with a sentinel, say "-1". This will be in the outer-most register window. The idea is that we're trying to see if the outer-most register window gets spilled, and thus the 64-bit values were truncated. 2) Load all the inputs for the montmul/montsqr/mpmul instruction, down to the inner-most register window. 3) Execute the opcode. 4) Traverse back up to the outer-most register window. 5) Check the sentinel, if it's still "-1" store the results. Otherwise retry the entire sequence. This retry is extremely troublesome. If you're just unlucky and an interrupt or other trap happens, it'll push that outer-most window to the stack and clear the sentinel when we restore it. We could retry forever and never make forward progress if interrupts arrive at a fast enough rate (consider perf events as one example). So we have do limited retries and fallback to software which is extremely non-deterministic. Luckily it's very straightforward to provide a mechanism to let 32-bit applications use a 64-bit stack. Stacks in 64-bit mode are biased by 2047 bytes, which means that the lowest bit is set in the actual %sp register value. So if we see bit zero set in a 32-bit application's stack we treat it like a 64-bit stack. Runtime detection of such a facility is tricky, and cumbersome at best. For example, just trying to use a biased stack and seeing if it works is hard to recover from (the signal handler will need to use an alt stack, plus something along the lines of longjmp). Therefore, we add a system call to report a bitmask of arch specific features like this in a cheap and less hairy way. With help from Andy Polyakov. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-16sparc64: Fix bit twiddling in sparc_pmu_enable_event().David S. Miller
There was a serious disconnect in the logic happening in sparc_pmu_disable_event() vs. sparc_pmu_enable_event(). Event disable is implemented by programming a NOP event into the PCR. However, event enable was not reversing this operation. Instead, it was setting the User/Priv/Hypervisor trace enable bits. That's not sparc_pmu_enable_event()'s job, that's what sparc_pmu_enable() and sparc_pmu_disable() do . The intent of sparc_pmu_enable_event() is clear, since it first clear out the event type encoding field. So fix this by OR'ing in the event encoding rather than the trace enable bits. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-14sparc64: Like x86 we should check current->mm during perf backtrace generation.David S. Miller
If the MM is not active, only report the top-level PC. Do not try to access the address space. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Update generic comments in perf event code to match reality.David S. Miller
Describe how we support two types of PMU setups, one with a single control register and two counters stored in a single register, and another with one control register per counter and each counter living in it's own register. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Add SPARC-T4 perf event support.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Support perf event encoding for multi-PCR PMUs.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Make sparc_pmu_{enable,disable}_event() multi-pcr aware.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Rework sparc_pmu_enable() so that the side effects are clearer.David S. Miller
When cpuc->n_events is zero, we actually don't do anything and we just write the cpuc->pcr[0] value as-is without any modifications. The "pcr = 0;" assignment there was just useless and confusing. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Prepare perf event layer for handling multiple PCR registers.David S. Miller
Make the per-cpu pcr save area an array instead of one u64. Describe how many PCR and PIC registers the chip has in the sparc_pmu descriptor. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Specify user and supervisor trace PCR bits in sparc_pmu.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Abstract PMC read/write behind sparc_pmu.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Allow max hw perf events to be variable.David S. Miller
Now specified in sparc_pmu descriptor. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Add perf_event abstractions for orthogonal PMUs.David S. Miller
Starting with SPARC-T4 we have a seperate PCR control register for each performance counter, and there are absolutely no restrictions on what events can run on which counters. Add flags that we can use to elide the conflict and dependency logic used to handle older chips. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Abstract away PIC register accesses.David S. Miller
And, like for the PCR, allow indexing of different PIC register numbers. This also removes all of the non-__KERNEL__ bits from asm/perfctr.h, nothing kernel side should include it any more. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-18sparc64: Add 'reg_num' argument to pcr_ops methods.David S. Miller
SPARC-T4 and later have multiple PCR registers, one for each PIC counter. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-09perf: Pass last sampling period to perf_sample_data_init()Robert Richter
We always need to pass the last sample period to perf_sample_data_init(), otherwise the event distribution will be wrong. Thus, modifiyng the function interface with the required period as argument. So basically a pattern like this: perf_sample_data_init(&data, ~0ULL); data.period = event->hw.last_period; will now be like that: perf_sample_data_init(&data, ~0ULL, event->hw.last_period); Avoids unininitialized data.period and simplifies code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-28Disintegrate asm/system.h for SparcDavid Howells
Disintegrate asm/system.h for Sparc. Signed-off-by: David Howells <dhowells@redhat.com> cc: sparclinux@vger.kernel.org
2012-03-05perf: Disable PERF_SAMPLE_BRANCH_* when not supportedStephane Eranian
PERF_SAMPLE_BRANCH_* is disabled for: - SW events (sw counters, tracepoints) - HW breakpoints - ALL but Intel x86 architecture - AMD64 processors Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-10-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-27sparc: Detect and handle UltraSPARC-T3 cpu types.David S. Miller
The cpu compatible string we look for is "SPARC-T3". As far as memset/memcpy optimizations go, we treat this chip the same as Niagara-T2/T2+. Use cache initializing stores for memset, and use perfetch, FPU block loads, cache initializing stores, and block stores for copies. We use the Niagara-T2 perf support, since T3 is a close relative in this regard. Later we'll add support for the new events T3 can report, plus enable T3's new "sample" mode. For now I haven't added any new ELF hwcap flags. We probably need to add a couple, for example: T2 and T3 both support the population count instruction in hardware. T3 supports VIS3 instructions, including support (finally) for partitioned shift. One can also now move directly between float and integer registers. T3 supports instructions meant to help with Galois Field and other HPC calculations, such as XOR multiply. Also there are "OP and negate" instructions, for example "fnmul" which is multiply-and-negate. T3 recognizes the transactional memory opcodes, however since transactional memory isn't supported: 1) 'commit' behaves as a NOP and 2) 'chkpt' always branches 3) 'rdcps' returns all zeros and 4) 'wrcps' behaves as a NOP. So we'll need about 3 new elf capability flags in the end to represent all of these things. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-26atomic: use <linux/atomic.h>Arun Sharma
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>