summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2019-04-29perf/x86: Make perf callchains work without CONFIG_FRAME_POINTERKairui Song
Currently perf callchain doesn't work well with ORC unwinder when sampling from trace point. We'll get useless in kernel callchain like this: perf 6429 [000] 22.498450: kmem:mm_page_alloc: page=0x176a17 pfn=1534487 order=0 migratetype=0 gfp_flags=GFP_KERNEL ffffffffbe23e32e __alloc_pages_nodemask+0x22e (/lib/modules/5.1.0-rc3+/build/vmlinux) 7efdf7f7d3e8 __poll+0x18 (/usr/lib64/libc-2.28.so) 5651468729c1 [unknown] (/usr/bin/perf) 5651467ee82a main+0x69a (/usr/bin/perf) 7efdf7eaf413 __libc_start_main+0xf3 (/usr/lib64/libc-2.28.so) 5541f689495641d7 [unknown] ([unknown]) The root cause is that, for trace point events, it doesn't provide a real snapshot of the hardware registers. Instead perf tries to get required caller's registers and compose a fake register snapshot which suppose to contain enough information for start a unwinding. However without CONFIG_FRAME_POINTER, if failed to get caller's BP as the frame pointer, so current frame pointer is returned instead. We get a invalid register combination which confuse the unwinder, and end the stacktrace early. So in such case just don't try dump BP, and let the unwinder start directly when the register is not a real snapshot. Use SP as the skip mark, unwinder will skip all the frames until it meet the frame of the trace point caller. Tested with frame pointer unwinder and ORC unwinder, this makes perf callchain get the full kernel space stacktrace again like this: perf 6503 [000] 1567.570191: kmem:mm_page_alloc: page=0x16c904 pfn=1493252 order=0 migratetype=0 gfp_flags=GFP_KERNEL ffffffffb523e2ae __alloc_pages_nodemask+0x22e (/lib/modules/5.1.0-rc3+/build/vmlinux) ffffffffb52383bd __get_free_pages+0xd (/lib/modules/5.1.0-rc3+/build/vmlinux) ffffffffb52fd28a __pollwait+0x8a (/lib/modules/5.1.0-rc3+/build/vmlinux) ffffffffb521426f perf_poll+0x2f (/lib/modules/5.1.0-rc3+/build/vmlinux) ffffffffb52fe3e2 do_sys_poll+0x252 (/lib/modules/5.1.0-rc3+/build/vmlinux) ffffffffb52ff027 __x64_sys_poll+0x37 (/lib/modules/5.1.0-rc3+/build/vmlinux) ffffffffb500418b do_syscall_64+0x5b (/lib/modules/5.1.0-rc3+/build/vmlinux) ffffffffb5a0008c entry_SYSCALL_64_after_hwframe+0x44 (/lib/modules/5.1.0-rc3+/build/vmlinux) 7f71e92d03e8 __poll+0x18 (/usr/lib64/libc-2.28.so) 55a22960d9c1 [unknown] (/usr/bin/perf) 55a22958982a main+0x69a (/usr/bin/perf) 7f71e9202413 __libc_start_main+0xf3 (/usr/lib64/libc-2.28.so) 5541f689495641d7 [unknown] ([unknown]) Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Young <dyoung@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190422162652.15483-1-kasong@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/intel: Add Tremont core PMU supportKan Liang
Add perf core PMU support for Intel Tremont CPU. The init code is based on Goldmont plus. The generic purpose counter 0 and fixed counter 0 have less skid. Force :ppp events on generic purpose counter 0. Force instruction:ppp on generic purpose counter 0 and fixed counter 0. Updates LLC cache event table and OFFCORE_RESPONSE mask. Adaptive PEBS, which is already enabled on ICL, is also supported on Tremont. No extra code required. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/1554922629-126287-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/intel/uncore: Add Intel Icelake uncore supportKan Liang
Add Intel Icelake uncore support: - The init code is based on Skylake - Add new PCI id for IMC - New MSR address for CBOX - Get CBOX# from CNL_UNC_CBO_CONFIG MSR directly - Create a new PMU for fixed clocktick counter Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-13-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/msr: Add Icelake supportKan Liang
Icelake is the same as the existing Skylake parts. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-12-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/intel/rapl: Add Icelake supportKan Liang
Icelake support the same RAPL counters as Skylake. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-11-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/intel/cstate: Add Icelake supportKan Liang
Icelake uses the same C-state residency events as Sandy Bridge. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-10-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/intel: Add Icelake supportKan Liang
Add Icelake core PMU perf code, including constraint tables and the main enable code. Icelake expanded the generic counters to always 8 even with HT on, but a range of events cannot be scheduled on the extra 4 counters. Add new constraint ranges to describe this to the scheduler. The number of constraints that need to be checked is larger now than with earlier CPUs. At some point we may need a new data structure to look them up more efficiently than with linear search. So far it still seems to be acceptable however. Icelake added a new fixed counter SLOTS. Full support for it is added later in the patch series. The cache events table is identical to Skylake. Compare to PEBS instruction event on generic counter, fixed counter 0 has less skid. Force instruction:ppp always in fixed counter 0. Originally-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-9-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86: Support constraint rangesPeter Zijlstra
Icelake extended the general counters to 8, even when SMT is enabled. However only a (large) subset of the events can be used on all 8 counters. The events that can or cannot be used on all counters are organized in ranges. A lot of scheduler constraints are required to handle all this. To avoid blowing up the tables add event code ranges to the constraint tables, and a new inline function to match them. Originally-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # developer hat on Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # maintainer hat on Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-8-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/lbr: Avoid reading the LBRs when adaptive PEBS handles themAndi Kleen
With adaptive PEBS the CPU can directly supply the LBR information, so we don't need to read it again. But the LBRs still need to be enabled. Add a special count to the cpuc that distinguishes these two cases, and avoid reading the LBRs unnecessarily when PEBS is active. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-7-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/intel: Support adaptive PEBS v4Kan Liang
Adaptive PEBS is a new way to report PEBS sampling information. Instead of a fixed size record for all PEBS events it allows to configure the PEBS record to only include the information needed. Events can then opt in to use such an extended record, or stay with a basic record which only contains the IP. The major new feature is to support LBRs in PEBS record. Besides normal LBR, this allows (much faster) large PEBS, while still supporting callstacks through callstack LBR. So essentially a lot of profiling can now be done without frequent interrupts, dropping the overhead significantly. The main requirement still is to use a period, and not use frequency mode, because frequency mode requires reevaluating the frequency on each overflow. The floating point state (XMM) is also supported, which allows efficient profiling of FP function arguments. Introduce specific drain function to handle variable length records. Use a new callback to parse the new record format, and also handle the STATUS field now being at a different offset. Add code to set up the configuration register. Since there is only a single register, all events either get the full super set of all events, or only the basic record. Originally-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-6-kan.liang@linux.intel.com [ Renamed GPRS => GP. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/intel/ds: Extract code of event update in short periodKan Liang
The drain_pebs() could be called twice in a short period for auto-reload event in pmu::read(). The intel_pmu_save_and_restart_reload() should be called to update the event->count. This case should also be handled on Icelake. Extract the code for later reuse. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-5-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/intel: Extract memory code PEBS parser for reuseAndi Kleen
Extract some code related to memory profiling from the PEBS record parser into separate functions. It can be reused by the upcoming adaptive PEBS parser. No functional changes. Rename intel_hsw_weight to intel_get_tsx_weight, and intel_hsw_transaction to intel_get_tsx_transaction. Because the input is not the hsw pebs format anymore. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-4-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86: Support outputting XMM registersKan Liang
Starting from Icelake, XMM registers can be collected in PEBS record. But current code only output the pt_regs. Add a new struct x86_perf_regs for both pt_regs and xmm_regs. The xmm_regs will be used later to keep a pointer to PEBS record which has XMM information. XMM registers are 128 bit. To simplify the code, they are handled like two different registers, which means setting two bits in the register bitmap. This also allows only sampling the lower 64bit bits in XMM. The index of XMM registers starts from 32. There are 16 XMM registers. So all reserved space for regs are used. Remove REG_RESERVED. Add PERF_REG_X86_XMM_MAX, which stands for the max number of all x86 regs including both GPRs and XMM. Add REG_NOSUPPORT for 32bit to exclude unsupported registers. Previous platforms can not collect XMM information in PEBS record. Adding pebs_no_xmm_regs to indicate the unsupported platforms. The common code still validates the supported registers. However, it cannot check model specific registers, e.g. XMM. Add extra check in x86_pmu_hw_config() to reject invalid config of regs_user and regs_intr. The regs_user never supports XMM collection. The regs_intr only supports XMM collection when sampling PEBS event on icelake and later platforms. Originally-by: Andi Kleen <ak@linux.intel.com> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lkml.kernel.org/r/20190402194509.2832-3-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86/intel: Force resched when TFA sysctl is modifiedStephane Eranian
This patch provides guarantee to the sysadmin that when TFA is disabled, no PMU event is using PMC3 when the echo command returns. Vice-Versa, when TFA is enabled, PMU can use PMC3 immediately (to eliminate possible multiplexing). $ perf stat -a -I 1000 --no-merge -e branches,branches,branches,branches 1.000123979 125,768,725,208 branches 1.000562520 125,631,000,456 branches 1.000942898 125,487,114,291 branches 1.001333316 125,323,363,620 branches 2.004721306 125,514,968,546 branches 2.005114560 125,511,110,861 branches 2.005482722 125,510,132,724 branches 2.005851245 125,508,967,086 branches 3.006323475 125,166,570,648 branches 3.006709247 125,165,650,056 branches 3.007086605 125,164,639,142 branches 3.007459298 125,164,402,912 branches 4.007922698 125,045,577,140 branches 4.008310775 125,046,804,324 branches 4.008670814 125,048,265,111 branches 4.009039251 125,048,677,611 branches 5.009503373 125,122,240,217 branches 5.009897067 125,122,450,517 branches Then on another connection, sysadmin does: $ echo 1 >/sys/devices/cpu/allow_tsx_force_abort Then perf stat adjusts the events immediately: 5.010286029 125,121,393,483 branches 5.010646308 125,120,556,786 branches 6.011113588 124,963,351,832 branches 6.011510331 124,964,267,566 branches 6.011889913 124,964,829,130 branches 6.012262996 124,965,841,156 branches 7.012708299 124,419,832,234 branches [79.69%] 7.012847908 124,416,363,853 branches [79.73%] 7.013225462 124,400,723,712 branches [79.73%] 7.013598191 124,376,154,434 branches [79.70%] 8.014089834 124,250,862,693 branches [74.98%] 8.014481363 124,267,539,139 branches [74.94%] 8.014856006 124,259,519,786 branches [74.98%] 8.014980848 124,225,457,969 branches [75.04%] 9.015464576 124,204,235,423 branches [75.03%] 9.015858587 124,204,988,490 branches [75.04%] 9.016243680 124,220,092,486 branches [74.99%] 9.016620104 124,231,260,146 branches [74.94%] And vice-versa if the syadmin does: $ echo 0 >/sys/devices/cpu/allow_tsx_force_abort Events are again spread over the 4 counters: 10.017096277 124,276,230,565 branches [74.96%] 10.017237209 124,228,062,171 branches [75.03%] 10.017478637 124,178,780,626 branches [75.03%] 10.017853402 124,198,316,177 branches [75.03%] 11.018334423 124,602,418,933 branches [85.40%] 11.018722584 124,602,921,320 branches [85.42%] 11.019095621 124,603,956,093 branches [85.42%] 11.019467742 124,595,273,783 branches [85.42%] 12.019945736 125,110,114,864 branches 12.020330764 125,109,334,472 branches 12.020688740 125,109,818,865 branches 12.021054020 125,108,594,014 branches 13.021516774 125,109,164,018 branches 13.021903640 125,108,794,510 branches 13.022270770 125,107,756,978 branches 13.022630819 125,109,380,471 branches 14.023114989 125,133,140,817 branches 14.023501880 125,133,785,858 branches 14.023868339 125,133,852,700 branches Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: kan.liang@intel.com Cc: nelson.dsouza@intel.com Cc: tonyj@suse.com Link: https://lkml.kernel.org/r/20190408173252.37932-3-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16perf/x86: Fix incorrect PEBS_REGSKan Liang
PEBS_REGS used as mask for the supported registers for large PEBS. However, the mask cannot filter the sample_regs_user/sample_regs_intr correctly. (1ULL << PERF_REG_X86_*) should be used to replace PERF_REG_X86_*, which is only the index. Rename PEBS_REGS to PEBS_GP_REGS, because the mask is only for general purpose registers. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Fixes: 2fe1bc1f501d ("perf/x86: Enable free running PEBS for REGS_USER/INTR") Link: https://lkml.kernel.org/r/20190402194509.2832-2-kan.liang@linux.intel.com [ Renamed it to PEBS_GP_REGS - as 'GPRS' is used elsewhere ;-) ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16Merge branch 'linus' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-13Merge tag 'powerpc-5.1-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A minor build fix for 64-bit FLATMEM configs. A fix for a boot failure on 32-bit powermacs. My commit to fix CLOCK_MONOTONIC across Y2038 broke the 32-bit VDSO on 64-bit kernels, ie. compat mode, which is only used on big endian. The rewrite of the SLB code we merged in 4.20 missed the fact that the 0x380 exception is also used with the Radix MMU to report out of range accesses. This could lead to an oops if userspace tried to read from addresses outside the user or kernel range. Thanks to: Aneesh Kumar K.V, Christophe Leroy, Larry Finger, Nicholas Piggin" * tag 'powerpc-5.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Define MAX_PHYSMEM_BITS for all 64-bit configs powerpc/64s/radix: Fix radix segment exception handling powerpc/vdso32: fix CLOCK_MONOTONIC on PPC64 powerpc/32: Fix early boot failure with RTAS built-in
2019-04-13Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The main thing is a fix to our FUTEX_WAKE_OP implementation which was unbelievably broken, but did actually work for the one scenario that GLIBC used to use. Summary: - Fix stack unwinding so we ignore user stacks - Fix ftrace module PLT trampoline initialisation checks - Fix terminally broken implementation of FUTEX_WAKE_OP atomics" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value arm64: backtrace: Don't bother trying to unwind the userspace stack arm64/ftrace: fix inadvertent BUG() in trampoline check
2019-04-12Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Fix typos in user-visible resctrl parameters, and also fix assembly constraint bugs that might result in miscompilation" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Use stricter assembly constraints in bitops x86/resctrl: Fix typos in the mba_sc mount option
2019-04-12Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Six kernel side fixes: three related to NMI handling on AMD systems, a race fix, a kexec initialization fix and a PEBS sampling fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix perf_event_disable_inatomic() race x86/perf/amd: Remove need to check "running" bit in NMI handler x86/perf/amd: Resolve NMI latency issues for active PMCs x86/perf/amd: Resolve race condition when disabling PMC perf/x86/intel: Initialize TFA MSR perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS
2019-04-12Merge tag 'dma-mapping-5.1-1' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fixes from Christoph Hellwig: "Fix a sparc64 sun4v_pci regression introduced in this merged window, and a dma-debug stracktrace regression from the big refactor last merge window" * tag 'dma-mapping-5.1-1' of git://git.infradead.org/users/hch/dma-mapping: dma-debug: only skip one stackframe entry sparc64/pci_sun4v: fix ATU checks for large DMA masks
2019-04-12arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result valueWill Deacon
Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't explicitly set the return value on the non-faulting path and instead leaves it holding the result of the underlying atomic operation. This means that any FUTEX_WAKE_OP atomic operation which computes a non-zero value will be reported as having failed. Regrettably, I wrote the buggy code back in 2011 and it was upstreamed as part of the initial arm64 support in 2012. The reasons we appear to get away with this are: 1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get exercised by futex() test applications 2. If the result of the atomic operation is zero, the system call behaves correctly 3. Prior to version 2.25, the only operation used by GLIBC set the futex to zero, and therefore worked as expected. From 2.25 onwards, FUTEX_WAKE_OP is not used by GLIBC at all. Fix the implementation by ensuring that the return value is either 0 to indicate that the atomic operation completed successfully, or -EFAULT if we encountered a fault when accessing the user mapping. Cc: <stable@kernel.org> Fixes: 6170a97460db ("arm64: Atomic operations") Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-10sparc64/pci_sun4v: fix ATU checks for large DMA masksChristoph Hellwig
Now that we allow drivers to always need to set larger than required DMA masks we need to be a little more careful in the sun4v PCI iommu driver to chose when to select the ATU support - a larger DMA mask can be set even when the platform does not support ATU, so we always have to check if it is avaiable before using it. Add a little helper for that and use it in all the places where we make ATU usage decisions based on the DMA mask. Fixes: 24132a419c68 ("sparc64/pci_sun4v: allow large DMA masks") Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Meelis Roos <mroos@linux.ee> Acked-by: David S. Miller <davem@davemloft.net>
2019-04-10x86/perf/amd: Remove need to check "running" bit in NMI handlerLendacky, Thomas
Spurious interrupt support was added to perf in the following commit, almost a decade ago: 63e6be6d98e1 ("perf, x86: Catch spurious interrupts after disabling counters") The two previous patches (resolving the race condition when disabling a PMC and NMI latency mitigation) allow for the removal of this older spurious interrupt support. Currently in x86_pmu_stop(), the bit for the PMC in the active_mask bitmap is cleared before disabling the PMC, which sets up a race condition. This race condition was mitigated by introducing the running bitmap. That race condition can be eliminated by first disabling the PMC, waiting for PMC reset on overflow and then clearing the bit for the PMC in the active_mask bitmap. The NMI handler will not re-enable a disabled counter. If x86_pmu_stop() is called from the perf NMI handler, the NMI latency mitigation support will guard against any unhandled NMI messages. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> # 4.14.x- Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-10powerpc/mm: Define MAX_PHYSMEM_BITS for all 64-bit configsMichael Ellerman
The recent commit 8bc086899816 ("powerpc/mm: Only define MAX_PHYSMEM_BITS in SPARSEMEM configurations") removed our definition of MAX_PHYSMEM_BITS when SPARSEMEM is disabled. This inadvertently broke some 64-bit FLATMEM using configs with eg: arch/powerpc/include/asm/book3s/64/mmu-hash.h:584:6: error: "MAX_PHYSMEM_BITS" is not defined, evaluates to 0 #if (MAX_PHYSMEM_BITS > MAX_EA_BITS_PER_CONTEXT) ^~~~~~~~~~~~~~~~ Fix it by making sure we define MAX_PHYSMEM_BITS for all 64-bit configs regardless of SPARSEMEM. Fixes: 8bc086899816 ("powerpc/mm: Only define MAX_PHYSMEM_BITS in SPARSEMEM configurations") Reported-by: Andreas Schwab <schwab@linux-m68k.org> Reported-by: Hugh Dickins <hughd@google.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-04-09Merge tag 'mips_fixes_5.1_2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Paul Burton: "A few minor MIPS fixes: - Provide struct pt_regs * from get_irq_regs() to kgdb_nmicallback() when handling an IPI triggered by kgdb_roundup_cpus(), matching the behavior of other architectures & resolving kgdb issues for SMP systems. - Defer a pointer dereference until after a NULL check in the irq_shutdown callback for SGI IP27 HUB interrupts. - A defconfig update for the MSCC Ocelot to enable some necessary drivers" * tag 'mips_fixes_5.1_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: generic: Add switchdev, pinctrl and fit to ocelot_defconfig MIPS: SGI-IP27: Fix use of unchecked pointer in shutdown_bridge_irq MIPS: KGDB: fix kgdb support for SMP platforms.
2019-04-08Merge tag 'xtensa-20190408' of git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds
Pull xtensa fixes from Max Filippov: - fix syscall number passed to trace_sys_exit - fix syscall number initialization in start_thread - fix level interpretation in the return_address - fix format string warning in init_pmd * tag 'xtensa-20190408' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: fix format string warning in init_pmd xtensa: fix return_address xtensa: fix initialization of pt_regs::syscall in start_thread xtensa: use actual syscall number in do_syscall_trace_leave
2019-04-08arm64: backtrace: Don't bother trying to unwind the userspace stackWill Deacon
Calling dump_backtrace() with a pt_regs argument corresponding to userspace doesn't make any sense and our unwinder will simply print "Call trace:" before unwinding the stack looking for user frames. Rather than go through this song and dance, just return early if we're passed a user register state. Cc: <stable@vger.kernel.org> Fixes: 1149aad10b1e ("arm64: Add dump_backtrace() in show_regs") Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-08arm64/ftrace: fix inadvertent BUG() in trampoline checkArd Biesheuvel
The ftrace trampoline code (which deals with modules loaded out of BL range of the core kernel) uses plt_entries_equal() to check whether the per-module trampoline equals a zero buffer, to decide whether the trampoline has already been initialized. This triggers a BUG() in the opcode manipulation code, since we end up checking the ADRP offset of a 0x0 opcode, which is not an ADRP instruction. So instead, add a helper to check whether a PLT is initialized, and call that from the frace code. Cc: <stable@vger.kernel.org> # v5.0 Fixes: bdb85cd1d206 ("arm64/module: switch to ADRP/ADD sequences for PLT entries") Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-08powerpc/64s/radix: Fix radix segment exception handlingNicholas Piggin
Commit 48e7b76957 ("powerpc/64s/hash: Convert SLB miss handlers to C") broke the radix-mode segment exception handler. In radix mode, this is exception is not an SLB miss, rather it signals that the EA is outside the range translated by any page table. The commit lost the radix feature alternate code patch, which can cause faults to some EAs to kernel BUG at arch/powerpc/mm/slb.c:639! The original radix code would send faults to slb_miss_large_addr, which would end up faulting due to slb_addr_limit being 0. This patch sends radix directly to do_bad_slb_fault, which is a bit clearer. Fixes: 48e7b7695745 ("powerpc/64s/hash: Convert SLB miss handlers to C") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-04-07Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "A collection of fixes from the last few weeks. Most of them are smaller tweaks and fixes to DT and hardware descriptions for boards. Some of the more significant ones are: - eMMC and RGMII stability tweaks for rk3288 - DDC fixes for Rock PI 4 - Audio fixes for two TI am335x eval boards - D_CAN clock fix for am335x - Compilation fixes for clang - !HOTPLUG_CPU compilation fix for one of the new platforms this release (milbeaut) - A revert of a gpio fix for nomadik that instead was fixed in the gpio subsystem - Whitespace fix for the DT JSON schema (no tabs allowed)" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits) ARM: milbeaut: fix build with !CONFIG_HOTPLUG_CPU ARM: iop: don't use using 64-bit DMA masks ARM: orion: don't use using 64-bit DMA masks Revert "ARM: dts: nomadik: Fix polarity of SPI CS" dt-bindings: cpu: Fix JSON schema arm/mach-at91/pm : fix possible object reference leak ARM: dts: at91: Fix typo in ISC_D0 on PC9 ARM: dts: Fix dcan clkctrl clock for am3 reset: meson-audio-arb: Fix missing .owner setting of reset_controller_dev dt-bindings: reset: meson-g12a: Add missing USB2 PHY resets ARM: dts: rockchip: Remove #address/#size-cells from rk3288-veyron gpio-keys ARM: dts: rockchip: Remove #address/#size-cells from rk3288 mipi_dsi ARM: dts: rockchip: Fix gpu opp node names for rk3288 ARM: dts: am335x-evmsk: Correct the regulators for the audio codec ARM: dts: am335x-evm: Correct the regulators for the audio codec ARM: OMAP2+: add missing of_node_put after of_device_is_available ARM: OMAP1: ams-delta: Fix broken GPIO ID allocation arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's arm64: dts: rockchip: fix rk3328 sdmmc0 write errors arm64: dts: rockchip: fix rk3328 rgmii high tx error rate ...
2019-04-07ARM: milbeaut: fix build with !CONFIG_HOTPLUG_CPUArnd Bergmann
When HOTPLUG_CPU is disabled, some fields in the smp operations are not available or needed: arch/arm/mach-milbeaut/platsmp.c:90:3: error: field designator 'cpu_die' does not refer to any field in type 'struct smp_operations' .cpu_die = m10v_cpu_die, ^ arch/arm/mach-milbeaut/platsmp.c:91:3: error: field designator 'cpu_kill' does not refer to any field in type 'struct smp_operations' .cpu_kill = m10v_cpu_kill, ^ Hide them in an #ifdef like the other platforms do. Fixes: 9fb29c734f9e ("ARM: milbeaut: Add basic support for Milbeaut m10v SoC") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Olof Johansson <olof@lixom.net>
2019-04-07ARM: iop: don't use using 64-bit DMA masksArnd Bergmann
clang warns about statically defined DMA masks from the DMA_BIT_MASK macro with length 64: arch/arm/mach-iop13xx/setup.c:303:35: error: shift count >= width of type [-Werror,-Wshift-count-overflow] static u64 iop13xx_adma_dmamask = DMA_BIT_MASK(64); ^~~~~~~~~~~~~~~~ include/linux/dma-mapping.h:141:54: note: expanded from macro 'DMA_BIT_MASK' #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1)) ^ ~~~ The ones in iop shouldn't really be 64 bit masks, so changing them to what the driver can support avoids the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Olof Johansson <olof@lixom.net>
2019-04-07ARM: orion: don't use using 64-bit DMA masksArnd Bergmann
clang warns about statically defined DMA masks from the DMA_BIT_MASK macro with length 64: arch/arm/plat-orion/common.c:625:29: error: shift count >= width of type [-Werror,-Wshift-count-overflow] .coherent_dma_mask = DMA_BIT_MASK(64), ^~~~~~~~~~~~~~~~ include/linux/dma-mapping.h:141:54: note: expanded from macro 'DMA_BIT_MASK' #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1)) The ones in orion shouldn't really be 64 bit masks, so changing them to what the driver can support avoids the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Olof Johansson <olof@lixom.net>
2019-04-07Revert "ARM: dts: nomadik: Fix polarity of SPI CS"Olof Johansson
This reverts commit fa9463564e77067df81b0b8dec91adbbbc47bfb4. Per Linus Walleij: Dear ARM SoC maintainers, can you please revert this patch. It was the wrong solution to the wrong problem, and I must have acted in stress. Andrey fixed the real bug in a proper way in these commits: commit e5545c94e43b8f6599ffc01df8d1aedf18ee912a "gpio: of: Check propname before applying "cs-gpios" quirks" commit 7ce40277bf848391705011ba37eac2e377cbd9e6 "gpio: of: Check for "spi-cs-high" in child instead of parent node" Signed-off-by: Olof Johansson <olof@lixom.net>
2019-04-07Merge tag 'omap-for-v5.1/fixes-signed' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Fixes for omaps for v5.1-rc cycle Few small fixes for omap variants: - Fix ams-delta gpio IDs - Add missing of_node_put for omapdss platform init code - Fix unconfigured audio regulators for two am335x boards - Fix use of wrong offset for am335x d_can clocks * tag 'omap-for-v5.1/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: Fix dcan clkctrl clock for am3 ARM: dts: am335x-evmsk: Correct the regulators for the audio codec ARM: dts: am335x-evm: Correct the regulators for the audio codec ARM: OMAP2+: add missing of_node_put after of_device_is_available ARM: OMAP1: ams-delta: Fix broken GPIO ID allocation Signed-off-by: Olof Johansson <olof@lixom.net>
2019-04-07Merge tag 'at91-5.1-fixes' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes AT91 fixes for 5.1 - fix a typo in sama5d2 pinmuxing which concerns the ISC data 0 signal - fix a kobject reference leak * tag 'at91-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: arm/mach-at91/pm : fix possible object reference leak ARM: dts: at91: Fix typo in ISC_D0 on PC9 Signed-off-by: Olof Johansson <olof@lixom.net>
2019-04-07Merge tag 'v5.1-rockchip-dtfixes-1' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes Fixes for dtc warnings, fixes for ethernet transfers on rk3328, sd-card related fixes on both rk3328 ans rk3288-tinker and a regulator fix on rock64 and making ddc actually work on the Rock PI 4 due to missing the ddc bus. * tag 'v5.1-rockchip-dtfixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: ARM: dts: rockchip: Remove #address/#size-cells from rk3288-veyron gpio-keys ARM: dts: rockchip: Remove #address/#size-cells from rk3288 mipi_dsi ARM: dts: rockchip: Fix gpu opp node names for rk3288 arm64: dts: rockchip: fix rk3328 sdmmc0 write errors arm64: dts: rockchip: fix rk3328 rgmii high tx error rate ARM: dts: rockchip: Fix SD card detection on rk3288-tinker arm64: dts: rockchip: Fix vcc_host1_5v GPIO polarity on rk3328-rock64 ARM: dts: rockchip: fix rk3288 cpu opp node reference arm64: dts: rockchip: add DDC bus on Rock Pi 4 arm64: dts: rockchip: fix rk3328-roc-cc gmac2io tx/rx_delay Signed-off-by: Olof Johansson <olof@lixom.net>
2019-04-07Merge tag 'stratix10_fix_for_v5.1' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into arm/fixes arm64: dts: stratix10: fix emac loading warning - Add missing "altr,sysmgr-syscon" property to all gmac nodes * tag 'stratix10_fix_for_v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's Signed-off-by: Olof Johansson <olof@lixom.net>
2019-04-08powerpc/vdso32: fix CLOCK_MONOTONIC on PPC64Christophe Leroy
Commit b5b4453e7912 ("powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038") changed the type of wtom_clock_sec to s64 on PPC64. Therefore, VDSO32 needs to read it with a 4 bytes shift in order to retrieve the lower part of it. Fixes: b5b4453e7912 ("powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-04-07Merge tag 'for-linus-5.1b-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "One minor fix and a small cleanup for the xen privcmd driver" * tag 'for-linus-5.1b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Prevent buffer overflow in privcmd ioctl xen: use struct_size() helper in kzalloc()
2019-04-06parisc: Detect QEMU earlier in boot processHelge Deller
While adding LASI support to QEMU, I noticed that the QEMU detection in the kernel happens much too late. For example, when a LASI chip is found by the kernel, it registers the LASI LED driver as well. But when we run on QEMU it makes sense to avoid spending unnecessary CPU cycles, so we need to access the running_on_QEMU flag earlier than before. This patch now makes the QEMU detection the fist task of the Linux kernel by moving it to where the kernel enters the C-coding. Fixes: 310d82784fb4 ("parisc: qemu idle sleep support") Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v4.14+
2019-04-06parisc: also set iaoq_b in instruction_pointer_set()Sven Schnelle
When setting the instruction pointer on PA-RISC we also need to set the back of the instruction queue to the new offset, otherwise we will execute on instruction from the new location, and jumping back to the old location stored in iaoq_b. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de> Fixes: 75ebedf1d263 ("parisc: Add HAVE_REGS_AND_STACK_ACCESS_API feature") Cc: stable@vger.kernel.org # 4.19+
2019-04-06parisc: regs_return_value() should return gpr28Sven Schnelle
While working on kretprobes for PA-RISC I was wondering while the kprobes sanity test always fails on kretprobes. This is caused by returning gpr20 instead of gpr28. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.14+
2019-04-06x86/asm: Use stricter assembly constraints in bitopsAlexander Potapenko
There's a number of problems with how arch/x86/include/asm/bitops.h is currently using assembly constraints for the memory region bitops are modifying: 1) Use memory clobber in bitops that touch arbitrary memory Certain bit operations that read/write bits take a base pointer and an arbitrarily large offset to address the bit relative to that base. Inline assembly constraints aren't expressive enough to tell the compiler that the assembly directive is going to touch a specific memory location of unknown size, therefore we have to use the "memory" clobber to indicate that the assembly is going to access memory locations other than those listed in the inputs/outputs. To indicate that BTR/BTS instructions don't necessarily touch the first sizeof(long) bytes of the argument, we also move the address to assembly inputs. This particular change leads to size increase of 124 kernel functions in a defconfig build. For some of them the diff is in NOP operations, other end up re-reading values from memory and may potentially slow down the execution. But without these clobbers the compiler is free to cache the contents of the bitmaps and use them as if they weren't changed by the inline assembly. 2) Use byte-sized arguments for operations touching single bytes. Passing a long value to ANDB/ORB/XORB instructions makes the compiler treat sizeof(long) bytes as being clobbered, which isn't the case. This may theoretically lead to worse code in the case of heavy optimization. Practical impact: I've built a defconfig kernel and looked through some of the functions generated by GCC 7.3.0 with and without this clobber, and didn't spot any miscompilations. However there is a (trivial) theoretical case where this code leads to miscompilation: https://lkml.org/lkml/2019/3/28/393 using just GCC 8.3.0 with -O2. It isn't hard to imagine someone writes such a function in the kernel someday. So the primary motivation is to fix an existing misuse of the asm directive, which happens to work in certain configurations now, but isn't guaranteed to work under different circumstances. [ --mingo: Added -stable tag because defconfig only builds a fraction of the kernel and the trivial testcase looks normal enough to be used in existing or in-development code. ] Signed-off-by: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: James Y Knight <jyknight@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20190402112813.193378-1-glider@google.com [ Edited the changelog, tidied up one of the defines. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "14 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: kernel/sysctl.c: fix out-of-bounds access when setting file-max mm/util.c: fix strndup_user() comment sh: fix multiple function definition build errors MAINTAINERS: add maintainer and replacing reviewer ARM/NUVOTON NPCM MAINTAINERS: fix bad pattern in ARM/NUVOTON NPCM mm: writeback: use exact memcg dirty counts psi: clarify the units used in pressure files mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd() hugetlbfs: fix memory leak for resv_map mm: fix vm_fault_t cast in VM_FAULT_GET_HINDEX() lib/lzo: fix bugs for very short or empty input include/linux/bitrev.h: fix constant bitrev kmemleak: powerpc: skip scanning holes in the .bss section lib/string.c: implement a basic bcmp
2019-04-05sh: fix multiple function definition build errorsRandy Dunlap
Many of the sh CPU-types have their own plat_irq_setup() and arch_init_clk_ops() functions, so these same (empty) functions in arch/sh/boards/of-generic.c are not needed and cause build errors. If there is some case where these empty functions are needed, they can be retained by marking them as "__weak" while at the same time making builds that do not need them succeed. Fixes these build errors: arch/sh/boards/of-generic.o: In function `plat_irq_setup': (.init.text+0x134): multiple definition of `plat_irq_setup' arch/sh/kernel/cpu/sh2/setup-sh7619.o:(.init.text+0x30): first defined here arch/sh/boards/of-generic.o: In function `arch_init_clk_ops': (.init.text+0x118): multiple definition of `arch_init_clk_ops' arch/sh/kernel/cpu/sh2/clock-sh7619.o:(.init.text+0x0): first defined here Link: http://lkml.kernel.org/r/9ee4e0c5-f100-86a2-bd4d-1d3287ceab31@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kbuild test robot <lkp@intel.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-05kmemleak: powerpc: skip scanning holes in the .bss sectionCatalin Marinas
Commit 2d4f567103ff ("KVM: PPC: Introduce kvm_tmp framework") adds kvm_tmp[] into the .bss section and then free the rest of unused spaces back to the page allocator. kernel_init kvm_guest_init kvm_free_tmp free_reserved_area free_unref_page free_unref_page_prepare With DEBUG_PAGEALLOC=y, it will unmap those pages from kernel. As the result, kmemleak scan will trigger a panic when it scans the .bss section with unmapped pages. This patch creates dedicated kmemleak objects for the .data, .bss and potentially .data..ro_after_init sections to allow partial freeing via the kmemleak_free_part() in the powerpc kvm_free_tmp() function. Link: http://lkml.kernel.org/r/20190321171917.62049-1-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Qian Cai <cai@lca.pw> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Qian Cai <cai@lca.pw> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Avi Kivity <avi@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "x86 fixes for overflows and other nastiness" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: nVMX: fix x2APIC VTPR read intercept KVM: x86: nVMX: close leak of L0's x2APIC MSRs (CVE-2019-3887) KVM: SVM: prevent DBG_DECRYPT and DBG_ENCRYPT overflow kvm: svm: fix potential get_num_contig_pages overflow