summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/perf_event.c
AgeCommit message (Collapse)Author
2023-03-27arm64: perf: Move PMUv3 driver to drivers/perfMarc Zyngier
Having the ARM PMUv3 driver sitting in arch/arm64/kernel is getting in the way of being able to use perf on ARMv8 cores running a 32bit kernel, such as 32bit KVM guests. This patch moves it into drivers/perf/arm_pmuv3.c, with an include file in include/linux/perf/arm_pmuv3.h. The only thing left in arch/arm64 is some mundane perf stuff. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-2-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>
2023-02-16arm64: perf: reject CHAIN events at creation timeMark Rutland
Currently it's possible for a user to open CHAIN events arbitrarily, which we previously tried to rule out in commit: ca2b497253ad01c8 ("arm64: perf: Reject stand-alone CHAIN events for PMUv3") Which allowed the events to be opened, but prevented them from being scheduled by by using an arm_pmu::filter_match hook to reject the relevant events. The CHAIN event filtering in the arm_pmu::filter_match hook was silently removed in commit: bd27568117664b8b ("perf: Rewrite core context handling") As a result, it's now possible for users to open CHAIN events, and for these to be installed arbitrarily. Fix this by rejecting CHAIN events at creation time. This avoids the creation of events which will never count, and doesn't require using the dynamic filtering. Attempting to open a CHAIN event (0x1e) will now be rejected: | # ./perf stat -e armv8_pmuv3/config=0x1e/ ls | perf | | Performance counter stats for 'ls': | | <not supported> armv8_pmuv3/config=0x1e/ | | 0.002197470 seconds time elapsed | | 0.000000000 seconds user | 0.002294000 seconds sys Other events (e.g. CPU_CYCLES / 0x11) will open as usual: | # ./perf stat -e armv8_pmuv3/config=0x11/ ls | perf | | Performance counter stats for 'ls': | | 2538761 armv8_pmuv3/config=0x11/ | | 0.002227330 seconds time elapsed | | 0.002369000 seconds user | 0.000000000 seconds sys Fixes: bd2756811766 ("perf: Rewrite core context handling") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230216141240.3833272-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-02-16arm_pmu: fix event CPU filteringMark Rutland
Janne reports that perf has been broken on Apple M1 as of commit: bd27568117664b8b ("perf: Rewrite core context handling") That commit replaced the pmu::filter_match() callback with pmu::filter(), whose return value has the opposite polarity, with true implying events should be ignored rather than scheduled. While an attempt was made to update the logic in armv8pmu_filter() and armpmu_filter() accordingly, the return value remains inverted in a couple of cases: * If the arm_pmu does not have an arm_pmu::filter() callback, armpmu_filter() will always return whether the CPU is supported rather than whether the CPU is not supported. As a result, the perf core will not schedule events on supported CPUs, resulting in a loss of events. Additionally, the perf core will attempt to schedule events on unsupported CPUs, but this will be rejected by armpmu_add(), which may result in a loss of events from other PMUs on those unsupported CPUs. * If the arm_pmu does have an arm_pmu::filter() callback, and armpmu_filter() is called on a CPU which is not supported by the arm_pmu, armpmu_filter() will return false rather than true. As a result, the perf core will attempt to schedule events on unsupported CPUs, but this will be rejected by armpmu_add(), which may result in a loss of events from other PMUs on those unsupported CPUs. This means a loss of events can be seen with any arm_pmu driver, but with the ARMv8 PMUv3 driver (which is the only arm_pmu driver with an arm_pmu::filter() callback) the event loss will be more limited and may go unnoticed, which is how this issue evaded testing so far. Fix the CPU filtering by performing this consistently in armpmu_filter(), and remove the redundant arm_pmu::filter() callback and armv8pmu_filter() implementation. Commit bd2756811766 also silently removed the CHAIN event filtering from armv8pmu_filter(), which will be addressed by a separate patch without using the filter callback. Fixes: bd2756811766 ("perf: Rewrite core context handling") Reported-by: Janne Grunau <j@jannau.net> Link: https://lore.kernel.org/asahi/20230215-arm_pmu_m1_regression-v1-1-f5a266577c8d@jannau.net/ Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Asahi Lina <lina@asahilina.net> Cc: Eric Curtin <ecurtin@redhat.com> Tested-by: Janne Grunau <j@jannau.net> Link: https://lore.kernel.org/r/20230216141240.3833272-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-12-12Merge tag 'perf-core-2022-12-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Thoroughly rewrite the data structures that implement perf task context handling, with the goal of fixing various quirks and unfeatures both in already merged, and in upcoming proposed code. The old data structure is the per task and per cpu perf_event_contexts: task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context ^ | ^ | ^ `---------------------------------' | `--> pmu ---' v ^ perf_event ------' In this new design this is replaced with a single task context and a single CPU context, plus intermediate data-structures: task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context ^ | ^ ^ `---------------------------' | | | | perf_cpu_pmu_context <--. | `----. ^ | | | | | | v v | | ,--> perf_event_pmu_context | | | | | | | v v | perf_event ---> pmu ----------------' [ See commit bd2756811766 for more details. ] This rewrite was developed by Peter Zijlstra and Ravi Bangoria. - Optimize perf_tp_event() - Update the Intel uncore PMU driver, extending it with UPI topology discovery on various hardware models. - Misc fixes & cleanups * tag 'perf-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) perf/x86/intel/uncore: Fix reference count leak in __uncore_imc_init_box() perf/x86/intel/uncore: Fix reference count leak in snr_uncore_mmio_map() perf/x86/intel/uncore: Fix reference count leak in hswep_has_limit_sbox() perf/x86/intel/uncore: Fix reference count leak in sad_cfg_iio_topology() perf/x86/intel/uncore: Make set_mapping() procedure void perf/x86/intel/uncore: Update sysfs-devices-mapping file perf/x86/intel/uncore: Enable UPI topology discovery for Sapphire Rapids perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server perf/x86/intel/uncore: Get UPI NodeID and GroupID perf/x86/intel/uncore: Enable UPI topology discovery for Skylake Server perf/x86/intel/uncore: Generalize get_topology() for SKX PMUs perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D perf/x86/intel/uncore: Clear attr_update properly perf/x86/intel/uncore: Introduce UPI topology type perf/x86/intel/uncore: Generalize IIO topology support perf/core: Don't allow grouping events from different hw pmus perf/amd/ibs: Make IBS a core pmu perf: Fix function pointer case perf/x86/amd: Remove the repeated declaration perf: Fix possible memleak in pmu_dev_alloc() ...
2022-11-29arm64/perf: Replace PMU version number '0' with ID_AA64DFR0_EL1_PMUVer_NIAnshuman Khandual
__armv8pmu_probe_pmu() returns if detected PMU is either not implemented or implementation defined. Extracted ID_AA64DFR0_EL1_PMUVer value, when PMU is not implemented is '0' which can be replaced with ID_AA64DFR0_EL1_PMUVer_NI defined as '0b0000'. Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20221128025449.39085-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-10-27perf: Rewrite core context handlingPeter Zijlstra
There have been various issues and limitations with the way perf uses (task) contexts to track events. Most notable is the single hardware PMU task context, which has resulted in a number of yucky things (both proposed and merged). Notably: - HW breakpoint PMU - ARM big.little PMU / Intel ADL PMU - Intel Branch Monitoring PMU - AMD IBS PMU - S390 cpum_cf PMU - PowerPC trace_imc PMU *Current design:* Currently we have a per task and per cpu perf_event_contexts: task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context ^ | ^ | ^ `---------------------------------' | `--> pmu ---' v ^ perf_event ------' Each task has an array of pointers to a perf_event_context. Each perf_event_context has a direct relation to a PMU and a group of events for that PMU. The task related perf_event_context's have a pointer back to that task. Each PMU has a per-cpu pointer to a per-cpu perf_cpu_context, which includes a perf_event_context, which again has a direct relation to that PMU, and a group of events for that PMU. The perf_cpu_context also tracks which task context is currently associated with that CPU and includes a few other things like the hrtimer for rotation etc. Each perf_event is then associated with its PMU and one perf_event_context. *Proposed design:* New design proposed by this patch reduce to a single task context and a single CPU context but adds some intermediate data-structures: task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context ^ | ^ ^ `---------------------------' | | | | perf_cpu_pmu_context <--. | `----. ^ | | | | | | v v | | ,--> perf_event_pmu_context | | | | | | | v v | perf_event ---> pmu ----------------' With the new design, perf_event_context will hold all events for all pmus in the (respective pinned/flexible) rbtrees. This can be achieved by adding pmu to rbtree key: {cpu, pmu, cgroup, group_index} Each perf_event_context carries a list of perf_event_pmu_context which is used to hold per-pmu-per-context state. For example, it keeps track of currently active events for that pmu, a pmu specific task_ctx_data, a flag to tell whether rotation is required or not etc. Additionally, perf_cpu_pmu_context is used to hold per-pmu-per-cpu state like hrtimer details to drive the event rotation, a pointer to perf_event_pmu_context of currently running task and some other ancillary information. Each perf_event is associated to it's pmu, perf_event_context and perf_event_pmu_context. Further optimizations to current implementation are possible. For example, ctx_resched() can be optimized to reschedule only single pmu events. Much thanks to Ravi for picking this up and pushing it towards completion. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221008062424.313-1-ravi.bangoria@amd.com
2022-09-16arm64/sysreg: Use feature numbering for PMU and SPE revisionsMark Brown
Currently the kernel refers to the versions of the PMU and SPE features by the version of the architecture where those features were updated but the ARM refers to them using the FEAT_ names for the features. To improve consistency and help with updating for newer features and since v9 will make our current naming scheme a bit more confusing update the macros identfying features to use the FEAT_ based scheme. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition namesMark Brown
Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64DFR0_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architectureMark Brown
The naming scheme the architecture uses for the fields in ID_AA64DFR0_EL1 does not align well with kernel conventions, using as it does a lot of MixedCase in various arrangements. In preparation for automatically generating the defines for this register rename the defines used to match what is in the architecture. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-03-08arm64: perf: Expose some Armv9 common events under sysfsShaokun Zhang
Armv9[1] has introduced some common architectural events (0x400C-0x400F) and common microarchitectural events (0x4010-0x401B), which can be detected by PMCEID0_EL0 from bit44 to bit59, so expose these common events under sysfs. [1] https://developer.arm.com/documentation/ddi0608/ba Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20220303085419.64085-1-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2022-01-04arm64: perf: Don't register user access sysctl handler multiple timesWill Deacon
Commit e2012600810c ("arm64: perf: Add userspace counter access disable switch") introduced a new 'perf_user_access' sysctl file to enable and disable direct userspace access to the PMU counters. Sadly, Geert reports that on his big.LITTLE SoC ('Renesas Salvator-XS w/ R-Car H3'), the file is created for each PMU type probed, resulting in a splat during boot: | hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available | sysctl duplicate entry: /kernel//perf_user_access | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-rc3-arm64-renesas-00003-ge2012600810c #1420 | Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT) | Call trace: | dump_backtrace+0x0/0x190 | show_stack+0x14/0x20 | dump_stack_lvl+0x88/0xb0 | dump_stack+0x14/0x2c | __register_sysctl_table+0x384/0x818 | register_sysctl+0x20/0x28 | armv8_pmu_init.constprop.0+0x118/0x150 | armv8_a57_pmu_init+0x1c/0x28 | arm_pmu_device_probe+0x1b4/0x558 | armv8_pmu_device_probe+0x18/0x20 | platform_probe+0x64/0xd0 | hw perfevents: enabled with armv8_cortex_a57 PMU driver, 7 counters available Introduce a state variable to track creation of the sysctl file and ensure that it is only created once. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: e2012600810c ("arm64: perf: Add userspace counter access disable switch") Link: https://lore.kernel.org/r/CAMuHMdVcDxR9sGzc5pcnORiotonERBgc6dsXZXMd6wTvLGA9iw@mail.gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14Merge branch 'for-next/perf-cpu' into for-next/perfWill Deacon
* for-next/perf-cpu: arm64: perf: Support new DT compatibles arm64: perf: Simplify registration boilerplate arm64: perf: Support Denver and Carmel PMUs
2021-12-14arm64: perf: Support new DT compatiblesRobin Murphy
Wire up the new DT compatibles so we can present appropriate PMU names to userspace for the latest and greatest CPUs. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/62d14ba12d847ec7f1fba7cb0b3b881b437e1cc5.1639490264.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14arm64: perf: Simplify registration boilerplateRobin Murphy
With the trend for per-core events moving to userspace JSON, registering names for PMUv3 implementations is increasingly a pure boilerplate exercise. Let's wrap things a step further so we can generate the basic PMUv3 init function with a macro invocation, and reduce further new addition to just 2 lines each. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b79477ea3b97f685d00511d4ecd2f686184dca34.1639490264.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14arm64: perf: Support Denver and Carmel PMUsThierry Reding
Add support for the NVIDIA Denver and Carmel PMUs using the generic PMUv3 event map for now. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thierry Reding <treding@nvidia.com> [ rm: reorder entries alphabetically ] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/5f0f69d47acca78a9e479501aa4d8b429e23cf11.1639490264.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14arm64: perf: Enable PMU counter userspace access for perf eventRob Herring
Arm PMUs can support direct userspace access of counters which allows for low overhead (i.e. no syscall) self-monitoring of tasks. The same feature exists on x86 called 'rdpmc'. Unlike x86, userspace access will only be enabled for thread bound events. This could be extended if needed, but simplifies the implementation and reduces the chances for any information leaks (which the x86 implementation suffers from). PMU EL0 access will be enabled when an event with userspace access is part of the thread's context. This includes when the event is not scheduled on the PMU. There's some additional overhead clearing dirty counters when access is enabled in order to prevent leaking disabled counter data from other tasks. Unlike x86, enabling of userspace access must be requested with a new attr bit: config1:1. If the user requests userspace access with 64-bit counters, then the event open will fail if the h/w doesn't support 64-bit counters. Chaining is not supported with userspace access. The modes for config1 are as follows: config1 = 0 : user access disabled and always 32-bit config1 = 1 : user access disabled and always 64-bit (using chaining if needed) config1 = 2 : user access enabled and always 32-bit config1 = 3 : user access enabled and always 64-bit Based on work by Raphael Gault <raphael.gault@arm.com>, but has been completely re-written. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211208201124.310740-5-robh@kernel.org [will: Made armv8pmu_proc_user_access_handler() static] Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14arm64: perf: Add userspace counter access disable switchRob Herring
Like x86, some users may want to disable userspace PMU counter altogether. Add a sysctl 'perf_user_access' file to control userspace counter access. The default is '0' which is disabled. Writing '1' enables access. Note that x86 supports globally enabling user access by writing '2' to /sys/bus/event_source/devices/cpu/rdpmc. As there's not existing userspace support to worry about, this shouldn't be necessary for Arm. It could be added later if the need arises. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-perf-users@vger.kernel.org Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211208201124.310740-4-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-08-11arm64/perf: Replace '0xf' instances with ID_AA64DFR0_PMUVER_IMP_DEFAnshuman Khandual
ID_AA64DFR0_PMUVER_IMP_DEF, indicating an "implementation defined" PMU, never actually gets used although there are '0xf' instances scattered all around. Use the symbolic name instead of the raw hex constant. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/1628652427-24695-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11arm64: perf: Simplify EVENT ATTR macro in perf_event.cQi Liu
Use common macro PMU_EVENT_ATTR_ID to simplify ARMV8_EVENT_ATTR Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1623220863-58233-8-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-03arm64: perf: Add more support on caps under sysfsShaokun Zhang
Armv8.7 has introduced BUS_SLOTS and BUS_WIDTH in PMMIR_EL1 register, add two entries in caps for bus_slots and bus_width under sysfs. It will return the true slots and width if the information is available, otherwise it will return 0. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1622704502-63951-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-01arm64: perf: Convert snprintf to sysfs_emitTian Tao
Use sysfs_emit instead of snprintf to avoid buf overrun,because in sysfs_emit it strictly checks whether buf is null or buf whether pagesize aligned, otherwise it returns an error. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Link: https://lore.kernel.org/r/1621497585-30887-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-04-01arm64: perf: Remove redundant initialization in perf_event.cQi Liu
The initialization of value in function armv8pmu_read_hw_counter() and armv8pmu_read_counter() seem redundant, as they are soon updated. So, We can remove them. Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1617275801-1980-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-10arm64: perf: Fix 64-bit event counter read truncationRob Herring
Commit 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection") changed armv8pmu_read_evcntr() to return a u32 instead of u64. The result is silent truncation of the event counter when using 64-bit counters. Given the offending commit appears to have passed thru several folks, it seems likely this was a bad rebase after v8.5 PMU 64-bit counters landed. Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: <stable@vger.kernel.org> Fixes: 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection") Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20210310004412.1450128-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-12Merge branch 'for-next/perf' into for-next/coreWill Deacon
Perf and PMU updates including support for Cortex-A78 and the v8.3 SPE extensions. * for-next/perf: drivers/perf: Replace spin_lock_irqsave to spin_lock dt-bindings: arm: add Cortex-A78 binding arm64: perf: add support for Cortex-A78 arm64: perf: Constify static attribute_group structs drivers/perf: Prevent forced unbinding of ARM_DMC620_PMU drivers perf/arm-cmn: Move IRQs when migrating context perf/arm-cmn: Fix PMU instance naming perf: Constify static struct attribute_group perf: hisi: Constify static struct attribute_group perf/imx_ddr: Constify static struct attribute_group perf: qcom: Constify static struct attribute_group drivers/perf: Add support for ARMv8.3-SPE
2021-02-04arm64: improve whitespaceZhiyuan Dai
In a few places we don't have whitespace between macro parameters, which makes them hard to read. This patch adds whitespace to clearly separate the parameters. In a few places we have unnecessary whitespace around unary operators, which is confusing, This patch removes the unnecessary whitespace. Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn> Link: https://lore.kernel.org/r/1612403029-5011-1-git-send-email-daizhiyuan@phytium.com.cn Signed-off-by: Will Deacon <will@kernel.org>
2021-02-03arm64: perf: add support for Cortex-A78Seiya Wang
Add support for Cortex-A78 using generic PMUv3 for now. Signed-off-by: Seiya Wang <seiya.wang@mediatek.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210203055348.4935-2-seiya.wang@mediatek.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-02arm64: perf: Constify static attribute_group structsRikard Falkeborn
The only usage of these is to put their addresses in an array of pointers to const attribute_group structs. Make them const to allow the compiler to put them in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Signed-off-by: Will Deacon <will@kernel.org>
2021-01-13Revert "arm64: Enable perf events based hard lockup detector"Will Deacon
This reverts commit 367c820ef08082e68df8a3bc12e62393af21e4b5. lockup_detector_init() makes heavy use of per-cpu variables and must be called with preemption disabled. Usually, it's handled early during boot in kernel_init_freeable(), before SMP has been initialised. Since we do not know whether or not our PMU interrupt can be signalled as an NMI until considerably later in the boot process, the Arm PMU driver attempts to re-initialise the lockup detector off the back of a device_initcall(). Unfortunately, this is called from preemptible context and results in the following splat: | BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 | caller is debug_smp_processor_id+0x20/0x2c | CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #276 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x3c0 | show_stack+0x20/0x6c | dump_stack+0x2f0/0x42c | check_preemption_disabled+0x1cc/0x1dc | debug_smp_processor_id+0x20/0x2c | hardlockup_detector_event_create+0x34/0x18c | hardlockup_detector_perf_init+0x2c/0x134 | watchdog_nmi_probe+0x18/0x24 | lockup_detector_init+0x44/0xa8 | armv8_pmu_driver_init+0x54/0x78 | do_one_initcall+0x184/0x43c | kernel_init_freeable+0x368/0x380 | kernel_init+0x1c/0x1cc | ret_from_fork+0x10/0x30 Rather than bodge this with raw_smp_processor_id() or randomly disabling preemption, simply revert the culprit for now until we figure out how to do this properly. Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com> Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Sumit Garg <sumit.garg@linaro.org> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20201221162249.3119-1-lecopzer.chen@mediatek.com Link: https://lore.kernel.org/r/20210112221855.10666-1-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-25arm64: Enable perf events based hard lockup detectorSumit Garg
With the recent feature added to enable perf events to use pseudo NMIs as interrupts on platforms which support GICv3 or later, its now been possible to enable hard lockup detector (or NMI watchdog) on arm64 platforms. So enable corresponding support. One thing to note here is that normally lockup detector is initialized just after the early initcalls but PMU on arm64 comes up much later as device_initcall(). So we need to re-initialize lockup detection once PMU has been initialized. Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Acked-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/1602060704-10921-1-git-send-email-sumit.garg@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Defer irq_work to IPI_IRQ_WORKJulien Thierry
When handling events, armv8pmu_handle_irq() calls perf_event_overflow(), and subsequently calls irq_work_run() to handle any work queued by perf_event_overflow(). As perf_event_overflow() raises IPI_IRQ_WORK when queuing the work, this isn't strictly necessary and the work could be handled as part of the IPI_IRQ_WORK handler. In the common case the IPI handler will run immediately after the PMU IRQ handler, and where the PE is heavily loaded with interrupts other handlers may run first, widening the window where some counters are disabled. In practice this window is unlikely to be a significant issue, and removing the call to irq_work_run() would make the PMU IRQ handler NMI safe in addition to making it simpler, so let's do that. [Alexandru E.: Reworded commit message] Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-5-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Remove PMU lockingJulien Thierry
The PMU is disabled and enabled, and the counters are programmed from contexts where interrupts or preemption is disabled. The functions to toggle the PMU and to program the PMU counters access the registers directly and don't access data modified by the interrupt handler. That, and the fact that they're always called from non-preemptible contexts, means that we don't need to disable interrupts or use a spinlock. [Alexandru E.: Explained why locking is not needed, removed WARN_ONs] Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-4-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Avoid PMXEV* indirectionMark Rutland
Currently we access the counter registers and their respective type registers indirectly. This requires us to write to PMSELR, issue an ISB, then access the relevant PMXEV* registers. This is unfortunate, because: * Under virtualization, accessing one register requires two traps to the hypervisor, even though we could access the register directly with a single trap. * We have to issue an ISB which we could otherwise avoid the cost of. * When we use NMIs, the NMI handler will have to save/restore the select register in case the code it preempted was attempting to access a counter or its type register. We can avoid these issues by directly accessing the relevant registers. This patch adds helpers to do so. In armv8pmu_enable_event() we still need the ISB to prevent the PE from reordering the write to PMINTENSET_EL1 register. If the interrupt is enabled before we disable the counter and the new event is configured, we might get an interrupt triggered by the previously programmed event overflowing, but which we wrongly attribute to the event that we are enabling. Execute an ISB after we disable the counter. In the process, remove the comment that refers to the ARMv7 PMU. [Julien T.: Don't inline read/write functions to avoid big code-size increase, remove unused read_pmevtypern function, fix counter index issue.] [Alexandru E.: Removed comment, removed trailing semicolons in macros, added ISB] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-3-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Add missing ISB in armv8pmu_enable_counter()Alexandru Elisei
Writes to the PMXEVTYPER_EL0 register are not self-synchronising. In armv8pmu_enable_event(), the PE can reorder configuring the event type after we have enabled the counter and the interrupt. This can lead to an interrupt being asserted because of the previous event type that we were counting using the same counter, not the one that we've just configured. The same rationale applies to writes to the PMINTENSET_EL1 register. The PE can reorder enabling the interrupt at any point in the future after we have enabled the event. Prevent both situations from happening by adding an ISB just before we enable the event counter. Fixes: 030896885ade ("arm64: Performance counters support") Reported-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-2-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Add support caps under sysfsShaokun Zhang
ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events, like STALL_SLOT etc, are related to it. Let's add a caps directory to /sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from PMMIR_EL1 registers in this entry. The user programs can get the slots from sysfs directly. /sys/bus/event_source/devices/armv8_pmuv3_0/caps/slots is exposed under sysfs. Both ARMv8.4-PMU and STALL_SLOT event are implemented, it returns the slots from PMMIR_EL1, otherwise it will return 0. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1600754025-53535-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07arm64: perf: Remove unnecessary event_idx checkQi Liu
event_idx is obtained from armv8pmu_get_event_idx(), and this idx must be between ARMV8_IDX_CYCLE_COUNTER and cpu_pmu->num_events. So it's unnecessary to do this check. Let's remove it. Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1599213458-28394-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07arm64: perf: Add general hardware LLC events for PMUv3Leo Yan
This patch is to add the general hardware last level cache (LLC) events for PMUv3: one event is for LLC access and another is for LLC miss. With this change, perf tool can support last level cache profiling, below is an example to demonstrate the usage on Arm64: $ perf stat -e LLC-load-misses -e LLC-loads -- \ perf bench mem memcpy -s 1024MB -l 100 -f default [...] Performance counter stats for 'perf bench mem memcpy -s 1024MB -l 100 -f default': 35,534,262 LLC-load-misses # 2.16% of all LL-cache hits 1,643,946,443 LLC-loads [...] Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200811053505.21223-1-leo.yan@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21arm64: perf: Expose some new events via sysfsShaokun Zhang
Some new PMU events can been detected by PMCEID1_EL0, but it can't be listed, Let's expose these through sysfs. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1595328573-12751-2-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Add cap_user_time_shortPeter Zijlstra
This completes the ARM64 cap_user_time support. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-7-leo.yan@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Only advertise cap_user_time for arch_timerPeter Zijlstra
When sched_clock is running on anything other than arch_timer, don't advertise cap_user_time*. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-5-leo.yan@linaro.org Requested-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Implement correct cap_user_timePeter Zijlstra
As reported by Leo; the existing implementation is broken when the clock and counter don't intersect at 0. Use the sched_clock's struct clock_read_data information to correctly implement cap_user_time and cap_user_time_zero. Note that the ARM64 counter is architecturally only guaranteed to be 56bit wide (implementations are allowed to be wider) and the existing perf ABI cannot deal with wrap-around. This implementation should also be faster than the old; seeing how we don't need to recompute mult and shift all the time. [leoyan: Use mul_u64_u32_shr() to convert cyc to ns to avoid overflow] Reported-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-4-leo.yan@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Correct the event index in sysfsShaokun Zhang
When PMU event ID is equal or greater than 0x4000, it will be reduced by 0x4000 and it is not the raw number in the sysfs. Let's correct it and obtain the raw event ID. Before this patch: cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed event=0x001 After this patch: cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed event=0x4001 Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1592487344-30555-3-git-send-email-zhangshaokun@hisilicon.com [will: fixed formatting of 'if' condition] Signed-off-by: Will Deacon <will@kernel.org>
2020-03-17arm64: perf: Add support for ARMv8.5-PMU 64-bit countersAndrew Murray
At present ARMv8 event counters are limited to 32-bits, though by using the CHAIN event it's possible to combine adjacent counters to achieve 64-bits. The perf config1:0 bit can be set to use such a configuration. With the introduction of ARMv8.5-PMU support, all event counters can now be used as 64-bit counters. Let's enable 64-bit event counters where support exists. Unless the user sets config1:0 we will adjust the counter value such that it overflows upon 32-bit overflow. This follows the same behaviour as the cycle counter which has always been (and remains) 64-bits. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> [Mark: fix ID field names, compare with 8.5 value] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-03-17arm64: perf: Clean up enable/disable callsRobin Murphy
Reading this code bordered on painful, what with all the repetition and pointless return values. More fundamentally, dribbling the hardware enables and disables in one bit at a time incurs needless system register overhead for chained events and on reset. We already use bitmask values for the KVM hooks, so consolidate all the register accesses to match, and make a reasonable saving in both source and object code. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-03-02arm64: perf: Support new DT compatiblesRobin Murphy
Add support for matching the new PMUs. For now, this just wires them up as generic PMUv3 such that people writing DTs for new SoCs can do the right thing, and at least have architectural and raw events be usable. We can come back and fill in event maps for sysfs and/or perf tools at a later date. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-03-02arm64: perf: Refactor PMU init callbacksRobin Murphy
The PMU init callbacks are already drowning in boilerplate, so before doubling the number of supported PMU models, give it a sensible refactor to significantly reduce the bloat, both in source and object code. Although nobody uses non-default sysfs attributes today, there's minimal impact to preserving the notion that maybe, some day, somebody might, so we may as well keep up appearances. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-11-01arm64: perf: Simplify the ARMv8 PMUv3 event attributesShaokun Zhang
For each PMU event, there is a ARMV8_EVENT_ATTR(xx, XX) and &armv8_event_attr_xx.attr.attr. Let's redefine the ARMV8_EVENT_ATTR to simplify the armv8_pmuv3_event_attrs. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> [will: Dropped unnecessary array syntax] Signed-off-by: Will Deacon <will@kernel.org>
2019-08-20arm64: perf_event: Add missing header needed for smp_processor_id()Raphael Gault
In perf_event.c we use smp_processor_id(), but we haven't included <linux/smp.h> where it is defined, and rely on this being pulled in via a transitive include. Let's make this more robust by including <linux.smp.h> explicitly. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Raphael Gault <raphael.gault@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-23arm64: perf: Remove unused macroShaokun Zhang
ARMV8_EVENT_ATTR_RESOLVE became unused after commit <4b1a9e6934ec> ("arm64/perf: Filter common events based on PMCEIDn_EL0"). Remove it. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - support for SVE and Pointer Authentication in guests - PMU improvements POWER: - support for direct access to the POWER9 XIVE interrupt controller - memory and performance optimizations x86: - support for accessing memory not backed by struct page - fixes and refactoring Generic: - dirty page tracking improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (155 commits) kvm: fix compilation on aarch64 Revert "KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU" kvm: x86: Fix L1TF mitigation for shadow MMU KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible KVM: PPC: Book3S: Remove useless checks in 'release' method of KVM device KVM: PPC: Book3S HV: XIVE: Fix spelling mistake "acessing" -> "accessing" KVM: PPC: Book3S HV: Make sure to load LPID for radix VCPUs kvm: nVMX: Set nested_run_pending in vmx_set_nested_state after checks complete tests: kvm: Add tests for KVM_SET_NESTED_STATE KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS state before setting new state tests: kvm: Add tests for KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_CPU_ID tests: kvm: Add tests to .gitignore KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 KVM: Fix kvm_clear_dirty_log_protect off-by-(minus-)one KVM: Fix the bitmap range to copy during clear dirty KVM: arm64: Fix ptrauth ID register masking logic KVM: x86: use direct accessors for RIP and RSP KVM: VMX: Use accessors for GPRs outside of dedicated caching logic KVM: x86: Omit caching logic for always-available GPRs kvm, x86: Properly check whether a pfn is an MMIO or not ...