summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel
AgeCommit message (Collapse)Author
2020-07-20arm64: perf: Add cap_user_time_shortPeter Zijlstra
This completes the ARM64 cap_user_time support. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-7-leo.yan@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Only advertise cap_user_time for arch_timerPeter Zijlstra
When sched_clock is running on anything other than arch_timer, don't advertise cap_user_time*. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-5-leo.yan@linaro.org Requested-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Implement correct cap_user_timePeter Zijlstra
As reported by Leo; the existing implementation is broken when the clock and counter don't intersect at 0. Use the sched_clock's struct clock_read_data information to correctly implement cap_user_time and cap_user_time_zero. Note that the ARM64 counter is architecturally only guaranteed to be 56bit wide (implementations are allowed to be wider) and the existing perf ABI cannot deal with wrap-around. This implementation should also be faster than the old; seeing how we don't need to recompute mult and shift all the time. [leoyan: Use mul_u64_u32_shr() to convert cyc to ns to avoid overflow] Reported-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-4-leo.yan@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Correct the event index in sysfsShaokun Zhang
When PMU event ID is equal or greater than 0x4000, it will be reduced by 0x4000 and it is not the raw number in the sysfs. Let's correct it and obtain the raw event ID. Before this patch: cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed event=0x001 After this patch: cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed event=0x4001 Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1592487344-30555-3-git-send-email-zhangshaokun@hisilicon.com [will: fixed formatting of 'if' condition] Signed-off-by: Will Deacon <will@kernel.org>
2020-07-17Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into master Pull arm64 fixes from Will Deacon: "A batch of arm64 fixes. Although the diffstat is a bit larger than we'd usually have at this stage, a decent amount of it is the addition of comments describing our syscall tracing behaviour, and also a sweep across all the modular arm64 PMU drivers to make them rebust against unloading and unbinding. There are a couple of minor things kicking around at the moment (CPU errata and module PLTs for very large modules), but I'm not expecting any significant changes now for us in 5.8. - Fix kernel text addresses for relocatable images booting using EFI and with KASLR disabled so that they match the vmlinux ELF binary. - Fix unloading and unbinding of PMU driver modules. - Fix generic mmiowb() when writeX() is called from preemptible context (reported by the riscv folks). - Fix ptrace hardware single-step interactions with signal handlers, system calls and reverse debugging. - Fix reporting of 64-bit x0 register for 32-bit tasks via 'perf_regs'. - Add comments describing syscall entry/exit tracing ABI" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: drivers/perf: Prevent forced unbinding of PMU drivers asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible() arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP arm64: ptrace: Use NO_SYSCALL instead of -1 in syscall_trace_enter() arm64: syscall: Expand the comment about ptrace and syscall(-1) arm64: ptrace: Add a comment describing our syscall entry/exit trap ABI arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return arm64: ptrace: Override SPSR.SS when single-stepping is enabled arm64: ptrace: Consistently use pseudo-singlestep exceptions drivers/perf: Fix kernel panic when rmmod PMU modules during perf sampling efi/libstub/arm64: Retain 2MB kernel Image alignment if !KASLR
2020-07-16arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEPWill Deacon
Rather than open-code test_tsk_thread_flag() at each callsite, simply replace the couple of offenders with calls to test_tsk_thread_flag() directly. Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Use NO_SYSCALL instead of -1 in syscall_trace_enter()Will Deacon
Setting a system call number of -1 is special, as it indicates that the current system call should be skipped. Use NO_SYSCALL instead of -1 when checking for this scenario, which is different from the -1 returned due to a seccomp failure. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Cc: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: syscall: Expand the comment about ptrace and syscall(-1)Will Deacon
If a task executes syscall(-1), we intercept this early and force x0 to be -ENOSYS so that we don't need to distinguish this scenario from one where the scno is -1 because a tracer wants to skip the system call using ptrace. With the return value set, the return path is the same as the skip case. Although there is a one-line comment noting this in el0_svc_common(), it misses out most of the detail. Expand the comment to describe a bit more about what is going on. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Cc: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Add a comment describing our syscall entry/exit trap ABIWill Deacon
Our tracehook logic for syscall entry/exit raises a SIGTRAP back to the tracer following a ptrace request such as PTRACE_SYSCALL. As part of this procedure, we clobber the reported value of one of the tracee's general purpose registers (x7 for native tasks, r12 for compat) to indicate whether the stop occurred on syscall entry or exit. This is a slightly unfortunate ABI, as it prevents the tracer from accessing the real register value and is at odds with other similar stops such as seccomp traps. Since we're stuck with this ABI, expand the comment in our tracehook logic to acknowledge the issue and describe the behaviour in more detail. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Luis Machado <luis.machado@linaro.org> Reported-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: compat: Ensure upper 32 bits of x0 are zero on syscall returnWill Deacon
Although we zero the upper bits of x0 on entry to the kernel from an AArch32 task, we do not clear them on the exception return path and can therefore expose 64-bit sign extended syscall return values to userspace via interfaces such as the 'perf_regs' ABI, which deal exclusively with 64-bit registers. Explicitly clear the upper 32 bits of x0 on return from a compat system call. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Cc: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Override SPSR.SS when single-stepping is enabledWill Deacon
Luis reports that, when reverse debugging with GDB, single-step does not function as expected on arm64: | I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP | request by GDB won't execute the underlying instruction. As a consequence, | the PC doesn't move, but we return a SIGTRAP just like we would for a | regular successful PTRACE_SINGLESTEP request. The underlying problem is that when the CPU register state is restored as part of a reverse step, the SPSR.SS bit is cleared and so the hardware single-step state can transition to the "active-pending" state, causing an unexpected step exception to be taken immediately if a step operation is attempted. In hindsight, we probably shouldn't have exposed SPSR.SS in the pstate accessible by the GPR regset, but it's a bit late for that now. Instead, simply prevent userspace from configuring the bit to a value which is inconsistent with the TIF_SINGLESTEP state for the task being traced. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Link: https://lore.kernel.org/r/1eed6d69-d53d-9657-1fc9-c089be07f98c@linaro.org Reported-by: Luis Machado <luis.machado@linaro.org> Tested-by: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Consistently use pseudo-singlestep exceptionsWill Deacon
Although the arm64 single-step state machine can be fast-forwarded in cases where we wish to generate a SIGTRAP without actually executing an instruction, this has two major limitations outside of simply skipping an instruction due to emulation. 1. Stepping out of a ptrace signal stop into a signal handler where SIGTRAP is blocked. Fast-forwarding the stepping state machine in this case will result in a forced SIGTRAP, with the handler reset to SIG_DFL. 2. The hardware implicitly fast-forwards the state machine when executing an SVC instruction for issuing a system call. This can interact badly with subsequent ptrace stops signalled during the execution of the system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt the stepping state by updating the PSTATE for the tracee. Resolve both of these issues by injecting a pseudo-singlestep exception on entry to a signal handler and also on return to userspace following a system call. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Tested-by: Luis Machado <luis.machado@linaro.org> Reported-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-15arm64: tlb: Detect the ARMv8.4 TLBI RANGE featureZhenyu Ye
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a range of input addresses. This patch detect this feature. Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> Link: https://lore.kernel.org/r/20200715071945.897-2-yezhenyu2@huawei.com [catalin.marinas@arm.com: some renaming for consistency] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-14arm64: stacktrace: Move export for save_stack_trace_tsk()Mark Brown
Due to refactoring way back in bb53c820c5b0f1 ("arm64: stacktrace: avoid listing stacktrace functions in stacktrace") the EXPORT_SYMBOL_GPL() for save_stack_trace_tsk() is at the end of __save_stack_trace() rather than the function it exports. Move it to the expected location. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200710182402.50473-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-14arm64/acpi: disallow writeable AML opregion mapping for EFI code regionsArd Biesheuvel
Given that the contents of EFI runtime code and data regions are provided by the firmware, as well as the DSDT, it is not unimaginable that AML code exists today that accesses EFI runtime code regions using a SystemMemory OpRegion. There is nothing fundamentally wrong with that, but since we take great care to ensure that executable code is never mapped writeable and executable at the same time, we should not permit AML to create writable mapping. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Link: https://lore.kernel.org/r/20200626155832.2323789-3-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-14arm64/acpi: disallow AML memory opregions to access kernel memoryArd Biesheuvel
AML uses SystemMemory opregions to allow AML handlers to access MMIO registers of, e.g., GPIO controllers, or access reserved regions of memory that are owned by the firmware. Currently, we also allow AML access to memory that is owned by the kernel and mapped via the linear region, which does not seem to be supported by a valid use case, and exposes the kernel's internal state to AML methods that may be buggy and exploitable. On arm64, ACPI support requires booting in EFI mode, and so we can cross reference the requested region against the EFI memory map, rather than just do a minimal check on the first page. So let's only permit regions to be remapped by the ACPI core if - they don't appear in the EFI memory map at all (which is the case for most MMIO), or - they are covered by a single region in the EFI memory map, which is not of a type that describes memory that is given to the kernel at boot. Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Link: https://lore.kernel.org/r/20200626155832.2323789-2-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-10Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "An unfortunately large collection of arm64 fixes for -rc5. Some of this is absolutely trivial, but the alternatives, vDSO and CPU errata workaround fixes are significant. At least people are finding and fixing these things, I suppose. - Fix workaround for CPU erratum #1418040 to disable the compat vDSO - Fix Oops when single-stepping with KGDB - Fix memory attributes for hypervisor device mappings at EL2 - Fix memory leak in PSCI and remove useless variable assignment - Fix up some comments and asm labels in our entry code - Fix broken register table formatting in our generated html docs - Fix missing NULL sentinel in CPU errata workaround list - Fix patching of branches in alternative instruction sections" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/alternatives: don't patch up internal branches arm64: Add missing sentinel to erratum_1463225 arm64: Documentation: Fix broken table in generated HTML arm64: kgdb: Fix single-step exception handling oops arm64: entry: Tidy up block comments and label numbers arm64: Rework ARM_ERRATUM_1414080 handling arm64: arch_timer: Disable the compat vdso for cores affected by ARM64_WORKAROUND_1418040 arm64: arch_timer: Allow an workaround descriptor to disable compat vdso arm64: Introduce a way to disable the 32bit vdso arm64: entry: Fix the typo in the comment of el1_dbg() drivers/firmware/psci: Assign @err directly in hotplug_tests() drivers/firmware/psci: Fix memory leakage in alloc_init_cpu_groups() KVM: arm64: Fix definition of PAGE_HYP_DEVICE
2020-07-09arm64/alternatives: don't patch up internal branchesArd Biesheuvel
Commit f7b93d42945c ("arm64/alternatives: use subsections for replacement sequences") moved the alternatives replacement sequences into subsections, in order to keep the as close as possible to the code that they replace. Unfortunately, this broke the logic in branch_insn_requires_update, which assumed that any branch into kernel executable code was a branch that required updating, which is no longer the case now that the code sequences that are patched in are in the same section as the patch site itself. So the only way to discriminate branches that require updating and ones that don't is to check whether the branch targets the replacement sequence itself, and so we can drop the call to kernel_text_address() entirely. Fixes: f7b93d42945c ("arm64/alternatives: use subsections for replacement sequences") Reported-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20200709125953.30918-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-09arm64: Add missing sentinel to erratum_1463225Florian Fainelli
When the erratum_1463225 array was introduced a sentinel at the end was missing thus causing a KASAN: global-out-of-bounds in is_affected_midr_range_list on arm64 error. Fixes: a9e821b89daa ("arm64: Add KRYO4XX gold CPU cores to erratum list 1463225 and 1418040") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Link: https://lore.kernel.org/linux-arm-kernel/CA+G9fYs3EavpU89-rTQfqQ9GgxAMgMAk7jiiVrfP0yxj5s+Q6g@mail.gmail.com/ Link: https://lore.kernel.org/r/20200709051345.14544-1-f.fainelli@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: kgdb: Fix single-step exception handling oopsWei Li
After entering kdb due to breakpoint, when we execute 'ss' or 'go' (will delay installing breakpoints, do single-step first), it won't work correctly, and it will enter kdb due to oops. It's because the reason gotten in kdb_stub() is not as expected, and it seems that the ex_vector for single-step should be 0, like what arch powerpc/sh/parisc has implemented. Before the patch: Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry [0]kdb> bp printk Instruction(i) BP #0 at 0xffff8000101486cc (printk) is enabled addr at ffff8000101486cc, hardtype=0 installed=0 [0]kdb> g / # echo h > /proc/sysrq-trigger Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 due to Breakpoint @ 0xffff8000101486cc [3]kdb> ss Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 Oops: (null) due to oops @ 0xffff800010082ab8 CPU: 3 PID: 266 Comm: sh Not tainted 5.7.0-rc4-13839-gf0e5ad491718 #6 Hardware name: linux,dummy-virt (DT) pstate: 00000085 (nzcv daIf -PAN -UAO) pc : el1_irq+0x78/0x180 lr : __handle_sysrq+0x80/0x190 sp : ffff800015003bf0 x29: ffff800015003d20 x28: ffff0000fa878040 x27: 0000000000000000 x26: ffff80001126b1f0 x25: ffff800011b6a0d8 x24: 0000000000000000 x23: 0000000080200005 x22: ffff8000101486cc x21: ffff800015003d30 x20: 0000ffffffffffff x19: ffff8000119f2000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : ffff800015003e50 x7 : 0000000000000002 x6 : 00000000380b9990 x5 : ffff8000106e99e8 x4 : ffff0000fadd83c0 x3 : 0000ffffffffffff x2 : ffff800011b6a0d8 x1 : ffff800011b6a000 x0 : ffff80001130c9d8 Call trace: el1_irq+0x78/0x180 printk+0x0/0x84 write_sysrq_trigger+0xb0/0x118 proc_reg_write+0xb4/0xe0 __vfs_write+0x18/0x40 vfs_write+0xb0/0x1b8 ksys_write+0x64/0xf0 __arm64_sys_write+0x14/0x20 el0_svc_common.constprop.2+0xb0/0x168 do_el0_svc+0x20/0x98 el0_sync_handler+0xec/0x1a8 el0_sync+0x140/0x180 [3]kdb> After the patch: Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry [0]kdb> bp printk Instruction(i) BP #0 at 0xffff8000101486cc (printk) is enabled addr at ffff8000101486cc, hardtype=0 installed=0 [0]kdb> g / # echo h > /proc/sysrq-trigger Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc [0]kdb> g Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc [0]kdb> ss Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to SS trap @ 0xffff800010082ab8 [0]kdb> Fixes: 44679a4f142b ("arm64: KGDB: Add step debugging support") Signed-off-by: Wei Li <liwei391@huawei.com> Tested-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200509214159.19680-2-liwei391@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: entry: Tidy up block comments and label numbersWill Deacon
Continually butchering our entry code with CPU errata workarounds has led to it looking a little scruffy. Consistently used /* */ comment style for multi-line block comments and ensure that small numeric labels use consecutive integers. No functional change, but the state of things was irritating. Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: Rework ARM_ERRATUM_1414080 handlingMarc Zyngier
The current handling of erratum 1414080 has the side effect that cntkctl_el1 can get changed for both 32 and 64bit tasks. This isn't a problem so far, but if we ever need to mitigate another of these errata on the 64bit side, we'd better keep the messing with cntkctl_el1 local to 32bit tasks. For that, make sure that on entering the kernel from a 32bit tasks, userspace access to cntvct gets enabled, and disabled returning to userspace, while it never gets changed for 64bit tasks. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200706163802.1836732-5-maz@kernel.org [will: removed branch instructions per Mark's review comments] Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: entry: Fix the typo in the comment of el1_dbg()Kevin Hao
The function name should be local_daif_mask(). Signed-off-by: Kevin Hao <haokexin@gmail.com> Acked-by: Mark Rutlamd <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200417103212.45812-2-haokexin@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-07arm64/cpufeature: Validate feature bits spacing in arm64_ftr_regs[]Anshuman Khandual
arm64_feature_bits for a register in arm64_ftr_regs[] are in a descending order as per their shift values. Validate that these features bits are defined correctly and do not overlap with each other. This check protects against any inadvertent erroneous changes to the register definitions. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1594131793-9498-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-07KVM: arm64: Make struct kvm_regs userspace-onlyMarc Zyngier
struct kvm_regs is used by userspace to indicate which register gets accessed by the {GET,SET}_ONE_REG API. But as we're about to refactor the layout of the in-kernel register structures, we need the kernel to move away from it. Let's make kvm_regs userspace only, and let the kernel map it to its own internal representation. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07Merge branch 'kvm-arm64/ttl-for-arm64' into HEADMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07arm64: Detect the ARMv8.4 TTL featureMarc Zyngier
In order to reduce the cost of TLB invalidation, the ARMv8.4 TTL feature allows TLBs to be issued with a level allowing for quicker invalidation. Let's detect the feature for now. Further patches will implement its actual usage. Reviewed-by : Suzuki K Polose <suzuki.poulose@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-05KVM: arm64: Compile remaining hyp/ files for both VHE/nVHEDavid Brazdil
The following files in hyp/ contain only code shared by VHE/nVHE: vgic-v3-sr.c, aarch32.c, vgic-v2-cpuif-proxy.c, entry.S, fpsimd.S Compile them under both configurations. Deletions in image-vars.h reflect eliminated dependencies of nVHE code on the rest of the kernel. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-14-dbrazdil@google.com
2020-07-05KVM: arm64: Duplicate hyp/timer-sr.c for VHE/nVHEDavid Brazdil
timer-sr.c contains a HVC handler for setting CNTVOFF_EL2 and two helper functions for controlling access to physical counter. The former is used by both VHE/nVHE and is duplicated, the latter are used only by nVHE and moved to nvhe/timer-sr.c. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-13-dbrazdil@google.com
2020-07-05KVM: arm64: Split hyp/sysreg-sr.c to VHE/nVHEDavid Brazdil
sysreg-sr.c contains KVM's code for saving/restoring system registers, with some code shared between VHE/nVHE. These common routines are moved to a header file, VHE-specific code is moved to vhe/sysreg-sr.c and nVHE-specific code to nvhe/sysreg-sr.c. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-12-dbrazdil@google.com
2020-07-05KVM: arm64: Split hyp/debug-sr.c to VHE/nVHEDavid Brazdil
debug-sr.c contains KVM's code for context-switching debug registers, with some code shared between VHE/nVHE. These common routines are moved to a header file, VHE-specific code is moved to vhe/debug-sr.c and nVHE-specific code to nvhe/debug-sr.c. Functions are slightly refactored to move code hidden behind `has_vhe()` checks to the corresponding .c files. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-11-dbrazdil@google.com
2020-07-05KVM: arm64: Split hyp/switch.c to VHE/nVHEDavid Brazdil
switch.c implements context-switching for KVM, with large parts shared between VHE/nVHE. These common routines are moved to a header file, VHE-specific code is moved to vhe/switch.c and nVHE-specific code is moved to nvhe/switch.c. Previously __kvm_vcpu_run needed a different symbol name for VHE/nVHE. This is cleaned up and the caller in arm.c simplified. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-10-dbrazdil@google.com
2020-07-05KVM: arm64: Duplicate hyp/tlb.c for VHE/nVHEDavid Brazdil
tlb.c contains code for flushing the TLB, with code shared between VHE/nVHE. Because common code is small, duplicate tlb.c and specialize each copy for VHE/nVHE. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-9-dbrazdil@google.com
2020-07-05KVM: arm64: Move hyp-init.S to nVHEAndrew Scull
hyp-init.S contains the identity mapped initialisation code for the non-VHE code that runs at EL2. It is only used for non-VHE. Adjust code that calls into this to use the prefixed symbol name. Signed-off-by: Andrew Scull <ascull@google.com> Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-8-dbrazdil@google.com
2020-07-05KVM: arm64: Build hyp-entry.S separately for VHE/nVHEDavid Brazdil
hyp-entry.S contains implementation of KVM hyp vectors. This code is mostly shared between VHE/nVHE, therefore compile it under both VHE and nVHE build rules. nVHE-specific host HVC handler is hidden behind __KVM_NVHE_HYPERVISOR__. Adjust code which selects which KVM hyp vecs to install to choose the correct VHE/nVHE symbol. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-7-dbrazdil@google.com
2020-07-05KVM: arm64: Handle calls to prefixed hyp functionsAndrew Scull
Once hyp functions are moved to a hyp object, they will have prefixed symbols. This change declares and gets the address of the prefixed version for calls to the hyp functions. To aid migration, the hyp functions that have not yet moved have their prefixed versions aliased to their non-prefixed version. This begins with all the hyp functions being listed and will reduce to none of them once the migration is complete. Signed-off-by: Andrew Scull <ascull@google.com> [David: Extracted kvm_call_hyp nVHE branches into own helper macros, added comments around symbol aliases.] Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-6-dbrazdil@google.com
2020-07-05KVM: arm64: Add build rules for separate VHE/nVHE object filesDavid Brazdil
Add new folders arch/arm64/kvm/hyp/{vhe,nvhe} and Makefiles for building code that runs in EL2 under VHE/nVHE KVM, repsectivelly. Add an include folder for hyp-specific header files which will include code common to VHE/nVHE. Build nVHE code with -D__KVM_NVHE_HYPERVISOR__, VHE code with -D__KVM_VHE_HYPERVISOR__. Under nVHE compile each source file into a `.hyp.tmp.o` object first, then prefix all its symbols with "__kvm_nvhe_" using `objcopy` and produce a `.hyp.o`. Suffixes were chosen so that it would be possible for VHE and nVHE to share some source files, but compiled with different CFLAGS. The nVHE ELF symbol prefix is added to kallsyms.c as ignored. EL2-only symbols will never appear in EL1 stack traces. Due to symbol prefixing, add a section in image-vars.h for aliases of symbols that are defined in nVHE EL2 and accessed by kernel in EL1 or vice versa. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-4-dbrazdil@google.com
2020-07-04Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Nothing earth-shattering, really - some CPU errata workarounds (one day they'll get it right, ha!) and a fix for a boot failure with very large kernel images where the alternative patching gets confused when patching relative branches using veneers. - Fix alternative patching for very large kernel images and modules - Hook up existing CPU errata workarounds for Qualcomm Kryo CPUs" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Add KRYO4XX silver CPU cores to erratum list 1530923 and 1024718 arm64: Add KRYO4XX gold CPU cores to erratum list 1463225 and 1418040 arm64: Add MIDR value for KRYO4XX gold CPU cores arm64/alternatives: use subsections for replacement sequences
2020-07-04arch: rename copy_thread_tls() back to copy_thread()Christian Brauner
Now that HAVE_COPY_THREAD_TLS has been removed, rename copy_thread_tls() back simply copy_thread(). It's a simpler name, and doesn't imply that only tls is copied here. This finishes an outstanding chunk of internal process creation work since we've added clone3(). Cc: linux-arch@vger.kernel.org Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>A Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Greentime Hu <green.hu@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>A Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-03vmalloc: fix the owner argument for the new __vmalloc_node_range callersChristoph Hellwig
Fix the recently added new __vmalloc_node_range callers to pass the correct values as the owner for display in /proc/vmallocinfo. Fixes: 800e26b81311 ("x86/hyperv: allocate the hypercall page with only read and execute bits") Fixes: 10d5e97c1bf8 ("arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page") Fixes: 7a0e27b2a0ce ("mm: remove vmalloc_exec") Reported-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200627075649.2455097-1-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-03arm64/cpufeature: Replace all open bits shift encodings with macrosAnshuman Khandual
There are many open bits shift encodings for various CPU ID registers that are scattered across cpufeature. This replaces them with register specific sensible macro definitions. This should not have any functional change. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1593748297-1965-5-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-03arm64/cpufeature: Add remaining feature bits in ID_AA64MMFR2 registerAnshuman Khandual
Enable EVT, BBM, TTL, IDS, ST, NV and CCIDX features bits in ID_AA64MMFR2 register as per ARM DDI 0487F.a specification. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1593748297-1965-4-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-03arm64/cpufeature: Add remaining feature bits in ID_AA64MMFR1 registerAnshuman Khandual
Enable ETS, TWED, XNX and SPECSEI features bits in ID_AA64MMFR1 register as per ARM DDI 0487F.a specification. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1593748297-1965-3-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-03arm64/cpufeature: Add remaining feature bits in ID_AA64MMFR0 registerAnshuman Khandual
Enable EVC, FGT, EXS features bits in ID_AA64MMFR0 register as per ARM DDI 0487F.a specification. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1593748297-1965-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-03arm64: Add KRYO4XX silver CPU cores to erratum list 1530923 and 1024718Sai Prakash Ranjan
KRYO4XX silver/LITTLE CPU cores with revision r1p0 are affected by erratum 1530923 and 1024718, so add them to the respective list. The variant and revision bits are implementation defined and are different from the their Cortex CPU counterparts on which they are based on, i.e., r1p0 is equivalent to rdpe. Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Link: https://lore.kernel.org/r/7013e8a3f857ca7e82863cc9e34a614293d7f80c.1593539394.git.saiprakash.ranjan@codeaurora.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-03arm64: Add KRYO4XX gold CPU cores to erratum list 1463225 and 1418040Sai Prakash Ranjan
KRYO4XX gold/big CPU core revisions r0p0 to r3p1 are affected by erratum 1463225 and 1418040, so add them to the respective list. The variant and revision bits are implementation defined and are different from the their Cortex CPU counterparts on which they are based on, i.e., (r0p0 to r3p1) is equivalent to (rcpe to rfpf). Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Link: https://lore.kernel.org/r/83780e80c6377c12ca51b5d53186b61241685e49.1593539394.git.saiprakash.ranjan@codeaurora.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-02arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfoBhupesh Sharma
TCR_EL1.TxSZ, which controls the VA space size, is configured by a single kernel image to support either 48-bit or 52-bit VA space. If the ARMv8.2-LVA optional feature is present and we are running with a 64KB page size, then it is possible to use 52-bits of address space for both userspace and kernel addresses. However, any kernel binary that supports 52-bit must also be able to fall back to 48-bit at early boot time if the hardware feature is not present. Since TCR_EL1.T1SZ indicates the size of the memory region addressed by TTBR1_EL1, export the same in vmcoreinfo. User-space utilities like makedumpfile and crash-utility need to read this value from vmcoreinfo for determining if a virtual address lies in the linear map range. While at it also add documentation for TCR_EL1.T1SZ variable being added to vmcoreinfo. It indicates the size offset of the memory region addressed by TTBR1_EL1. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Tested-by: John Donnelly <john.p.donnelly@oracle.com> Tested-by: Kamlakant Patel <kamlakantp@marvell.com> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Dave Anderson <anderson@redhat.com> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: kexec@lists.infradead.org Link: https://lore.kernel.org/r/1589395957-24628-3-git-send-email-bhsharma@redhat.com [catalin.marinas@arm.com: removed vabits_actual from the commit log] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-02arm64/panic: Unify all three existing notifier blocksAnshuman Khandual
Currently there are three different registered panic notifier blocks. This unifies all of them into a single one i.e arm64_panic_block, hence reducing code duplication and required calling sequence during panic. This preserves the existing dump sequence. While here, just use device_initcall() directly instead of __initcall() which has been a legacy alias for the earlier. This replacement is a pure cleanup with no functional implications. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1593405511-7625-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-02arm64/alternatives: use subsections for replacement sequencesArd Biesheuvel
When building very large kernels, the logic that emits replacement sequences for alternatives fails when relative branches are present in the code that is emitted into the .altinstr_replacement section and patched in at the original site and fixed up. The reason is that the linker will insert veneers if relative branches go out of range, and due to the relative distance of the .altinstr_replacement from the .text section where its branch targets usually live, veneers may be emitted at the end of the .altinstr_replacement section, with the relative branches in the sequence pointed at the veneers instead of the actual target. The alternatives patching logic will attempt to fix up the branch to point to its original target, which will be the veneer in this case, but given that the patch site is likely to be far away as well, it will be out of range and so patching will fail. There are other cases where these veneers are problematic, e.g., when the target of the branch is in .text while the patch site is in .init.text, in which case putting the replacement sequence inside .text may not help either. So let's use subsections to emit the replacement code as closely as possible to the patch site, to ensure that veneers are only likely to be emitted if they are required at the patch site as well, in which case they will be in range for the replacement sequence both before and after it is transported to the patch site. This will prevent alternative sequences in non-init code from being released from memory after boot, but this is tolerable given that the entire section is only 512 KB on an allyesconfig build (which weighs in at 500+ MB for the entire Image). Also, note that modules today carry the replacement sequences in non-init sections as well, and any of those that target init code will be emitted into init sections after this change. This fixes an early crash when booting an allyesconfig kernel on a system where any of the alternatives sequences containing relative branches are activated at boot (e.g., ARM64_HAS_PAN on TX2) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Dave P Martin <dave.martin@arm.com> Link: https://lore.kernel.org/r/20200630081921.13443-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-02arm64/module: Optimize module load time by optimizing PLT countingSaravana Kannan
When loading a module, module_frob_arch_sections() tries to figure out the number of PLTs that'll be needed to handle all the RELAs. While doing this, it tries to dedupe PLT allocations for multiple R_AARCH64_CALL26 relocations to the same symbol. It does the same for R_AARCH64_JUMP26 relocations. To make checks for duplicates easier/faster, it sorts the relocation list by type, symbol and addend. That way, to check for a duplicate relocation, it just needs to compare with the previous entry. However, sorting the entire relocation array is unnecessary and expensive (O(n log n)) because there are a lot of other relocation types that don't need deduping or can't be deduped. So this commit partitions the array into entries that need deduping and those that don't. And then sorts just the part that needs deduping. And when CONFIG_RANDOMIZE_BASE is disabled, the sorting is skipped entirely because PLTs are not allocated for R_AARCH64_CALL26 and R_AARCH64_JUMP26 if it's disabled. This gives significant reduction in module load time for modules with large number of relocations with no measurable impact on modules with a small number of relocations. In my test setup with CONFIG_RANDOMIZE_BASE enabled, these were the results for a few downstream modules: Module Size (MB) wlan 14 video codec 3.8 drm 1.8 IPA 2.5 audio 1.2 gpu 1.8 Without this patch: Module Number of entries sorted Module load time (ms) wlan 243739 283 video codec 74029 138 drm 53837 67 IPA 42800 90 audio 21326 27 gpu 20967 32 Total time to load all these module: 637 ms With this patch: Module Number of entries sorted Module load time (ms) wlan 22454 61 video codec 10150 47 drm 13014 40 IPA 8097 63 audio 4606 16 gpu 6527 20 Total time to load all these modules: 247 Time saved during boot for just these 6 modules: 390 ms Signed-off-by: Saravana Kannan <saravanak@google.com> Acked-by: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Link: https://lore.kernel.org/r/20200623011803.91232-1-saravanak@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>