summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/probes/decode-insn.c
AgeCommit message (Collapse)Author
2024-11-14Merge branch 'for-next/mops' into for-next/coreCatalin Marinas
* for-next/mops: : More FEAT_MOPS (memcpy instructions) uses - in-kernel routines arm64: mops: Document requirements for hypervisors arm64: lib: Use MOPS for copy_page() and clear_page() arm64: lib: Use MOPS for memcpy() routines arm64: mops: Document booting requirement for HCR_EL2.MCE2 arm64: mops: Handle MOPS exceptions from EL1 arm64: probes: Disable kprobes/uprobes on MOPS instructions # Conflicts: # arch/arm64/kernel/entry-common.c
2024-10-17arm64: probes: Disable kprobes/uprobes on MOPS instructionsKristina Martsenko
FEAT_MOPS instructions require that all three instructions (prologue, main and epilogue) appear consecutively in memory. Placing a kprobe/uprobe on one of them doesn't work as only a single instruction gets executed out-of-line or simulated. So don't allow placing a probe on a MOPS instruction. Fixes: b7564127ffcb ("arm64: mops: detect and enable FEAT_MOPS") Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20240930161051.3777828-2-kristina.martsenko@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-15arm64: insn: Simulate nop instruction for better uprobe performanceLiao Chang
v2->v1: 1. Remove the simuation of STP and the related bits. 2. Use arm64_skip_faulting_instruction for single-stepping or FEAT_BTI scenario. As Andrii pointed out, the uprobe/uretprobe selftest bench run into a counterintuitive result that nop and push variants are much slower than ret variant [0]. The root cause lies in the arch_probe_analyse_insn(), which excludes 'nop' and 'stp' from the emulatable instructions list. This force the kernel returns to userspace and execute them out-of-line, then trapping back to kernel for running uprobe callback functions. This leads to a significant performance overhead compared to 'ret' variant, which is already emulated. Typicall uprobe is installed on 'nop' for USDT and on function entry which starts with the instrucion 'stp x29, x30, [sp, #imm]!' to push lr and fp into stack regardless kernel or userspace binary. In order to improve the performance of handling uprobe for common usecases. This patch supports the emulation of Arm64 equvialents instructions of 'nop' and 'push'. The benchmark results below indicates the performance gain of emulation is obvious. On Kunpeng916 (Hi1616), 4 NUMA nodes, 64 Arm64 cores@2.4GHz. xol (1 cpus) ------------ uprobe-nop: 0.916 ± 0.001M/s (0.916M/prod) uprobe-push: 0.908 ± 0.001M/s (0.908M/prod) uprobe-ret: 1.855 ± 0.000M/s (1.855M/prod) uretprobe-nop: 0.640 ± 0.000M/s (0.640M/prod) uretprobe-push: 0.633 ± 0.001M/s (0.633M/prod) uretprobe-ret: 0.978 ± 0.003M/s (0.978M/prod) emulation (1 cpus) ------------------- uprobe-nop: 1.862 ± 0.002M/s (1.862M/prod) uprobe-push: 1.743 ± 0.006M/s (1.743M/prod) uprobe-ret: 1.840 ± 0.001M/s (1.840M/prod) uretprobe-nop: 0.964 ± 0.004M/s (0.964M/prod) uretprobe-push: 0.936 ± 0.004M/s (0.936M/prod) uretprobe-ret: 0.940 ± 0.001M/s (0.940M/prod) As shown above, the performance gap between 'nop/push' and 'ret' variants has been significantly reduced. Due to the emulation of 'push' instruction needs to access userspace memory, it spent more cycles than the other. As Mark suggested [1], it is painful to emulate the correct atomicity and ordering properties of STP, especially when it interacts with MTE, POE, etc. So this patch just focus on the simuation of 'nop'. The simluation of STP and related changes will be addressed in a separate patch. [0] https://lore.kernel.org/all/CAEf4BzaO4eG6hr2hzXYpn+7Uer4chS0R99zLn02ezZ5YruVuQw@mail.gmail.com/ [1] https://lore.kernel.org/all/Zr3RN4zxF5XPgjEB@J2N7QTR9R3/ CC: Andrii Nakryiko <andrii.nakryiko@gmail.com> CC: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Liao Chang <liaochang1@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240909071114.1150053-1-liaochang1@huawei.com [catalin.marinas@arm.com: small tweaks following MarkR's comments] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-15arm64: probes: Remove probe_opcode_tMark Rutland
The probe_opcode_t typedef for u32 isn't necessary, and is a source of confusion as it is easily confused with kprobe_opcode_t, which is a typedef for __le32. The typedef is only used within arch/arm64, and all of arm64's commn insn code uses u32 for the endian-agnostic value of an instruction, so it'd be clearer to use u32 consistently. Remove probe_opcode_t and use u32 directly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marnias@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241008155851.801546-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-15arm64: probes: Cleanup kprobes endianness conversionsMark Rutland
The core kprobes code uses kprobe_opcode_t for the in-memory representation of an instruction, using 'kprobe_opcode_t *' for XOL slots. As arm64 instructions are always little-endian 32-bit values, kprobes_opcode_t should be __le32, but at the moment kprobe_opcode_t is typedef'd to u32. Today there is no functional issue as we convert values via cpu_to_le32() and le32_to_cpu() where necessary, but these conversions are inconsistent with the types used, causing sparse warnings: | CHECK arch/arm64/kernel/probes/kprobes.c | arch/arm64/kernel/probes/kprobes.c:102:21: warning: cast to restricted __le32 | CHECK arch/arm64/kernel/probes/decode-insn.c | arch/arm64/kernel/probes/decode-insn.c:122:46: warning: cast to restricted __le32 | arch/arm64/kernel/probes/decode-insn.c:124:50: warning: cast to restricted __le32 | arch/arm64/kernel/probes/decode-insn.c:136:31: warning: cast to restricted __le32 Improve this by making kprobes_opcode_t a typedef for __le32 and consistently using this for pointers to executable instructions. With this change we can rely on the type system to tell us where conversions are necessary. Since kprobe::opcode is changed from u32 to __le32, the existing le32_to_cpu() converion moves from the point this is initialized (in arch_prepare_kprobe()) to the points this is consumed when passed to a handler or text patching function. As kprobe::opcode isn't altered or consumed elsewhere, this shouldn't result in a functional change. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241008155851.801546-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-09arm64: probes: Remove broken LDR (literal) uprobe supportMark Rutland
The simulate_ldr_literal() and simulate_ldrsw_literal() functions are unsafe to use for uprobes. Both functions were originally written for use with kprobes, and access memory with plain C accesses. When uprobes was added, these were reused unmodified even though they cannot safely access user memory. There are three key problems: 1) The plain C accesses do not have corresponding extable entries, and thus if they encounter a fault the kernel will treat these as unintentional accesses to user memory, resulting in a BUG() which will kill the kernel thread, and likely lead to further issues (e.g. lockup or panic()). 2) The plain C accesses are subject to HW PAN and SW PAN, and so when either is in use, any attempt to simulate an access to user memory will fault. Thus neither simulate_ldr_literal() nor simulate_ldrsw_literal() can do anything useful when simulating a user instruction on any system with HW PAN or SW PAN. 3) The plain C accesses are privileged, as they run in kernel context, and in practice can access a small range of kernel virtual addresses. The instructions they simulate have a range of +/-1MiB, and since the simulated instructions must itself be a user instructions in the TTBR0 address range, these can address the final 1MiB of the TTBR1 acddress range by wrapping downwards from an address in the first 1MiB of the TTBR0 address range. In contemporary kernels the last 8MiB of TTBR1 address range is reserved, and accesses to this will always fault, meaning this is no worse than (1). Historically, it was theoretically possible for the linear map or vmemmap to spill into the final 8MiB of the TTBR1 address range, but in practice this is extremely unlikely to occur as this would require either: * Having enough physical memory to fill the entire linear map all the way to the final 1MiB of the TTBR1 address range. * Getting unlucky with KASLR randomization of the linear map such that the populated region happens to overlap with the last 1MiB of the TTBR address range. ... and in either case if we were to spill into the final page there would be larger problems as the final page would alias with error pointers. Practically speaking, (1) and (2) are the big issues. Given there have been no reports of problems since the broken code was introduced, it appears that no-one is relying on probing these instructions with uprobes. Avoid these issues by not allowing uprobes on LDR (literal) and LDRSW (literal), limiting the use of simulate_ldr_literal() and simulate_ldrsw_literal() to kprobes. Attempts to place uprobes on LDR (literal) and LDRSW (literal) will be rejected as arm_probe_decode_insn() will return INSN_REJECTED. In future we can consider introducing working uprobes support for these instructions, but this will require more significant work. Fixes: 9842ceae9fa8 ("arm64: Add uprobe support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241008155851.801546-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15arm64: insn: simplify insn group identificationMark Rutland
The only code which needs to check for an entire instruction group is the aarch64_insn_is_steppable() helper function used by kprobes, which must not be instrumented, and only needs to check for the "Branch, exception generation and system instructions" class. Currently we have an out-of-line helper in insn.c which must be marked as __kprobes, which indexes a table with some bits extracted from the instruction. In aarch64_insn_is_steppable() we then need to compare the result with an expected enum value. It would be simpler to have a predicate for this, as with the other aarch64_insn_is_*() helpers, which would be always inlined to prevent inadvertent instrumentation, and would permit better code generation. This patch adds a predicate function for this instruction group using the existing __AARCH64_INSN_FUNCS() helpers, and removes the existing out-of-line helper. As the only class we currently care about is the branch+exception+sys class, I have only added helpers for this, and left the other classes unimplemented for now. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20221114135928.3000571-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-14arm64: kprobe: clarify the comment of steppable hint instructionsAmit Daniel Kachhap
The existing comment about steppable hint instruction is not complete and only describes NOP instructions as steppable. As the function aarch64_insn_is_steppable_hint allows all white-listed instruction to be probed so the comment is updated to reflect this. Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Dave Martin <dave.martin@arm.com> Link: https://lore.kernel.org/r/20200914083656.21428-7-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-14arm64: kprobe: add checks for ARMv8.3-PAuth combined instructionsAmit Daniel Kachhap
Currently the ARMv8.3-PAuth combined branch instructions (braa, retaa etc.) are not simulated for out-of-line execution with a handler. Hence the uprobe of such instructions leads to kernel warnings in a loop as they are not explicitly checked and fall into INSN_GOOD categories. Other combined instructions like LDRAA and LDRBB can be probed. The issue of the combined branch instructions is fixed by adding group definitions of all such instructions and rejecting their probes. The instruction groups added are br_auth(braa, brab, braaz and brabz), blr_auth(blraa, blrab, blraaz and blrabz), ret_auth(retaa and retab) and eret_auth(eretaa and eretab). Warning log: WARNING: CPU: 0 PID: 156 at arch/arm64/kernel/probes/uprobes.c:182 uprobe_single_step_handler+0x34/0x50 Modules linked in: CPU: 0 PID: 156 Comm: func Not tainted 5.9.0-rc3 #188 Hardware name: Foundation-v8A (DT) pstate: 804003c9 (Nzcv DAIF +PAN -UAO BTYPE=--) pc : uprobe_single_step_handler+0x34/0x50 lr : single_step_handler+0x70/0xf8 sp : ffff800012af3e30 x29: ffff800012af3e30 x28: ffff000878723b00 x27: 0000000000000000 x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 x23: 0000000060001000 x22: 00000000cb000022 x21: ffff800012065ce8 x20: ffff800012af3ec0 x19: ffff800012068d50 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : ffff800010085c90 x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff80001205a9c8 x5 : ffff80001205a000 x4 : ffff80001233db80 x3 : ffff8000100a7a60 x2 : 0020000000000003 x1 : 0000fffffffff008 x0 : ffff800012af3ec0 Call trace: uprobe_single_step_handler+0x34/0x50 single_step_handler+0x70/0xf8 do_debug_exception+0xb8/0x130 el0_sync_handler+0x138/0x1b8 el0_sync+0x158/0x180 Fixes: 74afda4016a7 ("arm64: compile the kernel with ptrauth return address signing") Fixes: 04ca3204fa09 ("arm64: enable pointer authentication") Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://lore.kernel.org/r/20200914083656.21428-2-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-04arm64: insn: Provide a better name for aarch64_insn_is_nop()Mark Brown
The current aarch64_insn_is_nop() has exactly one caller which uses it solely to identify if the instruction is a HINT that can safely be stepped, requiring us to list things that aren't NOPs and make things more confusing than they need to be. Rename the function to reflect the actual usage and make things more clear. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200504131326.18290-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 655 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070034.575739538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-07arm64: fix error: conflicting types for 'kprobe_fault_handler'Pratyush Anand
When CONFIG_KPROBE is disabled but CONFIG_UPROBE_EVENT is enabled, we get following compilation error: In file included from .../arch/arm64/kernel/probes/decode-insn.c:20:0: .../arch/arm64/include/asm/kprobes.h:52:5: error: conflicting types for 'kprobe_fault_handler' int kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr); ^~~~~~~~~~~~~~~~~~~~ In file included from .../arch/arm64/kernel/probes/decode-insn.c:17:0: .../include/linux/kprobes.h:398:90: note: previous definition of 'kprobe_fault_handler' was here static inline int kprobe_fault_handler(struct pt_regs *regs, int trapnr) ^ .../scripts/Makefile.build:290: recipe for target 'arch/arm64/kernel/probes/decode-insn.o' failed <asm/kprobes.h> is already included from <linux/kprobes.h> under #ifdef CONFIG_KPROBE. So, this patch fixes the error by removing it from decode-insn.c. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07arm64: kprobe: protect/rename few definitions to be reused by uprobePratyush Anand
decode-insn code has to be reused by arm64 uprobe implementation as well. Therefore, this patch protects some portion of kprobe code and renames few other, so that decode-insn functionality can be reused by uprobe even when CONFIG_KPROBES is not defined. kprobe_opcode_t and struct arch_specific_insn are also defined by linux/kprobes.h, when CONFIG_KPROBES is not defined. So, protect these definitions in asm/probes.h. linux/kprobes.h already includes asm/kprobes.h. Therefore, remove inclusion of asm/kprobes.h from decode-insn.c. There are some definitions like kprobe_insn and kprobes_handler_t etc can be re-used by uprobe. So, it would be better to remove 'k' from their names. struct arch_specific_insn is specific to kprobe. Therefore, introduce a new struct arch_probe_insn which will be common for both kprobe and uprobe, so that decode-insn code can be shared. Modify kprobe code accordingly. Function arm_probe_decode_insn() will be needed by uprobe as well. So make it global. Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-15arm64: Improve kprobes test for atomic sequenceDavid A. Long
Kprobes searches backwards a finite number of instructions to determine if there is an attempt to probe a load/store exclusive sequence. It stops when it hits the maximum number of instructions or a load or store exclusive. However this means it can run up past the beginning of the function and start looking at literal constants. This has been shown to cause a false positive and blocks insertion of the probe. To fix this, further limit the backwards search to stop if it hits a symbol address from kallsyms. The presumption is that this is the entry point to this code (particularly for the common case of placing probes at the beginning of functions). This also improves efficiency by not searching code that is not part of the function. There may be some possibility that the label might not denote the entry path to the probed instruction but the likelihood seems low and this is just another example of how the kprobes user really needs to be careful about what they are doing. Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: David A. Long <dave.long@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-19arm64: kprobes instruction simulation supportSandeepa Prabhu
Kprobes needs simulation of instructions that cannot be stepped from a different memory location, e.g.: those instructions that uses PC-relative addressing. In simulation, the behaviour of the instruction is implemented using a copy of pt_regs. The following instruction categories are simulated: - All branching instructions(conditional, register, and immediate) - Literal access instructions(load-literal, adr/adrp) Conditional execution is limited to branching instructions in ARM v8. If conditions at PSTATE do not match the condition fields of opcode, the instruction is effectively NOP. Thanks to Will Cohen for assorted suggested changes. Signed-off-by: Sandeepa Prabhu <sandeepa.s.prabhu@gmail.com> Signed-off-by: William Cohen <wcohen@redhat.com> Signed-off-by: David A. Long <dave.long@linaro.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> [catalin.marinas@arm.com: removed linux/module.h include] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-19arm64: Kprobes with single stepping supportSandeepa Prabhu
Add support for basic kernel probes(kprobes) and jump probes (jprobes) for ARM64. Kprobes utilizes software breakpoint and single step debug exceptions supported on ARM v8. A software breakpoint is placed at the probe address to trap the kernel execution into the kprobe handler. ARM v8 supports enabling single stepping before the break exception return (ERET), with next PC in exception return address (ELR_EL1). The kprobe handler prepares an executable memory slot for out-of-line execution with a copy of the original instruction being probed, and enables single stepping. The PC is set to the out-of-line slot address before the ERET. With this scheme, the instruction is executed with the exact same register context except for the PC (and DAIF) registers. Debug mask (PSTATE.D) is enabled only when single stepping a recursive kprobe, e.g.: during kprobes reenter so that probed instruction can be single stepped within the kprobe handler -exception- context. The recursion depth of kprobe is always 2, i.e. upon probe re-entry, any further re-entry is prevented by not calling handlers and the case counted as a missed kprobe). Single stepping from the x-o-l slot has a drawback for PC-relative accesses like branching and symbolic literals access as the offset from the new PC (slot address) may not be ensured to fit in the immediate value of the opcode. Such instructions need simulation, so reject probing them. Instructions generating exceptions or cpu mode change are rejected for probing. Exclusive load/store instructions are rejected too. Additionally, the code is checked to see if it is inside an exclusive load/store sequence (code from Pratyush). System instructions are mostly enabled for stepping, except MSR/MRS accesses to "DAIF" flags in PSTATE, which are not safe for probing. This also changes arch/arm64/include/asm/ptrace.h to use include/asm-generic/ptrace.h. Thanks to Steve Capper and Pratyush Anand for several suggested Changes. Signed-off-by: Sandeepa Prabhu <sandeepa.s.prabhu@gmail.com> Signed-off-by: David A. Long <dave.long@linaro.org> Signed-off-by: Pratyush Anand <panand@redhat.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>