From 27d8fa207835fa5c7cd6f969c6cc94d1123951ee Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Wed, 15 Jun 2022 14:22:38 +0100 Subject: Revert "arm64: Initialize jump labels before setup_machine_fdt()" This reverts commit 73e2d827a501d48dceeb5b9b267a4cd283d6b1ae. The reverted patch was needed as a fix after commit f5bda35fba61 ("random: use static branch for crng_ready()"). However, this was already fixed by 60e5b2886b92 ("random: do not use jump labels before they are initialized") and hence no longer necessary to initialise jump labels before setup_machine_fdt(). Signed-off-by: Catalin Marinas --- arch/arm64/kernel/setup.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index cf3a759f10d4..fea3223704b6 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -303,14 +303,13 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p) early_fixmap_init(); early_ioremap_init(); + setup_machine_fdt(__fdt_pointer); + /* * Initialise the static keys early as they may be enabled by the - * cpufeature code, early parameters, and DT setup. + * cpufeature code and early parameters. */ jump_label_init(); - - setup_machine_fdt(__fdt_pointer); - parse_early_param(); /* -- cgit From 3eefdf9d1e406f3da47470b2854347009ffcb6fa Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Tue, 14 Jun 2022 09:09:42 +0100 Subject: arm64: ftrace: fix branch range checks The branch range checks in ftrace_make_call() and ftrace_make_nop() are incorrect, erroneously permitting a forwards branch of 128M and erroneously rejecting a backwards branch of 128M. This is because both functions calculate the offset backwards, calculating the offset *from* the target *to* the branch, rather than the other way around as the later comparisons expect. If an out-of-range branch were erroeously permitted, this would later be rejected by aarch64_insn_gen_branch_imm() as branch_imm_common() checks the bounds correctly, resulting in warnings and the placement of a BRK instruction. Note that this can only happen for a forwards branch of exactly 128M, and so the caller would need to be exactly 128M bytes below the relevant ftrace trampoline. If an in-range branch were erroeously rejected, then: * For modules when CONFIG_ARM64_MODULE_PLTS=y, this would result in the use of a PLT entry, which is benign. Note that this is the common case, as this is selected by CONFIG_RANDOMIZE_BASE (and therefore RANDOMIZE_MODULE_REGION_FULL), which distributions typically seelct. This is also selected by CONFIG_ARM64_ERRATUM_843419. * For modules when CONFIG_ARM64_MODULE_PLTS=n, this would result in internal ftrace failures. * For core kernel text, this would result in internal ftrace failues. Note that for this to happen, the kernel text would need to be at least 128M bytes in size, and typical configurations are smaller tha this. Fix this by calculating the offset *from* the branch *to* the target in both functions. Fixes: f8af0b364e24 ("arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()") Fixes: e71a4e1bebaf ("arm64: ftrace: add support for far branches to dynamic ftrace") Signed-off-by: Mark Rutland Cc: Ard Biesheuvel Cc: Will Deacon Tested-by: "Ivan T. Ivanov" Reviewed-by: Chengming Zhou Reviewed-by: Ard Biesheuvel Link: https://lore.kernel.org/r/20220614080944.1349146-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas --- arch/arm64/kernel/ftrace.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c index f447c4a36f69..e1c88234b882 100644 --- a/arch/arm64/kernel/ftrace.c +++ b/arch/arm64/kernel/ftrace.c @@ -84,7 +84,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) { unsigned long pc = rec->ip; u32 old, new; - long offset = (long)pc - (long)addr; + long offset = (long)addr - (long)pc; if (offset < -SZ_128M || offset >= SZ_128M) { struct module *mod; @@ -183,7 +183,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long pc = rec->ip; bool validate = true; u32 old = 0, new; - long offset = (long)pc - (long)addr; + long offset = (long)addr - (long)pc; if (offset < -SZ_128M || offset >= SZ_128M) { u32 replaced; -- cgit From a6253579977e4c6f7818eeb05bf2bc65678a7187 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Tue, 14 Jun 2022 09:09:43 +0100 Subject: arm64: ftrace: consistently handle PLTs. Sometimes it is necessary to use a PLT entry to call an ftrace trampoline. This is handled by ftrace_make_call() and ftrace_make_nop(), with each having *almost* identical logic, but this is not handled by ftrace_modify_call() since its introduction in commit: 3b23e4991fb66f6d ("arm64: implement ftrace with regs") Due to this, if we ever were to call ftrace_modify_call() for a callsite which requires a PLT entry for a trampoline, then either: a) If the old addr requires a trampoline, ftrace_modify_call() will use an out-of-range address to generate the 'old' branch instruction. This will result in warnings from aarch64_insn_gen_branch_imm() and ftrace_modify_code(), and no instructions will be modified. As ftrace_modify_call() will return an error, this will result in subsequent internal ftrace errors. b) If the old addr does not require a trampoline, but the new addr does, ftrace_modify_call() will use an out-of-range address to generate the 'new' branch instruction. This will result in warnings from aarch64_insn_gen_branch_imm(), and ftrace_modify_code() will replace the 'old' branch with a BRK. This will result in a kernel panic when this BRK is later executed. Practically speaking, case (a) is vastly more likely than case (b), and typically this will result in internal ftrace errors that don't necessarily affect the rest of the system. This can be demonstrated with an out-of-tree test module which triggers ftrace_modify_call(), e.g. | # insmod test_ftrace.ko | test_ftrace: Function test_function raw=0xffffb3749399201c, callsite=0xffffb37493992024 | branch_imm_common: offset out of range | branch_imm_common: offset out of range | ------------[ ftrace bug ]------------ | ftrace failed to modify | [] test_function+0x8/0x38 [test_ftrace] | actual: 1d:00:00:94 | Updating ftrace call site to call a different ftrace function | ftrace record flags: e0000002 | (2) R | expected tramp: ffffb374ae42ed54 | ------------[ cut here ]------------ | WARNING: CPU: 0 PID: 165 at kernel/trace/ftrace.c:2085 ftrace_bug+0x280/0x2b0 | Modules linked in: test_ftrace(+) | CPU: 0 PID: 165 Comm: insmod Not tainted 5.19.0-rc2-00002-g4d9ead8b45ce #13 | Hardware name: linux,dummy-virt (DT) | pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : ftrace_bug+0x280/0x2b0 | lr : ftrace_bug+0x280/0x2b0 | sp : ffff80000839ba00 | x29: ffff80000839ba00 x28: 0000000000000000 x27: ffff80000839bcf0 | x26: ffffb37493994180 x25: ffffb374b0991c28 x24: ffffb374b0d70000 | x23: 00000000ffffffea x22: ffffb374afcc33b0 x21: ffffb374b08f9cc8 | x20: ffff572b8462c000 x19: ffffb374b08f9000 x18: ffffffffffffffff | x17: 6c6c6163202c6331 x16: ffffb374ae5ad110 x15: ffffb374b0d51ee4 | x14: 0000000000000000 x13: 3435646532346561 x12: 3437336266666666 | x11: 203a706d61727420 x10: 6465746365707865 x9 : ffffb374ae5149e8 | x8 : 336266666666203a x7 : 706d617274206465 x6 : 00000000fffff167 | x5 : ffff572bffbc4a08 x4 : 00000000fffff167 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : ffff572b84461e00 x0 : 0000000000000022 | Call trace: | ftrace_bug+0x280/0x2b0 | ftrace_replace_code+0x98/0xa0 | ftrace_modify_all_code+0xe0/0x144 | arch_ftrace_update_code+0x14/0x20 | ftrace_startup+0xf8/0x1b0 | register_ftrace_function+0x38/0x90 | test_ftrace_init+0xd0/0x1000 [test_ftrace] | do_one_initcall+0x50/0x2b0 | do_init_module+0x50/0x1f0 | load_module+0x17c8/0x1d64 | __do_sys_finit_module+0xa8/0x100 | __arm64_sys_finit_module+0x2c/0x3c | invoke_syscall+0x50/0x120 | el0_svc_common.constprop.0+0xdc/0x100 | do_el0_svc+0x3c/0xd0 | el0_svc+0x34/0xb0 | el0t_64_sync_handler+0xbc/0x140 | el0t_64_sync+0x18c/0x190 | ---[ end trace 0000000000000000 ]--- We can solve this by consistently determining whether to use a PLT entry for an address. Note that since (the earlier) commit: f1a54ae9af0da4d7 ("arm64: module/ftrace: intialize PLT at load time") ... we can consistently determine the PLT address that a given callsite will use, and therefore ftrace_make_nop() does not need to skip validation when a PLT is in use. This patch factors the existing logic out of ftrace_make_call() and ftrace_make_nop() into a common ftrace_find_callable_addr() helper function, which is used by ftrace_make_call(), ftrace_make_nop(), and ftrace_modify_call(). In ftrace_make_nop() the patching is consistently validated by ftrace_modify_code() as we can always determine what the old instruction should have been. Fixes: 3b23e4991fb6 ("arm64: implement ftrace with regs") Signed-off-by: Mark Rutland Cc: Ard Biesheuvel Cc: Will Deacon Tested-by: "Ivan T. Ivanov" Reviewed-by: Chengming Zhou Reviewed-by: Ard Biesheuvel Link: https://lore.kernel.org/r/20220614080944.1349146-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas --- arch/arm64/kernel/ftrace.c | 137 ++++++++++++++++++++++----------------------- 1 file changed, 66 insertions(+), 71 deletions(-) diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c index e1c88234b882..ea5dc7c90f46 100644 --- a/arch/arm64/kernel/ftrace.c +++ b/arch/arm64/kernel/ftrace.c @@ -78,47 +78,76 @@ static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr) } /* - * Turn on the call to ftrace_caller() in instrumented function + * Find the address the callsite must branch to in order to reach '*addr'. + * + * Due to the limited range of 'BL' instructions, modules may be placed too far + * away to branch directly and must use a PLT. + * + * Returns true when '*addr' contains a reachable target address, or has been + * modified to contain a PLT address. Returns false otherwise. */ -int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) +static bool ftrace_find_callable_addr(struct dyn_ftrace *rec, + struct module *mod, + unsigned long *addr) { unsigned long pc = rec->ip; - u32 old, new; - long offset = (long)addr - (long)pc; + long offset = (long)*addr - (long)pc; + struct plt_entry *plt; - if (offset < -SZ_128M || offset >= SZ_128M) { - struct module *mod; - struct plt_entry *plt; + /* + * When the target is within range of the 'BL' instruction, use 'addr' + * as-is and branch to that directly. + */ + if (offset >= -SZ_128M && offset < SZ_128M) + return true; - if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS)) - return -EINVAL; + /* + * When the target is outside of the range of a 'BL' instruction, we + * must use a PLT to reach it. We can only place PLTs for modules, and + * only when module PLT support is built-in. + */ + if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS)) + return false; - /* - * On kernels that support module PLTs, the offset between the - * branch instruction and its target may legally exceed the - * range of an ordinary relative 'bl' opcode. In this case, we - * need to branch via a trampoline in the module. - * - * NOTE: __module_text_address() must be called with preemption - * disabled, but we can rely on ftrace_lock to ensure that 'mod' - * retains its validity throughout the remainder of this code. - */ + /* + * 'mod' is only set at module load time, but if we end up + * dealing with an out-of-range condition, we can assume it + * is due to a module being loaded far away from the kernel. + * + * NOTE: __module_text_address() must be called with preemption + * disabled, but we can rely on ftrace_lock to ensure that 'mod' + * retains its validity throughout the remainder of this code. + */ + if (!mod) { preempt_disable(); mod = __module_text_address(pc); preempt_enable(); + } - if (WARN_ON(!mod)) - return -EINVAL; + if (WARN_ON(!mod)) + return false; - plt = get_ftrace_plt(mod, addr); - if (!plt) { - pr_err("ftrace: no module PLT for %ps\n", (void *)addr); - return -EINVAL; - } - - addr = (unsigned long)plt; + plt = get_ftrace_plt(mod, *addr); + if (!plt) { + pr_err("ftrace: no module PLT for %ps\n", (void *)*addr); + return false; } + *addr = (unsigned long)plt; + return true; +} + +/* + * Turn on the call to ftrace_caller() in instrumented function + */ +int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) +{ + unsigned long pc = rec->ip; + u32 old, new; + + if (!ftrace_find_callable_addr(rec, NULL, &addr)) + return -EINVAL; + old = aarch64_insn_gen_nop(); new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK); @@ -132,6 +161,11 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr, unsigned long pc = rec->ip; u32 old, new; + if (!ftrace_find_callable_addr(rec, NULL, &old_addr)) + return -EINVAL; + if (!ftrace_find_callable_addr(rec, NULL, &addr)) + return -EINVAL; + old = aarch64_insn_gen_branch_imm(pc, old_addr, AARCH64_INSN_BRANCH_LINK); new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK); @@ -181,54 +215,15 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long addr) { unsigned long pc = rec->ip; - bool validate = true; u32 old = 0, new; - long offset = (long)addr - (long)pc; - if (offset < -SZ_128M || offset >= SZ_128M) { - u32 replaced; - - if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS)) - return -EINVAL; - - /* - * 'mod' is only set at module load time, but if we end up - * dealing with an out-of-range condition, we can assume it - * is due to a module being loaded far away from the kernel. - */ - if (!mod) { - preempt_disable(); - mod = __module_text_address(pc); - preempt_enable(); - - if (WARN_ON(!mod)) - return -EINVAL; - } - - /* - * The instruction we are about to patch may be a branch and - * link instruction that was redirected via a PLT entry. In - * this case, the normal validation will fail, but we can at - * least check that we are dealing with a branch and link - * instruction that points into the right module. - */ - if (aarch64_insn_read((void *)pc, &replaced)) - return -EFAULT; - - if (!aarch64_insn_is_bl(replaced) || - !within_module(pc + aarch64_get_branch_offset(replaced), - mod)) - return -EINVAL; - - validate = false; - } else { - old = aarch64_insn_gen_branch_imm(pc, addr, - AARCH64_INSN_BRANCH_LINK); - } + if (!ftrace_find_callable_addr(rec, mod, &addr)) + return -EINVAL; + old = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK); new = aarch64_insn_gen_nop(); - return ftrace_modify_code(pc, old, new, validate); + return ftrace_modify_code(pc, old, new, true); } void arch_ftrace_update_code(int command) -- cgit From 0d8116ccd83b7e5384cf04de570ae19771e8a3d0 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Tue, 14 Jun 2022 09:09:44 +0100 Subject: arm64: ftrace: remove redundant label Since commit: c4a0ebf87cebbfa2 ("arm64/ftrace: Make function graph use ftrace directly") The 'ftrace_common_return' label has been unused. Remove it. Signed-off-by: Mark Rutland Cc: Chengming Zhou Cc: Will Deacon Tested-by: "Ivan T. Ivanov" Reviewed-by: Chengming Zhou Reviewed-by: Ard Biesheuvel Link: https://lore.kernel.org/r/20220614080944.1349146-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas --- arch/arm64/kernel/entry-ftrace.S | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S index d42a205ef625..bd5df50e4643 100644 --- a/arch/arm64/kernel/entry-ftrace.S +++ b/arch/arm64/kernel/entry-ftrace.S @@ -102,7 +102,6 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) * x19-x29 per the AAPCS, and we created frame records upon entry, so we need * to restore x0-x8, x29, and x30. */ -ftrace_common_return: /* Restore function arguments */ ldp x0, x1, [sp] ldp x2, x3, [sp, #S_X2] -- cgit From 3f77a1d0570e62cfce8d472319df00008bbeab38 Mon Sep 17 00:00:00 2001 From: Mark Brown Date: Wed, 15 Jun 2022 20:15:04 +0100 Subject: arm64/cpufeature: Unexport set_cpu_feature() We currently export set_cpu_feature() to modules but there are no in tree users that can be built as modules and it is hard to see cases where it would make sense for there to be any such users. Remove the export to avoid anyone else having to worry about why it is there and ensure that any users that do get added get a bit more visiblity. Signed-off-by: Mark Brown Acked-by: Suzuki K Poulose Reviewed-by: Mark Rutland Link: https://lore.kernel.org/r/20220615191504.626604-1-broonie@kernel.org Signed-off-by: Catalin Marinas --- arch/arm64/kernel/cpufeature.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 42ea2bd856c6..d76fd95376f0 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -3109,7 +3109,6 @@ void cpu_set_feature(unsigned int num) WARN_ON(num >= MAX_CPU_FEATURES); elf_hwcap |= BIT(num); } -EXPORT_SYMBOL_GPL(cpu_set_feature); bool cpu_have_feature(unsigned int num) { -- cgit From c50f11c6196f45c92ca48b16a5071615d4ae0572 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Fri, 10 Jun 2022 16:12:27 +0100 Subject: arm64: mm: Don't invalidate FROM_DEVICE buffers at start of DMA transfer Invalidating the buffer memory in arch_sync_dma_for_device() for FROM_DEVICE transfers When using the streaming DMA API to map a buffer prior to inbound non-coherent DMA (i.e. DMA_FROM_DEVICE), we invalidate any dirty CPU cachelines so that they will not be written back during the transfer and corrupt the buffer contents written by the DMA. This, however, poses two potential problems: (1) If the DMA transfer does not write to every byte in the buffer, then the unwritten bytes will contain stale data once the transfer has completed. (2) If the buffer has a virtual alias in userspace, then stale data may be visible via this alias during the period between performing the cache invalidation and the DMA writes landing in memory. Address both of these issues by cleaning (aka writing-back) the dirty lines in arch_sync_dma_for_device(DMA_FROM_DEVICE) instead of discarding them using invalidation. Cc: Ard Biesheuvel Cc: Christoph Hellwig Cc: Robin Murphy Cc: Russell King Cc: Link: https://lore.kernel.org/r/20220606152150.GA31568@willie-the-truck Signed-off-by: Will Deacon Reviewed-by: Ard Biesheuvel Link: https://lore.kernel.org/r/20220610151228.4562-2-will@kernel.org Signed-off-by: Catalin Marinas --- arch/arm64/mm/cache.S | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S index 0ea6cc25dc66..21c907987080 100644 --- a/arch/arm64/mm/cache.S +++ b/arch/arm64/mm/cache.S @@ -218,8 +218,6 @@ SYM_FUNC_ALIAS(__dma_flush_area, __pi___dma_flush_area) */ SYM_FUNC_START(__pi___dma_map_area) add x1, x0, x1 - cmp w2, #DMA_FROM_DEVICE - b.eq __pi_dcache_inval_poc b __pi_dcache_clean_poc SYM_FUNC_END(__pi___dma_map_area) SYM_FUNC_ALIAS(__dma_map_area, __pi___dma_map_area) -- cgit