summaryrefslogtreecommitdiff
path: root/arch/arm64/lib
AgeCommit message (Collapse)Author
2024-05-12arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrsPuranjay Mohan
Support an instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. Since commit 7158627686f0 ("arm64: percpu: implement optimised pcpu access using tpidr_el1"), the per-cpu offset for the CPU is stored in the tpidr_el1/2 register of that CPU. To support this BPF instruction in the ARM64 JIT, the following ARM64 instructions are emitted: mov dst, src // Move src to dst, if src != dst mrs tmp, tpidr_el1/2 // Move per-cpu offset of the current cpu in tmp. add dst, dst, tmp // Add the per cpu offset to the dst. To measure the performance improvement provided by this change, the benchmark in [1] was used: Before: glob-arr-inc : 23.597 ± 0.012M/s arr-inc : 23.173 ± 0.019M/s hash-inc : 12.186 ± 0.028M/s After: glob-arr-inc : 23.819 ± 0.034M/s arr-inc : 23.285 ± 0.017M/s hash-inc : 12.419 ± 0.011M/s [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05arm64: Get rid of ARM64_HAS_NO_HW_PREFETCHMarc Zyngier
Back in 2016, it was argued that implementations lacking a HW prefetcher could be helped by sprinkling a number of PRFM instructions in strategic locations. In 2023, the one platform that presumably needed this hack is no longer in active use (let alone maintained), and an quick experiment shows dropping this hack only leads to a 0.4% drop on a full kernel compilation (tested on a MT30-GS0 48 CPU system). Given that this is pretty much in the noise department and that it may give odd ideas to other implementers, drop the hack for good. Suggested-by: Will Deacon <will@kernel.org> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20231122133754.1240687-1-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-10-16arm64: Avoid cpus_have_const_cap() for ARM64_HAS_WFXTMark Rutland
In __delay() we use cpus_have_const_cap() to check for ARM64_HAS_WFXT, but this is not necessary and alternative_has_cap() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. The cpus_have_const_cap() check in __delay() is an optimization to use WFIT and WFET in preference to busy-polling the counter and/or using regular WFE and relying upon the architected timer event stream. It is not necessary to apply this optimization in the window between detecting the ARM64_HAS_WFXT cpucap and patching alternatives. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which will avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-09-08Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The main one is a fix for a broken strscpy() conversion that landed in the merge window and broke early parsing of the kernel command line. - Fix an incorrect mask in the CXL PMU driver - Fix a regression in early parsing of the kernel command line - Fix an IP checksum OoB access reported by syzbot" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: csum: Fix OoB access in IP checksum code for negative lengths arm64/sysreg: Fix broken strncpy() -> strscpy() conversion perf: CXL: fix mismatched number of counters mask
2023-09-07arm64: csum: Fix OoB access in IP checksum code for negative lengthsWill Deacon
Although commit c2c24edb1d9c ("arm64: csum: Fix pathological zero-length calls") added an early return for zero-length input, syzkaller has popped up with an example of a _negative_ length which causes an undefined shift and an out-of-bounds read: | BUG: KASAN: slab-out-of-bounds in do_csum+0x44/0x254 arch/arm64/lib/csum.c:39 | Read of size 4294966928 at addr ffff0000d7ac0170 by task syz-executor412/5975 | | CPU: 0 PID: 5975 Comm: syz-executor412 Not tainted 6.4.0-rc4-syzkaller-g908f31f2a05b #0 | Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023 | Call trace: | dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:233 | show_stack+0x2c/0x44 arch/arm64/kernel/stacktrace.c:240 | __dump_stack lib/dump_stack.c:88 [inline] | dump_stack_lvl+0xd0/0x124 lib/dump_stack.c:106 | print_address_description mm/kasan/report.c:351 [inline] | print_report+0x174/0x514 mm/kasan/report.c:462 | kasan_report+0xd4/0x130 mm/kasan/report.c:572 | kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:187 | __kasan_check_read+0x20/0x30 mm/kasan/shadow.c:31 | do_csum+0x44/0x254 arch/arm64/lib/csum.c:39 | csum_partial+0x30/0x58 lib/checksum.c:128 | gso_make_checksum include/linux/skbuff.h:4928 [inline] | __udp_gso_segment+0xaf4/0x1bc4 net/ipv4/udp_offload.c:332 | udp6_ufo_fragment+0x540/0xca0 net/ipv6/udp_offload.c:47 | ipv6_gso_segment+0x5cc/0x1760 net/ipv6/ip6_offload.c:119 | skb_mac_gso_segment+0x2b4/0x5b0 net/core/gro.c:141 | __skb_gso_segment+0x250/0x3d0 net/core/dev.c:3401 | skb_gso_segment include/linux/netdevice.h:4859 [inline] | validate_xmit_skb+0x364/0xdbc net/core/dev.c:3659 | validate_xmit_skb_list+0x94/0x130 net/core/dev.c:3709 | sch_direct_xmit+0xe8/0x548 net/sched/sch_generic.c:327 | __dev_xmit_skb net/core/dev.c:3805 [inline] | __dev_queue_xmit+0x147c/0x3318 net/core/dev.c:4210 | dev_queue_xmit include/linux/netdevice.h:3085 [inline] | packet_xmit+0x6c/0x318 net/packet/af_packet.c:276 | packet_snd net/packet/af_packet.c:3081 [inline] | packet_sendmsg+0x376c/0x4c98 net/packet/af_packet.c:3113 | sock_sendmsg_nosec net/socket.c:724 [inline] | sock_sendmsg net/socket.c:747 [inline] | __sys_sendto+0x3b4/0x538 net/socket.c:2144 Extend the early return to reject negative lengths as well, aligning our implementation with the generic code in lib/checksum.c Cc: Robin Murphy <robin.murphy@arm.com> Fixes: 5777eaed566a ("arm64: Implement optimised checksum routine") Reported-by: syzbot+4a9f9820bd8d302e22f7@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/000000000000e0e94c0603f8d213@google.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-18arm64: insn: Add encoders for LDRSB/LDRSH/LDRSWXu Kuohai
To support BPF sign-extend load instructions, add encoders for LDRSB/LDRSH/LDRSW. LDRSB/LDRSH/LDRSW (immediate) is encoded as follows: 3 2 2 2 2 1 0 0 0 7 6 4 2 0 5 0 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sz|1 1 1|0|0 1|opc| imm12 | Rn | Rt | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ LDRSB/LDRSH/LDRSW (register) is encoded as follows: 3 2 2 2 2 2 1 1 1 1 0 0 0 7 6 4 2 1 6 3 2 0 5 0 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sz|1 1 1|0|0 0|opc|1| Rm | opt |S|1 0| Rn | Rt | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ where: - sz indicates whether 8-bit, 16-bit or 32-bit data is to be loaded - opc opc[1] (bit 23) is always 1 and opc[0] == 1 indicates regsize is 32-bit. Since BPF signed load instructions always exend the sign bit to bit 63 regardless of whether it loads an 8-bit, 16-bit or 32-bit data. So only 64-bit register size is required. That is, it's sufficient to set field opc fixed to 0x2. - opt Indicates whether to sign extend the offset register Rm and the effective bits of Rm. We set opt to 0x7 (SXTX) since we'll use Rm as a sgined 64-bit value in BPF. - S Optional only when opt field is 0x3 (LSL) In short, the above fields are encoded to the values listed below. sz opc opt S LDRSB (immediate) 0x0 0x2 na na LDRSH (immediate) 0x1 0x2 na na LDRSW (immediate) 0x2 0x2 na na LDRSB (register) 0x0 0x2 0x7 0 LDRSH (register) 0x1 0x2 0x7 0 LDRSW (register) 0x2 0x2 0x7 0 Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/bpf/20230815154158.717901-2-xukuohai@huaweicloud.com
2023-05-25arm64: xor-neon: mark xor_arm64_neon_*() staticArnd Bergmann
The only references to these functions are in the same file, and there is no prototype, which causes a harmless warning: arch/arm64/lib/xor-neon.c:13:6: error: no previous prototype for 'xor_arm64_neon_2' [-Werror=missing-prototypes] arch/arm64/lib/xor-neon.c:40:6: error: no previous prototype for 'xor_arm64_neon_3' [-Werror=missing-prototypes] arch/arm64/lib/xor-neon.c:76:6: error: no previous prototype for 'xor_arm64_neon_4' [-Werror=missing-prototypes] arch/arm64/lib/xor-neon.c:121:6: error: no previous prototype for 'xor_arm64_neon_5' [-Werror=missing-prototypes] Fixes: cc9f8349cb33 ("arm64: crypto: add NEON accelerated XOR implementation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230516160642.523862-2-arnd@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-03-27arm: uaccess: Remove memcpy_page_flushcache()Ira Weiny
Commit 21b56c847753 ("iov_iter: get rid of separate bvec and xarray callbacks") removed the calls to memcpy_page_flushcache(). Remove the unnecessary memcpy_page_flushcache() call. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Dan Williams" <dan.j.williams@intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221230-kmap-x86-v1-3-15f1ecccab50@intel.com Signed-off-by: Will Deacon <will@kernel.org>
2022-12-06Merge branch 'for-next/sysregs' into for-next/coreWill Deacon
* for-next/sysregs: (39 commits) arm64/sysreg: Remove duplicate definitions from asm/sysreg.h arm64/sysreg: Convert ID_DFR1_EL1 to automatic generation arm64/sysreg: Convert ID_DFR0_EL1 to automatic generation arm64/sysreg: Convert ID_AFR0_EL1 to automatic generation arm64/sysreg: Convert ID_MMFR5_EL1 to automatic generation arm64/sysreg: Convert MVFR2_EL1 to automatic generation arm64/sysreg: Convert MVFR1_EL1 to automatic generation arm64/sysreg: Convert MVFR0_EL1 to automatic generation arm64/sysreg: Convert ID_PFR2_EL1 to automatic generation arm64/sysreg: Convert ID_PFR1_EL1 to automatic generation arm64/sysreg: Convert ID_PFR0_EL1 to automatic generation arm64/sysreg: Convert ID_ISAR6_EL1 to automatic generation arm64/sysreg: Convert ID_ISAR5_EL1 to automatic generation arm64/sysreg: Convert ID_ISAR4_EL1 to automatic generation arm64/sysreg: Convert ID_ISAR3_EL1 to automatic generation arm64/sysreg: Convert ID_ISAR2_EL1 to automatic generation arm64/sysreg: Convert ID_ISAR1_EL1 to automatic generation arm64/sysreg: Convert ID_ISAR0_EL1 to automatic generation arm64/sysreg: Convert ID_MMFR4_EL1 to automatic generation arm64/sysreg: Convert ID_MMFR3_EL1 to automatic generation ...
2022-12-01arm64/sysreg: Remove duplicate definitions from asm/sysreg.hWill Deacon
With the new-fangled generation of asm/sysreg-defs.h, some definitions have ended up being duplicated between the two files. Remove these duplicate definitions, and consolidate the naming for GMID_EL1_BS_WIDTH. Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15arm64: insn: always inline hint generationMark Rutland
All users of aarch64_insn_gen_hint() (e.g. aarch64_insn_gen_nop()) pass a constant argument and generate a constant value. Some of those users are noinstr code (e.g. for alternatives patching). For noinstr code it is necessary to either inline these functions or to ensure the out-of-line versions are noinstr. Since in all cases these are generating a constant, make them __always_inline. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20221114135928.3000571-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15arm64: insn: simplify insn group identificationMark Rutland
The only code which needs to check for an entire instruction group is the aarch64_insn_is_steppable() helper function used by kprobes, which must not be instrumented, and only needs to check for the "Branch, exception generation and system instructions" class. Currently we have an out-of-line helper in insn.c which must be marked as __kprobes, which indexes a table with some bits extracted from the instruction. In aarch64_insn_is_steppable() we then need to compare the result with an expected enum value. It would be simpler to have a predicate for this, as with the other aarch64_insn_is_*() helpers, which would be always inlined to prevent inadvertent instrumentation, and would permit better code generation. This patch adds a predicate function for this instruction group using the existing __AARCH64_INSN_FUNCS() helpers, and removes the existing out-of-line helper. As the only class we currently care about is the branch+exception+sys class, I have only added helpers for this, and left the other classes unimplemented for now. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20221114135928.3000571-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15arm64: insn: always inline predicatesMark Rutland
We have a number of aarch64_insn_*() predicates which are used in code which is not instrumentation safe (e.g. alternatives patching, kprobes). Some of those are marked with __kprobes, but most are not, and are implemented out-of-line in insn.c. This patch moves the predicates to insn.h and marks them with __always_inline. This is ensures that they will respect the instrumentation requirements of their caller which they will be inlined into. At the same time, I've formatted each of the functions consistently as a list, to make them easier to read and update in future. Other than preventing unwanted instrumentation, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20221114135928.3000571-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15arm64: insn: remove aarch64_insn_gen_prefetch()Mark Rutland
There are no users of aarch64_insn_gen_prefetch(), and which encodes a PRFM (immediate) with a hard-coded offset of 0. Remove it for now; we can always restore it with tests if we need it in future. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20221114135928.3000571-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-08-03Merge tag 'net-next-6.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking changes from Paolo Abeni: "Core: - Refactor the forward memory allocation to better cope with memory pressure with many open sockets, moving from a per socket cache to a per-CPU one - Replace rwlocks with RCU for better fairness in ping, raw sockets and IP multicast router. - Network-side support for IO uring zero-copy send. - A few skb drop reason improvements, including codegen the source file with string mapping instead of using macro magic. - Rename reference tracking helpers to a more consistent netdev_* schema. - Adapt u64_stats_t type to address load/store tearing issues. - Refine debug helper usage to reduce the log noise caused by bots. BPF: - Improve socket map performance, avoiding skb cloning on read operation. - Add support for 64 bits enum, to match types exposed by kernel. - Introduce support for sleepable uprobes program. - Introduce support for enum textual representation in libbpf. - New helpers to implement synproxy with eBPF/XDP. - Improve loop performances, inlining indirect calls when possible. - Removed all the deprecated libbpf APIs. - Implement new eBPF-based LSM flavor. - Add type match support, which allow accurate queries to the eBPF used types. - A few TCP congetsion control framework usability improvements. - Add new infrastructure to manipulate CT entries via eBPF programs. - Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel function. Protocols: - Introduce per network namespace lookup tables for unix sockets, increasing scalability and reducing contention. - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support. - Add support to forciby close TIME_WAIT TCP sockets via user-space tools. - Significant performance improvement for the TLS 1.3 receive path, both for zero-copy and not-zero-copy. - Support for changing the initial MTPCP subflow priority/backup status - Introduce virtually contingus buffers for sockets over RDMA, to cope better with memory pressure. - Extend CAN ethtool support with timestamping capabilities - Refactor CAN build infrastructure to allow building only the needed features. Driver API: - Remove devlink mutex to allow parallel commands on multiple links. - Add support for pause stats in distributed switch. - Implement devlink helpers to query and flash line cards. - New helper for phy mode to register conversion. New hardware / drivers: - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro. - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch. - Ethernet DSA driver for the Microchip LAN937x switch. - Ethernet PHY driver for the Aquantia AQR113C EPHY. - CAN driver for the OBD-II ELM327 interface. - CAN driver for RZ/N1 SJA1000 CAN controller. - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device. Drivers: - Intel Ethernet NICs: - i40e: add support for vlan pruning - i40e: add support for XDP framented packets - ice: improved vlan offload support - ice: add support for PPPoE offload - Mellanox Ethernet (mlx5) - refactor packet steering offload for performance and scalability - extend support for TC offload - refactor devlink code to clean-up the locking schema - support stacked vlans for bridge offloads - use TLS objects pool to improve connection rate - Netronome Ethernet NICs (nfp): - extend support for IPv6 fields mangling offload - add support for vepa mode in HW bridge - better support for virtio data path acceleration (VDPA) - enable TSO by default - Microsoft vNIC driver (mana) - add support for XDP redirect - Others Ethernet drivers: - bonding: add per-port priority support - microchip lan743x: extend phy support - Fungible funeth: support UDP segmentation offload and XDP xmit - Solarflare EF100: add support for virtual function representors - MediaTek SoC: add XDP support - Mellanox Ethernet/IB switch (mlxsw): - dropped support for unreleased H/W (XM router). - improved stats accuracy - unified bridge model coversion improving scalability (parts 1-6) - support for PTP in Spectrum-2 asics - Broadcom PHYs - add PTP support for BCM54210E - add support for the BCM53128 internal PHY - Marvell Ethernet switches (prestera): - implement support for multicast forwarding offload - Embedded Ethernet switches: - refactor OcteonTx MAC filter for better scalability - improve TC H/W offload for the Felix driver - refactor the Microchip ksz8 and ksz9477 drivers to share the probe code (parts 1, 2), add support for phylink mac configuration - Other WiFi: - Microchip wilc1000: diable WEP support and enable WPA3 - Atheros ath10k: encapsulation offload support Old code removal: - Neterion vxge ethernet driver: this is untouched since more than 10 years" * tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits) doc: sfp-phylink: Fix a broken reference wireguard: selftests: support UML wireguard: allowedips: don't corrupt stack when detecting overflow wireguard: selftests: update config fragments wireguard: ratelimiter: use hrtimer in selftest net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ net: usb: ax88179_178a: Bind only to vendor-specific interface selftests: net: fix IOAM test skip return code net: usb: make USB_RTL8153_ECM non user configurable net: marvell: prestera: remove reduntant code octeontx2-pf: Reduce minimum mtu size to 60 net: devlink: Fix missing mutex_unlock() call net/tls: Remove redundant workqueue flush before destroy net: txgbe: Fix an error handling path in txgbe_probe() net: dsa: Fix spelling mistakes and cleanup code Documentation: devlink: add add devlink-selftests to the table of contents dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock net: ionic: fix error check for vlan flags in ionic_set_nic_features() net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr() nfp: flower: add support for tunnel offload without key ID ...
2022-07-11arm64: Add LDR (literal) instructionXu Kuohai
Add LDR (literal) instruction to load data from address relative to PC. This instruction will be used to implement long jump from bpf prog to bpf trampoline in the follow-up patch. The instruction encoding: 3 2 2 2 0 0 0 7 6 4 5 0 +-----+-------+---+-----+-------------------------------------+--------+ | 0 x | 0 1 1 | 0 | 0 0 | imm19 | Rt | +-----+-------+---+-----+-------------------------------------+--------+ for 32-bit, variant x == 0; for 64-bit, x == 1. branch_imm_common() is used to check the distance between pc and target address, since it's reused by this patch and LDR (literal) is not a branch instruction, rename it to label_imm_common(). Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/bpf/20220711150823.2128542-3-xukuohai@huawei.com
2022-07-05arm64/mte: Standardise GMID field name definitionsMark Brown
Usually our defines for bitfields in system registers do not include a SYS_ prefix but those for GMID do. In preparation for automatic generation of defines remove that prefix. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220704170302.2609529-9-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-05-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "S390: - ultravisor communication device driver - fix TEID on terminating storage key ops RISC-V: - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support ARM: - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes x86: - New ioctls to get/set TSC frequency for a whole VM - Allow userspace to opt out of hypercall patching - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES - V_TSC_AUX support Nested virtualization improvements for AMD: - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) - Allow AVIC to co-exist with a nested guest running - Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support - PAUSE filtering for nested hypervisors Guest support: - Decoupling of vcpu_is_preempted from PV spinlocks" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits) KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest KVM: selftests: x86: Sync the new name of the test case to .gitignore Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND x86, kvm: use correct GFP flags for preemption disabled KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer x86/kvm: Alloc dummy async #PF token outside of raw spinlock KVM: x86: avoid calling x86 emulator without a decoded instruction KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) s390/uv_uapi: depend on CONFIG_S390 KVM: selftests: x86: Fix test failure on arch lbr capable platforms KVM: LAPIC: Trace LAPIC timer expiration on every vmentry KVM: s390: selftest: Test suppression indication on key prot exception KVM: s390: Don't indicate suppression on dirtying, failing memop selftests: drivers/s390x: Add uvdevice tests drivers/s390/char: Add Ultravisor io device MAINTAINERS: Update KVM RISC-V entry to cover selftests support RISC-V: KVM: Introduce ISA extension register RISC-V: KVM: Cleanup stale TLB entries when host CPU changes RISC-V: KVM: Add remote HFENCE functions based on VCPU requests ...
2022-05-25Merge tag 'net-next-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection" * tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits) ptp: ocp: Add firmware header checks ptp: ocp: fix PPS source selector debugfs reporting ptp: ocp: add .init function for sma_op vector ptp: ocp: vectorize the sma accessor functions ptp: ocp: constify selectors ptp: ocp: parameterize input/output sma selectors ptp: ocp: revise firmware display ptp: ocp: add Celestica timecard PCI ids ptp: ocp: Remove #ifdefs around PCI IDs ptp: ocp: 32-bit fixups for pci start address Revert "net/smc: fix listen processing for SMC-Rv2" ath6kl: Use cc-disable-warning to disable -Wdangling-pointer selftests/bpf: Dynptr tests bpf: Add dynptr data slices bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Add verifier support for dynptrs bpf: Suppress 'passing zero to PTR_ERR' warning bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack ...
2022-05-16arm64: mte: Clean up user tag accessorsRobin Murphy
Invoking user_ldst to explicitly add a post-increment of 0 is silly. Just use a normal USER() annotation and save the redundant instruction. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tong Tiangen <tongtiangen@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220420030418.3189040-6-tongtiangen@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-20arm64: Use WFxT for __delay() when possibleMarc Zyngier
Marginally optimise __delay() by using a WFIT/WFET sequence. It probably is a win if no interrupt fires during the delay. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419182755.601427-11-maz@kernel.org
2022-04-01arm64, insn: Add ldr/str with immediate offsetXu Kuohai
This patch introduces ldr/str with immediate offset support to simplify the JIT implementation of BPF LDX/STX instructions on arm64. Although arm64 ldr/str immediate is available in pre-index, post-index and unsigned offset forms, the unsigned offset form is sufficient for BPF, so this patch only adds this type. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321152852.2334294-2-xukuohai@huawei.com
2022-03-21Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - hwrng core now credits for low-quality RNG devices. Algorithms: - Optimisations for neon aes on arm/arm64. - Add accelerated crc32_be on arm64. - Add ffdheXYZ(dh) templates. - Disallow hmac keys < 112 bits in FIPS mode. - Add AVX assembly implementation for sm3 on x86. Drivers: - Add missing local_bh_disable calls for crypto_engine callback. - Ensure BH is disabled in crypto_engine callback path. - Fix zero length DMA mappings in ccree. - Add synchronization between mailbox accesses in octeontx2. - Add Xilinx SHA3 driver. - Add support for the TDES IP available on sama7g5 SoC in atmel" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits) crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list crypto: dh - Remove the unused function dh_safe_prime_dh_alg() hwrng: nomadik - Change clk_disable to clk_disable_unprepare crypto: arm64 - cleanup comments crypto: qat - fix initialization of pfvf rts_map_msg structures crypto: qat - fix initialization of pfvf cap_msg structures crypto: qat - remove unneeded assignment crypto: qat - disable registration of algorithms crypto: hisilicon/qm - fix memset during queues clearing crypto: xilinx: prevent probing on non-xilinx hardware crypto: marvell/octeontx - Use swap() instead of open coding it crypto: ccree - Fix use after free in cc_cipher_exit() crypto: ccp - ccp_dmaengine_unregister release dma channels crypto: octeontx2 - fix missing unlock hwrng: cavium - fix NULL but dereferenced coccicheck error crypto: cavium/nitrox - don't cast parameter in bit operations crypto: vmx - add missing dependencies MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver crypto: xilinx - Add Xilinx SHA3 driver ...
2022-03-14Merge branch 'for-next/strings' into for-next/coreWill Deacon
* for-next/strings: Revert "arm64: Mitigate MTE issues with str{n}cmp()" arm64: lib: Import latest version of Arm Optimized Routines' strncmp arm64: lib: Import latest version of Arm Optimized Routines' strcmp
2022-03-14Merge branch 'for-next/linkage' into for-next/coreWill Deacon
* for-next/linkage: arm64: module: remove (NOLOAD) from linker script linkage: remove SYM_FUNC_{START,END}_ALIAS() x86: clean up symbol aliasing arm64: clean up symbol aliasing linkage: add SYM_FUNC_ALIAS{,_LOCAL,_WEAK}()
2022-03-14Merge branch 'for-next/insn' into for-next/coreWill Deacon
* for-next/insn: arm64: insn: add encoders for atomic operations arm64: move AARCH64_BREAK_FAULT into insn-def.h arm64: insn: Generate 64 bit mask immediates correctly
2022-03-07Revert "arm64: Mitigate MTE issues with str{n}cmp()"Joey Gouly
This reverts commit 59a68d4138086c015ab8241c3267eec5550fbd44. Now that the str{n}cmp functions have been updated to handle MTE properly, the workaround to use the generic functions is no longer needed. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220301101435.19327-4-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-03-07arm64: lib: Import latest version of Arm Optimized Routines' strncmpJoey Gouly
Import the latest version of the Arm Optimized Routines strncmp function based on the upstream code of string/aarch64/strncmp.S at commit 189dfefe37d5 from: https://github.com/ARM-software/optimized-routines This latest version includes MTE support. Note that for simplicity Arm have chosen to contribute this code to Linux under GPLv2 rather than the original MIT OR Apache-2.0 WITH LLVM-exception license. Arm is the sole copyright holder for this code. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220301101435.19327-3-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-03-07arm64: lib: Import latest version of Arm Optimized Routines' strcmpJoey Gouly
Import the latest version of the Arm Optimized Routines strcmp function based on the upstream code of string/aarch64/strcmp.S at commit 189dfefe37d5 from: https://github.com/ARM-software/optimized-routines This latest version includes MTE support. Note that for simplicity Arm have chosen to contribute this code to Linux under GPLv2 rather than the original MIT OR Apache-2.0 WITH LLVM-exception license. Arm is the sole copyright holder for this code. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220301101435.19327-2-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-22arm64: insn: add encoders for atomic operationsHou Tao
It is a preparation patch for eBPF atomic supports under arm64. eBPF needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are the same with the implementations in linux kernel. Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra helper is added. atomic_fetch_add() and other atomic ops needs support for STLXR instruction, so extend enum aarch64_insn_ldst_type to do that. LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE atomics is enabled, so just return AARCH64_BREAK_FAULT directly in these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220217072232.1186625-3-houtao1@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-22arm64: clean up symbol aliasingMark Rutland
Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those to simplify and more consistently define function aliases across arch/arm64. Aliases are now defined in terms of a canonical function name. For position-independent functions I've made the __pi_<func> name the canonical name, and defined other alises in terms of this. The SYM_FUNC_{START,END}_PI(func) macros obscure the __pi_<func> name, and make this hard to seatch for. The SYM_FUNC_START_WEAK_PI() macro also obscures the fact that the __pi_<func> fymbol is global and the <func> symbol is weak. For clarity, I have removed these macros and used SYM_FUNC_{START,END}() directly with the __pi_<func> name. For example: SYM_FUNC_START_WEAK_PI(func) ... asm insns ... SYM_FUNC_END_PI(func) EXPORT_SYMBOL(func) ... becomes: SYM_FUNC_START(__pi_func) ... asm insns ... SYM_FUNC_END(__pi_func) SYM_FUNC_ALIAS_WEAK(func, __pi_func) EXPORT_SYMBOL(func) For clarity, where there are multiple annotations such as EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For example, where a function has a name and an alias which are both exported, this is organised as: SYM_FUNC_START(func) ... asm insns ... SYM_FUNC_END(func) EXPORT_SYMBOL(func) SYM_FUNC_ALIAS(alias, func) EXPORT_SYMBOL(alias) For consistency with the other string functions, I've defined strrchr as a position-independent function, as it can safely be used as such even though we have no users today. As we no longer use SYM_FUNC_{START,END}_ALIAS(), our local copies are removed. The common versions will be removed by a subsequent patch. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Mark Brown <broonie@kernel.org> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220216162229.1076788-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-15arm64: mte: Define the number of bytes for storing the tags in a pageCatalin Marinas
Rather than explicitly calculating the number of bytes for a compact tag storage format corresponding to a page, just add a MTE_PAGE_TAG_STORAGE macro. With the current MTE implementation of 4 bits per tag, we store 2 tags in a byte. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Luis Machado <luis.machado@linaro.org> Link: https://lore.kernel.org/r/20220131165456.2160675-4-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-15arm64: insn: Generate 64 bit mask immediates correctlyJames Morse
When the insn framework is used to encode an AND/ORR/EOR instruction, aarch64_encode_immediate() is used to pick the immr imms values. If the immediate is a 64bit mask, with bit 63 set, and zeros in any of the upper 32 bits, the immr value is incorrectly calculated meaning the wrong mask is generated. For example, 0x8000000000000001 should have an immr of 1, but 32 is used, meaning the resulting mask is 0x0000000300000000. It would appear eBPF is unable to hit these cases, as build_insn()'s imm value is a s32, so when used with BPF_ALU64, the sign-extended u64 immediate would always have all-1s or all-0s in the upper 32 bits. KVM does not generate a va_mask with any of the top bits set as these VA wouldn't be usable with TTBR0_EL2. This happens because the rotation is calculated from fls(~imm), which takes an unsigned int, but the immediate may be 64bit. Use fls64() so the 64bit mask doesn't get truncated to a u32. Signed-off-by: James Morse <james.morse@arm.com> Brown-paper-bag-for: Marc Zyngier <maz@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127162127.2391947-4-james.morse@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-11lib/xor: make xor prototypes more friendly to compiler vectorizationArd Biesheuvel
Modern compilers are perfectly capable of extracting parallelism from the XOR routines, provided that the prototypes reflect the nature of the input accurately, in particular, the fact that the input vectors are expected not to overlap. This is not documented explicitly, but is implied by the interchangeability of the various C routines, some of which use temporary variables while others don't: this means that these routines only behave identically for non-overlapping inputs. So let's decorate these input vectors with the __restrict modifier, which informs the compiler that there is no overlap. While at it, make the input-only vectors pointer-to-const as well. Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/563 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-01-31arm64: lib: accelerate crc32_beKevin Bracey
It makes no sense to leave crc32_be using the generic code while we only accelerate the little-endian ops. Even though the big-endian form doesn't fit as smoothly into the arm64, we can speed it up and avoid hitting the D cache. Tested on Cortex-A53. Without acceleration: crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 crc32: self tests passed, processed 225944 bytes in 192240 nsec crc32c: CRC_LE_BITS = 64 crc32c: self tests passed, processed 112972 bytes in 21360 nsec With acceleration: crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 crc32: self tests passed, processed 225944 bytes in 53480 nsec crc32c: CRC_LE_BITS = 64 crc32c: self tests passed, processed 112972 bytes in 21480 nsec Signed-off-by: Kevin Bracey <kevin@bracey.fi> Tested-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-01-05Merge branches 'for-next/misc', 'for-next/cache-ops-dzp', ↵Catalin Marinas
'for-next/stacktrace', 'for-next/xor-neon', 'for-next/kasan', 'for-next/armv8_7-fp', 'for-next/atomics', 'for-next/bti', 'for-next/sve', 'for-next/kselftest' and 'for-next/kcsan', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: (32 commits) arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: perf: Support new DT compatibles arm64: perf: Simplify registration boilerplate arm64: perf: Support Denver and Carmel PMUs drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding perf/arm-cmn: Add debugfs topology info perf/arm-cmn: Add CI-700 Support dt-bindings: perf: arm-cmn: Add CI-700 perf/arm-cmn: Support new IP features perf/arm-cmn: Demarcate CMN-600 specifics perf/arm-cmn: Move group validation data off-stack perf/arm-cmn: Optimise DTC counter accesses ... * for-next/misc: : Miscellaneous patches arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: errata: Fix exec handling in erratum 1418040 workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: remove __dma_*_area() aliases docs/arm64: delete a space from tagged-address-abi arm64/fp: Add comments documenting the usage of state restore functions arm64: mm: Use asid feature macro for cheanup arm64: mm: Rename asid2idx() to ctxid2asid() arm64: kexec: reduce calls to page_address() arm64: extable: remove unused ex_handler_t definition arm64: entry: Use SDEI event constants arm64: Simplify checking for populated DT arm64/kvm: Fix bitrotted comment for SVE handling in handle_exit.c * for-next/cache-ops-dzp: : Avoid DC instructions when DCZID_EL0.DZP == 1 arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1 arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 * for-next/stacktrace: : Unify the arm64 unwind code arm64: Make some stacktrace functions private arm64: Make dump_backtrace() use arch_stack_walk() arm64: Make profile_pc() use arch_stack_walk() arm64: Make return_address() use arch_stack_walk() arm64: Make __get_wchan() use arch_stack_walk() arm64: Make perf_callchain_kernel() use arch_stack_walk() arm64: Mark __switch_to() as __sched arm64: Add comment for stack_info::kr_cur arch: Make ARCH_STACKWALK independent of STACKTRACE * for-next/xor-neon: : Use SHA3 instructions to speed up XOR arm64/xor: use EOR3 instructions when available * for-next/kasan: : Log potential KASAN shadow aliases arm64: mm: log potential KASAN shadow alias arm64: mm: use die_kernel_fault() in do_mem_abort() * for-next/armv8_7-fp: : Add HWCAPS for ARMv8.7 FEAT_AFP amd FEAT_RPRES arm64: cpufeature: add HWCAP for FEAT_RPRES arm64: add ID_AA64ISAR2_EL1 sys register arm64: cpufeature: add HWCAP for FEAT_AFP * for-next/atomics: : arm64 atomics clean-ups and codegen improvements arm64: atomics: lse: define RETURN ops in terms of FETCH ops arm64: atomics: lse: improve constraints for simple ops arm64: atomics: lse: define ANDs in terms of ANDNOTs arm64: atomics lse: define SUBs in terms of ADDs arm64: atomics: format whitespace consistently * for-next/bti: : BTI clean-ups arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: Use BTI C directly and unconditionally arm64: Unconditionally override SYM_FUNC macros arm64: Add macro version of the BTI instruction arm64: ftrace: add missing BTIs arm64: kexec: use __pa_symbol(empty_zero_page) arm64: update PAC description for kernel * for-next/sve: : SVE code clean-ups and refactoring in prepararation of Scalable Matrix Extensions arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME * for-next/kselftest: : arm64 kselftest additions kselftest/arm64: Add pidbench for floating point syscall cases kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information * for-next/kcsan: : Enable KCSAN for arm64 arm64: Enable KCSAN
2021-12-14arm64: Use BTI C directly and unconditionallyMark Brown
Now we have a macro for BTI C that looks like a regular instruction change all the users of the current BTI_C macro to just emit a BTI C directly and remove the macro. This does mean that we now unconditionally BTI annotate all assembly functions, meaning that they are worse in this respect than code generated by the compiler. The overhead should be minimal for implementations with a reasonable HINT implementation. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20211214152714.2380849-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-14arm64/xor: use EOR3 instructions when availableArd Biesheuvel
Use the EOR3 instruction to implement xor_blocks() if the instruction is available, which is the case if the CPU implements the SHA-3 extension. This is about 20% faster on Apple M1 when using the 5-way version. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20211213140252.2856053-1-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-06arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1Reiji Watanabe
Currently, mte_set_mem_tag_range() and mte_zero_clear_page_tags() use DC {GVA,GZVA} unconditionally. But, they should make sure that DCZID_EL0.DZP, which indicates whether or not use of those instructions is prohibited, is zero when using those instructions. Use ST{G,ZG,Z2G} instead when DCZID_EL0.DZP == 1. Fixes: 013bb59dbb7c ("arm64: mte: handle tags zeroing at page allocation time") Fixes: 3d0cca0b02ac ("kasan: speed up mte_set_mem_tag_range") Signed-off-by: Reiji Watanabe <reijiw@google.com> Link: https://lore.kernel.org/r/20211206004736.1520989-3-reijiw@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-06arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1Reiji Watanabe
Currently, clear_page() uses DC ZVA instruction unconditionally. But it should make sure that DCZID_EL0.DZP, which indicates whether or not use of DC ZVA instruction is prohibited, is zero when using the instruction. Use STNP instead when DCZID_EL0.DZP == 1. Fixes: f27bb139c387 ("arm64: Miscellaneous library functions") Signed-off-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20211206004736.1520989-2-reijiw@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-11-08Merge tag 'kbuild-v5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove the global -isystem compiler flag, which was made possible by the introduction of <linux/stdarg.h> - Improve the Kconfig help to print the location in the top menu level - Fix "FORCE prerequisite is missing" build warning for sparc - Add new build targets, tarzst-pkg and perf-tarzst-src-pkg, which generate a zstd-compressed tarball - Prevent gen_init_cpio tool from generating a corrupted cpio when KBUILD_BUILD_TIMESTAMP is set to 2106-02-07 or later - Misc cleanups * tag 'kbuild-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits) kbuild: use more subdir- for visiting subdirectories while cleaning sh: remove meaningless archclean line initramfs: Check timestamp to prevent broken cpio archive kbuild: split DEBUG_CFLAGS out to scripts/Makefile.debug gen_init_cpio: add static const qualifiers kbuild: Add make tarzst-pkg build option scripts: update the comments of kallsyms support sparc: Add missing "FORCE" target when using if_changed kconfig: refactor conf_touch_dep() kconfig: refactor conf_write_dep() kconfig: refactor conf_write_autoconf() kconfig: add conf_get_autoheader_name() kconfig: move sym_escape_string_value() to confdata.c kconfig: refactor listnewconfig code kconfig: refactor conf_write_symbol() kconfig: refactor conf_write_heading() kconfig: remove 'const' from the return type of sym_escape_string_value() kconfig: rename a variable in the lexer to a clearer name kconfig: narrow the scope of variables in the lexer kconfig: Create links to main menu items in search ...
2021-10-21arm64: extable: consolidate definitionsMark Rutland
In subsequent patches we'll alter the structure and usage of struct exception_table_entry. For inline assembly, we create these using the `_ASM_EXTABLE()` CPP macro defined in <asm/uaccess.h>, and for plain assembly code we use the `_asm_extable()` GAS macro defined in <asm/assembler.h>, which are largely identical save for different escaping and stringification requirements. This patch moves the common definitions to a new <asm/asm-extable.h> header, so that it's easier to keep the two in-sync, and to remove the implication that these are only used for uaccess helpers (as e.g. load_unaligned_zeropad() is only used on kernel memory, and depends upon `_ASM_EXTABLE()`. At the same time, a few minor modifications are made for clarity and in preparation for subsequent patches: * The structure creation is factored out into an `__ASM_EXTABLE_RAW()` macro. This will make it easier to support different fixup variants in subsequent patches without needing to update all users of `_ASM_EXTABLE()`, and makes it easier to see tha the CPP and GAS variants of the macros are structurally identical. For the CPP macro, the stringification of fields is left to the wrapper macro, `_ASM_EXTABLE()`, as in subsequent patches it will be necessary to stringify fields in wrapper macros to safely concatenate strings which cannot be token-pasted together in CPP. * The fields of the structure are created separately on their own lines. This will make it easier to add/remove/modify individual fields clearly. * Additional parentheses are added around the use of macro arguments in field definitions to avoid any potential problems with evaluation due to operator precedence, and to make errors upon misuse clearer. * USER() is moved into <asm/asm-uaccess.h>, as it is not required by all assembly code, and is already refered to by comments in that file. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21arm64: lib: __arch_copy_to_user(): fold fixups into bodyMark Rutland
Like other functions, __arch_copy_to_user() places its exception fixups in the `.fixup` section without any clear association with __arch_copy_to_user() itself. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol, which happens to be `__entry_tramp_text_end`. Further, since the PC adjustment for the fixup is akin to a direct branch rather than a function call, __arch_copy_to_user() itself will be missing from the backtrace. This is confusing and hinders debugging. In general this pattern will also be problematic for CONFIG_LIVEPATCH, since fixups often return to their associated function, but this isn't accurately captured in the stacktrace. To solve these issues for assembly functions, we must move fixups into the body of the functions themselves, after the usual fast-path returns. This patch does so for __arch_copy_to_user(). Inline assembly will be dealt with in subsequent patches. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21arm64: lib: __arch_copy_from_user(): fold fixups into bodyMark Rutland
Like other functions, __arch_copy_from_user() places its exception fixups in the `.fixup` section without any clear association with __arch_copy_from_user() itself. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol, which happens to be `__entry_tramp_text_end`. Further, since the PC adjustment for the fixup is akin to a direct branch rather than a function call, __arch_copy_from_user() itself will be missing from the backtrace. This is confusing and hinders debugging. In general this pattern will also be problematic for CONFIG_LIVEPATCH, since fixups often return to their associated function, but this isn't accurately captured in the stacktrace. To solve these issues for assembly functions, we must move fixups into the body of the functions themselves, after the usual fast-path returns. This patch does so for __arch_copy_from_user(). Inline assembly will be dealt with in subsequent patches. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21arm64: lib: __arch_clear_user(): fold fixups into bodyMark Rutland
Like other functions, __arch_clear_user() places its exception fixups in the `.fixup` section without any clear association with __arch_clear_user() itself. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol, which happens to be `__entry_tramp_text_end`. Further, since the PC adjustment for the fixup is akin to a direct branch rather than a function call, __arch_clear_user() itself will be missing from the backtrace. This is confusing and hinders debugging. In general this pattern will also be problematic for CONFIG_LIVEPATCH, since fixups often return to their associated function, but this isn't accurately captured in the stacktrace. To solve these issues for assembly functions, we must move fixups into the body of the functions themselves, after the usual fast-path returns. This patch does so for __arch_clear_user(). Inline assembly will be dealt with in subsequent patches. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-09-22isystem: delete global -isystem compile optionAlexey Dobriyan
Further isolate kernel from userspace, prevent accidental inclusion of undesireable headers, mainly float.h and stdatomic.h. nds32 keeps -isystem globally due to intrinsics used in entrenched header. -isystem is selectively reenabled for some files, again, for intrinsics. Compile tested on: hexagon-defconfig hexagon-allmodconfig alpha-allmodconfig alpha-allnoconfig alpha-defconfig arm64-allmodconfig arm64-allnoconfig arm64-defconfig arm-am200epdkit arm-aspeed_g4 arm-aspeed_g5 arm-assabet arm-at91_dt arm-axm55xx arm-badge4 arm-bcm2835 arm-cerfcube arm-clps711x arm-cm_x300 arm-cns3420vb arm-colibri_pxa270 arm-colibri_pxa300 arm-collie arm-corgi arm-davinci_all arm-dove arm-ep93xx arm-eseries_pxa arm-exynos arm-ezx arm-footbridge arm-gemini arm-h3600 arm-h5000 arm-hackkit arm-hisi arm-imote2 arm-imx_v4_v5 arm-imx_v6_v7 arm-integrator arm-iop32x arm-ixp4xx arm-jornada720 arm-keystone arm-lart arm-lpc18xx arm-lpc32xx arm-lpd270 arm-lubbock arm-magician arm-mainstone arm-milbeaut_m10v arm-mini2440 arm-mmp2 arm-moxart arm-mps2 arm-multi_v4t arm-multi_v5 arm-multi_v7 arm-mv78xx0 arm-mvebu_v5 arm-mvebu_v7 arm-mxs arm-neponset arm-netwinder arm-nhk8815 arm-omap1 arm-omap2plus arm-orion5x arm-oxnas_v6 arm-palmz72 arm-pcm027 arm-pleb arm-pxa arm-pxa168 arm-pxa255-idp arm-pxa3xx arm-pxa910 arm-qcom arm-realview arm-rpc arm-s3c2410 arm-s3c6400 arm-s5pv210 arm-sama5 arm-shannon arm-shmobile arm-simpad arm-socfpga arm-spear13xx arm-spear3xx arm-spear6xx arm-spitz arm-stm32 arm-sunxi arm-tct_hammer arm-tegra arm-trizeps4 arm-u8500 arm-versatile arm-vexpress arm-vf610m4 arm-viper arm-vt8500_v6_v7 arm-xcep arm-zeus csky-allmodconfig csky-allnoconfig csky-defconfig h8300-edosk2674 h8300-h8300h-sim h8300-h8s-sim i386-allmodconfig i386-allnoconfig i386-defconfig ia64-allmodconfig ia64-allnoconfig ia64-bigsur ia64-generic ia64-gensparse ia64-tiger ia64-zx1 m68k-amcore m68k-amiga m68k-apollo m68k-atari m68k-bvme6000 m68k-hp300 m68k-m5208evb m68k-m5249evb m68k-m5272c3 m68k-m5275evb m68k-m5307c3 m68k-m5407c3 m68k-m5475evb m68k-mac m68k-multi m68k-mvme147 m68k-mvme16x m68k-q40 m68k-stmark2 m68k-sun3 m68k-sun3x microblaze-allmodconfig microblaze-allnoconfig microblaze-mmu mips-ar7 mips-ath25 mips-ath79 mips-bcm47xx mips-bcm63xx mips-bigsur mips-bmips_be mips-bmips_stb mips-capcella mips-cavium_octeon mips-ci20 mips-cobalt mips-cu1000-neo mips-cu1830-neo mips-db1xxx mips-decstation mips-decstation_64 mips-decstation_r4k mips-e55 mips-fuloong2e mips-gcw0 mips-generic mips-gpr mips-ip22 mips-ip27 mips-ip28 mips-ip32 mips-jazz mips-jmr3927 mips-lemote2f mips-loongson1b mips-loongson1c mips-loongson2k mips-loongson3 mips-malta mips-maltaaprp mips-malta_kvm mips-malta_qemu_32r6 mips-maltasmvp mips-maltasmvp_eva mips-maltaup mips-maltaup_xpa mips-mpc30x mips-mtx1 mips-nlm_xlp mips-nlm_xlr mips-omega2p mips-pic32mzda mips-pistachio mips-qi_lb60 mips-rb532 mips-rbtx49xx mips-rm200 mips-rs90 mips-rt305x mips-sb1250_swarm mips-tb0219 mips-tb0226 mips-tb0287 mips-vocore2 mips-workpad mips-xway nds32-allmodconfig nds32-allnoconfig nds32-defconfig nios2-10m50 nios2-3c120 nios2-allmodconfig nios2-allnoconfig openrisc-allmodconfig openrisc-allnoconfig openrisc-or1klitex openrisc-or1ksim openrisc-simple_smp parisc-allnoconfig parisc-generic-32bit parisc-generic-64bit powerpc-acadia powerpc-adder875 powerpc-akebono powerpc-amigaone powerpc-arches powerpc-asp8347 powerpc-bamboo powerpc-bluestone powerpc-canyonlands powerpc-cell powerpc-chrp32 powerpc-cm5200 powerpc-currituck powerpc-ebony powerpc-eiger powerpc-ep8248e powerpc-ep88xc powerpc-fsp2 powerpc-g5 powerpc-gamecube powerpc-ge_imp3a powerpc-holly powerpc-icon powerpc-iss476-smp powerpc-katmai powerpc-kilauea powerpc-klondike powerpc-kmeter1 powerpc-ksi8560 powerpc-linkstation powerpc-lite5200b powerpc-makalu powerpc-maple powerpc-mgcoge powerpc-microwatt powerpc-motionpro powerpc-mpc512x powerpc-mpc5200 powerpc-mpc7448_hpc2 powerpc-mpc8272_ads powerpc-mpc8313_rdb powerpc-mpc8315_rdb powerpc-mpc832x_mds powerpc-mpc832x_rdb powerpc-mpc834x_itx powerpc-mpc834x_itxgp powerpc-mpc834x_mds powerpc-mpc836x_mds powerpc-mpc836x_rdk powerpc-mpc837x_mds powerpc-mpc837x_rdb powerpc-mpc83xx powerpc-mpc8540_ads powerpc-mpc8560_ads powerpc-mpc85xx_cds powerpc-mpc866_ads powerpc-mpc885_ads powerpc-mvme5100 powerpc-obs600 powerpc-pasemi powerpc-pcm030 powerpc-pmac32 powerpc-powernv powerpc-ppa8548 powerpc-ppc40x powerpc-ppc44x powerpc-ppc64 powerpc-ppc64e powerpc-ppc6xx powerpc-pq2fads powerpc-ps3 powerpc-pseries powerpc-rainier powerpc-redwood powerpc-sam440ep powerpc-sbc8548 powerpc-sequoia powerpc-skiroot powerpc-socrates powerpc-storcenter powerpc-stx_gp3 powerpc-taishan powerpc-tqm5200 powerpc-tqm8540 powerpc-tqm8541 powerpc-tqm8548 powerpc-tqm8555 powerpc-tqm8560 powerpc-tqm8xx powerpc-walnut powerpc-warp powerpc-wii powerpc-xes_mpc85xx riscv-allmodconfig riscv-allnoconfig riscv-nommu_k210 riscv-nommu_k210_sdcard riscv-nommu_virt riscv-rv32 s390-allmodconfig s390-allnoconfig s390-debug s390-zfcpdump sh-ap325rxa sh-apsh4a3a sh-apsh4ad0a sh-dreamcast sh-ecovec24 sh-ecovec24-romimage sh-edosk7705 sh-edosk7760 sh-espt sh-hp6xx sh-j2 sh-kfr2r09 sh-kfr2r09-romimage sh-landisk sh-lboxre2 sh-magicpanelr2 sh-microdev sh-migor sh-polaris sh-r7780mp sh-r7785rp sh-rsk7201 sh-rsk7203 sh-rsk7264 sh-rsk7269 sh-rts7751r2d1 sh-rts7751r2dplus sh-sdk7780 sh-sdk7786 sh-se7206 sh-se7343 sh-se7619 sh-se7705 sh-se7712 sh-se7721 sh-se7722 sh-se7724 sh-se7750 sh-se7751 sh-se7780 sh-secureedge5410 sh-sh03 sh-sh2007 sh-sh7710voipgw sh-sh7724_generic sh-sh7757lcr sh-sh7763rdp sh-sh7770_generic sh-sh7785lcr sh-sh7785lcr_32bit sh-shmin sh-shx3 sh-titan sh-ul2 sh-urquell sparc-allmodconfig sparc-allnoconfig sparc-sparc32 sparc-sparc64 um-i386-allmodconfig um-i386-allnoconfig um-i386-defconfig um-x86_64-allmodconfig um-x86_64-allnoconfig x86_64-allmodconfig x86_64-allnoconfig x86_64-defconfig xtensa-allmodconfig xtensa-allnoconfig xtensa-audio_kc705 xtensa-cadence_csp xtensa-common xtensa-generic_kc705 xtensa-iss xtensa-nommu_kc705 xtensa-smp_lx200 xtensa-virt xtensa-xip_kc705 Tested-by: Nathan Chancellor <nathan@kernel.org> # build (hexagon) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-09-21arm64: Mitigate MTE issues with str{n}cmp()Robin Murphy
As with strlen(), the patches importing the updated str{n}cmp() implementations were originally developed and tested before the advent of CONFIG_KASAN_HW_TAGS, and have subsequently revealed not to be MTE-safe. Since in-kernel MTE is still a rather niche case, let it temporarily fall back to the generic C versions for correctness until we can figure out the best fix. Fixes: 758602c04409 ("arm64: Import latest version of Cortex Strings' strcmp") Fixes: 020b199bc70d ("arm64: Import latest version of Cortex Strings' strncmp") Cc: <stable@vger.kernel.org> # 5.14.x Reported-by: Branislav Rankov <branislav.rankov@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/34dc4d12eec0adae49b0ac927df642ed10089d40.1631890770.git.robin.murphy@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-08arch: remove compat_alloc_user_spaceArnd Bergmann
All users of compat_alloc_user_space() and copy_in_user() have been removed from the kernel, only a few functions in sparc remain that can be changed to calling arch_copy_in_user() instead. Link: https://lkml.kernel.org/r/20210727144859.4150043-7-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-30arm64: use __func__ to get function name in pr_errJason Wang
Prefer using '"%s...", __func__' to get current function's name in a debug message. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210726122907.51529-1-wangborong@cdjrlc.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-07-15arm64: Avoid premature usercopy failureRobin Murphy
Al reminds us that the usercopy API must only return complete failure if absolutely nothing could be copied. Currently, if userspace does something silly like giving us an unaligned pointer to Device memory, or a size which overruns MTE tag bounds, we may fail to honour that requirement when faulting on a multi-byte access even though a smaller access could have succeeded. Add a mitigation to the fixup routines to fall back to a single-byte copy if we faulted on a larger access before anything has been written to the destination, to guarantee making *some* forward progress. We needn't be too concerned about the overall performance since this should only occur when callers are doing something a bit dodgy in the first place. Particularly broken userspace might still be able to trick generic_perform_write() into an infinite loop by targeting write() at an mmap() of some read-only device register where the fault-in load succeeds but any store synchronously aborts such that copy_to_user() is genuinely unable to make progress, but, well, don't do that... CC: stable@vger.kernel.org Reported-by: Chen Huang <chenhuang5@huawei.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/dc03d5c675731a1f24a62417dba5429ad744234e.1626098433.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>