summaryrefslogtreecommitdiff
path: root/arch/arm64/lib/csum.c
AgeCommit message (Collapse)Author
2023-09-07arm64: csum: Fix OoB access in IP checksum code for negative lengthsWill Deacon
Although commit c2c24edb1d9c ("arm64: csum: Fix pathological zero-length calls") added an early return for zero-length input, syzkaller has popped up with an example of a _negative_ length which causes an undefined shift and an out-of-bounds read: | BUG: KASAN: slab-out-of-bounds in do_csum+0x44/0x254 arch/arm64/lib/csum.c:39 | Read of size 4294966928 at addr ffff0000d7ac0170 by task syz-executor412/5975 | | CPU: 0 PID: 5975 Comm: syz-executor412 Not tainted 6.4.0-rc4-syzkaller-g908f31f2a05b #0 | Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023 | Call trace: | dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:233 | show_stack+0x2c/0x44 arch/arm64/kernel/stacktrace.c:240 | __dump_stack lib/dump_stack.c:88 [inline] | dump_stack_lvl+0xd0/0x124 lib/dump_stack.c:106 | print_address_description mm/kasan/report.c:351 [inline] | print_report+0x174/0x514 mm/kasan/report.c:462 | kasan_report+0xd4/0x130 mm/kasan/report.c:572 | kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:187 | __kasan_check_read+0x20/0x30 mm/kasan/shadow.c:31 | do_csum+0x44/0x254 arch/arm64/lib/csum.c:39 | csum_partial+0x30/0x58 lib/checksum.c:128 | gso_make_checksum include/linux/skbuff.h:4928 [inline] | __udp_gso_segment+0xaf4/0x1bc4 net/ipv4/udp_offload.c:332 | udp6_ufo_fragment+0x540/0xca0 net/ipv6/udp_offload.c:47 | ipv6_gso_segment+0x5cc/0x1760 net/ipv6/ip6_offload.c:119 | skb_mac_gso_segment+0x2b4/0x5b0 net/core/gro.c:141 | __skb_gso_segment+0x250/0x3d0 net/core/dev.c:3401 | skb_gso_segment include/linux/netdevice.h:4859 [inline] | validate_xmit_skb+0x364/0xdbc net/core/dev.c:3659 | validate_xmit_skb_list+0x94/0x130 net/core/dev.c:3709 | sch_direct_xmit+0xe8/0x548 net/sched/sch_generic.c:327 | __dev_xmit_skb net/core/dev.c:3805 [inline] | __dev_queue_xmit+0x147c/0x3318 net/core/dev.c:4210 | dev_queue_xmit include/linux/netdevice.h:3085 [inline] | packet_xmit+0x6c/0x318 net/packet/af_packet.c:276 | packet_snd net/packet/af_packet.c:3081 [inline] | packet_sendmsg+0x376c/0x4c98 net/packet/af_packet.c:3113 | sock_sendmsg_nosec net/socket.c:724 [inline] | sock_sendmsg net/socket.c:747 [inline] | __sys_sendto+0x3b4/0x538 net/socket.c:2144 Extend the early return to reject negative lengths as well, aligning our implementation with the generic code in lib/checksum.c Cc: Robin Murphy <robin.murphy@arm.com> Fixes: 5777eaed566a ("arm64: Implement optimised checksum routine") Reported-by: syzbot+4a9f9820bd8d302e22f7@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/000000000000e0e94c0603f8d213@google.com Signed-off-by: Will Deacon <will@kernel.org>
2020-04-15arm64: csum: Disable KASAN for do_csum()Will Deacon
do_csum() over-reads the source buffer and therefore abuses READ_ONCE_NOCHECK() to avoid tripping up KASAN. In preparation for READ_ONCE_NOCHECK() becoming a macro, and therefore losing its '__no_sanitize_address' annotation, just annotate do_csum() explicitly and fall back to normal loads. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-03-09arm64: csum: Optimise IPv6 header checksumRobin Murphy
Throwing our __uint128_t idioms at csum_ipv6_magic() makes it about 1.3x-2x faster across a range of microarchitecture/compiler combinations. Not much in absolute terms, but every little helps. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-01-17arm64: csum: Fix pathological zero-length callsRobin Murphy
In validating the checksumming results of the new routine, I sadly neglected to test its not-checksumming results. Thus it slipped through that the one case where @buff is already dword-aligned and @len = 0 manages to defeat the tail-masking logic and behave as if @len = 8. For a zero length it doesn't make much sense to deference @buff anyway, so just add an early return (which has essentially zero impact on performance). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-16arm64: Implement optimised checksum routineRobin Murphy
Apparently there exist certain workloads which rely heavily on software checksumming, for which the generic do_csum() implementation becomes a significant bottleneck. Therefore let's give arm64 its own optimised version - for ease of maintenance this foregoes assembly or intrisics, and is thus not actually arm64-specific, but does rely heavily on C idioms that translate well to the A64 ISA and the typical load/store capabilities of most ARMv8 CPU cores. The resulting increase in checksum throughput scales nicely with buffer size, tending towards 4x for a small in-order core (Cortex-A53), and up to 6x or more for an aggressive big core (Ampere eMAG). Reported-by: Lingyan Huang <huanglingyan2@huawei.com> Tested-by: Lingyan Huang <huanglingyan2@huawei.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>