summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-10-30net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extensionBenoît Monin
As documented in skbuff.h, devices with NETIF_F_IPV6_CSUM capability can only checksum TCP and UDP over IPv6 if the IP header does not contains extension. This is enforced for UDP packets emitted from user-space to an IPv6 address as they go through ip6_make_skb(), which calls __ip6_append_data() where a check is done on the header size before setting CHECKSUM_PARTIAL. But the introduction of UDP encapsulation with fou6 added a code-path where it is possible to get an skb with a partial UDP checksum and an IPv6 header with extension: * fou6 adds a UDP header with a partial checksum if the inner packet does not contains a valid checksum. * ip6_tunnel adds an IPv6 header with a destination option extension header if encap_limit is non-zero (the default value is 4). The thread linked below describes in more details how to reproduce the problem with GRE-in-UDP tunnel. Add a check on the network header size in skb_csum_hwoffload_help() to make sure no IPv6 packet with extension header is handed to a network device with NETIF_F_IPV6_CSUM capability. Link: https://lore.kernel.org/netdev/26548921.1r3eYUQgxm@benoit.monin/T/#u Fixes: aa3463d65e7b ("fou: Add encap ops for IPv6 tunnels") Signed-off-by: Benoît Monin <benoit.monin@gmx.fr> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/5fbeecfc311ea182aa1d1c771725ab8b4cac515e.1729778144.git.benoit.monin@gmx.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30x86/uaccess: Avoid barrier_nospec() in 64-bit copy_from_user()Linus Torvalds
The barrier_nospec() in 64-bit copy_from_user() is slow. Instead use pointer masking to force the user pointer to all 1's for an invalid address. The kernel test robot reports a 2.6% improvement in the per_thread_ops benchmark [1]. This is a variation on a patch originally by Josh Poimboeuf [2]. Link: https://lore.kernel.org/202410281344.d02c72a2-oliver.sang@intel.com [1] Link: https://lore.kernel.org/5b887fe4c580214900e21f6c61095adf9a142735.1730166635.git.jpoimboe@kernel.org [2] Tested-and-reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-10-30Merge tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Update more header copies with the kernel sources, including const.h, msr-index.h, arm64's cputype.h, kvm's, bits.h and unaligned.h - The return from 'write' isn't a pid, fix cut'n'paste error in 'perf trace' - Fix up the python binding build on architectures without HAVE_KVM_STAT_SUPPORT - Add some more bounds checks to augmented_raw_syscalls.bpf.c (used to collect syscall pointer arguments in 'perf trace') to make the resulting bytecode to pass the kernel BPF verifier, allowing us to go back accepting clang 12.0.1 as the minimum version required for compiling BPF sources - Add __NR_capget for x86 to fix a regression on running perf + intel PT (hw tracing) as non-root setting up the capabilities as described in https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html - Fix missing syscalltbl in non-explicitly listed architectures, noticed on ARM 32-bit, that still needs a .tbl generator for the syscall id<->name tables, should be added for v6.13 - Handle 'perf test' failure when handling broken DWARF for ASM files * tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf cap: Add __NR_capget to arch/x86 unistd tools headers: Update the linux/unaligned.h copy with the kernel sources tools headers arm64: Sync arm64's cputype.h with the kernel sources tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources perf python: Fix up the build on architectures without HAVE_KVM_STAT_SUPPORT perf test: Handle perftool-testsuite_probe failure due to broken DWARF tools headers UAPI: Sync kvm headers with the kernel sources perf trace: Fix non-listed archs in the syscalltbl routines perf build: Change the clang check back to 12.0.1 perf trace augmented_raw_syscalls: Add more checks to pass the verifier perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers perf trace: The return from 'write' isn't a pid tools headers UAPI: Sync linux/const.h with the kernel headers
2024-10-30Merge branch 'fixes-for-bits-iterator'Alexei Starovoitov
Hou Tao says: ==================== The patch set fixes several issues in bits iterator. Patch #1 fixes the kmemleak problem of bits iterator. Patch #2~#3 fix the overflow problem of nr_bits. Patch #4 fixes the potential stack corruption when bits iterator is used on 32-bit host. Patch #5 adds more test cases for bits iterator. Please see the individual patches for more details. And comments are always welcome. --- v4: * patch #1: add ack from Yafang * patch #3: revert code-churn like changes: (1) compute nr_bytes and nr_bits before the check of nr_words. (2) use nr_bits == 64 to check for single u64, preventing build warning on 32-bit hosts. * patch #4: use "BITS_PER_LONG == 32" instead of "!defined(CONFIG_64BIT)" v3: https://lore.kernel.org/bpf/20241025013233.804027-1-houtao@huaweicloud.com/T/#t * split the bits-iterator related patches from "Misc fixes for bpf" patch set * patch #1: use "!nr_bits || bits >= nr_bits" to stop the iteration * patch #2: add a new helper for the overflow problem * patch #3: decrease the limitation from 512 to 511 and check whether nr_bytes is too large for bpf memory allocator explicitly * patch #5: add two more test cases for bit iterator v2: http://lore.kernel.org/bpf/d49fa2f4-f743-c763-7579-c3cab4dd88cb@huaweicloud.com ==================== Link: https://lore.kernel.org/r/20241030100516.3633640-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30selftests/bpf: Add three test cases for bits_iterHou Tao
Add more test cases for bits iterator: (1) huge word test Verify the multiplication overflow of nr_bits in bits_iter. Without the overflow check, when nr_words is 67108865, nr_bits becomes 64, causing bpf_probe_read_kernel_common() to corrupt the stack. (2) max word test Verify correct handling of maximum nr_words value (511). (3) bad word test Verify early termination of bits iteration when bits iterator initialization fails. Also rename bits_nomem to bits_too_big to better reflect its purpose. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-6-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Use __u64 to save the bits in bits iteratorHou Tao
On 32-bit hosts (e.g., arm32), when a bpf program passes a u64 to bpf_iter_bits_new(), bpf_iter_bits_new() will use bits_copy to store the content of the u64. However, bits_copy is only 4 bytes, leading to stack corruption. The straightforward solution would be to replace u64 with unsigned long in bpf_iter_bits_new(). However, this introduces confusion and problems for 32-bit hosts because the size of ulong in bpf program is 8 bytes, but it is treated as 4-bytes after passed to bpf_iter_bits_new(). Fix it by changing the type of both bits and bit_count from unsigned long to u64. However, the change is not enough. The main reason is that bpf_iter_bits_next() uses find_next_bit() to find the next bit and the pointer passed to find_next_bit() is an unsigned long pointer instead of a u64 pointer. For 32-bit little-endian host, it is fine but it is not the case for 32-bit big-endian host. Because under 32-bit big-endian host, the first iterated unsigned long will be the bits 32-63 of the u64 instead of the expected bits 0-31. Therefore, in addition to changing the type, swap the two unsigned longs within the u64 for 32-bit big-endian host. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Check the validity of nr_words in bpf_iter_bits_new()Hou Tao
Check the validity of nr_words in bpf_iter_bits_new(). Without this check, when multiplication overflow occurs for nr_bits (e.g., when nr_words = 0x0400-0001, nr_bits becomes 64), stack corruption may occur due to bpf_probe_read_kernel_common(..., nr_bytes = 0x2000-0008). Fix it by limiting the maximum value of nr_words to 511. The value is derived from the current implementation of BPF memory allocator. To ensure compatibility if the BPF memory allocator's size limitation changes in the future, use the helper bpf_mem_alloc_check_size() to check whether nr_bytes is too larger. And return -E2BIG instead of -ENOMEM for oversized nr_bytes. Fixes: 4665415975b0 ("bpf: Add bits iterator") Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Add bpf_mem_alloc_check_size() helperHou Tao
Introduce bpf_mem_alloc_check_size() to check whether the allocation size exceeds the limitation for the kmalloc-equivalent allocator. The upper limit for percpu allocation is LLIST_NODE_SZ bytes larger than non-percpu allocation, so a percpu argument is added to the helper. The helper will be used in the following patch to check whether the size parameter passed to bpf_mem_alloc() is too big. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Free dynamically allocated bits in bpf_iter_bits_destroy()Hou Tao
bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the bits are dynamically allocated. However, the check is incorrect and may cause a kmemleak as shown below: unreferenced object 0xffff88812628c8c0 (size 32): comm "swapper/0", pid 1, jiffies 4294727320 hex dump (first 32 bytes): b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U........... f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 .............. backtrace (crc 781e32cc): [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 [<00000000b119f72f>] kernel_init+0x20/0x160 [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30 That is because nr_bits will be set as zero in bpf_iter_bits_next() after all bits have been iterated. Fix the issue by setting kit->bit to kit->nr_bits instead of setting kit->nr_bits to zero when the iteration completes in bpf_iter_bits_next(). In addition, use "!nr_bits || bits >= nr_bits" to check whether the iteration is complete and still use "nr_bits > 64" to indicate whether bits are dynamically allocated. The "!nr_bits" check is necessary because bpf_iter_bits_new() may fail before setting kit->nr_bits, and this condition will stop the iteration early instead of accessing the zeroed or freed kit->bits. Considering the initial value of kit->bits is -1 and the type of kit->nr_bits is unsigned int, change the type of kit->nr_bits to int. The potential overflow problem will be handled in the following patch. Fixes: 4665415975b0 ("bpf: Add bits iterator") Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecsSungwoo Kim
Fix __hci_cmd_sync_sk() to return not NULL for unknown opcodes. __hci_cmd_sync_sk() returns NULL if a command returns a status event. However, it also returns NULL where an opcode doesn't exist in the hci_cc table because hci_cmd_complete_evt() assumes status = skb->data[0] for unknown opcodes. This leads to null-ptr-deref in cmd_sync for HCI_OP_READ_LOCAL_CODECS as there is no hci_cc for HCI_OP_READ_LOCAL_CODECS, which always assumes status = skb->data[0]. KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 1 PID: 2000 Comm: kworker/u9:5 Not tainted 6.9.0-ga6bcb805883c-dirty #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci7 hci_power_on RIP: 0010:hci_read_supported_codecs+0xb9/0x870 net/bluetooth/hci_codec.c:138 Code: 08 48 89 ef e8 b8 c1 8f fd 48 8b 75 00 e9 96 00 00 00 49 89 c6 48 ba 00 00 00 00 00 fc ff df 4c 8d 60 70 4c 89 e3 48 c1 eb 03 <0f> b6 04 13 84 c0 0f 85 82 06 00 00 41 83 3c 24 02 77 0a e8 bf 78 RSP: 0018:ffff888120bafac8 EFLAGS: 00010212 RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8881173f0040 RDX: dffffc0000000000 RSI: ffffffffa58496c0 RDI: ffff88810b9ad1e4 RBP: ffff88810b9ac000 R08: ffffffffa77882a7 R09: 1ffffffff4ef1054 R10: dffffc0000000000 R11: fffffbfff4ef1055 R12: 0000000000000070 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810b9ac000 FS: 0000000000000000(0000) GS:ffff8881f6c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6ddaa3439e CR3: 0000000139764003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> hci_read_local_codecs_sync net/bluetooth/hci_sync.c:4546 [inline] hci_init_stage_sync net/bluetooth/hci_sync.c:3441 [inline] hci_init4_sync net/bluetooth/hci_sync.c:4706 [inline] hci_init_sync net/bluetooth/hci_sync.c:4742 [inline] hci_dev_init_sync net/bluetooth/hci_sync.c:4912 [inline] hci_dev_open_sync+0x19a9/0x2d30 net/bluetooth/hci_sync.c:4994 hci_dev_do_open net/bluetooth/hci_core.c:483 [inline] hci_power_on+0x11e/0x560 net/bluetooth/hci_core.c:1015 process_one_work kernel/workqueue.c:3267 [inline] process_scheduled_works+0x8ef/0x14f0 kernel/workqueue.c:3348 worker_thread+0x91f/0xe50 kernel/workqueue.c:3429 kthread+0x2cb/0x360 kernel/kthread.c:388 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fixes: abfeea476c68 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY") Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-10-30Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two small fixes, both in drivers (ufs and scsi_debug)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Fix another deadlock during RTC update scsi: scsi_debug: Fix do_device_access() handling of unexpected SG copy length
2024-10-30ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1Christoffer Sandberg
Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029151653.80726-2-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-30ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3Christoffer Sandberg
Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029151653.80726-1-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-30ALSA: usb-audio: Add quirks for Dell WD19 dockJan Schär
The WD19 family of docks has the same audio chipset as the WD15. This change enables jack detection on the WD19. We don't need the dell_dock_mixer_init quirk for the WD19. It is only needed because of the dell_alc4020_map quirk for the WD15 in mixer_maps.c, which disables the volume controls. Even for the WD15, this quirk was apparently only needed when the dock firmware was not updated. Signed-off-by: Jan Schär <jan@jschaer.ch> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029221249.15661-1-jan@jschaer.ch Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-30Merge tag 'asoc-fix-v6.12-rc5' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.12 The biggest set of changes here is Hans' fixes and quirks for various Baytrail based platforms with RT5640 CODECs, and there's one core fix for a missed length assignment for __counted_by() checking. Otherwise it's small device specific fixes, several of them in the DT bindings.
2024-10-30netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6()Eric Dumazet
I got a syzbot report without a repro [1] crashing in nf_send_reset6() I think the issue is that dev->hard_header_len is zero, and we attempt later to push an Ethernet header. Use LL_MAX_HEADER, as other functions in net/ipv6/netfilter/nf_reject_ipv6.c. [1] skbuff: skb_under_panic: text:ffffffff89b1d008 len:74 put:14 head:ffff88803123aa00 data:ffff88803123a9f2 tail:0x3c end:0x140 dev:syz_tun kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 7373 Comm: syz.1.568 Not tainted 6.12.0-rc2-syzkaller-00631-g6d858708d465 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0d 8d 48 c7 c6 60 a6 29 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 ba 30 38 02 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc900045269b0 EFLAGS: 00010282 RAX: 0000000000000088 RBX: dffffc0000000000 RCX: cd66dacdc5d8e800 RDX: 0000000000000000 RSI: 0000000000000200 RDI: 0000000000000000 RBP: ffff88802d39a3d0 R08: ffffffff8174afec R09: 1ffff920008a4ccc R10: dffffc0000000000 R11: fffff520008a4ccd R12: 0000000000000140 R13: ffff88803123aa00 R14: ffff88803123a9f2 R15: 000000000000003c FS: 00007fdbee5ff6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000005d322000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 eth_header+0x38/0x1f0 net/ethernet/eth.c:83 dev_hard_header include/linux/netdevice.h:3208 [inline] nf_send_reset6+0xce6/0x1270 net/ipv6/netfilter/nf_reject_ipv6.c:358 nft_reject_inet_eval+0x3b9/0x690 net/netfilter/nft_reject_inet.c:48 expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline] nft_do_chain+0x4ad/0x1da0 net/netfilter/nf_tables_core.c:288 nft_do_chain_inet+0x418/0x6b0 net/netfilter/nft_chain_filter.c:161 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] NF_HOOK include/linux/netfilter.h:312 [inline] br_nf_pre_routing_ipv6+0x63e/0x770 net/bridge/br_netfilter_ipv6.c:184 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_bridge_pre net/bridge/br_input.c:277 [inline] br_handle_frame+0x9fd/0x1530 net/bridge/br_input.c:424 __netif_receive_skb_core+0x13e8/0x4570 net/core/dev.c:5562 __netif_receive_skb_one_core net/core/dev.c:5666 [inline] __netif_receive_skb+0x12f/0x650 net/core/dev.c:5781 netif_receive_skb_internal net/core/dev.c:5867 [inline] netif_receive_skb+0x1e8/0x890 net/core/dev.c:5926 tun_rx_batched+0x1b7/0x8f0 drivers/net/tun.c:1550 tun_get_user+0x3056/0x47e0 drivers/net/tun.c:2007 tun_chr_write_iter+0x10d/0x1f0 drivers/net/tun.c:2053 new_sync_write fs/read_write.c:590 [inline] vfs_write+0xa6d/0xc90 fs/read_write.c:683 ksys_write+0x183/0x2b0 fs/read_write.c:736 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fdbeeb7d1ff Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 c9 8d 02 00 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 1c 8e 02 00 48 RSP: 002b:00007fdbee5ff000 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fdbeed36058 RCX: 00007fdbeeb7d1ff RDX: 000000000000008e RSI: 0000000020000040 RDI: 00000000000000c8 RBP: 00007fdbeebf12be R08: 0000000000000000 R09: 0000000000000000 R10: 000000000000008e R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000000000 R14: 00007fdbeed36058 R15: 00007ffc38de06e8 </TASK> Fixes: c8d7b98bec43 ("netfilter: move nf_send_resetX() code to nf_reject_ipvX modules") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-10-30netfilter: Fix use-after-free in get_info()Dong Chenchen
ip6table_nat module unload has refcnt warning for UAF. call trace is: WARNING: CPU: 1 PID: 379 at kernel/module/main.c:853 module_put+0x6f/0x80 Modules linked in: ip6table_nat(-) CPU: 1 UID: 0 PID: 379 Comm: ip6tables Not tainted 6.12.0-rc4-00047-gc2ee9f594da8-dirty #205 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:module_put+0x6f/0x80 Call Trace: <TASK> get_info+0x128/0x180 do_ip6t_get_ctl+0x6a/0x430 nf_getsockopt+0x46/0x80 ipv6_getsockopt+0xb9/0x100 rawv6_getsockopt+0x42/0x190 do_sock_getsockopt+0xaa/0x180 __sys_getsockopt+0x70/0xc0 __x64_sys_getsockopt+0x20/0x30 do_syscall_64+0xa2/0x1a0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Concurrent execution of module unload and get_info() trigered the warning. The root cause is as follows: cpu0 cpu1 module_exit //mod->state = MODULE_STATE_GOING ip6table_nat_exit xt_unregister_template kfree(t) //removed from templ_list getinfo() t = xt_find_table_lock list_for_each_entry(tmpl, &xt_templates[af]...) if (strcmp(tmpl->name, name)) continue; //table not found try_module_get list_for_each_entry(t, &xt_net->tables[af]...) return t; //not get refcnt module_put(t->me) //uaf unregister_pernet_subsys //remove table from xt_net list While xt_table module was going away and has been removed from xt_templates list, we couldnt get refcnt of xt_table->me. Check module in xt_net->tables list re-traversal to fix it. Fixes: fdacd57c79b7 ("netfilter: x_tables: never register tables by default") Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-10-30selftests: netfilter: remove unused parameterLiu Jing
err is never used, remove it. Signed-off-by: Liu Jing <liujing@cmss.chinamobile.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-10-29bpf: disallow 40-bytes extra stack for bpf_fastcall patternsEduard Zingerman
Hou Tao reported an issue with bpf_fastcall patterns allowing extra stack space above MAX_BPF_STACK limit. This extra stack allowance is not integrated properly with the following verifier parts: - backtracking logic still assumes that stack can't exceed MAX_BPF_STACK; - bpf_verifier_env->scratched_stack_slots assumes only 64 slots are available. Here is an example of an issue with precision tracking (note stack slot -8 tracked as precise instead of -520): 0: (b7) r1 = 42 ; R1_w=42 1: (b7) r2 = 42 ; R2_w=42 2: (7b) *(u64 *)(r10 -512) = r1 ; R1_w=42 R10=fp0 fp-512_w=42 3: (7b) *(u64 *)(r10 -520) = r2 ; R2_w=42 R10=fp0 fp-520_w=42 4: (85) call bpf_get_smp_processor_id#8 ; R0_w=scalar(...) 5: (79) r2 = *(u64 *)(r10 -520) ; R2_w=42 R10=fp0 fp-520_w=42 6: (79) r1 = *(u64 *)(r10 -512) ; R1_w=42 R10=fp0 fp-512_w=42 7: (bf) r3 = r10 ; R3_w=fp0 R10=fp0 8: (0f) r3 += r2 mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx -1 mark_precise: frame0: regs=r2 stack= before 7: (bf) r3 = r10 mark_precise: frame0: regs=r2 stack= before 6: (79) r1 = *(u64 *)(r10 -512) mark_precise: frame0: regs=r2 stack= before 5: (79) r2 = *(u64 *)(r10 -520) mark_precise: frame0: regs= stack=-8 before 4: (85) call bpf_get_smp_processor_id#8 mark_precise: frame0: regs= stack=-8 before 3: (7b) *(u64 *)(r10 -520) = r2 mark_precise: frame0: regs=r2 stack= before 2: (7b) *(u64 *)(r10 -512) = r1 mark_precise: frame0: regs=r2 stack= before 1: (b7) r2 = 42 9: R2_w=42 R3_w=fp42 9: (95) exit This patch disables the additional allowance for the moment. Also, two test cases are removed: - bpf_fastcall_max_stack_ok: it fails w/o additional stack allowance; - bpf_fastcall_max_stack_fail: this test is no longer necessary, stack size follows regular rules, pattern invalidation is checked by other test cases. Reported-by: Hou Tao <houtao@huaweicloud.com> Closes: https://lore.kernel.org/bpf/20241023022752.172005-1-houtao@huaweicloud.com/ Fixes: 5b5f51bff1b6 ("bpf: no_caller_saved_registers attribute for helper calls") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241029193911.1575719-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-29Merge tag 'cgroup-for-6.12-rc5-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - cgroup_bpf_release_fn() could saturate system_wq with cgrp->bpf.release_work which can then form a circular dependency leading to deadlocks. Fix by using a dedicated workqueue. The system_wq's max concurrency limit is being increased separately. - Fix theoretical off-by-one bug when enforcing max cgroup hierarchy depth * tag 'cgroup-for-6.12-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Fix potential overflow issue when checking max_depth cgroup/bpf: use a dedicated workqueue for cgroup bpf destruction
2024-10-29Merge tag 'sched_ext-for-6.12-rc5-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Instances of scx_ops_bypass() could race each other leading to misbehavior. Fix by protecting the operation with a spinlock. - selftest and userspace header fixes * tag 'sched_ext-for-6.12-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Fix enq_last_no_enq_fails selftest sched_ext: Make cast_mask() inline scx: Fix raciness in scx_ops_bypass() scx: Fix exit selftest to use custom DSQ sched_ext: Fix function pointer type mismatches in BPF selftests selftests/sched_ext: add order-only dependency of runner.o on BPFOBJ
2024-10-29Merge tag 'slab-for-6.12-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: - Fix for a slub_kunit test warning with MEM_ALLOC_PROFILING_DEBUG (Pei Xiao) - Fix for a MTE-based KASAN BUG in krealloc() (Qun-Wei Lin) * tag 'slab-for-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm: krealloc: Fix MTE false alarm in __do_krealloc slub/kunit: fix a WARNING due to unwrapped __kmalloc_cache_noprof
2024-10-29Merge tag 'mm-hotfixes-stable-2024-10-28-21-50' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "21 hotfixes. 13 are cc:stable. 13 are MM and 8 are non-MM. No particular theme here - mainly singletons, a couple of doubletons. Please see the changelogs" * tag 'mm-hotfixes-stable-2024-10-28-21-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits) mm: avoid unconditional one-tick sleep when swapcache_prepare fails mseal: update mseal.rst mm: split critical region in remap_file_pages() and invoke LSMs in between selftests/mm: fix deadlock for fork after pthread_create with atomic_bool Revert "selftests/mm: replace atomic_bool with pthread_barrier_t" Revert "selftests/mm: fix deadlock for fork after pthread_create on ARM" tools: testing: add expand-only mode VMA test mm/vma: add expand-only VMA merge mode and optimise do_brk_flags() resource,kexec: walk_system_ram_res_rev must retain resource flags nilfs2: fix kernel bug due to missing clearing of checked flag mm: numa_clear_kernel_node_hotplug: Add NUMA_NO_NODE check for node id ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow mm: shmem: fix data-race in shmem_getattr() mm: mark mas allocation in vms_abort_munmap_vmas as __GFP_NOFAIL x86/traps: move kmsan check after instrumentation_begin resource: remove dependency on SPARSEMEM from GET_FREE_REGION mm/mmap: fix race in mmap_region() with ftruncate() mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves fork: only invoke khugepaged, ksm hooks if no error fork: do not invoke uffd on fork if error occurs ...
2024-10-29Merge tag 'tpmdd-next-6.12-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fix from Jarkko Sakkinen: "Address a significant boot-time delay issue" Link: https://bugzilla.kernel.org/show_bug.cgi?id=219229 * tag 'tpmdd-next-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Lazily flush the auth session tpm: Rollback tpm2_load_null() tpm: Return tpm2_sessions_init() when null key creation fails
2024-10-29Merge tag 'wireless-2024-10-29' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== wireless fixes for v6.12-rc6 Another set of fixes, mostly iwlwifi: * fix infinite loop in 6 GHz scan if more than 255 colocated APs were reported * revert removal of retry loops for now to work around issues with firmware initialization on some devices/platforms * fix SAR table issues with some BIOSes * fix race in suspend/debug collection * fix memory leak in fw recovery * fix link ID leak in AP mode for older devices * fix sending TX power constraints * fix link handling in FW restart And also the stack: * fix setting TX power from userspace with the new chanctx emulation code for old-style drivers * fix a memory corruption bug due to structure embedding * fix CQM configuration double-free when moving between net namespaces * tag 'wireless-2024-10-29' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: mac80211: ieee80211_i: Fix memory corruption bug in struct ieee80211_chanctx wifi: iwlwifi: mvm: fix 6 GHz scan construction wifi: cfg80211: clear wdev->cqm_config pointer on free mac80211: fix user-power when emulating chanctx Revert "wifi: iwlwifi: remove retry loops in start" wifi: iwlwifi: mvm: don't add default link in fw restart flow wifi: iwlwifi: mvm: Fix response handling in iwl_mvm_send_recovery_cmd() wifi: iwlwifi: mvm: SAR table alignment wifi: iwlwifi: mvm: Use the sync timepoint API in suspend wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd wifi: iwlwifi: mvm: don't leak a link on AP removal ==================== Link: https://patch.msgid.link/20241029093926.13750-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29net: fix crash when config small gso_max_size/gso_ipv4_max_sizeWang Liang
Config a small gso_max_size/gso_ipv4_max_size will lead to an underflow in sk_dst_gso_max_size(), which may trigger a BUG_ON crash, because sk->sk_gso_max_size would be much bigger than device limits. Call Trace: tcp_write_xmit tso_segs = tcp_init_tso_segs(skb, mss_now); tcp_set_skb_tso_segs tcp_skb_pcount_set // skb->len = 524288, mss_now = 8 // u16 tso_segs = 524288/8 = 65535 -> 0 tso_segs = DIV_ROUND_UP(skb->len, mss_now) BUG_ON(!tso_segs) Add check for the minimum value of gso_max_size and gso_ipv4_max_size. Fixes: 46e6b992c250 ("rtnetlink: allow GSO maximums to be set on device creation") Fixes: 9eefedd58ae1 ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device") Signed-off-by: Wang Liang <wangliang74@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241023035213.517386-1-wangliang74@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29selftests/bpf: Add test for trie_get_next_key()Byeonguk Jeong
Add a test for out-of-bounds write in trie_get_next_key() when a full path from root to leaf exists and bpf_map_get_next_key() is called with the leaf node. It may crashes the kernel on failure, so please run in a VM. Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/Zxx4ep78tsbeWPVM@localhost.localdomain Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-29bpf: Fix out-of-bounds write in trie_get_next_key()Byeonguk Jeong
trie_get_next_key() allocates a node stack with size trie->max_prefixlen, while it writes (trie->max_prefixlen + 1) nodes to the stack when it has full paths from the root to leaves. For example, consider a trie with max_prefixlen is 8, and the nodes with key 0x00/0, 0x00/1, 0x00/2, ... 0x00/8 inserted. Subsequent calls to trie_get_next_key with _key with .prefixlen = 8 make 9 nodes be written on the node stack with size 8. Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map") Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org> Tested-by: Hou Tao <houtao1@huawei.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/Zxx384ZfdlFYnz6J@localhost.localdomain Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-29wcd937x codec fixesMark Brown
Merge series from Alexey Klimov <alexey.klimov@linaro.org>: This sent as RFC because of the following: - regarding the LO switch patch. I've got info about that from two persons independently hence not sure what tags to put there and who should be the author. Please let me know if that needs to be corrected. - the wcd937x pdm watchdog is a problem for audio playback and needs to be fixed. The minimal fix would be to at least increase timeout value but it will still trigger in case of plenty of dbg messages or other delay-generating things. Unfortunately, I can't test HPHL/R outputs hence the patch is only for AUX. The other options would be introducing module parameter for debugging and using HOLD_OFF bit for that or adding Kconfig option. Alexey Klimov (2): ASoC: codecs: wcd937x: add missing LO Switch control ASoC: codecs: wcd937x: relax the AUX PDM watchdog sound/soc/codecs/wcd937x.c | 12 ++++++++++-- sound/soc/codecs/wcd937x.h | 4 ++++ 2 files changed, 14 insertions(+), 2 deletions(-) -- 2.45.2
2024-10-29net: usb: qmi_wwan: add Quectel RG650VBenoît Monin
Add support for Quectel RG650V which is based on Qualcomm SDX65 chip. The composition is DIAG / NMEA / AT / AT / QMI. T: Bus=02 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2c7c ProdID=0122 Rev=05.15 S: Manufacturer=Quectel S: Product=RG650V-EU S: SerialNumber=xxxxxxx C: #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=9ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=9ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=9ms Signed-off-by: Benoît Monin <benoit.monin@gmx.fr> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241024151113.53203-1-benoit.monin@gmx.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29net/sched: sch_api: fix xa_insert() error path in tcf_block_get_ext()Vladimir Oltean
This command: $ tc qdisc replace dev eth0 ingress_block 1 egress_block 1 clsact Error: block dev insert failed: -EBUSY. fails because user space requests the same block index to be set for both ingress and egress. [ side note, I don't think it even failed prior to commit 913b47d3424e ("net/sched: Introduce tc block netdev tracking infra"), because this is a command from an old set of notes of mine which used to work, but alas, I did not scientifically bisect this ] The problem is not that it fails, but rather, that the second time around, it fails differently (and irrecoverably): $ tc qdisc replace dev eth0 ingress_block 1 egress_block 1 clsact Error: dsa_core: Flow block cb is busy. [ another note: the extack is added by me for illustration purposes. the context of the problem is that clsact_init() obtains the same &q->ingress_block pointer as &q->egress_block, and since we call tcf_block_get_ext() on both of them, "dev" will be added to the block->ports xarray twice, thus failing the operation: once through the ingress block pointer, and once again through the egress block pointer. the problem itself is that when xa_insert() fails, we have emitted a FLOW_BLOCK_BIND command through ndo_setup_tc(), but the offload never sees a corresponding FLOW_BLOCK_UNBIND. ] Even correcting the bad user input, we still cannot recover: $ tc qdisc replace dev swp3 ingress_block 1 egress_block 2 clsact Error: dsa_core: Flow block cb is busy. Basically the only way to recover is to reboot the system, or unbind and rebind the net device driver. To fix the bug, we need to fill the correct error teardown path which was missed during code movement, and call tcf_block_offload_unbind() when xa_insert() fails. [ last note, fundamentally I blame the label naming convention in tcf_block_get_ext() for the bug. The labels should be named after what they do, not after the error path that jumps to them. This way, it is obviously wrong that two labels pointing to the same code mean something is wrong, and checking the code correctness at the goto site is also easier ] Fixes: 94e2557d086a ("net: sched: move block device tracking into tcf_block_get/put_ext()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20241023100541.974362-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29netdevsim: Add trailing zero to terminate the string in ↵Zichen Xie
nsim_nexthop_bucket_activity_write() This was found by a static analyzer. We should not forget the trailing zero after copy_from_user() if we will further do some string operations, sscanf() in this case. Adding a trailing zero will ensure that the function performs properly. Fixes: c6385c0b67c5 ("netdevsim: Allow reporting activity on nexthop buckets") Signed-off-by: Zichen Xie <zichenxie0106@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20241022171907.8606-1-zichenxie0106@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29selftests/bpf: Test with a very short loopEduard Zingerman
The test added is a simplified reproducer from syzbot report [1]. If verifier does not insert checkpoint somewhere inside the loop, verification of the program would take a very long time. This would happen because mark_chain_precision() for register r7 would constantly trace jump history of the loop back, processing many iterations for each mark_chain_precision() call. [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20241029172641.1042523-2-eddyz87@gmail.com
2024-10-29bpf: Force checkpoint when jmp history is too longEduard Zingerman
A specifically crafted program might trick verifier into growing very long jump history within a single bpf_verifier_state instance. Very long jump history makes mark_chain_precision() unreasonably slow, especially in case if verifier processes a loop. Mitigate this by forcing new state in is_state_visited() in case if current state's jump history is too long. Use same constant as in `skip_inf_loop_check`, but multiply it by arbitrarily chosen value 2 to account for jump history containing not only information about jumps, but also information about stack access. For an example of problematic program consider the code below, w/o this patch the example is processed by verifier for ~15 minutes, before failing to allocate big-enough chunk for jmp_history. 0: r7 = *(u16 *)(r1 +0);" 1: r7 += 0x1ab064b9;" 2: if r7 & 0x702000 goto 1b; 3: r7 &= 0x1ee60e;" 4: r7 += r1;" 5: if r7 s> 0x37d2 goto +0;" 6: r0 = 0;" 7: exit;" Perf profiling shows that most of the time is spent in mark_chain_precision() ~95%. The easiest way to explain why this program causes problems is to apply the following patch: diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 0c216e71cec7..4b4823961abe 100644 \--- a/include/linux/bpf.h \+++ b/include/linux/bpf.h \@@ -1926,7 +1926,7 @@ struct bpf_array { }; }; -#define BPF_COMPLEXITY_LIMIT_INSNS 1000000 /* yes. 1M insns */ +#define BPF_COMPLEXITY_LIMIT_INSNS 256 /* yes. 1M insns */ #define MAX_TAIL_CALL_CNT 33 /* Maximum number of loops for bpf_loop and bpf_iter_num. diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f514247ba8ba..75e88be3bb3e 100644 \--- a/kernel/bpf/verifier.c \+++ b/kernel/bpf/verifier.c \@@ -18024,8 +18024,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) skip_inf_loop_check: if (!force_new_state && env->jmps_processed - env->prev_jmps_processed < 20 && - env->insn_processed - env->prev_insn_processed < 100) + env->insn_processed - env->prev_insn_processed < 100) { + verbose(env, "is_state_visited: suppressing checkpoint at %d, %d jmps processed, cur->jmp_history_cnt is %d\n", + env->insn_idx, + env->jmps_processed - env->prev_jmps_processed, + cur->jmp_history_cnt); add_new_state = false; + } goto miss; } /* If sl->state is a part of a loop and this loop's entry is a part of \@@ -18142,6 +18147,9 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) if (!add_new_state) return 0; + verbose(env, "is_state_visited: new checkpoint at %d, resetting env->jmps_processed\n", + env->insn_idx); + /* There were no equivalent states, remember the current one. * Technically the current state is not proven to be safe yet, * but it will either reach outer most bpf_exit (which means it's safe) And observe verification log: ... is_state_visited: new checkpoint at 5, resetting env->jmps_processed 5: R1=ctx() R7=ctx(...) 5: (65) if r7 s> 0x37d2 goto pc+0 ; R7=ctx(...) 6: (b7) r0 = 0 ; R0_w=0 7: (95) exit from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0 6: R1=ctx() R7=ctx(...) R10=fp0 6: (b7) r0 = 0 ; R0_w=0 7: (95) exit is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74 from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0 1: R1=ctx() R7_w=scalar(...) R10=fp0 1: (07) r7 += 447767737 is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75 2: R7_w=scalar(...) 2: (45) if r7 & 0x702000 goto pc-2 ... mark_precise 152 steps for r7 ... 2: R7_w=scalar(...) is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75 1: (07) r7 += 447767737 is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76 2: R7_w=scalar(...) 2: (45) if r7 & 0x702000 goto pc-2 ... BPF program is too large. Processed 257 insn The log output shows that checkpoint at label (1) is never created, because it is suppressed by `skip_inf_loop_check` logic: a. When 'if' at (2) is processed it pushes a state with insn_idx (1) onto stack and proceeds to (3); b. At (5) checkpoint is created, and this resets env->{jmps,insns}_processed. c. Verification proceeds and reaches `exit`; d. State saved at step (a) is popped from stack and is_state_visited() considers if checkpoint needs to be added, but because env->{jmps,insns}_processed had been just reset at step (b) the `skip_inf_loop_check` logic forces `add_new_state` to false. e. Verifier proceeds with current state, which slowly accumulates more and more entries in the jump history. The accumulation of entries in the jump history is a problem because of two factors: - it eventually exhausts memory available for kmalloc() allocation; - mark_chain_precision() traverses the jump history of a state, meaning that if `r7` is marked precise, verifier would iterate ever growing jump history until parent state boundary is reached. (note: the log also shows a REG INVARIANTS VIOLATION warning upon jset processing, but that's another bug to fix). With this patch applied, the example above is rejected by verifier under 1s of time, reaching 1M instructions limit. The program is a simplified reproducer from syzbot report. Previous discussion could be found at [1]. The patch does not cause any changes in verification performance, when tested on selftests from veristat.cfg and cilium programs taken from [2]. [1] https://lore.kernel.org/bpf/20241009021254.2805446-1-eddyz87@gmail.com/ [2] https://github.com/anakryiko/cilium Changelog: - v1 -> v2: - moved patch to bpf tree; - moved force_new_state variable initialization after declaration and shortened the comment. v1: https://lore.kernel.org/bpf/20241018020307.1766906-1-eddyz87@gmail.com/ Fixes: 2589726d12a1 ("bpf: introduce bounded loops") Reported-by: syzbot+7e46cdef14bf496a3ab4@syzkaller.appspotmail.com Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20241029172641.1042523-1-eddyz87@gmail.com Closes: https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
2024-10-29net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOTPedro Tammela
In qdisc_tree_reduce_backlog, Qdiscs with major handle ffff: are assumed to be either root or ingress. This assumption is bogus since it's valid to create egress qdiscs with major handle ffff: Budimir Markovic found that for qdiscs like DRR that maintain an active class list, it will cause a UAF with a dangling class pointer. In 066a3b5b2346, the concern was to avoid iterating over the ingress qdisc since its parent is itself. The proper fix is to stop when parent TC_H_ROOT is reached because the only way to retrieve ingress is when a hierarchy which does not contain a ffff: major handle call into qdisc_lookup with TC_H_MAJ(TC_H_ROOT). In the scenario where major ffff: is an egress qdisc in any of the tree levels, the updates will also propagate to TC_H_ROOT, which then the iteration must stop. Fixes: 066a3b5b2346 ("[NET_SCHED] sch_api: fix qdisc_tree_decrease_qlen() loop") Reported-by: Budimir Markovic <markovicbudimir@gmail.com> Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> net/sched/sch_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241024165547.418570-1-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29selftests: netfilter: nft_flowtable.sh: make first pass deterministicFlorian Westphal
The CI occasionaly encounters a failing test run. Example: # PASS: ipsec tunnel mode for ns1/ns2 # re-run with random mtus: -o 10966 -l 19499 -r 31322 # PASS: flow offloaded for ns1/ns2 [..] # FAIL: ipsec tunnel ... counter 1157059 exceeds expected value 878489 This script will re-exec itself, on the second run, random MTUs are chosen for the involved links. This is done so we can cover different combinations (large mtu on client, small on server, link has lowest mtu, etc). Furthermore, file size is random, even for the first run. Rework this script and always use the same file size on initial run so that at least the first round can be expected to have reproducible behavior. Second round will use random mtu/filesize. Raise the failure limit to that of the file size, this should avoid all errneous test errors. Currently, first fin will remove the offload, so if one peer is already closing remaining data is handled by classic path, which result in larger-than-expected counter and a test failure. Given packet path also counts tcp/ip headers, in case offload is completely broken this test will still fail (as expected). The test counter limit could be made more strict again in the future once flowtable can keep a connection in offloaded state until FINs in both directions were seen. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241022152324.13554-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29gtp: allow -1 to be specified as file description from userspacePablo Neira Ayuso
Existing user space applications maintained by the Osmocom project are breaking since a recent fix that addresses incorrect error checking. Restore operation for user space programs that specify -1 as file descriptor to skip GTPv0 or GTPv1 only sockets. Fixes: defd8b3c37b0 ("gtp: fix a potential NULL pointer dereference") Reported-by: Pau Espin Pedrol <pespin@sysmocom.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Oliver Smith <osmith@sysmocom.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241022144825.66740-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29mctp i2c: handle NULL header addressMatt Johnston
daddr can be NULL if there is no neighbour table entry present, in that case the tx packet should be dropped. saddr will usually be set by MCTP core, but check for NULL in case a packet is transmitted by a different protocol. Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver") Cc: stable@vger.kernel.org Reported-by: Dung Cao <dung@os.amperecomputing.com> Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241022-mctp-i2c-null-dest-v3-1-e929709956c5@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29ipv4: ip_tunnel: Fix suspicious RCU usage warning in ip_tunnel_find()Ido Schimmel
The per-netns IP tunnel hash table is protected by the RTNL mutex and ip_tunnel_find() is only called from the control path where the mutex is taken. Add a lockdep expression to hlist_for_each_entry_rcu() in ip_tunnel_find() in order to validate that the mutex is held and to silence the suspicious RCU usage warning [1]. [1] WARNING: suspicious RCU usage 6.12.0-rc3-custom-gd95d9a31aceb #139 Not tainted ----------------------------- net/ipv4/ip_tunnel.c:221 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ip/362: #0: ffffffff86fc7cb0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x377/0xf60 stack backtrace: CPU: 12 UID: 0 PID: 362 Comm: ip Not tainted 6.12.0-rc3-custom-gd95d9a31aceb #139 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0xba/0x110 lockdep_rcu_suspicious.cold+0x4f/0xd6 ip_tunnel_find+0x435/0x4d0 ip_tunnel_newlink+0x517/0x7a0 ipgre_newlink+0x14c/0x170 __rtnl_newlink+0x1173/0x19c0 rtnl_newlink+0x6c/0xa0 rtnetlink_rcv_msg+0x3cc/0xf60 netlink_rcv_skb+0x171/0x450 netlink_unicast+0x539/0x7f0 netlink_sendmsg+0x8c1/0xd80 ____sys_sendmsg+0x8f9/0xc20 ___sys_sendmsg+0x197/0x1e0 __sys_sendmsg+0x122/0x1f0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241023123009.749764-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29ipv4: ip_tunnel: Fix suspicious RCU usage warning in ip_tunnel_init_flow()Ido Schimmel
There are code paths from which the function is called without holding the RCU read lock, resulting in a suspicious RCU usage warning [1]. Fix by using l3mdev_master_upper_ifindex_by_index() which will acquire the RCU read lock before calling l3mdev_master_upper_ifindex_by_index_rcu(). [1] WARNING: suspicious RCU usage 6.12.0-rc3-custom-gac8f72681cf2 #141 Not tainted ----------------------------- net/core/dev.c:876 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ip/361: #0: ffffffff86fc7cb0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x377/0xf60 stack backtrace: CPU: 3 UID: 0 PID: 361 Comm: ip Not tainted 6.12.0-rc3-custom-gac8f72681cf2 #141 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0xba/0x110 lockdep_rcu_suspicious.cold+0x4f/0xd6 dev_get_by_index_rcu+0x1d3/0x210 l3mdev_master_upper_ifindex_by_index_rcu+0x2b/0xf0 ip_tunnel_bind_dev+0x72f/0xa00 ip_tunnel_newlink+0x368/0x7a0 ipgre_newlink+0x14c/0x170 __rtnl_newlink+0x1173/0x19c0 rtnl_newlink+0x6c/0xa0 rtnetlink_rcv_msg+0x3cc/0xf60 netlink_rcv_skb+0x171/0x450 netlink_unicast+0x539/0x7f0 netlink_sendmsg+0x8c1/0xd80 ____sys_sendmsg+0x8f9/0xc20 ___sys_sendmsg+0x197/0x1e0 __sys_sendmsg+0x122/0x1f0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: db53cd3d88dc ("net: Handle l3mdev in ip_tunnel_init_flow") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20241022063822.462057-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29bpf: fix filed access without lockJiayuan Chen
The tcp_bpf_recvmsg_parser() function, running in user context, retrieves seq_copied from tcp_sk without holding the socket lock, and stores it in a local variable seq. However, the softirq context can modify tcp_sk->seq_copied concurrently, for example, n tcp_read_sock(). As a result, the seq value is stale when it is assigned back to tcp_sk->copied_seq at the end of tcp_bpf_recvmsg_parser(), leading to incorrect behavior. Due to concurrency, the copied_seq field in tcp_bpf_recvmsg_parser() might be set to an incorrect value (less than the actual copied_seq) at the end of function: 'WRITE_ONCE(tcp->copied_seq, seq)'. This causes the 'offset' to be negative in tcp_read_sock()->tcp_recv_skb() when processing new incoming packets (sk->copied_seq - skb->seq becomes less than 0), and all subsequent packets will be dropped. Signed-off-by: Jiayuan Chen <mrpre@163.com> Link: https://lore.kernel.org/r/20241028065226.35568-1-mrpre@163.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-10-29Merge branch 'intel-wired-lan-driver-fixes-2024-10-21-igb-ice'Paolo Abeni
Jacob Keller says: ==================== Intel Wired LAN Driver Fixes 2024-10-21 (igb, ice) This series includes fixes for the ice and igb drivers. Wander fixes an issue in igb when operating on PREEMPT_RT kernels due to the PREEMPT_RT kernel switching IRQs to be threaded by default. Michal fixes the ice driver to block subfunction port creation when the PF is operating in legacy (non-switchdev) mode. Arkadiusz fixes a crash when loading the ice driver on an E810 LOM which has DPLL enabled. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> ==================== Link: https://patch.msgid.link/20241021-iwl-2024-10-21-iwl-net-fixes-v1-0-a50cb3059f55@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-29ice: fix crash on probe for DPLL enabled E810 LOMArkadiusz Kubalewski
The E810 Lan On Motherboard (LOM) design is vendor specific. Intel provides the reference design, but it is up to vendor on the final product design. For some cases, like Linux DPLL support, the static values defined in the driver does not reflect the actual LOM design. Current implementation of dpll pins is causing the crash on probe of the ice driver for such DPLL enabled E810 LOM designs: WARNING: (...) at drivers/dpll/dpll_core.c:495 dpll_pin_get+0x2c4/0x330 ... Call Trace: <TASK> ? __warn+0x83/0x130 ? dpll_pin_get+0x2c4/0x330 ? report_bug+0x1b7/0x1d0 ? handle_bug+0x42/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? dpll_pin_get+0x117/0x330 ? dpll_pin_get+0x2c4/0x330 ? dpll_pin_get+0x117/0x330 ice_dpll_get_pins.isra.0+0x52/0xe0 [ice] ... The number of dpll pins enabled by LOM vendor is greater than expected and defined in the driver for Intel designed NICs, which causes the crash. Prevent the crash and allow generic pin initialization within Linux DPLL subsystem for DPLL enabled E810 LOM designs. Newly designed solution for described issue will be based on "per HW design" pin initialization. It requires pin information dynamically acquired from the firmware and is already in progress, planned for next-tree only. Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Reviewed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-29ice: block SF port creation in legacy modeMichal Swiatkowski
There is no support for SF in legacy mode. Reflect it in the code. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Fixes: eda69d654c7e ("ice: add basic devlink subfunctions support") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-29igb: Disable threaded IRQ for igb_msix_otherWander Lairson Costa
During testing of SR-IOV, Red Hat QE encountered an issue where the ip link up command intermittently fails for the igbvf interfaces when using the PREEMPT_RT variant. Investigation revealed that e1000_write_posted_mbx returns an error due to the lack of an ACK from e1000_poll_for_ack. The underlying issue arises from the fact that IRQs are threaded by default under PREEMPT_RT. While the exact hardware details are not available, it appears that the IRQ handled by igb_msix_other must be processed before e1000_poll_for_ack times out. However, e1000_write_posted_mbx is called with preemption disabled, leading to a scenario where the IRQ is serviced only after the failure of e1000_write_posted_mbx. To resolve this, we set IRQF_NO_THREAD for the affected interrupt, ensuring that the kernel handles it immediately, thereby preventing the aforementioned error. Reproducer: #!/bin/bash # echo 2 > /sys/class/net/ens14f0/device/sriov_numvfs ipaddr_vlan=3 nic_test=ens14f0 vf=${nic_test}v0 while true; do ip link set ${nic_test} mtu 1500 ip link set ${vf} mtu 1500 ip link set $vf up ip link set ${nic_test} vf 0 vlan ${ipaddr_vlan} ip addr add 172.30.${ipaddr_vlan}.1/24 dev ${vf} ip addr add 2021:db8:${ipaddr_vlan}::1/64 dev ${vf} if ! ip link show $vf | grep 'state UP'; then echo 'Error found' break fi ip link set $vf down done Signed-off-by: Wander Lairson Costa <wander@redhat.com> Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver") Reported-by: Yuying Ma <yuma@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-29ASoC: codecs: wcd937x: relax the AUX PDM watchdogAlexey Klimov
On a system with wcd937x, rxmacro and Qualcomm audio DSP, which is pretty common set of devices on Qualcomm platforms, and due to the order of how DAPM widgets are powered on (they are sorted), there is a small time window when wcd937x chip is online and expects the flow of incoming data but rxmacro is not yet online. When wcd937x is programmed to receive data via AUX port then its AUX PDM watchdog is enabled in wcd937x_codec_enable_aux_pa(). If due to some reasons the rxmacro and soundwire machinery are delayed to start streaming data, then there is a chance for this AUX PDM watchdog to reset the wcd937x codec. Such event is not logged as a message and only wcd937x IRQ counter is increased however there could be a lot of other reasons for that IRQ. There is a similar opportunity for such delay during DAPM widgets power down sequence. If wcd937x codec reset happens on the start of the playback, then there will be no sound and if such reset happens at the end of a playback then it may generate additional clicks and pops noises. On qrb4210 RB2 board without any debugging bits the wcd937x resets are sometimes observed at the end of a playback though not always. With some debugging messages or with some tracing enabled the AUX PDM watchdog resets the wcd937x codec at the start of a playback and there is no sound output at all. In this patch: - TIMEOUT_SEL bit in PDM_WD_CTL2 register is set to increase the watchdog reset delay to 100ms which eliminates the AUX PDM watchdog IRQs on qrb4210 RB2 board completely and decreases the number of unwanted clicks noises; - HOLD_OFF bit postpones triggering such watchdog IRQ till wcd937x codec reset which usually happens at the end of a playback. This allows to actually output some sound in case of debugging. Cc: Adam Skladowski <a39.skl@gmail.com> Cc: Mohammad Rafi Shaik <quic_mohs@quicinc.com> Cc: Prasad Kumpatla <quic_pkumpatl@quicinc.com> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://patch.msgid.link/20241022033132.787416-3-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-29ASoC: codecs: wcd937x: add missing LO Switch controlAlexey Klimov
The wcd937x supports also AUX input but the control that sets correct soundwire port for this is missing. This control is required for audio playback, for instance, on qrb4210 RB2 board as well as on other SoCs. Reported-by: Adam Skladowski <a39.skl@gmail.com> Reported-by: Prasad Kumpatla <quic_pkumpatl@quicinc.com> Suggested-by: Adam Skladowski <a39.skl@gmail.com> Suggested-by: Prasad Kumpatla <quic_pkumpatl@quicinc.com> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Mohammad Rafi Shaik <quic_mohs@quicinc.com> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://patch.msgid.link/20241022033132.787416-2-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-29ASoC: dt-bindings: rockchip,rk3308-codec: add port propertyDmitry Yashin
Fix DTB warnings when rk3308-codec used with audio-graph-card by documenting port property: codec@ff560000: 'port' does not match any of the regexes: 'pinctrl-[0-9]+' Signed-off-by: Dmitry Yashin <dmt.yashin@gmail.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://patch.msgid.link/20241028213314.476776-2-dmt.yashin@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-29net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB dataFurong Xu
In case the non-paged data of a SKB carries protocol header and protocol payload to be transmitted on a certain platform that the DMA AXI address width is configured to 40-bit/48-bit, or the size of the non-paged data is bigger than TSO_MAX_BUFF_SIZE on a certain platform that the DMA AXI address width is configured to 32-bit, then this SKB requires at least two DMA transmit descriptors to serve it. For example, three descriptors are allocated to split one DMA buffer mapped from one piece of non-paged data: dma_desc[N + 0], dma_desc[N + 1], dma_desc[N + 2]. Then three elements of tx_q->tx_skbuff_dma[] will be allocated to hold extra information to be reused in stmmac_tx_clean(): tx_q->tx_skbuff_dma[N + 0], tx_q->tx_skbuff_dma[N + 1], tx_q->tx_skbuff_dma[N + 2]. Now we focus on tx_q->tx_skbuff_dma[entry].buf, which is the DMA buffer address returned by DMA mapping call. stmmac_tx_clean() will try to unmap the DMA buffer _ONLY_IF_ tx_q->tx_skbuff_dma[entry].buf is a valid buffer address. The expected behavior that saves DMA buffer address of this non-paged data to tx_q->tx_skbuff_dma[entry].buf is: tx_q->tx_skbuff_dma[N + 0].buf = NULL; tx_q->tx_skbuff_dma[N + 1].buf = NULL; tx_q->tx_skbuff_dma[N + 2].buf = dma_map_single(); Unfortunately, the current code misbehaves like this: tx_q->tx_skbuff_dma[N + 0].buf = dma_map_single(); tx_q->tx_skbuff_dma[N + 1].buf = NULL; tx_q->tx_skbuff_dma[N + 2].buf = NULL; On the stmmac_tx_clean() side, when dma_desc[N + 0] is closed by the DMA engine, tx_q->tx_skbuff_dma[N + 0].buf is a valid buffer address obviously, then the DMA buffer will be unmapped immediately. There may be a rare case that the DMA engine does not finish the pending dma_desc[N + 1], dma_desc[N + 2] yet. Now things will go horribly wrong, DMA is going to access a unmapped/unreferenced memory region, corrupted data will be transmited or iommu fault will be triggered :( In contrast, the for-loop that maps SKB fragments behaves perfectly as expected, and that is how the driver should do for both non-paged data and paged frags actually. This patch corrects DMA map/unmap sequences by fixing the array index for tx_q->tx_skbuff_dma[entry].buf when assigning DMA buffer address. Tested and verified on DWXGMAC CORE 3.20a Reported-by: Suraj Jaiswal <quic_jsuraj@quicinc.com> Fixes: f748be531d70 ("stmmac: support new GMAC4") Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241021061023.2162701-1-0x1207@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-29net: stmmac: dwmac4: Fix high address display by updating reg_space[] from ↵Ley Foon Tan
register values The high address will display as 0 if the driver does not set the reg_space[]. To fix this, read the high address registers and update the reg_space[] accordingly. Fixes: fbf68229ffe7 ("net: stmmac: unify registers dumps methods") Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241021054625.1791965-1-leyfoon.tan@starfivetech.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>