summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-06-02xdp: Remove unused mem_return_failed eventSteven Rostedt
The change to allow page_pool to handle its own page destruction instead of relying on XDP removed the trace_mem_return_failed() tracepoint caller, but did not remove the mem_return_failed trace event. As trace events take up memory when they are created regardless of if they are used or not, having this unused event around wastes around 5K of memory. Remove the unused event. Link: https://lore.kernel.org/all/20250529130138.544ffec4@gandalf.local.home/ Cc: netdev <netdev@vger.kernel.org> Cc: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/20250529160550.1f888b15@gandalf.local.home Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.") Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-06-02ftrace: Don't allocate ftrace module map if ftrace is disabledYe Bin
If ftrace is disabled, it is meaningless to allocate a module map. Add a check in allocate_ftrace_mod_map() to not allocate if ftrace is disabled. Link: https://lore.kernel.org/20250529111955.2349189-3-yebin@huaweicloud.com Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-06-02ftrace: Fix UAF when lookup kallsym after ftrace disabledYe Bin
The following issue happens with a buggy module: BUG: unable to handle page fault for address: ffffffffc05d0218 PGD 1bd66f067 P4D 1bd66f067 PUD 1bd671067 PMD 101808067 PTE 0 Oops: Oops: 0000 [#1] SMP KASAN PTI Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS RIP: 0010:sized_strscpy+0x81/0x2f0 RSP: 0018:ffff88812d76fa08 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffffc0601010 RCX: dffffc0000000000 RDX: 0000000000000038 RSI: dffffc0000000000 RDI: ffff88812608da2d RBP: 8080808080808080 R08: ffff88812608da2d R09: ffff88812608da68 R10: ffff88812608d82d R11: ffff88812608d810 R12: 0000000000000038 R13: ffff88812608da2d R14: ffffffffc05d0218 R15: fefefefefefefeff FS: 00007fef552de740(0000) GS:ffff8884251c7000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffc05d0218 CR3: 00000001146f0000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ftrace_mod_get_kallsym+0x1ac/0x590 update_iter_mod+0x239/0x5b0 s_next+0x5b/0xa0 seq_read_iter+0x8c9/0x1070 seq_read+0x249/0x3b0 proc_reg_read+0x1b0/0x280 vfs_read+0x17f/0x920 ksys_read+0xf3/0x1c0 do_syscall_64+0x5f/0x2e0 entry_SYSCALL_64_after_hwframe+0x76/0x7e The above issue may happen as follows: (1) Add kprobe tracepoint; (2) insmod test.ko; (3) Module triggers ftrace disabled; (4) rmmod test.ko; (5) cat /proc/kallsyms; --> Will trigger UAF as test.ko already removed; ftrace_mod_get_kallsym() ... strscpy(module_name, mod_map->mod->name, MODULE_NAME_LEN); ... The problem is when a module triggers an issue with ftrace and sets ftrace_disable. The ftrace_disable is set when an anomaly is discovered and to prevent any more damage, ftrace stops all text modification. The issue that happened was that the ftrace_disable stops more than just the text modification. When a module is loaded, its init functions can also be traced. Because kallsyms deletes the init functions after a module has loaded, ftrace saves them when the module is loaded and function tracing is enabled. This allows the output of the function trace to show the init function names instead of just their raw memory addresses. When a module is removed, ftrace_release_mod() is called, and if ftrace_disable is set, it just returns without doing anything more. The problem here is that it leaves the mod_list still around and if kallsyms is called, it will call into this code and access the module memory that has already been freed as it will return: strscpy(module_name, mod_map->mod->name, MODULE_NAME_LEN); Where the "mod" no longer exists and triggers a UAF bug. Link: https://lore.kernel.org/all/20250523135452.626d8dcd@gandalf.local.home/ Cc: stable@vger.kernel.org Fixes: aba4b5c22cba ("ftrace: Save module init functions kallsyms symbols for tracing") Link: https://lore.kernel.org/20250529111955.2349189-2-yebin@huaweicloud.com Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-06-02Merge tag 'nand/for-6.16' into mtd/nextMiquel Raynal
The SPI NAND subsystem has seen the introduction of DTR operations (the equivalent of DDR transfers), which involved quite a few preparation patches for clarifying macro names. In the raw NAND subsystem, the brcmnand driver has been "fixed" for old legacy SoCs with an update of the ->exec_op() hook, there has been the introduction of a new controller driver named Loongson-1, and the Qualcomm driver has received quite a few misc fixes as well as a new compatible. Aside from this, there is the usual load of misc improvement and fixes.
2025-06-02Merge tag 'spi-nor/for-6.16' into mtd/nextMiquel Raynal
SPI NOR changes for 6.16 Notable changes: - Cleanup some Macronix flash entries. - Add SFDP table fixups for Macronix MX25L3255E.
2025-06-02bcachefs: Run check_dirents second time if requiredKent Overstreet
If we move a key backwards, we'll need a second pass to run the rest of the fsck checks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: Run snapshot deletion out of system_long_wqKent Overstreet
We don't want this running out of the same workqueue, and blocking, writes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: Make check_key_has_snapshot saferKent Overstreet
Snapshot deletion v2 added sentinal values for deleted snapshots, so "key for deleted snapshot" - i.e. snapshot deletion missed something - is safe to repair automatically. But if we find a key for a missing snapshot we have no idea what happened, and we shouldn't delete it unless we're very sure that everything else is consistent. So hook it up to the new bch2_require_recovery_pass(), we'll now only delete if snapshots and subvolumes have recenlty been checked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: BCH_RECOVERY_PASS_NO_RATELIMITKent Overstreet
Add a superblock flag to temporarily disable ratelimiting for a recovery pass. This will be used to make check_key_has_snapshot safer: we don't want to delete a key for a missing snapshot unless we know that the snapshots and subvolumes btrees are consistent, i.e. check_snapshots and check_subvols have run recently. Changing those btrees - creating/deleting a subvolume or snapshot - will set the "disable ratelimit" flag, i.e. ensuring that those passes run if check_key_has_snapshot discovers an error. We're only disabling ratelimiting in the snapshot/subvol delete paths, we're not so concerned about the create paths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: bch2_require_recovery_pass()Kent Overstreet
Add a helper for requiring that a recovery pass has already run: either run it directly, if we're still in recovery, or if we're not in recovery check if it has run recently and schedule it if it hasn't. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: bch_err_throw()Kent Overstreet
Add a tracepoint for any time we return an error and unwind. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: Repair code for directory i_sizeKent Overstreet
We had a bug due due to an incomplete revert of the patch implementing directory i_size (summing up the size of the dirents), leading to completely screwy i_size values that underflow. Most userspace programs don't seem to care (e.g. du ignores it), but it turns out this broke sshfs, so needs to be repaired. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: Kill un-reverted directory i_size codeKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: Delete redundant fsck_err()Kent Overstreet
'inode_has_wrong_backpointer'; we have more specific errors for every case afterwards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: Convert BUG() to errorKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02PCI: WARN (not BUG()) when we fail to assign optional resourcesIlpo Järvinen
Resource fitting/assignment code checks if there's a remainder in add_list (aka. realloc_head in the inner functions) using BUG_ON(). This problem typically results in a mere PCI device resource assignment failure which does not warrant using BUG_ON(). The machine could well come up usable even if this condition occurs because the realloc_head relates to resources which are optional anyway. Change BUG_ON() to WARN_ON_ONCE() and free the list if it's not empty. [bhelgaas: subject] Reported-by: Tudor Ambarus <tudor.ambarus@linaro.org> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/linux-pci/5f103643-5e1c-43c6-b8fe-9617d3b5447c@linaro.org Link: https://lore.kernel.org/r/20250511215223.7131-1-ilpo.jarvinen@linux.intel.com
2025-06-02PCI: Remove unused pci_printk()Ilpo Järvinen
include/linux/pci.h provides low-level pci_printk() interface that is not used since the commits fab874e12593 ("PCI/AER: Descope pci_printk() to aer_printk()") and 588021b28642 ("PCI: shpchp: Remove 'shpchp_debug' module parameter"). PCI logging should not use pci_printk() but pci_*() wrappers that follow the usual logging wrapper patterns. Remove pci_printk(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20250407101215.1376-1-ilpo.jarvinen@linux.intel.com
2025-06-02um: pass FD for memory operations when neededBenjamin Berg
Instead of always sharing the FDs with the userspace process, only hand over the FDs needed for mmap when required. The idea is that userspace might be able to force the stub into executing an mmap syscall, however, it will not be able to manipulate the control flow sufficiently to have access to an FD that would allow mapping arbitrary memory. Security wise, we need to be sure that only the expected syscalls are executed after the kernel sends FDs through the socket. This is currently not the case, as userspace can trivially jump to the rt_sigreturn syscall instruction to execute any syscall that the stub is permitted to do. With this, it can trick the kernel to send the FD, which in turn allows userspace to freely map any physical memory. As such, this is currently *not* secure. However, in principle the approach should be fine with a more strict SECCOMP filter and a careful review of the stub control flow (as userspace can prepare a stack). With some care, it is likely possible to extend the security model to SMP if desired. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250602130052.545733-8-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: Add SECCOMP support detection and initializationBenjamin Berg
This detects seccomp support, sets the global using_seccomp variable and initilizes the exec registers. The support is only enabled if the seccomp= kernel parameter is set to either "on" or "auto". With "auto" a fallback to ptrace mode will happen if initialization failed. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250602130052.545733-7-benjamin@sipsolutions.net [extend help with Kconfig text from v2, use exit syscall instead of libc, remove unneeded mctx_offset assignment, disable on 32-bit for now] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: Implement kernel side of SECCOMP based process handlingBenjamin Berg
This adds the kernel side of the seccomp based process handling. Co-authored-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250602130052.545733-6-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: Track userspace children dying in SECCOMP modeBenjamin Berg
When in seccomp mode, we would hang forever on the futex if a child has died unexpectedly. In contrast, ptrace mode will notice it and kill the corresponding thread when it fails to run it. Fix this issue using a new IRQ that is fired after a SIGCHLD and keeping an (internal) list of all MMs. In the IRQ handler, find the affected MM and set its PID to -1 as well as the futex variable to FUTEX_IN_KERN. This, together with futex returning -EINTR after the signal is sufficient to implement a race-free detection of a child dying. Note that this also enables IRQ handling while starting a userspace process. This should be safe and SECCOMP requires the IRQ in case the process does not come up properly. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250602130052.545733-5-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: Add helper functions to get/set state for SECCOMPBenjamin Berg
When not using ptrace, we need to both save and restore registers through the mcontext as provided by the host kernel to our signal handlers. Add corresponding functions to store the state to an mcontext and helpers to access the mcontext of the subprocess through the stub data. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250602130052.545733-4-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: Add stub side of SECCOMP/futex based process handlingBenjamin Berg
This adds the stub side for the new seccomp process management code. In this case we do register save/restore through the signal handler mcontext. Add special code for handling TLS, which for x86_64 means setting the FS_BASE/GS_BASE registers while for i386 it means calling the set_thread_area syscall. Co-authored-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250602130052.545733-3-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: Move faultinfo extraction into userspace routineBenjamin Berg
The segv handler is called slightly differently depending on whether PTRACE_FULL_FAULTINFO is set or not (32bit vs. 64bit). The only difference is that we don't try to pass the registers and instruction pointer to the segv handler. It would be good to either document or remove the difference, but I do not know why this difference exists. And, passing NULL can even result in a crash. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Link: https://patch.msgid.link/20250602130052.545733-2-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02arm64: Add override for MPAMXi Ruoyao
As the message of the commit 09e6b306f3ba ("arm64: cpufeature: discover CPU support for MPAM") already states, if a buggy firmware fails to either enable MPAM or emulate the trap as if it were disabled, the kernel will just fail to boot. While upgrading the firmware should be the best solution, we have some hardware of which the vendor have made no response 2 months after we requested a firmware update. Allow overriding it so our devices don't become some e-waste. Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Cc: Mingcong Bai <jeffbai@aosc.io> Cc: Shaopeng Tan <tan.shaopeng@fujitsu.com> Cc: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250602043723.216338-1-xry111@xry111.site Signed-off-by: Will Deacon <will@kernel.org>
2025-06-02dm-table: check BLK_FEAT_ATOMIC_WRITES inside limits_lockBenjamin Marzinski
dm_set_device_limits() should check q->limits.features for BLK_FEAT_ATOMIC_WRITES while holding q->limits_lock, like it does for the rest of the queue limits. Fixes: b7c18b17a173 ("dm-table: Set BLK_FEAT_ATOMIC_WRITES for target queue limits") Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-06-02arm64/mm: Close theoretical race where stale TLB entry remains validRyan Roberts
Commit 3ea277194daa ("mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries") describes a race that, prior to the commit, could occur between reclaim and operations such as mprotect() when using reclaim's tlbbatch mechanism. See that commit for details but the summary is: """ Nadav Amit identified a theoritical race between page reclaim and mprotect due to TLB flushes being batched outside of the PTL being held. He described the race as follows: CPU0 CPU1 ---- ---- user accesses memory using RW PTE [PTE now cached in TLB] try_to_unmap_one() ==> ptep_get_and_clear() ==> set_tlb_ubc_flush_pending() mprotect(addr, PROT_READ) ==> change_pte_range() ==> [ PTE non-present - no flush ] user writes using cached RW PTE ... try_to_unmap_flush() """ The solution was to insert flush_tlb_batched_pending() in mprotect() and friends to explcitly drain any pending reclaim TLB flushes. In the modern version of this solution, arch_flush_tlb_batched_pending() is called to do that synchronisation. arm64's tlbbatch implementation simply issues TLBIs at queue-time (arch_tlbbatch_add_pending()), eliding the trailing dsb(ish). The trailing dsb(ish) is finally issued in arch_tlbbatch_flush() at the end of the batch to wait for all the issued TLBIs to complete. Now, the Arm ARM states: """ The completion of the TLB maintenance instruction is guaranteed only by the execution of a DSB by the observer that performed the TLB maintenance instruction. The execution of a DSB by a different observer does not have this effect, even if the DSB is known to be executed after the TLB maintenance instruction is observed by that different observer. """ arch_tlbbatch_add_pending() and arch_tlbbatch_flush() conform to this requirement because they are called from the same task (either kswapd or caller of madvise(MADV_PAGEOUT)), so either they are on the same CPU or if the task was migrated, __switch_to() contains an extra dsb(ish). HOWEVER, arm64's arch_flush_tlb_batched_pending() is also implemented as a dsb(ish). But this may be running on a CPU remote from the one that issued the outstanding TLBIs. So there is no architectural gurantee of synchonization. Therefore we are still vulnerable to the theoretical race described in Commit 3ea277194daa ("mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries"). Fix this by flushing the entire mm in arch_flush_tlb_batched_pending(). This aligns with what the other arches that implement the tlbbatch feature do. Cc: <stable@vger.kernel.org> Fixes: 43b3dfdd0455 ("arm64: support batched/deferred tlb shootdown during page reclamation/migration") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20250530152445.2430295-1-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-06-02arm64: Work around convergence issue with LLD linkerArd Biesheuvel
LLD will occasionally error out with a '__init_end does not converge' error if INIT_IDMAP_DIR_SIZE is defined in terms of _end, as this results in a circular dependency. Counter this by dimensioning the initial IDMAP page tables based on a new boundary marker 'kimage_limit', and define it such that its value should not change as a result of the initdata segment being pushed over a 64k segment boundary due to changes in INIT_IDMAP_DIR_SIZE, provided that its value doesn't change by more than 2M between linker passes. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250531123005.3866382-2-ardb+git@google.com Signed-off-by: Will Deacon <will@kernel.org>
2025-06-02arm64: Disable LLD linker ASSERT()s for the time beingArd Biesheuvel
It turns out [1] that the way LLD handles ASSERT()s in the linker script can result in spurious failures, so disable them for the newly introduced BSS symbol export checks. Since we're not aware of any issues with the existing assertions in vmlinux.lds.S, leave those alone for now so that they can continue to provide useful coverage. A linker fix [2] is due to land in version 21 of LLD. Link: https://lore.kernel.org/r/202505261019.OUlitN6m-lkp@intel.com [1] Link: https://github.com/llvm/llvm-project/commit/5859863bab7f [2] Link: https://github.com/ClangBuiltLinux/linux/issues/2094 Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505261019.OUlitN6m-lkp@intel.com/ Link: https://lore.kernel.org/r/20250529073507.2984959-2-ardb+git@google.com Signed-off-by: Will Deacon <will@kernel.org>
2025-06-02net: fix udp gso skb_segment after pull from frag_listShiming Cheng
Commit a1e40ac5b5e9 ("net: gso: fix udp gso fraglist segmentation after pull from frag_list") detected invalid geometry in frag_list skbs and redirects them from skb_segment_list to more robust skb_segment. But some packets with modified geometry can also hit bugs in that code. We don't know how many such cases exist. Addressing each one by one also requires touching the complex skb_segment code, which risks introducing bugs for other types of skbs. Instead, linearize all these packets that fail the basic invariants on gso fraglist skbs. That is more robust. If only part of the fraglist payload is pulled into head_skb, it will always cause exception when splitting skbs by skb_segment. For detailed call stack information, see below. Valid SKB_GSO_FRAGLIST skbs - consist of two or more segments - the head_skb holds the protocol headers plus first gso_size - one or more frag_list skbs hold exactly one segment - all but the last must be gso_size Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can modify fraglist skbs, breaking these invariants. In extreme cases they pull one part of data into skb linear. For UDP, this causes three payloads with lengths of (11,11,10) bytes were pulled tail to become (12,10,10) bytes. The skbs no longer meets the above SKB_GSO_FRAGLIST conditions because payload was pulled into head_skb, it needs to be linearized before pass to regular skb_segment. skb_segment+0xcd0/0xd14 __udp_gso_segment+0x334/0x5f4 udp4_ufo_fragment+0x118/0x15c inet_gso_segment+0x164/0x338 skb_mac_gso_segment+0xc4/0x13c __skb_gso_segment+0xc4/0x124 validate_xmit_skb+0x9c/0x2c0 validate_xmit_skb_list+0x4c/0x80 sch_direct_xmit+0x70/0x404 __dev_queue_xmit+0x64c/0xe5c neigh_resolve_output+0x178/0x1c4 ip_finish_output2+0x37c/0x47c __ip_finish_output+0x194/0x240 ip_finish_output+0x20/0xf4 ip_output+0x100/0x1a0 NF_HOOK+0xc4/0x16c ip_forward+0x314/0x32c ip_rcv+0x90/0x118 __netif_receive_skb+0x74/0x124 process_backlog+0xe8/0x1a4 __napi_poll+0x5c/0x1f8 net_rx_action+0x154/0x314 handle_softirqs+0x154/0x4b8 [118.376811] [C201134] rxq0_pus: [name:bug&]kernel BUG at net/core/skbuff.c:4278! [118.376829] [C201134] rxq0_pus: [name:traps&]Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP [118.470774] [C201134] rxq0_pus: [name:mrdump&]Kernel Offset: 0x178cc00000 from 0xffffffc008000000 [118.470810] [C201134] rxq0_pus: [name:mrdump&]PHYS_OFFSET: 0x40000000 [118.470827] [C201134] rxq0_pus: [name:mrdump&]pstate: 60400005 (nZCv daif +PAN -UAO) [118.470848] [C201134] rxq0_pus: [name:mrdump&]pc : [0xffffffd79598aefc] skb_segment+0xcd0/0xd14 [118.470900] [C201134] rxq0_pus: [name:mrdump&]lr : [0xffffffd79598a5e8] skb_segment+0x3bc/0xd14 [118.470928] [C201134] rxq0_pus: [name:mrdump&]sp : ffffffc008013770 Fixes: a1e40ac5b5e9 ("gso: fix udp gso fraglist segmentation after pull from frag_list") Signed-off-by: Shiming Cheng <shiming.cheng@mediatek.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-06-02um: vector: Use mac_pton() for MAC address parsingTiwei Bie
Use mac_pton() instead of custom approach. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20250506045117.1896661-3-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: vector: Clean up and modernize log messagesTiwei Bie
Use pr_*() and netdev_*() to print log messages. While at it, join split messages for easier grepping. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20250506045117.1896661-2-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: chan_kern: use raw spinlock for irqs_to_free_lockJohannes Berg
Since this is called deep in the ARCH=um IRQ infrastructure it must use a raw spinlock. It's not really part of the driver, but rather the core UML IRQ code. Link: https://patch.msgid.link/20250505103358.ae7dc659f8b4.I64ca7aece30e0b4b0b5b35ad89cdd63db197c0ce@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02MAINTAINERS: remove obsolete file entry in TUN/TAP DRIVERLukas Bulwahn
Commit 65eaac591b75 ("um: Remove obsolete legacy network transports") removes the directory arch/um/os-Linux/drivers/, but misses to remove the file entry in TUN/TAP DRIVER referring to that directory. Remove this obsolete file entry. While at it, put the section name in capital letters. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Reviewed-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20250507071004.35120-1-lukas.bulwahn@redhat.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: Fix tgkill compile error on old host OSesYongting Lin
tgkill is a quite old syscall since kernel 2.5.75, but unfortunately glibc doesn't support it before 2.30. Thus some systems fail to compile the latest UserMode Linux. Here is the compile error I encountered when I tried to compile UML in my system shipped with glibc-2.28. CALL scripts/checksyscalls.sh CC arch/um/os-Linux/sigio.o In file included from arch/um/os-Linux/sigio.c:17: arch/um/os-Linux/sigio.c: In function ‘write_sigio_thread’: arch/um/os-Linux/sigio.c:49:19: error: implicit declaration of function ‘tgkill’; did you mean ‘kill’? [-Werror=implicit-function-declaration] CATCH_EINTR(r = tgkill(pid, pid, SIGIO)); ^~~~~~ ./arch/um/include/shared/os.h:21:48: note: in definition of macro ‘CATCH_EINTR’ #define CATCH_EINTR(expr) while ((errno = 0, ((expr) < 0)) && (errno == EINTR)) ^~~~ cc1: some warnings being treated as errors Fix it by Replacing glibc call with raw syscall. Fixes: 33c9da5dfb18 ("um: Rewrite the sigio workaround based on epoll and tgkill") Signed-off-by: Yongting Lin <linyongting@gmail.com> Link: https://patch.msgid.link/20250527151222.40371-1-linyongting@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02um: stop using PCI port I/OArnd Bergmann
arch/um is one of the last users of CONFIG_GENERIC_IOMAP, but upon closer look it appears that the PCI host bridge does not register any port I/O, and the absense of both custom inb/outb functions and a PCI_IOBASE constant means that actually trying to use port I/O results on a NULL pointer access. Build testing with clang confirms this by warning about this exact problem: include/asm-generic/io.h:549:31: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic] 549 | val = __raw_readb(PCI_IOBASE + addr); | ~~~~~~~~~~ ^ Remove all the Kconfig selects that refer to legacy port I/O and instead just build the normal MMIO path that is emulated by the virtio PCI host. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250509084125.1488601-1-arnd@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-02Merge tag 'kvmarm-fixes-6.16-1' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.16, take #1 - Make the irqbypass hooks resilient to changes in the GSI<->MSI routing, avoiding behind stale vLPI mappings being left behind. The fix is to resolve the VGIC IRQ using the host IRQ (which is stable) and nuking the vLPI mapping upon a routing change. - Close another VGIC race where vCPU creation races with VGIC creation, leading to in-flight vCPUs entering the kernel w/o private IRQs allocated. - Fix a build issue triggered by the recently added workaround for Ampere's AC04_CPU_23 erratum. - Correctly sign-extend the VA when emulating a TLBI instruction potentially targeting a VNCR mapping. - Avoid dereferencing a NULL pointer in the VGIC debug code, which can happen if the device doesn't have any mapping yet.
2025-06-02rtmutex_api: provide correct extern functionsPaolo Bonzini
Commit fb49f07ba1d9 ("locking/mutex: implement mutex_lock_killable_nest_lock") changed the set of functions that mutex.c defines when CONFIG_DEBUG_LOCK_ALLOC is set. - it removed the "extern" declaration of mutex_lock_killable_nested from include/linux/mutex.h, and replaced it with a macro since it could be treated as a special case of _mutex_lock_killable. It also removed a definition of the function in kernel/locking/mutex.c. - likewise, it replaced mutex_trylock() with the more generic mutex_trylock_nest_lock() and replaced mutex_trylock() with a macro. However, it left the old definitions in place in kernel/locking/rtmutex_api.c, which causes failures when building with CONFIG_RT_MUTEXES=y. Bring over the changes. Fixes: fb49f07ba1d9 ("locking/mutex: implement mutex_lock_killable_nest_lock") Reported-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-01Merge branch 'next' into for-linusDmitry Torokhov
Prepare input updates for 6.16 merge window.
2025-06-01smb: client: use ParentLeaseKey in cifs_do_createHenrique Carvalho
Implement ParentLeaseKey logic in cifs_do_create() by looking up the parent cfid, copying its lease key into the fid struct, and setting the appropriate lease flag. Fixes: f047390a097e ("CIFS: Add create lease v2 context for SMB3") Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-01smb: client: use ParentLeaseKey in open_cached_dirHenrique Carvalho
Implement ParentLeaseKey logic in open_cached_dir() by looking up the parent cfid, copying its lease key into the fid struct, and setting the appropriate lease flag. Fixes: f047390a097e ("CIFS: Add create lease v2 context for SMB3") Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-01smb: client: add ParentLeaseKey supportHenrique Carvalho
According to MS-SMB2 3.2.4.3.8, when opening a file the client must lookup its parent directory, copy that entry’s LeaseKey into ParentLeaseKey, and set SMB2_LEASE_FLAG_PARENT_LEASE_KEY_SET. Extend lease context functions to carry a parent_lease_key and lease_flags and to add them to the lease context buffer accordingly in smb3_create_lease_buf. Also add a parent_lease_key field to struct cifs_fid and lease_flags to cifs_open_parms. Only applies to the SMB 3.x dialect family. Fixes: f047390a097e ("CIFS: Add create lease v2 context for SMB3") Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-01cifs: Fix cifs_query_path_info() for Windows NT serversPali Rohár
For TRANS2 QUERY_PATH_INFO request when the path does not exist, the Windows NT SMB server returns error response STATUS_OBJECT_NAME_NOT_FOUND or ERRDOS/ERRbadfile without the SMBFLG_RESPONSE flag set. Similarly it returns STATUS_DELETE_PENDING when the file is being deleted. And looks like that any error response from TRANS2 QUERY_PATH_INFO does not have SMBFLG_RESPONSE flag set. So relax check in check_smb_hdr() for detecting if the packet is response for this special case. This change fixes stat() operation against Windows NT SMB servers and also all operations which depends on -ENOENT result from stat like creat() or mkdir(). Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-01cifs: Fix validation of SMB1 query reparse point responsePali Rohár
Validate the SMB1 query reparse point response per [MS-CIFS] section 2.2.7.2 NT_TRANSACT_IOCTL. NT_TRANSACT_IOCTL response contains one word long setup data after which is ByteCount member. So check that SetupCount is 1 before trying to read and use ByteCount member. Output setup data contains ReturnedDataLen member which is the output length of executed IOCTL command by remote system. So check that output was not truncated before transferring over network. Change MaxSetupCount of NT_TRANSACT_IOCTL request from 4 to 1 as io_rsp structure already expects one word long output setup data. This should prevent server sending incompatible structure (in case it would be extended in future, which is unlikely). Change MaxParameterCount of NT_TRANSACT_IOCTL request from 2 to 0 as NT IOCTL does not have any documented output parameters and this function does not parse any output parameters at all. Fixes: ed3e0a149b58 ("smb: client: implement ->query_reparse_point() for SMB1") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-01cifs: Correctly set SMB1 SessionKey field in Session Setup RequestPali Rohár
[MS-CIFS] specification in section 2.2.4.53.1 where is described SMB_COM_SESSION_SETUP_ANDX Request, for SessionKey field says: The client MUST set this field to be equal to the SessionKey field in the SMB_COM_NEGOTIATE Response for this SMB connection. Linux SMB client currently set this field to zero. This is working fine against Windows NT SMB servers thanks to [MS-CIFS] product behavior <94>: Windows NT Server ignores the client's SessionKey. For compatibility with [MS-CIFS], set this SessionKey field in Session Setup Request to value retrieved from Negotiate response. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-01cifs: Fix encoding of SMB1 Session Setup NTLMSSP Request in non-UNICODE modePali Rohár
SMB1 Session Setup NTLMSSP Request in non-UNICODE mode is similar to UNICODE mode, just strings are encoded in ASCII and not in UTF-16. With this change it is possible to setup SMB1 session with NTLM authentication in non-UNICODE mode with Windows SMB server. This change fixes mounting SMB1 servers with -o nounicode mount option together with -o sec=ntlmssp mount option (which is the default sec=). Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-02rtc: mt6359: Add mt6357 supportAlexandre Mergnat
The MT6357 PMIC contains the same RTC as MT6358 which allows to add support for it trivially by just complementing the list of compatibles. Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Tested-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/20250428-rtc-mt6357-v1-1-31f673b0a723@baylibre.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-06-02rtc: test: Test date conversion for dates starting in 1900Uwe Kleine-König
While the RTC framework intends to only handle dates after 1970 for consumers, time conversion must also work for earlier dates to cover e.g. storing dates beyond an RTC's range_max. This is most relevant for the rtc-mt6397 driver that has range_min = RTC_TIMESTAMP_BEGIN_1900; range_max = mktime64(2027, 12, 31, 23, 59, 59); and so needs working support for timestamps in 1900 starting in less than three years. So shift the tested interval of timestamps to also cover years 1900 to 1970. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://lore.kernel.org/r/20250428-enable-rtc-v4-5-2b2f7e3f9349@baylibre.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-06-02rtc: test: Also test time and wday outcome of rtc_time64_to_tm()Uwe Kleine-König
To cover calculation of the time and wday in the rtc kunit test also check tm_hour, tm_min, tm_sec and tm_wday of the rtc_time calculated by rtc_time64_to_tm(). There are no surprises, the two tests making use of rtc_time64_to_tm_test_date_range() continue to succeed. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://lore.kernel.org/r/20250428-enable-rtc-v4-4-2b2f7e3f9349@baylibre.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-06-02rtc: test: Emit the seconds-since-1970 value instead of days-since-1970Uwe Kleine-König
This is easier to handle because you can just consult date(1) to convert between a seconds-since-1970 value and a date string: $ date --utc -d @3661 Thu Jan 1 01:01:01 AM UTC 1970 $ date -d "Jan 1 12:00:00 AM UTC 1900" +%s -2208988800 The intended side effect is that this prepares the test for dates before 1970. The division of a negative value by 86400 doesn't result in the desired days-since-1970 value as e.g. secs=-82739 should map to days=-1. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://lore.kernel.org/r/20250428-enable-rtc-v4-3-2b2f7e3f9349@baylibre.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>