Age | Commit message (Collapse) | Author |
|
The change to allow page_pool to handle its own page destruction instead
of relying on XDP removed the trace_mem_return_failed() tracepoint caller,
but did not remove the mem_return_failed trace event. As trace events take
up memory when they are created regardless of if they are used or not,
having this unused event around wastes around 5K of memory.
Remove the unused event.
Link: https://lore.kernel.org/all/20250529130138.544ffec4@gandalf.local.home/
Cc: netdev <netdev@vger.kernel.org>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/20250529160550.1f888b15@gandalf.local.home
Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
If ftrace is disabled, it is meaningless to allocate a module map.
Add a check in allocate_ftrace_mod_map() to not allocate if ftrace is
disabled.
Link: https://lore.kernel.org/20250529111955.2349189-3-yebin@huaweicloud.com
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
The following issue happens with a buggy module:
BUG: unable to handle page fault for address: ffffffffc05d0218
PGD 1bd66f067 P4D 1bd66f067 PUD 1bd671067 PMD 101808067 PTE 0
Oops: Oops: 0000 [#1] SMP KASAN PTI
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
RIP: 0010:sized_strscpy+0x81/0x2f0
RSP: 0018:ffff88812d76fa08 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffffffc0601010 RCX: dffffc0000000000
RDX: 0000000000000038 RSI: dffffc0000000000 RDI: ffff88812608da2d
RBP: 8080808080808080 R08: ffff88812608da2d R09: ffff88812608da68
R10: ffff88812608d82d R11: ffff88812608d810 R12: 0000000000000038
R13: ffff88812608da2d R14: ffffffffc05d0218 R15: fefefefefefefeff
FS: 00007fef552de740(0000) GS:ffff8884251c7000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffc05d0218 CR3: 00000001146f0000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
ftrace_mod_get_kallsym+0x1ac/0x590
update_iter_mod+0x239/0x5b0
s_next+0x5b/0xa0
seq_read_iter+0x8c9/0x1070
seq_read+0x249/0x3b0
proc_reg_read+0x1b0/0x280
vfs_read+0x17f/0x920
ksys_read+0xf3/0x1c0
do_syscall_64+0x5f/0x2e0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The above issue may happen as follows:
(1) Add kprobe tracepoint;
(2) insmod test.ko;
(3) Module triggers ftrace disabled;
(4) rmmod test.ko;
(5) cat /proc/kallsyms; --> Will trigger UAF as test.ko already removed;
ftrace_mod_get_kallsym()
...
strscpy(module_name, mod_map->mod->name, MODULE_NAME_LEN);
...
The problem is when a module triggers an issue with ftrace and
sets ftrace_disable. The ftrace_disable is set when an anomaly is
discovered and to prevent any more damage, ftrace stops all text
modification. The issue that happened was that the ftrace_disable stops
more than just the text modification.
When a module is loaded, its init functions can also be traced. Because
kallsyms deletes the init functions after a module has loaded, ftrace
saves them when the module is loaded and function tracing is enabled. This
allows the output of the function trace to show the init function names
instead of just their raw memory addresses.
When a module is removed, ftrace_release_mod() is called, and if
ftrace_disable is set, it just returns without doing anything more. The
problem here is that it leaves the mod_list still around and if kallsyms
is called, it will call into this code and access the module memory that
has already been freed as it will return:
strscpy(module_name, mod_map->mod->name, MODULE_NAME_LEN);
Where the "mod" no longer exists and triggers a UAF bug.
Link: https://lore.kernel.org/all/20250523135452.626d8dcd@gandalf.local.home/
Cc: stable@vger.kernel.org
Fixes: aba4b5c22cba ("ftrace: Save module init functions kallsyms symbols for tracing")
Link: https://lore.kernel.org/20250529111955.2349189-2-yebin@huaweicloud.com
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
The SPI NAND subsystem has seen the introduction of DTR operations (the
equivalent of DDR transfers), which involved quite a few preparation
patches for clarifying macro names.
In the raw NAND subsystem, the brcmnand driver has been "fixed" for old
legacy SoCs with an update of the ->exec_op() hook, there has been the
introduction of a new controller driver named Loongson-1, and the
Qualcomm driver has received quite a few misc fixes as well as a new
compatible.
Aside from this, there is the usual load of misc improvement and fixes.
|
|
SPI NOR changes for 6.16
Notable changes:
- Cleanup some Macronix flash entries.
- Add SFDP table fixups for Macronix MX25L3255E.
|
|
If we move a key backwards, we'll need a second pass to run the rest of
the fsck checks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We don't want this running out of the same workqueue, and blocking,
writes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Snapshot deletion v2 added sentinal values for deleted snapshots, so
"key for deleted snapshot" - i.e. snapshot deletion missed something -
is safe to repair automatically.
But if we find a key for a missing snapshot we have no idea what
happened, and we shouldn't delete it unless we're very sure that
everything else is consistent.
So hook it up to the new bch2_require_recovery_pass(), we'll now only
delete if snapshots and subvolumes have recenlty been checked.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a superblock flag to temporarily disable ratelimiting for a recovery
pass.
This will be used to make check_key_has_snapshot safer: we don't want to
delete a key for a missing snapshot unless we know that the snapshots
and subvolumes btrees are consistent, i.e. check_snapshots and
check_subvols have run recently.
Changing those btrees - creating/deleting a subvolume or snapshot - will
set the "disable ratelimit" flag, i.e. ensuring that those passes run if
check_key_has_snapshot discovers an error.
We're only disabling ratelimiting in the snapshot/subvol delete paths,
we're not so concerned about the create paths.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a helper for requiring that a recovery pass has already run: either
run it directly, if we're still in recovery, or if we're not in recovery
check if it has run recently and schedule it if it hasn't.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a tracepoint for any time we return an error and unwind.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We had a bug due due to an incomplete revert of the patch implementing
directory i_size (summing up the size of the dirents), leading to
completely screwy i_size values that underflow.
Most userspace programs don't seem to care (e.g. du ignores it), but it
turns out this broke sshfs, so needs to be repaired.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
'inode_has_wrong_backpointer'; we have more specific errors for every
case afterwards.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Resource fitting/assignment code checks if there's a remainder in
add_list (aka. realloc_head in the inner functions) using BUG_ON().
This problem typically results in a mere PCI device resource assignment
failure which does not warrant using BUG_ON(). The machine could well
come up usable even if this condition occurs because the realloc_head
relates to resources which are optional anyway.
Change BUG_ON() to WARN_ON_ONCE() and free the list if it's not empty.
[bhelgaas: subject]
Reported-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/linux-pci/5f103643-5e1c-43c6-b8fe-9617d3b5447c@linaro.org
Link: https://lore.kernel.org/r/20250511215223.7131-1-ilpo.jarvinen@linux.intel.com
|
|
include/linux/pci.h provides low-level pci_printk() interface that is
not used since the commits fab874e12593 ("PCI/AER: Descope pci_printk()
to aer_printk()") and 588021b28642 ("PCI: shpchp: Remove 'shpchp_debug'
module parameter"). PCI logging should not use pci_printk() but pci_*()
wrappers that follow the usual logging wrapper patterns.
Remove pci_printk().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20250407101215.1376-1-ilpo.jarvinen@linux.intel.com
|
|
Instead of always sharing the FDs with the userspace process, only hand
over the FDs needed for mmap when required. The idea is that userspace
might be able to force the stub into executing an mmap syscall, however,
it will not be able to manipulate the control flow sufficiently to have
access to an FD that would allow mapping arbitrary memory.
Security wise, we need to be sure that only the expected syscalls are
executed after the kernel sends FDs through the socket. This is
currently not the case, as userspace can trivially jump to the
rt_sigreturn syscall instruction to execute any syscall that the stub is
permitted to do. With this, it can trick the kernel to send the FD,
which in turn allows userspace to freely map any physical memory.
As such, this is currently *not* secure. However, in principle the
approach should be fine with a more strict SECCOMP filter and a careful
review of the stub control flow (as userspace can prepare a stack). With
some care, it is likely possible to extend the security model to SMP if
desired.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://patch.msgid.link/20250602130052.545733-8-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This detects seccomp support, sets the global using_seccomp variable and
initilizes the exec registers. The support is only enabled if the
seccomp= kernel parameter is set to either "on" or "auto". With "auto" a
fallback to ptrace mode will happen if initialization failed.
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://patch.msgid.link/20250602130052.545733-7-benjamin@sipsolutions.net
[extend help with Kconfig text from v2, use exit syscall instead of libc,
remove unneeded mctx_offset assignment, disable on 32-bit for now]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This adds the kernel side of the seccomp based process handling.
Co-authored-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://patch.msgid.link/20250602130052.545733-6-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When in seccomp mode, we would hang forever on the futex if a child has
died unexpectedly. In contrast, ptrace mode will notice it and kill the
corresponding thread when it fails to run it.
Fix this issue using a new IRQ that is fired after a SIGCHLD and keeping
an (internal) list of all MMs. In the IRQ handler, find the affected MM
and set its PID to -1 as well as the futex variable to FUTEX_IN_KERN.
This, together with futex returning -EINTR after the signal is
sufficient to implement a race-free detection of a child dying.
Note that this also enables IRQ handling while starting a userspace
process. This should be safe and SECCOMP requires the IRQ in case the
process does not come up properly.
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://patch.msgid.link/20250602130052.545733-5-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When not using ptrace, we need to both save and restore registers
through the mcontext as provided by the host kernel to our signal
handlers.
Add corresponding functions to store the state to an mcontext and
helpers to access the mcontext of the subprocess through the stub data.
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://patch.msgid.link/20250602130052.545733-4-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This adds the stub side for the new seccomp process management code. In
this case we do register save/restore through the signal handler
mcontext.
Add special code for handling TLS, which for x86_64 means setting the
FS_BASE/GS_BASE registers while for i386 it means calling the
set_thread_area syscall.
Co-authored-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://patch.msgid.link/20250602130052.545733-3-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The segv handler is called slightly differently depending on whether
PTRACE_FULL_FAULTINFO is set or not (32bit vs. 64bit). The only
difference is that we don't try to pass the registers and instruction
pointer to the segv handler.
It would be good to either document or remove the difference, but I do
not know why this difference exists. And, passing NULL can even result
in a crash.
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Link: https://patch.msgid.link/20250602130052.545733-2-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
As the message of the commit 09e6b306f3ba ("arm64: cpufeature: discover
CPU support for MPAM") already states, if a buggy firmware fails to
either enable MPAM or emulate the trap as if it were disabled, the
kernel will just fail to boot. While upgrading the firmware should be
the best solution, we have some hardware of which the vendor have made
no response 2 months after we requested a firmware update. Allow
overriding it so our devices don't become some e-waste.
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Cc: Mingcong Bai <jeffbai@aosc.io>
Cc: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Cc: Ben Horgan <ben.horgan@arm.com>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250602043723.216338-1-xry111@xry111.site
Signed-off-by: Will Deacon <will@kernel.org>
|
|
dm_set_device_limits() should check q->limits.features for
BLK_FEAT_ATOMIC_WRITES while holding q->limits_lock, like it does for
the rest of the queue limits.
Fixes: b7c18b17a173 ("dm-table: Set BLK_FEAT_ATOMIC_WRITES for target queue limits")
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
|
|
Commit 3ea277194daa ("mm, mprotect: flush TLB if potentially racing with
a parallel reclaim leaving stale TLB entries") describes a race that,
prior to the commit, could occur between reclaim and operations such as
mprotect() when using reclaim's tlbbatch mechanism. See that commit for
details but the summary is:
"""
Nadav Amit identified a theoritical race between page reclaim and
mprotect due to TLB flushes being batched outside of the PTL being held.
He described the race as follows:
CPU0 CPU1
---- ----
user accesses memory using RW PTE
[PTE now cached in TLB]
try_to_unmap_one()
==> ptep_get_and_clear()
==> set_tlb_ubc_flush_pending()
mprotect(addr, PROT_READ)
==> change_pte_range()
==> [ PTE non-present - no flush ]
user writes using cached RW PTE
...
try_to_unmap_flush()
"""
The solution was to insert flush_tlb_batched_pending() in mprotect() and
friends to explcitly drain any pending reclaim TLB flushes. In the
modern version of this solution, arch_flush_tlb_batched_pending() is
called to do that synchronisation.
arm64's tlbbatch implementation simply issues TLBIs at queue-time
(arch_tlbbatch_add_pending()), eliding the trailing dsb(ish). The
trailing dsb(ish) is finally issued in arch_tlbbatch_flush() at the end
of the batch to wait for all the issued TLBIs to complete.
Now, the Arm ARM states:
"""
The completion of the TLB maintenance instruction is guaranteed only by
the execution of a DSB by the observer that performed the TLB
maintenance instruction. The execution of a DSB by a different observer
does not have this effect, even if the DSB is known to be executed after
the TLB maintenance instruction is observed by that different observer.
"""
arch_tlbbatch_add_pending() and arch_tlbbatch_flush() conform to this
requirement because they are called from the same task (either kswapd or
caller of madvise(MADV_PAGEOUT)), so either they are on the same CPU or
if the task was migrated, __switch_to() contains an extra dsb(ish).
HOWEVER, arm64's arch_flush_tlb_batched_pending() is also implemented as
a dsb(ish). But this may be running on a CPU remote from the one that
issued the outstanding TLBIs. So there is no architectural gurantee of
synchonization. Therefore we are still vulnerable to the theoretical
race described in Commit 3ea277194daa ("mm, mprotect: flush TLB if
potentially racing with a parallel reclaim leaving stale TLB entries").
Fix this by flushing the entire mm in arch_flush_tlb_batched_pending().
This aligns with what the other arches that implement the tlbbatch
feature do.
Cc: <stable@vger.kernel.org>
Fixes: 43b3dfdd0455 ("arm64: support batched/deferred tlb shootdown during page reclamation/migration")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://lore.kernel.org/r/20250530152445.2430295-1-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
LLD will occasionally error out with a '__init_end does not converge'
error if INIT_IDMAP_DIR_SIZE is defined in terms of _end, as this
results in a circular dependency.
Counter this by dimensioning the initial IDMAP page tables based on a
new boundary marker 'kimage_limit', and define it such that its value
should not change as a result of the initdata segment being pushed over
a 64k segment boundary due to changes in INIT_IDMAP_DIR_SIZE, provided
that its value doesn't change by more than 2M between linker passes.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250531123005.3866382-2-ardb+git@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
It turns out [1] that the way LLD handles ASSERT()s in the linker script
can result in spurious failures, so disable them for the newly
introduced BSS symbol export checks. Since we're not aware of any issues
with the existing assertions in vmlinux.lds.S, leave those alone for now
so that they can continue to provide useful coverage.
A linker fix [2] is due to land in version 21 of LLD.
Link: https://lore.kernel.org/r/202505261019.OUlitN6m-lkp@intel.com [1]
Link: https://github.com/llvm/llvm-project/commit/5859863bab7f [2]
Link: https://github.com/ClangBuiltLinux/linux/issues/2094
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505261019.OUlitN6m-lkp@intel.com/
Link: https://lore.kernel.org/r/20250529073507.2984959-2-ardb+git@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Commit a1e40ac5b5e9 ("net: gso: fix udp gso fraglist segmentation after
pull from frag_list") detected invalid geometry in frag_list skbs and
redirects them from skb_segment_list to more robust skb_segment. But some
packets with modified geometry can also hit bugs in that code. We don't
know how many such cases exist. Addressing each one by one also requires
touching the complex skb_segment code, which risks introducing bugs for
other types of skbs. Instead, linearize all these packets that fail the
basic invariants on gso fraglist skbs. That is more robust.
If only part of the fraglist payload is pulled into head_skb, it will
always cause exception when splitting skbs by skb_segment. For detailed
call stack information, see below.
Valid SKB_GSO_FRAGLIST skbs
- consist of two or more segments
- the head_skb holds the protocol headers plus first gso_size
- one or more frag_list skbs hold exactly one segment
- all but the last must be gso_size
Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
modify fraglist skbs, breaking these invariants.
In extreme cases they pull one part of data into skb linear. For UDP,
this causes three payloads with lengths of (11,11,10) bytes were
pulled tail to become (12,10,10) bytes.
The skbs no longer meets the above SKB_GSO_FRAGLIST conditions because
payload was pulled into head_skb, it needs to be linearized before pass
to regular skb_segment.
skb_segment+0xcd0/0xd14
__udp_gso_segment+0x334/0x5f4
udp4_ufo_fragment+0x118/0x15c
inet_gso_segment+0x164/0x338
skb_mac_gso_segment+0xc4/0x13c
__skb_gso_segment+0xc4/0x124
validate_xmit_skb+0x9c/0x2c0
validate_xmit_skb_list+0x4c/0x80
sch_direct_xmit+0x70/0x404
__dev_queue_xmit+0x64c/0xe5c
neigh_resolve_output+0x178/0x1c4
ip_finish_output2+0x37c/0x47c
__ip_finish_output+0x194/0x240
ip_finish_output+0x20/0xf4
ip_output+0x100/0x1a0
NF_HOOK+0xc4/0x16c
ip_forward+0x314/0x32c
ip_rcv+0x90/0x118
__netif_receive_skb+0x74/0x124
process_backlog+0xe8/0x1a4
__napi_poll+0x5c/0x1f8
net_rx_action+0x154/0x314
handle_softirqs+0x154/0x4b8
[118.376811] [C201134] rxq0_pus: [name:bug&]kernel BUG at net/core/skbuff.c:4278!
[118.376829] [C201134] rxq0_pus: [name:traps&]Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
[118.470774] [C201134] rxq0_pus: [name:mrdump&]Kernel Offset: 0x178cc00000 from 0xffffffc008000000
[118.470810] [C201134] rxq0_pus: [name:mrdump&]PHYS_OFFSET: 0x40000000
[118.470827] [C201134] rxq0_pus: [name:mrdump&]pstate: 60400005 (nZCv daif +PAN -UAO)
[118.470848] [C201134] rxq0_pus: [name:mrdump&]pc : [0xffffffd79598aefc] skb_segment+0xcd0/0xd14
[118.470900] [C201134] rxq0_pus: [name:mrdump&]lr : [0xffffffd79598a5e8] skb_segment+0x3bc/0xd14
[118.470928] [C201134] rxq0_pus: [name:mrdump&]sp : ffffffc008013770
Fixes: a1e40ac5b5e9 ("gso: fix udp gso fraglist segmentation after pull from frag_list")
Signed-off-by: Shiming Cheng <shiming.cheng@mediatek.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use mac_pton() instead of custom approach.
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Link: https://patch.msgid.link/20250506045117.1896661-3-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Use pr_*() and netdev_*() to print log messages. While at it,
join split messages for easier grepping.
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Link: https://patch.msgid.link/20250506045117.1896661-2-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Since this is called deep in the ARCH=um IRQ infrastructure
it must use a raw spinlock. It's not really part of the
driver, but rather the core UML IRQ code.
Link: https://patch.msgid.link/20250505103358.ae7dc659f8b4.I64ca7aece30e0b4b0b5b35ad89cdd63db197c0ce@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Commit 65eaac591b75 ("um: Remove obsolete legacy network transports")
removes the directory arch/um/os-Linux/drivers/, but misses to remove the
file entry in TUN/TAP DRIVER referring to that directory.
Remove this obsolete file entry. While at it, put the section name in
capital letters.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.btw@antgroup.com>
Link: https://patch.msgid.link/20250507071004.35120-1-lukas.bulwahn@redhat.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
tgkill is a quite old syscall since kernel 2.5.75, but unfortunately glibc
doesn't support it before 2.30. Thus some systems fail to compile the
latest UserMode Linux.
Here is the compile error I encountered when I tried to compile UML in
my system shipped with glibc-2.28.
CALL scripts/checksyscalls.sh
CC arch/um/os-Linux/sigio.o
In file included from arch/um/os-Linux/sigio.c:17:
arch/um/os-Linux/sigio.c: In function ‘write_sigio_thread’:
arch/um/os-Linux/sigio.c:49:19: error: implicit declaration of function ‘tgkill’; did you mean ‘kill’? [-Werror=implicit-function-declaration]
CATCH_EINTR(r = tgkill(pid, pid, SIGIO));
^~~~~~
./arch/um/include/shared/os.h:21:48: note: in definition of macro ‘CATCH_EINTR’
#define CATCH_EINTR(expr) while ((errno = 0, ((expr) < 0)) && (errno == EINTR))
^~~~
cc1: some warnings being treated as errors
Fix it by Replacing glibc call with raw syscall.
Fixes: 33c9da5dfb18 ("um: Rewrite the sigio workaround based on epoll and tgkill")
Signed-off-by: Yongting Lin <linyongting@gmail.com>
Link: https://patch.msgid.link/20250527151222.40371-1-linyongting@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
arch/um is one of the last users of CONFIG_GENERIC_IOMAP, but upon
closer look it appears that the PCI host bridge does not register
any port I/O, and the absense of both custom inb/outb functions and
a PCI_IOBASE constant means that actually trying to use port I/O
results on a NULL pointer access.
Build testing with clang confirms this by warning about this exact
problem:
include/asm-generic/io.h:549:31: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic]
549 | val = __raw_readb(PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
Remove all the Kconfig selects that refer to legacy port I/O
and instead just build the normal MMIO path that is emulated
by the virtio PCI host.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20250509084125.1488601-1-arnd@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.16, take #1
- Make the irqbypass hooks resilient to changes in the GSI<->MSI
routing, avoiding behind stale vLPI mappings being left behind. The
fix is to resolve the VGIC IRQ using the host IRQ (which is stable)
and nuking the vLPI mapping upon a routing change.
- Close another VGIC race where vCPU creation races with VGIC
creation, leading to in-flight vCPUs entering the kernel w/o private
IRQs allocated.
- Fix a build issue triggered by the recently added workaround for
Ampere's AC04_CPU_23 erratum.
- Correctly sign-extend the VA when emulating a TLBI instruction
potentially targeting a VNCR mapping.
- Avoid dereferencing a NULL pointer in the VGIC debug code, which can
happen if the device doesn't have any mapping yet.
|
|
Commit fb49f07ba1d9 ("locking/mutex: implement mutex_lock_killable_nest_lock")
changed the set of functions that mutex.c defines when CONFIG_DEBUG_LOCK_ALLOC
is set.
- it removed the "extern" declaration of mutex_lock_killable_nested from
include/linux/mutex.h, and replaced it with a macro since it could be
treated as a special case of _mutex_lock_killable. It also removed a
definition of the function in kernel/locking/mutex.c.
- likewise, it replaced mutex_trylock() with the more generic
mutex_trylock_nest_lock() and replaced mutex_trylock() with a macro.
However, it left the old definitions in place in kernel/locking/rtmutex_api.c,
which causes failures when building with CONFIG_RT_MUTEXES=y. Bring over
the changes.
Fixes: fb49f07ba1d9 ("locking/mutex: implement mutex_lock_killable_nest_lock")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Prepare input updates for 6.16 merge window.
|
|
Implement ParentLeaseKey logic in cifs_do_create() by looking up the
parent cfid, copying its lease key into the fid struct, and setting
the appropriate lease flag.
Fixes: f047390a097e ("CIFS: Add create lease v2 context for SMB3")
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Implement ParentLeaseKey logic in open_cached_dir() by looking up the
parent cfid, copying its lease key into the fid struct, and setting
the appropriate lease flag.
Fixes: f047390a097e ("CIFS: Add create lease v2 context for SMB3")
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
According to MS-SMB2 3.2.4.3.8, when opening a file the client must
lookup its parent directory, copy that entry’s LeaseKey into
ParentLeaseKey, and set SMB2_LEASE_FLAG_PARENT_LEASE_KEY_SET.
Extend lease context functions to carry a parent_lease_key and
lease_flags and to add them to the lease context buffer accordingly in
smb3_create_lease_buf. Also add a parent_lease_key field to struct
cifs_fid and lease_flags to cifs_open_parms.
Only applies to the SMB 3.x dialect family.
Fixes: f047390a097e ("CIFS: Add create lease v2 context for SMB3")
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
For TRANS2 QUERY_PATH_INFO request when the path does not exist, the
Windows NT SMB server returns error response STATUS_OBJECT_NAME_NOT_FOUND
or ERRDOS/ERRbadfile without the SMBFLG_RESPONSE flag set. Similarly it
returns STATUS_DELETE_PENDING when the file is being deleted. And looks
like that any error response from TRANS2 QUERY_PATH_INFO does not have
SMBFLG_RESPONSE flag set.
So relax check in check_smb_hdr() for detecting if the packet is response
for this special case.
This change fixes stat() operation against Windows NT SMB servers and also
all operations which depends on -ENOENT result from stat like creat() or
mkdir().
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Validate the SMB1 query reparse point response per [MS-CIFS] section
2.2.7.2 NT_TRANSACT_IOCTL.
NT_TRANSACT_IOCTL response contains one word long setup data after which is
ByteCount member. So check that SetupCount is 1 before trying to read and
use ByteCount member.
Output setup data contains ReturnedDataLen member which is the output
length of executed IOCTL command by remote system. So check that output was
not truncated before transferring over network.
Change MaxSetupCount of NT_TRANSACT_IOCTL request from 4 to 1 as io_rsp
structure already expects one word long output setup data. This should
prevent server sending incompatible structure (in case it would be extended
in future, which is unlikely).
Change MaxParameterCount of NT_TRANSACT_IOCTL request from 2 to 0 as
NT IOCTL does not have any documented output parameters and this function
does not parse any output parameters at all.
Fixes: ed3e0a149b58 ("smb: client: implement ->query_reparse_point() for SMB1")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
[MS-CIFS] specification in section 2.2.4.53.1 where is described
SMB_COM_SESSION_SETUP_ANDX Request, for SessionKey field says:
The client MUST set this field to be equal to the SessionKey field in
the SMB_COM_NEGOTIATE Response for this SMB connection.
Linux SMB client currently set this field to zero. This is working fine
against Windows NT SMB servers thanks to [MS-CIFS] product behavior <94>:
Windows NT Server ignores the client's SessionKey.
For compatibility with [MS-CIFS], set this SessionKey field in Session
Setup Request to value retrieved from Negotiate response.
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB1 Session Setup NTLMSSP Request in non-UNICODE mode is similar to
UNICODE mode, just strings are encoded in ASCII and not in UTF-16.
With this change it is possible to setup SMB1 session with NTLM
authentication in non-UNICODE mode with Windows SMB server.
This change fixes mounting SMB1 servers with -o nounicode mount option
together with -o sec=ntlmssp mount option (which is the default sec=).
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The MT6357 PMIC contains the same RTC as MT6358 which allows to add
support for it trivially by just complementing the list of compatibles.
Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Tested-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20250428-rtc-mt6357-v1-1-31f673b0a723@baylibre.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
While the RTC framework intends to only handle dates after 1970 for
consumers, time conversion must also work for earlier dates to cover
e.g. storing dates beyond an RTC's range_max. This is most relevant for
the rtc-mt6397 driver that has
range_min = RTC_TIMESTAMP_BEGIN_1900;
range_max = mktime64(2027, 12, 31, 23, 59, 59);
and so needs working support for timestamps in 1900 starting in less than
three years.
So shift the tested interval of timestamps to also cover years 1900 to
1970.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com>
Link: https://lore.kernel.org/r/20250428-enable-rtc-v4-5-2b2f7e3f9349@baylibre.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
To cover calculation of the time and wday in the rtc kunit test also check
tm_hour, tm_min, tm_sec and tm_wday of the rtc_time calculated by
rtc_time64_to_tm().
There are no surprises, the two tests making use of
rtc_time64_to_tm_test_date_range() continue to succeed.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com>
Link: https://lore.kernel.org/r/20250428-enable-rtc-v4-4-2b2f7e3f9349@baylibre.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
This is easier to handle because you can just consult date(1) to convert
between a seconds-since-1970 value and a date string:
$ date --utc -d @3661
Thu Jan 1 01:01:01 AM UTC 1970
$ date -d "Jan 1 12:00:00 AM UTC 1900" +%s
-2208988800
The intended side effect is that this prepares the test for dates before
1970. The division of a negative value by 86400 doesn't result in the
desired days-since-1970 value as e.g. secs=-82739 should map to days=-1.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com>
Link: https://lore.kernel.org/r/20250428-enable-rtc-v4-3-2b2f7e3f9349@baylibre.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|