summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
11 daysnscommon: simplify initializationChristian Brauner
There's a lot of information that namespace implementers don't need to know about at all. Encapsulate this all in the initialization helper. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 dayscgroup: split namespace into separate headerChristian Brauner
We have dedicated headers for all namespace types. Add one for the cgroup namespace as well. Now it's consistent for all namespace types and easy to figure out what to include. Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysnscommon: move to separate fileChristian Brauner
It's really awkward spilling the ns common infrastructure into multiple headers. Move it to a separate file. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysmnt: expose pointer to init_mnt_nsChristian Brauner
There's various scenarios where we need to know whether we are in the initial set of namespaces or not to e.g., shortcut permission checking. All namespaces expose that information. Let's do that too. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysuts: split namespace into separate headerChristian Brauner
We have dedicated headers for all namespace types. Add one for the uts namespace as well. Now it's consistent for all namespace types. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysnsfs: support file handlesChristian Brauner
A while ago we added support for file handles to pidfs so pidfds can be encoded and decoded as file handles. Userspace has adopted this quickly and it's proven very useful. Implement file handles for namespaces as well. A process is not always able to open /proc/self/ns/. That requires procfs to be mounted and for /proc/self/ or /proc/self/ns/ to not be overmounted. However, userspace can always derive a namespace fd from a pidfd. And that always works for a task's own namespace. There's no need to introduce unnecessary behavioral differences between /proc/self/ns/ fds, pidfd-derived namespace fds, and file-handle-derived namespace fds. So namespace file handles are always decodable if the caller is located in the namespace the file handle refers to. This also allows a task to e.g., store a set of file handles to its namespaces in a file on-disk so it can verify when it gets rexeced that they're still valid and so on. This is akin to the pidfd use-case. Or just plainly for namespace comparison reasons where a file handle to the task's own namespace can be easily compared against others. Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysnsfs: add current_in_namespace()Christian Brauner
Add a helper to easily check whether a given namespace is the caller's current namespace. This is currently open-coded in a lot of places. Simply switch on the type and compare the results. Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysns: add to_<type>_ns() to respective headersChristian Brauner
Every namespace type has a container_of(ns, <ns_type>, ns) static inline function that is currently not exposed in the header. So we have a bunch of places that open-code it via container_of(). Move it to the headers so we can use it directly. Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daystime: support ns lookupChristian Brauner
Support the generic ns lookup infrastructure to support file handles for namespaces. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysMerge branch 'no-rebase-mnt_ns_tree_remove'Christian Brauner
Bring in the fix for removing a mount namespace from the mount namespace rbtree and list. Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysnstree: make iterator genericChristian Brauner
Move the namespace iteration infrastructure originally introduced for mount namespaces into a generic library usable by all namespace types. Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysns: remove ns_alloc_inum()Christian Brauner
It's now unused. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysns: uniformly initialize ns_commonChristian Brauner
No point in cargo-culting the same code across all the different types. Use one common initializer. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysnsfs: add nsfs.h headerChristian Brauner
And move the stuff out from proc_ns.h where it really doesn't belong. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysns: move to_ns_common() to ns_common.hChristian Brauner
Move the helper to ns_common.h where it belongs. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 dayswriteback: Avoid contention on wb->list_lock when switching inodesJan Kara
There can be multiple inode switch works that are trying to switch inodes to / from the same wb. This can happen in particular if some cgroup exits which owns many (thousands) inodes and we need to switch them all. In this case several inode_switch_wbs_work_fn() instances will be just spinning on the same wb->list_lock while only one of them makes forward progress. This wastes CPU cycles and quickly leads to softlockup reports and unusable system. Instead of running several inode_switch_wbs_work_fn() instances in parallel switching to the same wb and contending on wb->list_lock, run just one work item per wb and manage a queue of isw items switching to this wb. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>
11 daysMerge tag 'trace-rv-v6.17-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull runtime verifier fixes from Steven Rostedt: - Fix build in some RISC-V flavours Some system calls only are available for the 64bit RISC-V machines. #ifdef out the cases of clock_nanosleep and futex in the sleep monitor if they are not supported by the architecture. - Fix wrong cast, obsolete after refactoring Use container_of() to get to the rv_monitor structure from the enable_monitors_next() 'p' pointer. The assignment worked only because the list field used happened to be the first field of the structure. - Remove redundant include files Some include files were listed twice. Remove the extra ones and sort the includes. - Fix missing unlock on failure There was an error path that exited the rv_register_monitor() function without releasing a lock. Change that to goto the lock release. - Add Gabriele Monaco to be Runtime Verifier maintainer Gabriele is doing most of the work on RV as well as collecting patches. Add him to the maintainers file for Runtime Verification. * tag 'trace-rv-v6.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv: Add Gabriele Monaco as maintainer for Runtime Verification rv: Fix missing mutex unlock in rv_register_monitor() include/linux/rv.h: remove redundant include file rv: Fix wrong type cast in enabled_monitors_next() rv: Support systems with time64-only syscalls
12 daysarm64: Enable permission change on arm64 kernel block mappingsDev Jain
This patch paves the path to enable huge mappings in vmalloc space and linear map space by default on arm64. For this we must ensure that we can handle any permission games on the kernel (init_mm) pagetable. Previously, __change_memory_common() used apply_to_page_range() which does not support changing permissions for block mappings. We move away from this by using the pagewalk API, similar to what riscv does right now. It is the responsibility of the caller to ensure that the range over which permissions are being changed falls on leaf mapping boundaries. For systems with BBML2, this will be handled in future patches by dyanmically splitting the mappings when required. Unlike apply_to_page_range(), the pagewalk API currently enforces the init_mm.mmap_lock to be held. To avoid the unnecessary bottleneck of the mmap_lock for our usecase, this patch extends this generic API to be used locklessly, so as to retain the existing behaviour for changing permissions. Apart from this reason, it is noted at [1] that KFENCE can manipulate kernel pgtable entries during softirqs. It does this by calling set_memory_valid() -> __change_memory_common(). This being a non-sleepable context, we cannot take the init_mm mmap lock. Add comments to highlight the conditions under which we can use the lockless variant - no underlying VMA, and the user having exclusive control over the range, thus guaranteeing no concurrent access. We require that the start and end of a given range do not partially overlap block mappings, or cont mappings. Return -EINVAL in case a partial block mapping is detected in any of the PGD/P4D/PUD/PMD levels; add a corresponding comment in update_range_prot() to warn that eliminating such a condition is the responsibility of the caller. Note that, the pte level callback may change permissions for a whole contpte block, and that will be done one pte at a time, as opposed to an atomic operation for the block mappings. This is fine as any access will decode either the old or the new permission until the TLBI. apply_to_page_range() currently performs all pte level callbacks while in lazy mmu mode. Since arm64 can optimize performance by batching barriers when modifying kernel pgtables in lazy mmu mode, we would like to continue to benefit from this optimisation. Unfortunately walk_kernel_page_table_range() does not use lazy mmu mode. However, since the pagewalk framework is not allocating any memory, we can safely bracket the whole operation inside lazy mmu mode ourselves. Therefore, wrap the call to walk_kernel_page_table_range() with the lazy MMU helpers. Link: https://lore.kernel.org/linux-arm-kernel/89d0ad18-4772-4d8f-ae8a-7c48d26a927e@arm.com/ [1] Signed-off-by: Dev Jain <dev.jain@arm.com> Signed-off-by: Yang Shi <yshi@os.amperecomputing.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
12 daysio_uring/msg_ring: kill alloc_cache for io_kiocb allocationsJens Axboe
A recent commit: fc582cd26e88 ("io_uring/msg_ring: ensure io_kiocb freeing is deferred for RCU") fixed an issue with not deferring freeing of io_kiocb structs that msg_ring allocates to after the current RCU grace period. But this only covers requests that don't end up in the allocation cache. If a request goes into the alloc cache, it can get reused before it is sane to do so. A recent syzbot report would seem to indicate that there's something there, however it may very well just be because of the KASAN poisoning that the alloc_cache handles manually. Rather than attempt to make the alloc_cache sane for that use case, just drop the usage of the alloc_cache for msg_ring request payload data. Fixes: 50cf5f3842af ("io_uring/msg_ring: add an alloc cache for io_kiocb entries") Link: https://lore.kernel.org/io-uring/68cc2687.050a0220.139b6.0005.GAE@google.com/ Reported-by: syzbot+baa2e0f4e02df602583e@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
12 daysMerge tag 'net-6.17-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless. No known regressions at this point. Current release - fix to a fix: - eth: Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set" - wifi: iwlwifi: pcie: fix byte count table for 7000/8000 devices - net: clear sk->sk_ino in sk_set_socket(sk, NULL), fix CRIU Previous releases - regressions: - bonding: set random address only when slaves already exist - rxrpc: fix untrusted unsigned subtract - eth: - ice: fix Rx page leak on multi-buffer frames - mlx5: don't return mlx5_link_info table when speed is unknown Previous releases - always broken: - tls: make sure to abort the stream if headers are bogus - tcp: fix null-deref when using TCP-AO with TCP_REPAIR - dpll: fix skipping last entry in clock quality level reporting - eth: qed: don't collect too many protection override GRC elements, fix memory corruption" * tag 'net-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits) octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp() cnic: Fix use-after-free bugs in cnic_delete_task devlink rate: Remove unnecessary 'static' from a couple places MAINTAINERS: update sundance entry net: liquidio: fix overflow in octeon_init_instr_queue() net: clear sk->sk_ino in sk_set_socket(sk, NULL) Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set" selftests: tls: test skb copy under mem pressure and OOB tls: make sure to abort the stream if headers are bogus selftest: packetdrill: Add tcp_fastopen_server_reset-after-disconnect.pkt. tcp: Clear tcp_sk(sk)->fastopen_rsk in tcp_disconnect(). octeon_ep: fix VF MAC address lifecycle handling selftests: bonding: add vlan over bond testing bonding: don't set oif to bond dev when getting NS target destination net: rfkill: gpio: Fix crash due to dereferencering uninitialized pointer net/mlx5e: Add a miss level for ipsec crypto offload net/mlx5e: Harden uplink netdev access against device unbind MAINTAINERS: make the DPLL entry cover drivers doc/netlink: Fix typos in operation attributes igc: don't fail igc_probe() on LED setup error ...
12 dayscompiler_types: Add __assume macroHeiko Carstens
Make the statement attribute "assume" with a new __assume macro available. The assume attribute is used to indicate that a certain condition is assumed to be true. Compilers may or may not use this indication to generate optimized code. If this condition is violated at runtime, the behavior is undefined. Note that the clang documentation states that optimizers may react differently to this attribute, and this may even have a negative performance impact. Therefore this attribute should be used with care. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
12 daysMerge tag 'mm-hotfixes-stable-2025-09-17-21-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "15 hotfixes. 11 are cc:stable and the remainder address post-6.16 issues or aren't considered necessary for -stable kernels. 13 of these fixes are for MM. The usual shower of singletons, plus - fixes from Hugh to address various misbehaviors in get_user_pages() - patches from SeongJae to address a quite severe issue in DAMON - another series also from SeongJae which completes some fixes for a DAMON startup issue" * tag 'mm-hotfixes-stable-2025-09-17-21-10' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: zram: fix slot write race condition nilfs2: fix CFI failure when accessing /sys/fs/nilfs2/features/* samples/damon/mtier: avoid starting DAMON before initialization samples/damon/prcl: avoid starting DAMON before initialization samples/damon/wsse: avoid starting DAMON before initialization MAINTAINERS: add Lance Yang as a THP reviewer MAINTAINERS: add Jann Horn as rmap reviewer mm/damon/sysfs: use dynamically allocated repeat mode damon_call_control mm/damon/core: introduce damon_call_control->dealloc_on_cancel mm: folio_may_be_lru_cached() unless folio_test_large() mm: revert "mm: vmscan.c: fix OOM on swap stress test" mm: revert "mm/gup: clear the LRU flag of a page before adding to LRU batch" mm/gup: local lru_add_drain() to avoid lru_add_drain_all() mm/gup: check ref_count instead of lru before migration
13 daysstddef: Introduce __TRAILING_OVERLAP()Gustavo A. R. Silva
Introduce underlying __TRAILING_OVERLAP() macro to let callers apply atributes to trailing overlapping members. For instance, the code below: | struct flex { | size_t count; | int data[]; | }; | struct { | struct flex f; | struct foo a; | struct boo b; | } __packed instance; can now be changed to the following, and preserve the __packed attribute: | __TRAILING_OVERLAP(struct flex, f, data, __packed, | struct foo a; | struct boo b; | ) instance; Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/f80c529b239ce11f0a51f714fe00ddf839e05f5e.1758115257.git.gustavoars@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
13 daysstddef: Remove token-pasting in TRAILING_OVERLAP()Gustavo A. R. Silva
Currently, TRAILING_OVERLAP() token-pastes the FAM parameter into the name of internal pdding member `__offset_to_##FAM`. This forces FAM to be a single identifier, which prevents callers from using a FAM when it's a nested member. For instance, see the following scenario: | struct flex { | size_t count; | int data[]; | }; | struct foo { | int hdr_foo; | struct flex f; | }; | struct composite { | struct foo hdr; | int data[100]; | }; In this case, it'd be useful if TRAILING_OVERLAP() could be used in the following way: | struct composite { | TRAILING_OVERLAP(struct foo, hdr, f.data, | int data[100]; | ); | }; However, this is not current possible due to the token concatenation in `__offset_to_##FAM`, which fails when FAM contains a dot. So, remove token-pasting and use the fixed internal name `__offset_to_FAM` and, with this, expand the capabilities of TRAILING_OVERLAP(). :) Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/13b3e0a69aad837b4e32ca8269b9d91bf1fbe9ef.1758115257.git.gustavoars@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
13 daysnet/mlx5e: Harden uplink netdev access against device unbindJianbo Liu
The function mlx5_uplink_netdev_get() gets the uplink netdevice pointer from mdev->mlx5e_res.uplink_netdev. However, the netdevice can be removed and its pointer cleared when unbound from the mlx5_core.eth driver. This results in a NULL pointer, causing a kernel panic. BUG: unable to handle page fault for address: 0000000000001300 at RIP: 0010:mlx5e_vport_rep_load+0x22a/0x270 [mlx5_core] Call Trace: <TASK> mlx5_esw_offloads_rep_load+0x68/0xe0 [mlx5_core] esw_offloads_enable+0x593/0x910 [mlx5_core] mlx5_eswitch_enable_locked+0x341/0x420 [mlx5_core] mlx5_devlink_eswitch_mode_set+0x17e/0x3a0 [mlx5_core] devlink_nl_eswitch_set_doit+0x60/0xd0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x183/0x290 netlink_rcv_skb+0x4b/0xf0 genl_rcv+0x24/0x40 netlink_unicast+0x255/0x380 netlink_sendmsg+0x1f3/0x420 __sock_sendmsg+0x38/0x60 __sys_sendto+0x119/0x180 do_syscall_64+0x53/0x1d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Ensure the pointer is valid before use by checking it for NULL. If it is valid, immediately call netdev_hold() to take a reference, and preventing the netdevice from being freed while it is in use. Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757939074-617281-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 daysfs: add an enum for number of life time hintsHans Holmberg
Add WRITE_LIFE_HINT_NR into the rw_hint enum to define the number of values write life time hints can be set to. This is useful for e.g. file systems which may want to map these values to allocation groups. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-09-15fs: rename generic_delete_inode() and generic_drop_inode()Mateusz Guzik
generic_delete_inode() is rather misleading for what the routine is doing. inode_just_drop() should be much clearer. The new naming is inconsistent with generic_drop_inode(), so rename that one as well with inode_ as the suffix. No functional changes. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-15pidfs: validate extensible ioctlsChristian Brauner
Validate extensible ioctls stricter than we do now. Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-15include/linux/rv.h: remove redundant include fileAkhilesh Patil
Remove redundant include <linux/types.h> to clean up the code. Move all unique include files inside CONFIG_RV as they are only needed when CONFIG_RV is enabled. Arrange include files alphabetically. Fixes: 24cbfe18d55a ("rv: Merge struct rv_monitor_def into struct rv_monitor") [1] Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202507312017.oyD08TL5-lkp@intel.com/ Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in> Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/r/aJneRbHGlNFg7lr9@bhairav-test.ee.iitb.ac.in Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2025-09-13mm/damon/core: introduce damon_call_control->dealloc_on_cancelSeongJae Park
Patch series "mm/damon/sysfs: fix refresh_ms control overwriting on multi-kdamonds usages". Automatic esssential DAMON/DAMOS status update feature of DAMON sysfs interface (refresh_ms) is broken [1] for multiple DAMON contexts (kdamonds) use case, since it uses a global single damon_call_control object for all created DAMON contexts. The fields of the object, particularly the list field is over-written for the contexts and it makes unexpected results including user-space hangup and kernel crashes [2]. Fix it by extending damon_call_control for the use case and updating the usage on DAMON sysfs interface to use per-context dynamically allocated damon_call_control object. This patch (of 2): When damon_call_control->repeat is set, damon_call() is executed asynchronously, and is eventually canceled when kdamond finishes. If the damon_call_control object is dynamically allocated, finding the place to deallocate the object is difficult. Introduce a new damon_call_control field, namely dealloc_on_cancel, to ask the kdamond deallocates those dynamically allocated objects when those are canceled. Link: https://lkml.kernel.org/r/20250908201513.60802-3-sj@kernel.org Link: https://lkml.kernel.org/r/20250908201513.60802-2-sj@kernel.org Fixes: d809a7c64ba8 ("mm/damon/sysfs: implement refresh_ms file internal work") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Yunjeong Mun <yunjeong.mun@sk.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mm: folio_may_be_lru_cached() unless folio_test_large()Hugh Dickins
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a large folio is added: so collect_longterm_unpinnable_folios() just wastes effort when calling lru_add_drain[_all]() on a large folio. But although there is good reason not to batch up PMD-sized folios, we might well benefit from batching a small number of low-order mTHPs (though unclear how that "small number" limitation will be implemented). So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to insulate those particular checks from future change. Name preferred to "folio_is_batchable" because large folios can well be put on a batch: it's just the per-CPU LRU caches, drained much later, which need care. Marked for stable, to counter the increase in lru_add_drain_all()s from "mm/gup: check ref_count instead of lru before migration". Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region") Signed-off-by: Hugh Dickins <hughd@google.com> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Keir Fraser <keirf@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Li Zhe <lizhe.67@bytedance.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shivank Garg <shivankg@amd.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Xu <weixugc@google.com> Cc: Will Deacon <will@kernel.org> Cc: yangge <yangge1116@126.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-12Merge tag 'imx-fixes-6.17-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 6.17, round 2: - Fix mach-imx Kconfig to select the correct PIT timer option (Lukas Bulwahn) - Correct thermal sensor index for i.MX8MP device tree (Peng Fan) - Fix i.MX SCMI build error by adding stub API functions (Peng Fan) * tag 'imx-fixes-6.17-2' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: arm64: dts: imx8mp: Correct thermal sensor index ARM: imx: Kconfig: Adjust select after renamed config option firmware: imx: Add stub functions for SCMI CPU API firmware: imx: Add stub functions for SCMI LMM API firmware: imx: Add stub functions for SCMI MISC API Link: https://lore.kernel.org/r/aMQs2zr4fYl2DYVr@dragon Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-11Merge tag 'net-6.17-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from CAN, netfilter and wireless. We have an IPv6 routing regression with the relevant fix still a WiP. This includes a last-minute revert to avoid more problems. Current release - new code bugs: - wifi: nl80211: completely disable per-link stats for now Previous releases - regressions: - dev_ioctl: take ops lock in hwtstamp lower paths - netfilter: - fix spurious set lookup failures - fix lockdep splat due to missing annotation - genetlink: fix genl_bind() invoking bind() after -EPERM - phy: transfer phy_config_inband() locking responsibility to phylink - can: xilinx_can: fix use-after-free of transmitted SKB - hsr: fix lock warnings - eth: - igb: fix NULL pointer dereference in ethtool loopback test - i40e: fix Jumbo Frame support after iPXE boot - macsec: sync features on RTM_NEWLINK Previous releases - always broken: - tunnels: reset the GSO metadata before reusing the skb - mptcp: make sync_socket_options propagate SOCK_KEEPOPEN - can: j1939: implement NETDEV_UNREGISTER notification hanidler - wifi: ath12k: fix WMI TLV header misalignment" * tag 'net-6.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) Revert "net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups" hsr: hold rcu and dev lock for hsr_get_port_ndev hsr: use hsr_for_each_port_rtnl in hsr_port_get_hsr hsr: use rtnl lock when iterating over ports wifi: nl80211: completely disable per-link stats for now net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups net: ethtool: fix wrong type used in struct kernel_ethtool_ts_info MAINTAINERS: add Phil as netfilter reviewer netfilter: nf_tables: restart set lookup on base_seq change netfilter: nf_tables: make nft_set_do_lookup available unconditionally netfilter: nf_tables: place base_seq in struct net netfilter: nft_set_rbtree: continue traversal if element is inactive netfilter: nft_set_pipapo: don't check genbit from packetpath lookups netfilter: nft_set_bitmap: fix lockdep splat due to missing annotation can: rcar_can: rcar_can_resume(): fix s2ram with PSCI can: xilinx_can: xcan_write_frame(): fix use-after-free of transmitted SKB can: j1939: j1939_local_ecu_get(): undo increment when j1939_local_ecu_get() fails can: j1939: j1939_sk_bind(): call j1939_priv_put() immediately when j1939_local_ecu_get() failed can: j1939: implement NETDEV_UNREGISTER notification handler selftests: can: enable CONFIG_CAN_VCAN as a module ...
2025-09-11Merge tag 'pm-6.17-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix a nasty hibernation regression introduced during the 6.16 cycle, an issue related to energy model management occurring on Intel hybrid systems where some CPUs are offline to start with, and two regressions in the amd-pstate driver: - Restore a pm_restrict_gfp_mask() call in hibernation_snapshot() that was removed incorrectly during the 6.16 development cycle (Rafael Wysocki) - Introduce a function for registering a perf domain without triggering a system-wide CPU capacity update and make the intel_pstate driver use it to avoid reocurring unsuccessful attempts to update capacities of all CPUs in the system (Rafael Wysocki) - Fix setting of CPPC.min_perf in the active mode with performance governor in the amd-pstate driver to restore its expected behavior changed recently (Gautham Shenoy) - Avoid mistakenly setting EPP to 0 in the amd-pstate driver after system resume as a result of recent code changes (Mario Limonciello)" * tag 'pm-6.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: hibernate: Restrict GFP mask in hibernation_snapshot() PM: EM: Add function for registering a PD without capacity update cpufreq/amd-pstate: Fix a regression leading to EPP 0 after resume cpufreq/amd-pstate: Fix setting of CPPC.min_perf in active mode for performance governor
2025-09-11pmdomain: core: Restore behaviour for disabling unused PM domainsUlf Hansson
Recent changes to genpd prevents those PM domains being powered-on during initialization from being powered-off during the boot sequence. Based upon whether CONFIG_PM_CONFIG_PM_GENERIC_DOMAINS_OF is set of not, genpd relies on the sync_state mechanism or the genpd_power_off_unused() (which is a late_initcall_sync), to understand when it's okay to allow these PM domains to be powered-off. This new behaviour in genpd has lead to problems on different platforms. Let's therefore restore the behavior of genpd_power_off_unused(). Moreover, let's introduce GENPD_FLAG_NO_STAY_ON, to allow genpd OF providers to opt-out from the new behaviour. Link: https://lore.kernel.org/all/20250701114733.636510-1-ulf.hansson@linaro.org/ Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/all/20250902-rk3576-lockup-regression-v1-1-c4a0c9daeb00@collabora.com/ Reported-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Fixes: 0e789b491ba0 ("pmdomain: core: Leave powered-on genpds on until sync_state") Fixes: 13a4b7fb6260 ("pmdomain: core: Leave powered-on genpds on until late_initcall_sync") Tested-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-09-10Merge tag 'mm-hotfixes-stable-2025-09-10-20-00' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "20 hotfixes. 15 are cc:stable and the remainder address post-6.16 issues or aren't considered necessary for -stable kernels. 14 of these fixes are for MM. This includes - kexec fixes from Breno for a recently introduced use-uninitialized bug - DAMON fixes from Quanmin Yan to avoid div-by-zero crashes which can occur if the operator uses poorly-chosen insmod parameters and misc singleton fixes" * tag 'mm-hotfixes-stable-2025-09-10-20-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: MAINTAINERS: add tree entry to numa memblocks and emulation block mm/damon/sysfs: fix use-after-free in state_show() proc: fix type confusion in pde_set_flags() compiler-clang.h: define __SANITIZE_*__ macros only when undefined mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc() ocfs2: fix recursive semaphore deadlock in fiemap call mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory mm/mremap: fix regression in vrm->new_addr check percpu: fix race on alloc failed warning limit mm/memory-failure: fix redundant updates for already poisoned pages s390: kexec: initialize kexec_buf struct riscv: kexec: initialize kexec_buf struct arm64: kexec: initialize kexec_buf struct in load_other_segments() mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters() mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters() mm/damon/core: set quota->charged_from to jiffies at first charge window mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range() init/main.c: fix boot time tracing crash mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range() mm/khugepaged: fix the address passed to notifier on testing young
2025-09-10Merge tag 'vmscape-for-linus-20250904' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull vmescape mitigation fixes from Dave Hansen: "Mitigate vmscape issue with indirect branch predictor flushes. vmscape is a vulnerability that essentially takes Spectre-v2 and attacks host userspace from a guest. It particularly affects hypervisors like QEMU. Even if a hypervisor may not have any sensitive data like disk encryption keys, guest-userspace may be able to attack the guest-kernel using the hypervisor as a confused deputy. There are many ways to mitigate vmscape using the existing Spectre-v2 defenses like IBRS variants or the IBPB flushes. This series focuses solely on IBPB because it works universally across vendors and all vulnerable processors. Further work doing vendor and model-specific optimizations can build on top of this if needed / wanted. Do the normal issue mitigation dance: - Add the CPU bug boilerplate - Add a list of vulnerable CPUs - Use IBPB to flush the branch predictors after running guests" * tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vmscape: Add old Intel CPUs to affected list x86/vmscape: Warn when STIBP is disabled with SMT x86/bugs: Move cpu_bugs_smt_update() down x86/vmscape: Enable the mitigation x86/vmscape: Add conditional IBPB mitigation x86/vmscape: Enumerate VMSCAPE bug Documentation/hw-vuln: Add VMSCAPE documentation
2025-09-11firmware: imx: Add stub functions for SCMI CPU APIPeng Fan
To ensure successful builds when CONFIG_IMX_SCMI_CPU_DRV is not enabled, this patch adds static inline stub implementations for the following functions: - scmi_imx_cpu_start() - scmi_imx_cpu_started() - scmi_imx_cpu_reset_vector_set() These stubs return -EOPNOTSUPP to indicate that the functionality is not supported in the current configuration. This avoids potential build or link errors in code that conditionally calls these functions based on feature availability. Fixes: 1055faa5d660 ("firmware: imx: Add i.MX95 SCMI CPU driver") Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-09-11firmware: imx: Add stub functions for SCMI LMM APIPeng Fan
To ensure successful builds when CONFIG_IMX_SCMI_LMM_DRV is not enabled, this patch adds static inline stub implementations for the following functions: - scmi_imx_lmm_operation() - scmi_imx_lmm_info() - scmi_imx_lmm_reset_vector_set() These stubs return -EOPNOTSUPP to indicate that the functionality is not supported in the current configuration. This avoids potential build or link errors in code that conditionally calls these functions based on feature availability. Fixes: 7242bbf418f0 ("firmware: imx: Add i.MX95 SCMI LMM driver") Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-09-11firmware: imx: Add stub functions for SCMI MISC APIPeng Fan
To ensure successful builds when CONFIG_IMX_SCMI_MISC_DRV is not enabled, this patch adds static inline stub implementations for the following functions: - scmi_imx_misc_ctrl_get() - scmi_imx_misc_ctrl_set() These stubs return -EOPNOTSUPP to indicate that the functionality is not supported in the current configuration. This avoids potential build or link errors in code that conditionally calls these functions based on feature availability. This patch also drops the changes in commit 540c830212ed ("firmware: imx: remove duplicate scmi_imx_misc_ctrl_get()"). The original change aimed to simplify the handling of optional features by removing conditional stubs. However, the use of conditional stubs is necessary when CONFIG_IMX_SCMI_MISC_DRV is n, while consumer driver is set to y. This is not a matter of preserving legacy patterns, but rather to ensure that there is no link error whether for module or built-in. Fixes: 0b4f8a68b292 ("firmware: imx: Add i.MX95 MISC driver") Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-09-10net: ethtool: fix wrong type used in struct kernel_ethtool_ts_infoRussell King (Oracle)
In C, enumerated types do not have a defined size, apart from being compatible with one of the standard types. This allows an ABI / compiler to choose the type of an enum depending on the values it needs to store, and storing larger values in it can lead to undefined behaviour. The tx_type and rx_filters members of struct kernel_ethtool_ts_info are defined as enumerated types, but are bit arrays, where each bit is defined by the enumerated type. This means they typically store values in excess of the maximum value of the enumerated type, in fact (1 << max_value) and thus must not be declared using the enumated type. Fix both of these to use u32, as per the corresponding __u32 UAPI type. Fixes: 2111375b85ad ("net: Add struct kernel_ethtool_ts_info") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/E1uvMEK-00000003Amd-2pWR@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-10PM: EM: Add function for registering a PD without capacity updateRafael J. Wysocki
The intel_pstate driver manages CPU capacity changes itself and it does not need an update of the capacity of all CPUs in the system to be carried out after registering a PD. Moreover, in some configurations (for instance, an SMT-capable hybrid x86 system booted with nosmt in the kernel command line) the em_check_capacity_update() call at the end of em_dev_register_perf_domain() always fails and reschedules itself to run once again in 1 s, so effectively it runs in vain every 1 s forever. To address this, introduce a new variant of em_dev_register_perf_domain(), called em_dev_register_pd_no_update(), that does not invoke em_check_capacity_update(), and make intel_pstate use it instead of the original. Fixes: 7b010f9b9061 ("cpufreq: intel_pstate: EAS support for hybrid platforms") Closes: https://lore.kernel.org/linux-pm/40212796-734c-4140-8a85-854f72b8144d@panix.com/ Reported-by: Kenneth R. Crudup <kenny@panix.com> Tested-by: Kenneth R. Crudup <kenny@panix.com> Cc: 6.16+ <stable@vger.kernel.org> # 6.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-08compiler-clang.h: define __SANITIZE_*__ macros only when undefinedNathan Chancellor
Clang 22 recently added support for defining __SANITIZE__ macros similar to GCC [1], which causes warnings (or errors with CONFIG_WERROR=y or W=e) with the existing defines that the kernel creates to emulate this behavior with existing clang versions. In file included from <built-in>:3: In file included from include/linux/compiler_types.h:171: include/linux/compiler-clang.h:37:9: error: '__SANITIZE_THREAD__' macro redefined [-Werror,-Wmacro-redefined] 37 | #define __SANITIZE_THREAD__ | ^ <built-in>:352:9: note: previous definition is here 352 | #define __SANITIZE_THREAD__ 1 | ^ Refactor compiler-clang.h to only define the sanitizer macros when they are undefined and adjust the rest of the code to use these macros for checking if the sanitizers are enabled, clearing up the warnings and allowing the kernel to easily drop these defines when the minimum supported version of LLVM for building the kernel becomes 22.0.0 or newer. Link: https://lkml.kernel.org/r/20250902-clang-update-sanitize-defines-v1-1-cf3702ca3d92@kernel.org Link: https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444dae4fdc8a9c [1] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Justin Stitt <justinstitt@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Bill Wendling <morbo@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()Uladzislau Rezki (Sony)
kasan_populate_vmalloc() and its helpers ignore the caller's gfp_mask and always allocate memory using the hardcoded GFP_KERNEL flag. This makes them inconsistent with vmalloc(), which was recently extended to support GFP_NOFS and GFP_NOIO allocations. Page table allocations performed during shadow population also ignore the external gfp_mask. To preserve the intended semantics of GFP_NOFS and GFP_NOIO, wrap the apply_to_page_range() calls into the appropriate memalloc scope. xfs calls vmalloc with GFP_NOFS, so this bug could lead to deadlock. There was a report here https://lkml.kernel.org/r/686ea951.050a0220.385921.0016.GAE@google.com This patch: - Extends kasan_populate_vmalloc() and helpers to take gfp_mask; - Passes gfp_mask down to alloc_pages_bulk() and __get_free_page(); - Enforces GFP_NOFS/NOIO semantics with memalloc_*_save()/restore() around apply_to_page_range(); - Updates vmalloc.c and percpu allocator call sites accordingly. Link: https://lkml.kernel.org/r/20250831121058.92971-1-urezki@gmail.com Fixes: 451769ebb7e7 ("mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc") Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reported-by: syzbot+3470c9ffee63e4abafeb@syzkaller.appspotmail.com Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08bitops: Add __attribute_const__ to generic ffs()-family implementationsKees Cook
While tracking down a problem where constant expressions used by BUILD_BUG_ON() suddenly stopped working[1], we found that an added static initializer was convincing the compiler that it couldn't track the state of the prior statically initialized value. Tracing this down found that ffs() was used in the initializer macro, but since it wasn't marked with __attribute__const__, the compiler had to assume the function might change variable states as a side-effect (which is not true for ffs(), which provides deterministic math results). Add missing __attribute_const__ annotations to generic implementations of ffs(), __ffs(), fls(), and __fls() functions. These are pure mathematical functions that always return the same result for the same input with no side effects, making them eligible for compiler optimization. Build tested with x86_64 defconfig using GCC 14.2.0, which should validate the implementations when used by ARM, ARM64, LoongArch, Microblaze, NIOS2, and SPARC32 architectures. Link: https://github.com/KSPP/linux/issues/364 [1] Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250804164417.1612371-2-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-09-08Merge tag 'vfs-6.17-rc6.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "fuse: - Prevent opening of non-regular backing files. Fuse doesn't support non-regular files anyway. - Check whether copy_file_range() returns a larger size than requested. - Prevent overflow in copy_file_range() as fuse currently only supports 32-bit sized copies. - Cache the blocksize value if the server returned a new value as inode->i_blkbits isn't modified directly anymore. - Fix i_blkbits handling for iomap partial writes. By default i_blkbits is set to PAGE_SIZE which causes iomap to mark the whole folio as uptodate even on a partial write. But fuseblk filesystems support choosing a blocksize smaller than PAGE_SIZE risking data corruption. Simply enforce PAGE_SIZE as blocksize for fuseblk's internal inode for now. - Prevent out-of-bounds acces in fuse_dev_write() when the number of bytes to be retrieved is truncated to the fc->max_pages limit. virtiofs: - Fix page faults for DAX page addresses. Misc: - Tighten file handle decoding from userns. Check that the decoded dentry itself has a valid idmapping in the user namespace. - Fix mount-notify selftests. - Fix some indentation errors. - Add an FMODE_ flag to indicate IOCB_HAS_METADATA availability. This will be moved to an FOP_* flag with a bit more rework needed for that to happen not suitable for a fix. - Don't silently ignore metadata for sync read/write. - Don't pointlessly log warning when reading coredump sysctls" * tag 'vfs-6.17-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fuse: virtio_fs: fix page fault for DAX page address selftests/fs/mount-notify: Fix compilation failure. fhandle: use more consistent rules for decoding file handle from userns fuse: Block access to folio overlimit fuse: fix fuseblk i_blkbits for iomap partial writes fuse: reflect cached blocksize if blocksize was changed fuse: prevent overflow in copy_file_range return value fuse: check if copy_file_range() returns larger than requested size fuse: do not allow mapping a non-regular backing file coredump: don't pointlessly check and spew warnings fs: fix indentation style block: don't silently ignore metadata for sync read/write fs: add a FMODE_ flag to indicate IOCB_HAS_METADATA availability Please enter a commit message to explain why this merge is necessary, especially if it merges an updated upstream into a topic branch.
2025-09-07Merge tag 'timers-urgent-2025-09-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix a severe slowdown regression in the timer vDSO code related to the while() loop in __iter_div_u64_rem(), when the AUX-clock is enabled" * tag 'timers-urgent-2025-09-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: vdso/vsyscall: Avoid slow division loop in auxiliary clock update
2025-09-04Merge tag 'net-6.17-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, wireless and Bluetooth. We're reverting the removal of a Sundance driver, a user has appeared. This makes the PR rather large in terms of LoC. There's a conspicuous absence of real, user-reported 6.17 issues. Slightly worried that the summer distracted people from testing. Previous releases - regressions: - ax25: properly unshare skbs in ax25_kiss_rcv() Previous releases - always broken: - phylink: disable autoneg for interfaces that have no inband, fix regression on pcs-lynx (NXP LS1088) - vxlan: fix null-deref when using nexthop objects - batman-adv: fix OOB read/write in network-coding decode - icmp: icmp_ndo_send: fix reversing address translation for replies - tcp: fix socket ref leak in TCP-AO failure handling for IPv6 - mctp: - mctp_fraq_queue should take ownership of passed skb - usb: initialise mac header in RX path, avoid WARN - wifi: mac80211: do not permit 40 MHz EHT operation on 5/6 GHz, respect device limitations - wifi: wilc1000: avoid buffer overflow in WID string configuration - wifi: mt76: - fix regressions from mt7996 MLO support rework - fix offchannel handling issues on mt7996 - fix multiple wcid linked list corruption issues - mt7921: don't disconnect when AP requests switch to a channel which requires radar detection - mt7925u: use connac3 tx aggr check in tx complete - wifi: intel: - improve validation of ACPI DSM data - cfg: restore some 1000 series configs - wifi: ath: - ath11k: a fix for GTK rekeying - ath12k: a missed WiFi7 capability (multi-link EMLSR) - eth: intel: - ice: fix races in "low latency" firmware interface for Tx timestamps - idpf: set mac type when adding and removing MAC filters - i40e: remove racy read access to some debugfs files Misc: - Revert "eth: remove the DLink/Sundance (ST201) driver" - netfilter: conntrack: helper: Replace -EEXIST by -EBUSY, avoid confusing modprobe" * tag 'net-6.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits) phy: mscc: Stop taking ts_lock for tx_queue and use its own lock selftest: net: Fix weird setsockopt() in bind_bhash.c. MAINTAINERS: add Sabrina to TLS maintainers gve: update MAINTAINERS ppp: fix memory leak in pad_compress_skb net: xilinx: axienet: Add error handling for RX metadata pointer retrieval net: atm: fix memory leak in atm_register_sysfs when device_register fail netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX selftests: netfilter: fix udpclash tool hang ax25: properly unshare skbs in ax25_kiss_rcv() mctp: return -ENOPROTOOPT for unknown getsockopt options net/smc: Remove validation of reserved bits in CLC Decline message ipv4: Fix NULL vs error pointer check in inet_blackhole_dev_init() net: thunder_bgx: decrement cleanup index before use net: thunder_bgx: add a missing of_node_put net: phylink: move PHY interrupt request to non-fail path net: lockless sock_i_ino() tools: ynl-gen: fix nested array counting wifi: wilc1000: avoid buffer overflow in WID string configuration wifi: cfg80211: sme: cap SSID length in __cfg80211_connect_result() ...
2025-09-03binfmt_elf: preserve original ELF e_flags for core dumpsSvetlana Parfenova
Some architectures, such as RISC-V, use the ELF e_flags field to encode ABI-specific information (e.g., ISA extensions, fpu support). Debuggers like GDB rely on these flags in core dumps to correctly interpret optional register sets. If the flags are missing or incorrect, GDB may warn and ignore valid data, for example: warning: Unexpected size of section '.reg2/213' in core file. This can prevent access to fpu or other architecture-specific registers even when they were dumped. Save the e_flags field during ELF binary loading (in load_elf_binary()) into the mm_struct, and later retrieve it during core dump generation (in fill_note_info()). Kconfig option CONFIG_ARCH_HAS_ELF_CORE_EFLAGS is introduced for architectures that require this behaviour. Signed-off-by: Svetlana Parfenova <svetlana.parfenova@syntacore.com> Link: https://lore.kernel.org/r/20250901135350.619485-1-svetlana.parfenova@syntacore.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-09-03vdso/vsyscall: Avoid slow division loop in auxiliary clock updateThomas Weißschuh
The call to __iter_div_u64_rem() in vdso_time_update_aux() is a wrapper around subtraction. It cannot be used to divide large numbers, as that introduces long, computationally expensive delays. A regular u64 division is also not possible in the timekeeper update path as it can be too slow. Instead of splitting the ktime_t offset into into second and subsecond components during the timekeeper update fast-path, do it together with the adjustment of tk->offs_aux in the slow-path. Equivalent to the handling of offs_boot and monotonic_to_boot. Reuse the storage of monotonic_to_boot for the new field, as it is not used by auxiliary timekeepers. Fixes: 380b84e168e5 ("vdso/vsyscall: Update auxiliary clock data in the datapage") Reported-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250825-vdso-auxclock-division-v1-1-a1d32a16a313@linutronix.de Closes: https://lore.kernel.org/lkml/aKwsNNWsHJg8IKzj@localhost/