summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2025-08-15netfs: Fix unbuffered write error handlingDavid Howells
If all the subrequests in an unbuffered write stream fail, the subrequest collector doesn't update the stream->transferred value and it retains its initial LONG_MAX value. Unfortunately, if all active streams fail, then we take the smallest value of { LONG_MAX, LONG_MAX, ... } as the value to set in wreq->transferred - which is then returned from ->write_iter(). LONG_MAX was chosen as the initial value so that all the streams can be quickly assessed by taking the smallest value of all stream->transferred - but this only works if we've set any of them. Fix this by adding a flag to indicate whether the value in stream->transferred is valid and checking that when we integrate the values. stream->transferred can then be initialised to zero. This was found by running the generic/750 xfstest against cifs with cache=none. It splices data to the target file. Once (if) it has used up all the available scratch space, the writes start failing with ENOSPC. This causes ->write_iter() to fail. However, it was returning wreq->transferred, i.e. LONG_MAX, rather than an error (because it thought the amount transferred was non-zero) and iter_file_splice_write() would then try to clean up that amount of pipe bufferage - leading to an oops when it overran. The kernel log showed: CIFS: VFS: Send error in write = -28 followed by: BUG: kernel NULL pointer dereference, address: 0000000000000008 with: RIP: 0010:iter_file_splice_write+0x3a4/0x520 do_splice+0x197/0x4e0 or: RIP: 0010:pipe_buf_release (include/linux/pipe_fs_i.h:282) iter_file_splice_write (fs/splice.c:755) Also put a warning check into splice to announce if ->write_iter() returned that it had written more than it was asked to. Fixes: 288ace2f57c9 ("netfs: New writeback implementation") Reported-by: Xiaoli Feng <fengxiaoli0714@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220445 Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/915443.1755207950@warthog.procyon.org.uk cc: Paulo Alcantara <pc@manguebit.org> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: netfs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: stable@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-14Merge tag 'firewire-fixes-6.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fixes from Takashi Sakamoto: "This fixes a potential call to schedule() within an RCU read-side critical section. The solution applies reference counting to ensure that handlers which may call schedule() are invoked safely outside of the critical section" * tag 'firewire-fixes-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: core: reallocate buffer for FCP address handlers when more than 4 are registered firewire: core: call FCP address handlers outside RCU read-side critical section firewire: core: call handler for exclusive regions outside RCU read-side critical section firewire: core: use reference counting to invoke address handlers safely
2025-08-14x86/vmscape: Enable the mitigationPawan Gupta
Enable the previously added mitigation for VMscape. Add the cmdline vmscape={off|ibpb|force} and sysfs reporting. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-08-14Merge tag 'net-6.17-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from Netfilter and IPsec. Current release - regressions: - netfilter: nft_set_pipapo: - don't return bogus extension pointer - fix null deref for empty set Current release - new code bugs: - core: prevent deadlocks when enabling NAPIs with mixed kthread config - eth: netdevsim: Fix wild pointer access in nsim_queue_free(). Previous releases - regressions: - page_pool: allow enabling recycling late, fix false positive warning - sched: ets: use old 'nbands' while purging unused classes - xfrm: - restore GSO for SW crypto - bring back device check in validate_xmit_xfrm - tls: handle data disappearing from under the TLS ULP - ptp: prevent possible ABBA deadlock in ptp_clock_freerun() - eth: - bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE - hv_netvsc: fix panic during namespace deletion with VF Previous releases - always broken: - netfilter: fix refcount leak on table dump - vsock: do not allow binding to VMADDR_PORT_ANY - sctp: linearize cloned gso packets in sctp_rcv - eth: - hibmcge: fix the division by zero issue - microchip: fix KSZ8863 reset problem" * tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits) net: usb: asix_devices: add phy_mask for ax88772 mdio bus net: kcm: Fix race condition in kcm_unattach() selftests: net/forwarding: test purge of active DWRR classes net/sched: ets: use old 'nbands' while purging unused classes bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE netdevsim: Fix wild pointer access in nsim_queue_free(). net: mctp: Fix bad kfree_skb in bind lookup test netfilter: nf_tables: reject duplicate device on updates ipvs: Fix estimator kthreads preferred affinity netfilter: nft_set_pipapo: fix null deref for empty set selftests: tls: test TCP stealing data from under the TLS socket tls: handle data disappearing from under the TLS ULP ptp: prevent possible ABBA deadlock in ptp_clock_freerun() ixgbe: prevent from unwanted interface name changes devlink: let driver opt out of automatic phys_port_name generation net: prevent deadlocks when enabling NAPIs with mixed kthread config net: update NAPI threaded config even for disabled NAPIs selftests: drv-net: don't assume device has only 2 queues docs: Fix name for net.ipv4.udp_child_hash_entries riscv: dts: thead: Add APB clocks for TH1520 GMACs ...
2025-08-13kcov, usb: Don't disable interrupts in kcov_remote_start_usb_softirq()Sebastian Andrzej Siewior
kcov_remote_start_usb_softirq() the begin of urb's completion callback. HCDs marked HCD_BH will invoke this function from the softirq and in_serving_softirq() will detect this properly. Root-HUB (RH) requests will not be delayed to softirq but complete immediately in IRQ context. This will confuse kcov because in_serving_softirq() will report true if the softirq is served after the hardirq and if the softirq got interrupted by the hardirq in which currently runs. This was addressed by simply disabling interrupts in kcov_remote_start_usb_softirq() which avoided the interruption by the RH while a regular completion callback was invoked. This not only changes the behaviour while kconv is enabled but also breaks PREEMPT_RT because now sleeping locks can no longer be acquired. Revert the previous fix. Address the issue by invoking kcov_remote_start_usb() only if the context is just "serving softirqs" which is identified by checking in_serving_softirq() and in_hardirq() must be false. Fixes: f85d39dd7ed89 ("kcov, usb: disable interrupts in kcov_remote_start_usb_softirq") Cc: stable <stable@kernel.org> Reported-by: Yunseong Kim <ysk@kzalloc.com> Closes: https://lore.kernel.org/all/20250725201400.1078395-2-ysk@kzalloc.com/ Tested-by: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250811082745.ycJqBXMs@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13locking: Fix __clear_task_blocked_on() warning from __ww_mutex_wound() pathJohn Stultz
The __clear_task_blocked_on() helper added a number of sanity checks ensuring we hold the mutex wait lock and that the task we are clearing blocked_on pointer (if set) matches the mutex. However, there is an edge case in the _ww_mutex_wound() logic where we need to clear the blocked_on pointer for the task that owns the mutex, not the task that is waiting on the mutex. For this case the sanity checks aren't valid, so handle this by allowing a NULL lock to skip the additional checks. K Prateek Nayak and Maarten Lankhorst also pointed out that in this case where we don't hold the owner's mutex wait_lock, we need to be a bit more careful using READ_ONCE/WRITE_ONCE in both the __clear_task_blocked_on() and __set_task_blocked_on() implementations to avoid accidentally tripping WARN_ONs if two instances race. So do that here as well. This issue was easier to miss, I realized, as the test-ww_mutex driver only exercises the wait-die class of ww_mutexes. I've sent a patch[1] to address this so the logic will be easier to test. [1]: https://lore.kernel.org/lkml/20250801023358.562525-2-jstultz@google.com/ Fixes: a4f0b6fef4b0 ("locking/mutex: Add p->blocked_on wrappers for correctness checks") Closes: https://lore.kernel.org/lkml/68894443.a00a0220.26d0e1.0015.GAE@google.com/ Reported-by: syzbot+602c4720aed62576cd79@syzkaller.appspotmail.com Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20250805001026.2247040-1-jstultz@google.com
2025-08-12net: update NAPI threaded config even for disabled NAPIsJakub Kicinski
We have to make sure that all future NAPIs will have the right threaded state when the state is configured on the device level. We chose not to have an "unset" state for threaded, and not to wipe the NAPI config clean when channels are explicitly disabled. This means the persistent config structs "exist" even when their NAPIs are not instantiated. Differently put - the NAPI persistent state lives in the net_device (ncfg == struct napi_config): ,--- [napi 0] - [napi 1] [dev] | | `--- [ncfg 0] - [ncfg 1] so say we a device with 2 queues but only 1 enabled: ,--- [napi 0] [dev] | `--- [ncfg 0] - [ncfg 1] now we set the device to threaded=1: ,---------- [napi 0 (thr:1)] [dev(thr:1)] | `---------- [ncfg 0 (thr:1)] - [ncfg 1 (thr:?)] Since [ncfg 1] was not attached to a NAPI during configuration we skipped it. If we create a NAPI for it later it will have the old setting (presumably disabled). One could argue if this is right or not "in principle", but it's definitely not how things worked before per-NAPI config.. Fixes: 2677010e7793 ("Add support to set NAPI threaded for individual NAPI") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20250809001205.1147153-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-11iosys-map: Fix undefined behavior in iosys_map_clear()Nitin Gote
The current iosys_map_clear() implementation reads the potentially uninitialized 'is_iomem' boolean field to decide which union member to clear. This causes undefined behavior when called on uninitialized structures, as 'is_iomem' may contain garbage values like 0xFF. UBSAN detects this as: UBSAN: invalid-load in include/linux/iosys-map.h:267 load of value 255 is not a valid value for type '_Bool' Fix by unconditionally clearing the entire structure with memset(), eliminating the need to read uninitialized data and ensuring all fields are set to known good values. Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14639 Fixes: 01fd30da0474 ("dma-buf: Add struct dma-buf-map for storing struct dma_buf.vaddr_ptr") Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250718105051.2709487-1-nitin.r.gote@intel.com
2025-08-11module: Rename EXPORT_SYMBOL_GPL_FOR_MODULES to EXPORT_SYMBOL_FOR_MODULESVlastimil Babka
Christoph suggested that the explicit _GPL_ can be dropped from the module namespace export macro, as it's intended for in-tree modules only. It would be possible to restrict it technically, but it was pointed out [2] that some cases of using an out-of-tree build of an in-tree module with the same name are legitimate. But in that case those also have to be GPL anyway so it's unnecessary to spell it out in the macro name. Link: https://lore.kernel.org/all/aFleJN_fE-RbSoFD@infradead.org/ [1] Link: https://lore.kernel.org/all/CAK7LNATRkZHwJGpojCnvdiaoDnP%2BaeUXgdey5sb_8muzdWTMkA@mail.gmail.com/ [2] Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Shivank Garg <shivankg@amd.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Nicolas Schier <n.schier@avm.de> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/20250808-export_modules-v4-1-426945bcc5e1@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11of: reserved_mem: Restructure call site for dma_contiguous_early_fixup()Oreoluwa Babatunde
Restructure the call site for dma_contiguous_early_fixup() to where the reserved_mem nodes are being parsed from the DT so that dma_mmu_remap[] is populated before dma_contiguous_remap() is called. Fixes: 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved memory regions are processed") Signed-off-by: Oreoluwa Babatunde <oreoluwa.babatunde@oss.qualcomm.com> Tested-by: William Zhang <william.zhang@broadcom.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20250806172421.2748302-1-oreoluwa.babatunde@oss.qualcomm.com
2025-08-09Merge tag 'efi-next-for-v6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Expose the OVMF firmware debug log via sysfs - Lower the default log level for the EFI stub to avoid corrupting any splash screens with unimportant diagnostic output * tag 'efi-next-for-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: add API doc entry for ovmf_debug_log efistub: Lower default log level efi: add ovmf debug log driver
2025-08-09Merge tag 'block-6.17-20250808' of git://git.kernel.dk/linuxLinus Torvalds
Pull more block updates from Jens Axboe: - MD pull request via Yu: - mddev null-ptr-dereference fix, by Erkun - md-cluster fail to remove the faulty disk regression fix, by Heming - minor cleanup, by Li Nan and Jinchao - mdadm lifetime regression fix reported by syzkaller, by Yu Kuai - MD pull request via Christoph - add support for getting the FDP featuee in fabrics passthru path (Nitesh Shetty) - add capability to connect to an administrative controller (Kamaljit Singh) - fix a leak on sgl setup error (Keith Busch) - initialize discovery subsys after debugfs is initialized (Mohamed Khalfella) - fix various comment typos (Bjorn Helgaas) - remove unneeded semicolons (Jiapeng Chong) - nvmet debugfs ordering issue fix - Fix UAF in the tag_set in zloop - Ensure sbitmap shallow depth covers entire set - Reduce lock roundtrips in io context lookup - Move scheduler tags alloc/free out of elevator and freeze lock, to fix some lockdep found issues - Improve robustness of queue limits checking - Fix a regression with IO priorities, if no io context exists * tag 'block-6.17-20250808' of git://git.kernel.dk/linux: (26 commits) lib/sbitmap: make sbitmap_get_shallow() internal lib/sbitmap: convert shallow_depth from one word to the whole sbitmap nvmet: exit debugfs after discovery subsystem exits block, bfq: Reorder struct bfq_iocq_bfqq_data md: make rdev_addable usable for rcu mode md/raid1: remove struct pool_info and related code md/raid1: change r1conf->r1bio_pool to a pointer type block: ensure discard_granularity is zero when discard is not supported zloop: fix KASAN use-after-free of tag set block: Fix default IO priority if there is no IO context nvme: fix various comment typos nvme-auth: remove unneeded semicolon nvme-pci: fix leak on sgl setup error nvmet: initialize discovery subsys after debugfs is initialized nvme: add capability to connect to an administrative controller nvmet: add support for FDP in fabrics passthru path md: rename recovery_cp to resync_offset md/md-cluster: handle REMOVE message earlier md: fix create on open mddev lifetime regression block: fix potential deadlock while running nr_hw_queue update ...
2025-08-09Merge tag 'gpio-updates-for-v6.17-rc1-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio updates from Bartosz Golaszewski: "As discussed: there's a small commit that removes the legacy GPIO line value setter callbacks as they're no longer used and a big, treewide commit that renames the new ones to the old names across all GPIO drivers at once. While at it: there are also two fixes that I picked up over the course of the merge window: - remove unused, legacy GPIO line value setters from struct gpio_chip - rename the new set callbacks back to the original names treewide - fix interrupt handling in gpio-mlxbf2 - revert a buggy immutable irqchip conversion" * tag 'gpio-updates-for-v6.17-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: treewide: rename GPIO set callbacks back to their original names gpio: remove legacy GPIO line value setter callbacks gpio: mlxbf2: use platform_get_irq_optional() Revert "gpio: pxa: Make irq_chip immutable"
2025-08-09Merge tag 'nfs-for-6.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - don't inherit NFS filesystem capabilities when crossing from one filesystem to another Bugfixes: - NFS wakeup of __nfs_lookup_revalidate() needs memory barriers - NFS improve bounds checking in nfs_fh_to_dentry() - NFS Fix allocation errors when writing to a NFS file backed loopback device - NFSv4: More listxattr fixes - SUNRPC: fix client handling of TLS alerts - pNFS block/scsi layout fix for an uninitialised pointer dereference - pNFS block/scsi layout fixes for the extent encoding, stripe mapping, and disk offset overflows - pNFS layoutcommit work around for RPC size limitations - pNFS/flexfiles avoid looping when handling fatal errors after layoutget - localio: fix various race conditions Features and cleanups: - Add NFSv4 support for retrieving the btime - NFS: Allow folio migration for the case of mode == MIGRATE_SYNC - NFS: Support using a kernel keyring to store TLS certificates - NFSv4: Speed up delegation lookup using a hash table - Assorted cleanups to remove unused variables and struct fields - Assorted new tracepoints to improve debugging" * tag 'nfs-for-6.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (44 commits) NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh() NFS/localio: nfs_close_local_fh() fix check for file closed NFSv4: Remove duplicate lookups, capability probes and fsinfo calls NFS: Fix the setting of capabilities when automounting a new filesystem sunrpc: fix client side handling of tls alerts nfs/localio: use read_seqbegin() rather than read_seqbegin_or_lock() NFS: Fixup allocation flags for nfsiod's __GFP_NORETRY NFSv4.2: another fix for listxattr NFS: Fix filehandle bounds checking in nfs_fh_to_dentry() SUNRPC: Silence warnings about parameters not being described NFS: Clean up pnfs_put_layout_hdr()/pnfs_destroy_layout_final() NFS: Fix wakeup of __nfs_lookup_revalidate() in unblock_revalidate() NFS: use a hash table for delegation lookup NFS: track active delegations per-server NFS: move the delegation_watermark module parameter NFS: cleanup nfs_inode_reclaim_delegation NFS: cleanup error handling in nfs4_server_common_setup pNFS/flexfiles: don't attempt pnfs on fatal DS errors NFS: drop __exit from nfs_exit_keyring ...
2025-08-08Merge tag 'net-6.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: Previous releases - regressions: - netlink: avoid infinite retry looping in netlink_unicast() Previous releases - always broken: - packet: fix a race in packet_set_ring() and packet_notifier() - ipv6: reject malicious packets in ipv6_gso_segment() - sched: mqprio: fix stack out-of-bounds write in tc entry parsing - net: drop UFO packets (injected via virtio) in udp_rcv_segment() - eth: mlx5: correctly set gso_segs when LRO is used, avoid false positive checksum validation errors - netpoll: prevent hanging NAPI when netcons gets enabled - phy: mscc: fix parsing of unicast frames for PTP timestamping - a number of device tree / OF reference leak fixes" * tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits) pptp: fix pptp_xmit() error path net: ti: icssg-prueth: Fix skb handling for XDP_PASS net: Update threaded state in napi config in netif_set_threaded selftests: netdevsim: Xfail nexthop test on slow machines eth: fbnic: Lock the tx_dropped update eth: fbnic: Fix tx_dropped reporting eth: fbnic: remove the debugging trick of super high page bias net: ftgmac100: fix potential NULL pointer access in ftgmac100_phy_disconnect dt-bindings: net: Replace bouncing Alexandru Tachici emails dpll: zl3073x: ZL3073X_I2C and ZL3073X_SPI should depend on NET net/sched: mqprio: fix stack out-of-bounds write in tc entry parsing Revert "net: mdio_bus: Use devm for getting reset GPIO" selftests: net: packetdrill: xfail all problems on slow machines net/packet: fix a race in packet_set_ring() and packet_notifier() benet: fix BUG when creating VFs net: airoha: npu: Add missing MODULE_FIRMWARE macros net: devmem: fix DMA direction on unmapping ipa: fix compile-testing with qcom-mdt=m eth: fbnic: unlink NAPIs from queues on error to open net: Add locking to protect skb->dev access in ip_output ...
2025-08-07lib/sbitmap: make sbitmap_get_shallow() internalYu Kuai
Because it's only used in sbitmap.c Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20250807032413.1469456-3-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-07lib/sbitmap: convert shallow_depth from one word to the whole sbitmapYu Kuai
Currently elevators will record internal 'async_depth' to throttle asynchronous requests, and they both calculate shallow_dpeth based on sb->shift, with the respect that sb->shift is the available tags in one word. However, sb->shift is not the availbale tags in the last word, see __map_depth: if (index == sb->map_nr - 1) return sb->depth - (index << sb->shift); For consequence, if the last word is used, more tags can be get than expected, for example, assume nr_requests=256 and there are four words, in the worst case if user set nr_requests=32, then the first word is the last word, and still use bits per word, which is 64, to calculate async_depth is wrong. One the ohter hand, due to cgroup qos, bfq can allow only one request to be allocated, and set shallow_dpeth=1 will still allow the number of words request to be allocated. Fix this problems by using shallow_depth to the whole sbitmap instead of per word, also change kyber, mq-deadline and bfq to follow this, a new helper __map_depth_with_shallow() is introduced to calculate available bits in each word. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250807032413.1469456-2-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-07treewide: rename GPIO set callbacks back to their original namesBartosz Golaszewski
The conversion of all GPIO drivers to using the .set_rv() and .set_multiple_rv() callbacks from struct gpio_chip (which - unlike their predecessors - return an integer and allow the controller drivers to indicate failures to users) is now complete and the legacy ones have been removed. Rename the new callbacks back to their original names in one sweeping change. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-08-07gpio: remove legacy GPIO line value setter callbacksBartosz Golaszewski
With no more users of the legacy GPIO line value setters - .set() and .set_multiple() - we can now remove them from the kernel. Link: https://lore.kernel.org/r/20250725074651.14002-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-08-07Merge tag 'input-for-v6.17-rc0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - updates to several drivers consuming GPIO APIs to use setters returning error codes - an infrastructure allowing to define "overlays" for touchscreens carving out regions implementing buttons and other elements from a bigger sensors and a corresponding update to st1232 driver - an update to AT/PS2 keyboard driver to map F13-F24 by default - Samsung keypad driver got a facelift - evdev input handler will now bind to all devices using EV_SYN event instead of abusing id->driver_info - two new sub-drivers implementing 1A (capacitive buttons) and 21 (forcepad button) functions in Synaptics RMI driver - support for polling mode in Goodix touchscreen driver - support for support for FocalTech FT8716 in edt-ft5x06 driver - support for MT6359 in mtk-pmic-keys driver - removal of pcf50633-input driver since platform it was used on is gone - new definitions for game controller "grip" buttons (BTN_GRIP*) and corresponding changes to xpad and hid-steam controller drivers - a new definition for "performance" key * tag 'input-for-v6.17-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (38 commits) HID: hid-steam: Use new BTN_GRIP* buttons Input: add keycode for performance mode key Input: max77693 - convert to atomic pwm operation Input: st1232 - add touch-overlay handling dt-bindings: input: touchscreen: st1232: add touch-overlay example Input: touch-overlay - add touchscreen overlay handling dt-bindings: touchscreen: add touch-overlay property Input: atkbd - correctly map F13 - F24 Input: xpad - use new BTN_GRIP* buttons Input: Add and document BTN_GRIP* Input: xpad - change buttons the D-Pad gets mapped as to BTN_DPAD_* Documentation: Fix capitalization of XBox -> Xbox Input: synaptics-rmi4 - add support for F1A dt-bindings: input: syna,rmi4: Document F1A function Input: synaptics-rmi4 - add support for Forcepads (F21) Input: mtk-pmic-keys - add support for MT6359 PMIC keys Input: remove special handling of id->driver_info when matching Input: evdev - switch matching to EV_SYN Input: samsung-keypad - use BIT() and GENMASK() where appropriate Input: samsung-keypad - use per-chip parameters ...
2025-08-07Merge tag 'vfio-v6.17-rc1-v2' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - Fix imbalance where the no-iommu/cdev device path skips too much on open, failing to increment a reference, but still decrements the reference on close. Add bounds checking to prevent such underflows (Jacob Pan) - Fill missing detach_ioas op for pds_vfio_pci, fixing probe failure when used with IOMMUFD (Brett Creeley) - Split SR-IOV VFs to separate dev_set, avoiding unnecessary serialization between VFs that appear on the same bus (Alex Williamson) - Fix a theoretical integer overflow is the mlx5-vfio-pci variant driver (Artem Sadovnikov) - Implement missing VF token checking support via vfio cdev/IOMMUFD interface (Jason Gunthorpe) - Update QAT vfio-pci variant driver to claim latest VF devices (Małgorzata Mielnik) - Add a cond_resched() call to avoid holding the CPU too long during DMA mapping operations (Keith Busch) * tag 'vfio-v6.17-rc1-v2' of https://github.com/awilliam/linux-vfio: vfio/type1: conditional rescheduling while pinning vfio/qat: add support for intel QAT 6xxx virtual functions vfio/qat: Remove myself from VFIO QAT PCI driver maintainers vfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFD vfio/mlx5: fix possible overflow in tracking max message size vfio/pci: Separate SR-IOV VF dev_set vfio/pds: Fix missing detach_ioas op vfio: Prevent open_count decrement to negative vfio: Fix unbalanced vfio_df_close call in no-iommu mode
2025-08-06Merge tag 'ata-6.17-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fixes from Damien Le Moal: - Cleanup whitespace in messages in libata-core and the pata_pdc2027x, pata_macio drivers (Colin) - Fix ata_to_sense_error() to avoid seeing nonsensical sense data for rare cases where we fail to get sense data from the drive. The complementary fix to this is to ensure that we always return the generic "ABORTED COMMAND" sense data for a failed command for which we have no status or error fields - The recent changes to link power management (LPM) which now prevent the user from attempting to set an LPM policy through the link_power_management_policy caused some regressions in test environments because of the error that is now returned when writing to that attribute when LPM is not supported. To allow users to not trip on this, introduce the new link_power_management_supported attribute to allow simple testing of a port/device LPM support (me) * tag 'ata-6.17-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: pata_pdc2027x: Remove space before newline and abbreviations ata: pata_macio: Remove space before newline ata: libata-core: Remove space before newline ata: libata-sata: Add link_power_management_supported sysfs attribute ata: libata-scsi: Return aborted command when missing sense and result TF ata: libata-scsi: Fix ata_to_sense_error() status handling
2025-08-06Merge tag 'kbuild-v6.17-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: "This is the last pull request from me. I'm grateful to have been able to continue as a maintainer for eight years. From the next cycle, Nathan and Nicolas will maintain Kbuild. - Fix a shortcut key issue in menuconfig - Fix missing rebuild of kheaders - Sort the symbol dump generated by gendwarfsyms - Support zboot extraction in scripts/extract-vmlinux - Migrate gconfig to GTK 3 - Add TAR variable to allow overriding the default tar command - Hand over Kbuild maintainership" * tag 'kbuild-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (92 commits) MAINTAINERS: hand over Kbuild maintenance kheaders: make it possible to override TAR kbuild: userprogs: use correct linker when mixing clang and GNU ld kconfig: lxdialog: replace strcpy() with strncpy() in inputbox.c kconfig: lxdialog: replace strcpy with snprintf in print_autowrap kconfig: gconf: refactor text_insert_help() kconfig: gconf: remove unneeded variable in text_insert_msg kconfig: gconf: use hyphens in signals kconfig: gconf: replace GtkImageMenuItem with GtkMenuItem kconfig: gconf: Fix Back button behavior kconfig: gconf: fix single view to display dependent symbols correctly scripts: add zboot support to extract-vmlinux gendwarfksyms: order -T symtypes output by name gendwarfksyms: use preferred form of sizeof for allocation kconfig: qconf: confine {begin,end}Group to constructor and destructor kconfig: qconf: fix ConfigList::updateListAllforAll() kconfig: add a function to dump all menu entries in a tree-like format kconfig: gconf: show GTK version in About dialog kconfig: gconf: replace GtkHPaned and GtkVPaned with GtkPaned kconfig: gconf: replace GdkColor with GdkRGBA ...
2025-08-05vfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFDJason Gunthorpe
This was missed during the initial implementation. The VFIO PCI encodes the vf_token inside the device name when opening the device from the group FD, something like: "0000:04:10.0 vf_token=bd8d9d2b-5a5f-4f5a-a211-f591514ba1f3" This is used to control access to a VF unless there is co-ordination with the owner of the PF. Since we no longer have a device name in the cdev path, pass the token directly through VFIO_DEVICE_BIND_IOMMUFD using an optional field indicated by VFIO_DEVICE_BIND_FLAG_TOKEN. Fixes: 5fcc26969a16 ("vfio: Add VFIO_DEVICE_BIND_IOMMUFD") Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v3-bdd8716e85fe+3978a-vfio_token_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-05Merge tag 'mm-stable-2025-08-03-12-35' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: "Significant patch series in this pull request: - "mseal cleanups" (Lorenzo Stoakes) Some mseal cleaning with no intended functional change. - "Optimizations for khugepaged" (David Hildenbrand) Improve khugepaged throughput by batching PTE operations for large folios. This gain is mainly for arm64. - "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes" (Mike Rapoport) A bugfix, additional debug code and cleanups to the execmem code. - "mm/shmem, swap: bugfix and improvement of mTHP swap in" (Kairui Song) Bugfixes, cleanups and performance improvememnts to the mTHP swapin code" * tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (38 commits) mm: mempool: fix crash in mempool_free() for zero-minimum pools mm: correct type for vmalloc vm_flags fields mm/shmem, swap: fix major fault counting mm/shmem, swap: rework swap entry and index calculation for large swapin mm/shmem, swap: simplify swapin path and result handling mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO mm/shmem, swap: tidy up swap entry splitting mm/shmem, swap: tidy up THP swapin checks mm/shmem, swap: avoid redundant Xarray lookup during swapin x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations x86/kprobes: enable EXECMEM_ROX_CACHE for kprobes allocations execmem: drop writable parameter from execmem_fill_trapping_insns() execmem: add fallback for failures in vmalloc(VM_ALLOW_HUGE_VMAP) execmem: move execmem_force_rw() and execmem_restore_rox() before use execmem: rework execmem_cache_free() execmem: introduce execmem_alloc_rw() execmem: drop unused execmem_update_copy() mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped mm/rmap: add anon_vma lifetime debug check mm: remove mm/io-mapping.c ...
2025-08-04Merge tag 'f2fs-for-6.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "Three main updates: folio conversion by Matthew, switch to a new mount API by Hongbo and Eric, and several sysfs entries to tune GCs for ZUFS with finer granularity by Daeho. There are also patches to address bugs and issues in the existing features such as GCs, file pinning, write-while-dio-read, contingous block allocation, and memory access violations. Enhancements: - switch to new mount API and folio conversion - add sysfs nodes to controle F2FS GCs for ZUFS - improve performance on the nat entry cache - drop inode from the donation list when the last file is closed - avoid splitting bio when reading multiple pages Bug fixes: - fix to trigger foreground gc during f2fs_map_blocks() in lfs mode - make sure zoned device GC to use FG_GC in shortage of free section - fix to calculate dirty data during has_not_enough_free_secs() - fix to update upper_p in __get_secs_required() correctly - wait for inflight dio completion, excluding pinned files read using dio - don't break allocation when crossing contiguous sections - vm_unmap_ram() may be called from an invalid context - fix to avoid out-of-boundary access in dnode page - fix to avoid panic in f2fs_evict_inode - fix to avoid UAF in f2fs_sync_inode_meta() - fix to use f2fs_is_valid_blkaddr_raw() in do_write_page() - fix UAF of f2fs_inode_info in f2fs_free_dic - fix to avoid invalid wait context issue - fix bio memleak when committing super block - handle nat.blkaddr corruption in f2fs_get_node_info() In addition, there are also clean-ups and minor bug fixes" * tag 'f2fs-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (109 commits) f2fs: drop inode from the donation list when the last file is closed f2fs: add gc_boost_gc_greedy sysfs node f2fs: add gc_boost_gc_multiple sysfs node f2fs: fix to trigger foreground gc during f2fs_map_blocks() in lfs mode f2fs: fix to calculate dirty data during has_not_enough_free_secs() f2fs: fix to update upper_p in __get_secs_required() correctly f2fs: directly add newly allocated pre-dirty nat entry to dirty set list f2fs: avoid redundant clean nat entry move in lru list f2fs: zone: wait for inflight dio completion, excluding pinned files read using dio f2fs: ignore valid ratio when free section count is low f2fs: don't break allocation when crossing contiguous sections f2fs: remove unnecessary tracepoint enabled check f2fs: merge the two conditions to avoid code duplication f2fs: vm_unmap_ram() may be called from an invalid context f2fs: fix to avoid out-of-boundary access in dnode page f2fs: switch to the new mount api f2fs: introduce fs_context_operation structure f2fs: separate the options parsing and options checking f2fs: Add f2fs_fs_context to record the mount options f2fs: Allow sbi to be NULL in f2fs_printk ...
2025-08-04Merge tag 'printk-for-6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Add new "hash_pointers=[auto|always|never]" boot parameter to force the hashing even with "slab_debug" enabled - Allow to stop CPU, after losing nbcon console ownership during panic(), even without proper NMI - Allow to use the printk kthread immediately even for the 1st registered nbcon - Compiler warning removal * tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: nbcon: Allow reacquire during panic printk: Allow to use the printk kthread immediately even for 1st nbcon slab: Decouple slab_debug and no_hash_pointers vsprintf: Use __diag macros to disable '-Wsuggest-attribute=format' compiler-gcc.h: Introduce __diag_GCC_all
2025-08-04Merge branch 'for-6.17-hash_pointers' into for-linusPetr Mladek
2025-08-04Merge branch 'for-6.15-printf-attribute' into for-linusPetr Mladek
2025-08-03Merge tag 'ib-mfd-gpio-input-pwm-v6.17' of ↵Dmitry Torokhov
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next Merge an immutable branch between MFD, GPIO, Input and PWM to resolve conflicts for the merge window pull request.
2025-08-03Merge tag 'rtc-6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "Support for a new RTC in an existing driver and all the drivers exposing clocks using the common clock framework have been converted to determine_rate(). Summary: Subsystem: - Convert drivers exposing a clock from round_rate() to determine_rate() Drivers: - ds1307: oscillator stop flag handling for ds1341 - pcf85063: add support for RV8063" * tag 'rtc-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (34 commits) rtc: ds1685: Update Joshua Kinard's email address. rtc: rv3032: convert from round_rate() to determine_rate() rtc: rv3028: convert from round_rate() to determine_rate() rtc: pcf8563: convert from round_rate() to determine_rate() rtc: pcf85063: convert from round_rate() to determine_rate() rtc: nct3018y: convert from round_rate() to determine_rate() rtc: max31335: convert from round_rate() to determine_rate() rtc: m41t80: convert from round_rate() to determine_rate() rtc: hym8563: convert from round_rate() to determine_rate() rtc: ds1307: convert from round_rate() to determine_rate() rtc: rv3028: fix incorrect maximum clock rate handling rtc: pcf8563: fix incorrect maximum clock rate handling rtc: pcf85063: fix incorrect maximum clock rate handling rtc: nct3018y: fix incorrect maximum clock rate handling rtc: hym8563: fix incorrect maximum clock rate handling rtc: ds1307: fix incorrect maximum clock rate handling rtc: pcf85063: scope pcf85063_config structures rtc: Optimize calculations in rtc_time64_to_tm() dt-bindings: rtc: amlogic,a4-rtc: Add compatible string for C3 rtc: ds1307: handle oscillator stop flag (OSF) for ds1341 ...
2025-08-03Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Significant patch series in this pull request: - "squashfs: Remove page->mapping references" (Matthew Wilcox) gets us closer to being able to remove page->mapping - "relayfs: misc changes" (Jason Xing) does some maintenance and minor feature addition work in relayfs - "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches us from static preallocation of the kdump crashkernel's working memory over to dynamic allocation. So the difficulty of a-priori estimation of the second kernel's needs is removed and the first kernel obtains extra memory - "generalize panic_print's dump function to be used by other kernel parts" (Feng Tang) implements some consolidation and rationalization of the various ways in which a failing kernel splats information at the operator * tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits) tools/getdelays: add backward compatibility for taskstats version kho: add test for kexec handover delaytop: enhance error logging and add PSI feature description samples: Kconfig: fix spelling mistake "instancess" -> "instances" fat: fix too many log in fat_chain_add() scripts/spelling.txt: add notifer||notifier to spelling.txt xen/xenbus: fix typo "notifer" net: mvneta: fix typo "notifer" drm/xe: fix typo "notifer" cxl: mce: fix typo "notifer" KVM: x86: fix typo "notifer" MAINTAINERS: add maintainers for delaytop ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below() ucount: fix atomic_long_inc_below() argument type kexec: enable CMA based contiguous allocation stackdepot: make max number of pools boot-time configurable lib/xxhash: remove unused functions init/Kconfig: restore CONFIG_BROKEN help text lib/raid6: update recov_rvv.c zero page usage docs: update docs after introducing delaytop ...
2025-08-03Merge tag 'trace-v6.17-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull more tracing updates from Steven Rostedt: - Remove unneeded goto out statements Over time, the logic was restructured but left a "goto out" where the out label simply did a "return ret;". Instead of jumping to this out label, simply return immediately and remove the out label. - Add guard(ring_buffer_nest) Some calls to the tracing ring buffer can happen when the ring buffer is already being written to at the same context (for example, a trace_printk() in between a ring_buffer_lock_reserve() and a ring_buffer_unlock_commit()). In order to not trigger the recursion detection, these functions use ring_buffer_nest_start() and ring_buffer_nest_end(). Create a guard() for these functions so that their use cases can be simplified and not need to use goto for the release. - Clean up the tracing code with guard() and __free() logic There were several locations that were prime candidates for using guard() and __free() helpers. Switch them over to use them. - Fix output of function argument traces for unsigned int values The function tracer with "func-args" option set will record up to 6 argument registers and then use BTF to format them for human consumption when the trace file is read. There are several arguments that are "unsigned long" and even "unsigned int" that are either and address or a mask. It is easier to understand if they were printed using hexadecimal instead of decimal. The old method just printed all non-pointer values as signed integers, which made it even worse for unsigned integers. For instance, instead of: __local_bh_disable_ip(ip=-2127311112, cnt=256) <-handle_softirqs show: __local_bh_disable_ip(ip=0xffffffff8133cef8, cnt=0x100) <-handle_softirqs" * tag 'trace-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Have unsigned int function args displayed as hexadecimal ring-buffer: Convert ring_buffer_write() to use guard(preempt_notrace) tracing: Use __free(kfree) in trace.c to remove gotos tracing: Add guard() around locks and mutexes in trace.c tracing: Add guard(ring_buffer_nest) tracing: Remove unneeded goto out logic
2025-08-03Merge tag 'modules-6.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux Pull module updates from Daniel Gomez: "This is a small set of changes for modules, primarily to extend module users to use the module data structures in combination with the already no-op stub module functions, even when support for modules is disabled in the kernel configuration. This change follows the kernel's coding style for conditional compilation and allows kunit code to drop all CONFIG_MODULES ifdefs, which is also part of the changes. This should allow others part of the kernel to do the same cleanup. The remaining changes include a fix for module name length handling which could potentially lead to the removal of an incorrect module, and various cleanups" * tag 'modules-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: module: Rename MAX_PARAM_PREFIX_LEN to __MODULE_NAME_LEN tracing: Replace MAX_PARAM_PREFIX_LEN with MODULE_NAME_LEN module: Restore the moduleparam prefix length check module: Remove unnecessary +1 from last_unloaded_module::name size module: Prevent silent truncation of module name in delete_module(2) kunit: test: Drop CONFIG_MODULE ifdeffery module: make structure definitions always visible module: move 'struct module_use' to internal.h
2025-08-03Merge tag 'i3c/for-6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux Pull i3c updates from Alexandre Belloni: "New driver: - Renesas I3C controller Subsystem: - use adapter timeout value for I2C transfers - don't fail if GETHDRCAP is unsupported - replace ENOTSUPP with SUSV4-compliant EOPNOTSUPP Drivers: - svc: Fix npcm845 FIFO_EMPTY quirk" * tag 'i3c/for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: (25 commits) i3c: add missing include to internal header i3c: dw: Remove redundant pm_runtime_mark_last_busy() calls i3c: master: svc: Remove redundant pm_runtime_mark_last_busy() calls i3c: master: svc: Fix npcm845 FIFO_EMPTY quirk i3c: master: Add basic driver for the Renesas I3C controller dt-bindings: i3c: Add Renesas I3C controller i3c: Add more parameters for controllers to the header i3c: Standardize defines for specification parameters i3c: fix module_i3c_i2c_driver() with I3C=n i3c: master: cdns: Simplify handling clocks in probe() i3c: Fix i3c_device_do_priv_xfers() kernel-doc indentation i3c: master: dw: Use i3c_writel_fifo() and i3c_readl_fifo() i3c: master: cdns: Use i3c_writel_fifo() and i3c_readl_fifo() i3c: master: Add inline i3c_readl_fifo() and i3c_writel_fifo() i3c: prefix hexadecimal entries in sysfs i3c: master: cdns: replace ENOTSUPP with SUSV4-compliant EOPNOTSUPP i3c: dw: replace ENOTSUPP with SUSV4-compliant EOPNOTSUPP i3c: master: replace ENOTSUPP with SUSV4-compliant EOPNOTSUPP i3c: don't fail if GETHDRCAP is unsupported i3c: add patchwork entry to MAINTAINERS ...
2025-08-11firewire: core: use reference counting to invoke address handlers safelyTakashi Sakamoto
The lifetime of address handler has been managed by linked list and RCU. This approach was introduced in commit 35202f7d8420 ("firewire: remove global lock around address handlers, convert to RCU"). The invocations of address handler are performed within RCU read-side critical sections. In commit 57e6d9f85fff ("firewire: ohci: use workqueue to handle events of AR request/response contexts"), the invocations are in a workqueue context. The approach still imposes limitation that sleeping is not allowed within RCU read-side critical sections. However, since sleeping is not permitted within RCU read-side critical sections, this approach still has a limitation. This commit adds reference counting to decouple handler invocation from handler discovery. The linked list and RCU is used to discover the handlers, while the reference counting is used to invoke them safely. Link: https://lore.kernel.org/r/20250803122015.236493-2-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2025-08-03rtc: ds1685: Update Joshua Kinard's email address.Joshua Kinard
I am switching my address to a personal domain, so need to update the driver's files and the entry in MAINTAINERS. Signed-off-by: Joshua Kinard <kumba@gentoo.org> Link: https://lore.kernel.org/r/20250721170051.32407-1-kumba@gentoo.org Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-08-02Merge tag 'pinctrl-v6.17-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control updates from Linus Walleij: "Nothing stands out, apart from maybe the interesting Eswin EIC7700, a RISC-V SoC I've never seen before. Core changes: - Open code PINCTRL_FUNCTION_DESC() instead of defining a complex macro only used in one place - Add pinmux_generic_add_pinfunction() helper and use this in a few drivers New drivers: - Amlogic S7, S7D and S6 pin control support - Eswin EIC7700 pin control support - Qualcomm PMIV0104, PM7550 and Milos pin control support Because of unhelpful numbering schemes, the Qualcomm driver now needs to start to rely on SoC codenames - STM32 HDP pin control support - Mediatek MT8189 pin control support Improvements: - Switch remaining pin control drivers over to the new GPIO set callback that provides a return value - Support RSVD (reserved) pins in the STM32 driver - Move many fixed assignments over to pinctrl_desc definitions - Handle multiple TLMM regions in the Qualcomm driver" * tag 'pinctrl-v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (105 commits) pinctrl: mediatek: Add pinctrl driver for mt8189 dt-bindings: pinctrl: mediatek: Add support for mt8189 pinctrl: aspeed-g6: Add PCIe RC PERST pin group pinctrl: ingenic: use pinmux_generic_add_pinfunction() pinctrl: keembay: use pinmux_generic_add_pinfunction() pinctrl: mediatek: moore: use pinmux_generic_add_pinfunction() pinctrl: airoha: use pinmux_generic_add_pinfunction() pinctrl: equilibrium: use pinmux_generic_add_pinfunction() pinctrl: provide pinmux_generic_add_pinfunction() pinctrl: pinmux: open-code PINCTRL_FUNCTION_DESC() pinctrl: ma35: use new GPIO line value setter callbacks MAINTAINERS: add Clément Le Goffic as STM32 HDP maintainer pinctrl: stm32: Introduce HDP driver dt-bindings: pinctrl: stm32: Introduce HDP pinctrl: qcom: Add Milos pinctrl driver dt-bindings: pinctrl: document the Milos Top Level Mode Multiplexer pinctrl: qcom: spmi: Add PM7550 dt-bindings: pinctrl: qcom,pmic-gpio: Add PM7550 support pinctrl: qcom: spmi: Add PMIV0104 dt-bindings: pinctrl: qcom,pmic-gpio: Add PMIV0104 support ...
2025-08-02execmem: drop writable parameter from execmem_fill_trapping_insns()Mike Rapoport (Microsoft)
After update of execmem_cache_free() that made memory writable before updating it, there is no need to update read only memory, so the writable parameter to execmem_fill_trapping_insns() is not needed. Drop it. Link: https://lkml.kernel.org/r/20250713071730.4117334-7-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02execmem: introduce execmem_alloc_rw()Mike Rapoport (Microsoft)
Some callers of execmem_alloc() require the memory to be temporarily writable even when it is allocated from ROX cache. These callers use execemem_make_temp_rw() right after the call to execmem_alloc(). Wrap this sequence in execmem_alloc_rw() API. Link: https://lkml.kernel.org/r/20250713071730.4117334-3-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02execmem: drop unused execmem_update_copy()Mike Rapoport (Microsoft)
Patch series "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes", v3. These patches enable use of EXECMEM_ROX_CACHE for ftrace and kprobes allocations on x86. They also include some ground work in execmem. Since the execmem model for caching large ROX pages changed from the initial assumption that the memory that is allocated from ROX cache is always ROX to the current state where memory can be temporarily made RW and then restored to ROX, we can stop using text poking to update it. This also saves the hassle of trying lock text_mutex in execmem_cache_free() when kprobes already hold that mutex. This patch (of 8): The execmem_update_copy() that used text poking was required when memory allocated from ROX cache was always read-only. Since now its permissions can be switched to read-write there is no need in a function that updates memory with text poking. Remove it. Link: https://lkml.kernel.org/r/20250713071730.4117334-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20250713071730.4117334-2-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got droppedSuren Baghdasaryan
By inducing delays in the right places, Jann Horn created a reproducer for a hard to hit UAF issue that became possible after VMAs were allowed to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache. Race description is borrowed from Jann's discovery report: lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under rcu_read_lock(). At that point, the VMA may be concurrently freed, and it can be recycled by another process. vma_start_read() then increments the vma->vm_refcnt (if it is in an acceptable range), and if this succeeds, vma_start_read() can return a recycled VMA. In this scenario where the VMA has been recycled, lock_vma_under_rcu() will then detect the mismatching ->vm_mm pointer and drop the VMA through vma_end_read(), which calls vma_refcount_put(). vma_refcount_put() drops the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. This is wrong: It implicitly assumes that the caller is keeping the VMA's mm alive, but in this scenario the caller has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF. The diagram depicting the race: T1 T2 T3 == == == lock_vma_under_rcu mas_walk <VMA gets removed from mm> mmap <the same VMA is reallocated> vma_start_read __refcount_inc_not_zero_limited_acquire munmap __vma_enter_locked refcount_add_not_zero vma_end_read vma_refcount_put __refcount_dec_and_test rcuwait_wait_event <finish operation> rcuwait_wake_up [UAF] Note that rcuwait_wait_event() in T3 does not block because refcount was already dropped by T1. At this point T3 can exit and free the mm causing UAF in T1. To avoid this we move vma->vm_mm verification into vma_start_read() and grab vma->vm_mm to stabilize it before vma_refcount_put() operation. [surenb@google.com: v3] Link: https://lkml.kernel.org/r/20250729145709.2731370-1-surenb@google.com Link: https://lkml.kernel.org/r/20250728175355.2282375-1-surenb@google.com Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: Jann Horn <jannh@google.com> Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m2tA@mail.gmail.com/ Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02mm/rmap: add anon_vma lifetime debug checkJann Horn
If an anon folio is mapped into userspace, its anon_vma must be alive, otherwise rmap walks can hit UAF. There have been syzkaller reports a few months ago[1][2] of UAF in rmap walks that seems to indicate that there can be pages with elevated mapcount whose anon_vma has already been freed, but I think we never figured out what the cause is; and syzkaller only hit these UAFs when memory pressure randomly caused reclaim to rmap-walk the affected pages, so it of course didn't manage to create a reproducer. Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to hopefully catch such issues more reliably. [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com Link: https://lkml.kernel.org/r/20250725-anonvma-uaf-debug-v2-1-bc3c7e5ba5b1@google.com Signed-off-by: Jann Horn <jannh@google.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Harry Yoo <harry.yoo@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02mm: remove mm/io-mapping.cLorenzo Stoakes
This is dead code, which was used from commit b739f125e4eb ("i915: use io_mapping_map_user") but reverted a month later by commit 0e4fe0c9f2f9 ("Revert "i915: use io_mapping_map_user"") back in 2021. Since then nobody has used it, so remove it. [akpm@linux-foundation.org: update Documentation/core-api/mm-api.rst, per Vlastimil] Link: https://lkml.kernel.org/r/20250725142901.81502-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02mm: add get_and_clear_ptes() and clear_ptes()David Hildenbrand
Patch series "Optimizations for khugepaged", v4. If the underlying folio mapped by the ptes is large, we can process those ptes in a batch using folio_pte_batch(). For arm64 specifically, this results in a 16x reduction in the number of ptep_get() calls, since on a contig block, ptep_get() on arm64 will iterate through all 16 entries to collect a/d bits. Next, ptep_clear() will cause a TLBI for every contig block in the range via contpte_try_unfold(). Instead, use clear_ptes() to only do the TLBI at the first and last contig block of the range. For split folios, there will be no pte batching; the batch size returned by folio_pte_batch() will be 1. For pagetable split folios, the ptes will still point to the same large folio; for arm64, this results in the optimization described above, and for other arches, a minor improvement is expected due to a reduction in the number of function calls and batching atomic operations. This patch (of 3): Let's add variants to be used where "full" does not apply -- which will be the majority of cases in the future. "full" really only applies if we are about to tear down a full MM. Use get_and_clear_ptes() in existing code, clear_ptes() users will be added next. Link: https://lkml.kernel.org/r/20250724052301.23844-2-dev.jain@arm.com Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Barry Song <baohua@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02mm/mseal: always define VM_SEALEDLorenzo Stoakes
Patch series "mseal cleanups", v4. Perform a number of cleanups to the mseal logic. Firstly, VM_SEALED is treated differently from every other VMA flag, it really doesn't make sense to do this, so we start by making this consistent with everything else. Next we place the madvise logic where it belongs - in mm/madvise.c. It really makes no sense to abstract this elsewhere. In doing so, we go to great lengths to explain very clearly the previously very confusing logic as to what sealed mappings are impacted here. In doing so, we retain existing logic regarding treatment of madvise() discard operations for a sealed, read-only MAP_PRIVATE file-backed mapping. This is something we likely need to revisit. We then abstract out and explain the 'are there are any gaps in this range in the mm?' check being performed as a prerequisite to mseal being performed. Finally, we simplify the actual mseal logic which is really quite straightforward. No functional change is intended. This patch (of 4): There is no reason to treat VM_SEALED in a special way, in each other case in which a VMA flag is unavailable due to configuration, we simply assign that flag to VM_NONE, so make VM_SEALED consistent with all other VMA flags in this respect. Additionally, use the next available bit for VM_SEALED, 42, rather than arbitrarily putting it at 63 and update the declaration to match all other VMA flags. No functional change intended. Link: https://lkml.kernel.org/r/cover.1753431105.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/aeb398a77029b6e7377cd944328bc9bbc3c90537.1753431105.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Kees Cook <kees@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02mm/page-flags: remove folio_start_writeback_keepwrite()Joanne Koong
Commit cd57b77197a4 ("ext4: Convert ext4_bio_write_page() to use a folio) removed set_page_writeback_keepwrite() which was the last/only caller of folio_start_writeback_keepwrite(). Link: https://lkml.kernel.org/r/20250722182230.2114587-1-joannelkoong@gmail.com Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02kexec: enable CMA based contiguous allocationAlexander Graf
When booting a new kernel with kexec_file, the kernel picks a target location that the kernel should live at, then allocates random pages, checks whether any of those patches magically happens to coincide with a target address range and if so, uses them for that range. For every page allocated this way, it then creates a page list that the relocation code - code that executes while all CPUs are off and we are just about to jump into the new kernel - copies to their final memory location. We can not put them there before, because chances are pretty good that at least some page in the target range is already in use by the currently running Linux environment. Copying is happening from a single CPU at RAM rate, which takes around 4-50 ms per 100 MiB. All of this is inefficient and error prone. To successfully kexec, we need to quiesce all devices of the outgoing kernel so they don't scribble over the new kernel's memory. We have seen cases where that does not happen properly (*cough* GIC *cough*) and hence the new kernel was corrupted. This started a month long journey to root cause failing kexecs to eventually see memory corruption, because the new kernel was corrupted severely enough that it could not emit output to tell us about the fact that it was corrupted. By allocating memory for the next kernel from a memory range that is guaranteed scribbling free, we can boot the next kernel up to a point where it is at least able to detect corruption and maybe even stop it before it becomes severe. This increases the chance for successful kexecs. Since kexec got introduced, Linux has gained the CMA framework which can perform physically contiguous memory mappings, while keeping that memory available for movable memory when it is not needed for contiguous allocations. The default CMA allocator is for DMA allocations. This patch adds logic to the kexec file loader to attempt to place the target payload at a location allocated from CMA. If successful, it uses that memory range directly instead of creating copy instructions during the hot phase. To ensure that there is a safety net in case anything goes wrong with the CMA allocation, it also adds a flag for user space to force disable CMA allocations. Using CMA allocations has two advantages: 1) Faster by 4-50 ms per 100 MiB. There is no more need to copy in the hot phase. 2) More robust. Even if by accident some page is still in use for DMA, the new kernel image will be safe from that access because it resides in a memory region that is considered allocated in the old kernel and has a chance to reinitialize that component. Link: https://lkml.kernel.org/r/20250610085327.51817-1-graf@amazon.com Signed-off-by: Alexander Graf <graf@amazon.com> Acked-by: Baoquan He <bhe@redhat.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Zhongkun He <hezhongkun.hzk@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02lib/xxhash: remove unused functionsDr. David Alan Gilbert
xxh32_digest() and xxh32_update() were added in 2017 in the original xxhash commit, but have remained unused. Remove them. Link: https://lkml.kernel.org/r/20250716133245.243363-1-linux@treblig.org Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Dave Gilbert <linux@treblig.org> Cc: Nick Terrell <terrelln@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02Merge tag 'firewire-updates-6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire updates from Takashi Sakamoto: "This update replaces the remaining tasklet usage in the FireWire subsystem with workqueue for asynchronous packet transmission. With this change, tasklets are now fully eliminated from the subsystem. Asynchronous packet transmission is used for serial bus topology management as well as for the operation of the SBP-2 protocol driver (firewire-sbp2). To ensure reliability during low-memory conditions, the associated workqueue is created with the WQ_MEM_RECLAIM flag, allowing it to participate in memory reclaim paths. Other attributes are aligned with those used for isochronous packet handling, which was migrated to workqueues in v6.12. The workqueues are sleepable and support preemptible work items, making them more suitable for real-time workloads that benefit from timely task preemption at the system level. There remains an issue where 'schedule()' may be called within an RCU read-side critical section, due to a direct replacement of 'tasklet_disable_in_atomic()' with 'disable_work_sync()'. A proposed fix for this has been posted[1], and is currently under review and testing. It is expected to be sent upstream later" Link: https://lore.kernel.org/lkml/20250728015125.17825-1-o-takashi@sakamocchi.jp/ [1] * tag 'firewire-updates-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: ohci: reduce the size of common context structure by extracting members into AT structure firewire: core: minor code refactoring to localize table of gap count firewire: ohci: use workqueue to handle events of AT request/response contexts firewire: ohci: use workqueue to handle events of AR request/response contexts firewire: core: allocate workqueue for AR/AT request/response contexts firewire: core: use from_work() macro to expand parent structure of work_struct firewire: ohci: use from_work() macro to expand parent structure of work_struct firewire: ohci: correct code comments about bus_reset tasklet