summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-07-28Merge tag 'vfs-6.17-rc1.fallocate' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull fallocate updates from Christian Brauner: "fallocate() currently supports creating preallocated files efficiently. However, on most filesystems fallocate() will preallocate blocks in an unwriten state even if FALLOC_FL_ZERO_RANGE is specified. The extent state must later be converted to a written state when the user writes data into this range, which can trigger numerous metadata changes and journal I/O. This may leads to significant write amplification and performance degradation in synchronous write mode. At the moment, the only method to avoid this is to create an empty file and write zero data into it (for example, using 'dd' with a large block size). However, this method is slow and consumes a considerable amount of disk bandwidth. Now that more and more flash-based storage devices are available it is possible to efficiently write zeros to SSDs using the unmap write zeroes command if the devices do not write physical zeroes to the media. For example, if SCSI SSDs support the UMMAP bit or NVMe SSDs support the DEAC bit[1], the write zeroes command does not write actual data to the device, instead, NVMe converts the zeroed range to a deallocated state, which works fast and consumes almost no disk write bandwidth. This series implements the BLK_FEAT_WRITE_ZEROES_UNMAP feature and BLK_FLAG_WRITE_ZEROES_UNMAP_DISABLED flag for SCSI, NVMe and device-mapper drivers, and add the FALLOC_FL_WRITE_ZEROES and STATX_ATTR_WRITE_ZEROES_UNMAP support for ext4 and raw bdev devices. fallocate() is subsequently extended with the FALLOC_FL_WRITE_ZEROES flag. FALLOC_FL_WRITE_ZEROES zeroes a specified file range in such a way that subsequent writes to that range do not require further changes to the file mapping metadata. This flag is beneficial for subsequent pure overwriting within this range, as it can save on block allocation and, consequently, significant metadata changes" * tag 'vfs-6.17-rc1.fallocate' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: ext4: add FALLOC_FL_WRITE_ZEROES support block: add FALLOC_FL_WRITE_ZEROES support block: factor out common part in blkdev_fallocate() fs: introduce FALLOC_FL_WRITE_ZEROES to fallocate dm: clear unmap write zeroes limits when disabling write zeroes scsi: sd: set max_hw_wzeroes_unmap_sectors if device supports SD_ZERO_*_UNMAP nvmet: set WZDS and DRB if device enables unmap write zeroes operation nvme: set max_hw_wzeroes_unmap_sectors if device supports DEAC bit block: introduce max_{hw|user}_wzeroes_unmap_sectors to queue limits
2025-07-28Merge tag 'vfs-6.17-rc1.async.dir' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull async directory updates from Christian Brauner: "This contains preparatory changes for the asynchronous directory locking scheme. While the locking scheme is still very much controversial and we're still far away from landing any actual changes in that area the preparatory work that we've been upstreaming for a while now has been very useful. This is another set of minor changes and cleanups" * tag 'vfs-6.17-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: exportfs: use lookup_one_unlocked() coda: use iterate_dir() in coda_readdir() VFS: Minor fixes for porting.rst VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl()
2025-07-28Merge tag 'vfs-6.17-rc1.nsfs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull namespace updates from Christian Brauner: "This contains namespace updates. This time specifically for nsfs: - Userspace heavily relies on the root inode numbers for namespaces to identify the initial namespaces. That's already a hard dependency. So we cannot change that anymore. Move the initial inode numbers to a public header and align the only two namespaces that currently don't do that with all the other namespaces. - The root inode of /proc having a fixed inode number has been part of the core kernel ABI since its inception, and recently some userspace programs (mainly container runtimes) have started to explicitly depend on this behaviour. The main reason this is useful to userspace is that by checking that a suspect /proc handle has fstype PROC_SUPER_MAGIC and is PROCFS_ROOT_INO, they can then use openat2() together with RESOLVE_{NO_{XDEV,MAGICLINK},BENEATH} to ensure that there isn't a bind-mount that replaces some procfs file with a different one. This kind of attack has lead to security issues in container runtimes in the past (such as CVE-2019-19921) and libraries like libpathrs[1] use this feature of procfs to provide safe procfs handling functions" * tag 'vfs-6.17-rc1.nsfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: uapi: export PROCFS_ROOT_INO mntns: use stable inode number for initial mount ns netns: use stable inode number for initial mount ns nsfs: move root inode number to uapi
2025-07-28Merge tag 'vfs-6.17-rc1.ovl' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull overlayfs updates from Christian Brauner: "This contains overlayfs updates for this cycle. The changes for overlayfs in here are primarily focussed on preparing for some proposed changes to directory locking. Overlayfs currently will sometimes lock a directory on the upper filesystem and do a few different things while holding the lock. This is incompatible with the new potential scheme. This series narrows the region of code protected by the directory lock, taking it multiple times when necessary. This theoretically opens up the possibilty of other changes happening on the upper filesytem between the unlock and the lock. To some extent the patches guard against that by checking the dentries still have the expect parent after retaking the lock. In general, concurrent changes to the upper and lower filesystems aren't supported properly anyway" * tag 'vfs-6.17-rc1.ovl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits) ovl: properly print correct variable ovl: rename ovl_cleanup_unlocked() to ovl_cleanup() ovl: change ovl_create_real() to receive dentry parent ovl: narrow locking in ovl_check_rename_whiteout() ovl: narrow locking in ovl_whiteout() ovl: change ovl_cleanup_and_whiteout() to take rename lock as needed ovl: narrow locking on ovl_remove_and_whiteout() ovl: change ovl_workdir_cleanup() to take dir lock as needed. ovl: narrow locking in ovl_workdir_cleanup_recurse() ovl: narrow locking in ovl_indexdir_cleanup() ovl: narrow locking in ovl_workdir_create() ovl: narrow locking in ovl_cleanup_index() ovl: narrow locking in ovl_cleanup_whiteouts() ovl: narrow locking in ovl_rename() ovl: simplify gotos in ovl_rename() ovl: narrow locking in ovl_create_over_whiteout() ovl: narrow locking in ovl_clear_empty() ovl: narrow locking in ovl_create_upper() ovl: narrow the locked region in ovl_copy_up_workdir() ovl: Call ovl_create_temp() without lock held. ...
2025-07-28Merge tag 'vfs-6.17-rc1.coredump' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull coredump updates from Christian Brauner: "This contains an extension to the coredump socket and a proper rework of the coredump code. - This extends the coredump socket to allow the coredump server to tell the kernel how to process individual coredumps. This allows for fine-grained coredump management. Userspace can decide to just let the kernel write out the coredump, or generate the coredump itself, or just reject it. * COREDUMP_KERNEL The kernel will write the coredump data to the socket. * COREDUMP_USERSPACE The kernel will not write coredump data but will indicate to the parent that a coredump has been generated. This is used when userspace generates its own coredumps. * COREDUMP_REJECT The kernel will skip generating a coredump for this task. * COREDUMP_WAIT The kernel will prevent the task from exiting until the coredump server has shutdown the socket connection. The flexible coredump socket can be enabled by using the "@@" prefix instead of the single "@" prefix for the regular coredump socket: @@/run/systemd/coredump.socket - Cleanup the coredump code properly while we have to touch it anyway. Split out each coredump mode in a separate helper so it's easy to grasp what is going on and make the code easier to follow. The core coredump function should now be very trivial to follow" * tag 'vfs-6.17-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (31 commits) cleanup: add a scoped version of CLASS() coredump: add coredump_skip() helper coredump: avoid pointless variable coredump: order auto cleanup variables at the top coredump: add coredump_cleanup() coredump: auto cleanup prepare_creds() cred: add auto cleanup method coredump: directly return coredump: auto cleanup argv coredump: add coredump_write() coredump: use a single helper for the socket coredump: move pipe specific file check into coredump_pipe() coredump: split pipe coredumping into coredump_pipe() coredump: move core_pipe_count to global variable coredump: prepare to simplify exit paths coredump: split file coredumping into coredump_file() coredump: rename do_coredump() to vfs_coredump() selftests/coredump: make sure invalid paths are rejected coredump: validate socket path in coredump_parse() coredump: don't allow ".." in coredump socket path ...
2025-07-28Merge tag 'vfs-6.17-rc1.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc VFS updates from Christian Brauner: "This contains the usual selections of misc updates for this cycle. Features: - Add ext4 IOCB_DONTCACHE support This refactors the address_space_operations write_begin() and write_end() callbacks to take const struct kiocb * as their first argument, allowing IOCB flags such as IOCB_DONTCACHE to propagate to the filesystem's buffered I/O path. Ext4 is updated to implement handling of the IOCB_DONTCACHE flag and advertises support via the FOP_DONTCACHE file operation flag. Additionally, the i915 driver's shmem write paths are updated to bypass the legacy write_begin/write_end interface in favor of directly calling write_iter() with a constructed synchronous kiocb. Another i915 change replaces a manual write loop with kernel_write() during GEM shmem object creation. Cleanups: - don't duplicate vfs_open() in kernel_file_open() - proc_fd_getattr(): don't bother with S_ISDIR() check - fs/ecryptfs: replace snprintf with sysfs_emit in show function - vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes() - filelock: add new locks_wake_up_waiter() helper - fs: Remove three arguments from block_write_end() - VFS: change old_dir and new_dir in struct renamedata to dentrys - netfs: Remove unused declaration netfs_queue_write_request() Fixes: - eventpoll: Fix semi-unbounded recursion - eventpoll: fix sphinx documentation build warning - fs/read_write: Fix spelling typo - fs: annotate data race between poll_schedule_timeout() and pollwake() - fs/pipe: set FMODE_NOWAIT in create_pipe_files() - docs/vfs: update references to i_mutex to i_rwsem - fs/buffer: remove comment about hard sectorsize - fs/buffer: remove the min and max limit checks in __getblk_slow() - fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable - fs_context: fix parameter name in infofc() macro - fs: Prevent file descriptor table allocations exceeding INT_MAX" * tag 'vfs-6.17-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (24 commits) netfs: Remove unused declaration netfs_queue_write_request() eventpoll: fix sphinx documentation build warning ext4: support uncached buffered I/O mm/pagemap: add write_begin_get_folio() helper function fs: change write_begin/write_end interface to take struct kiocb * drm/i915: Refactor shmem_pwrite() to use kiocb and write_iter drm/i915: Use kernel_write() in shmem object create eventpoll: Fix semi-unbounded recursion vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes() fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable fs/buffer: remove the min and max limit checks in __getblk_slow() fs: Prevent file descriptor table allocations exceeding INT_MAX fs: Remove three arguments from block_write_end() fs/ecryptfs: replace snprintf with sysfs_emit in show function fs: annotate suspected data race between poll_schedule_timeout() and pollwake() docs/vfs: update references to i_mutex to i_rwsem fs/buffer: remove comment about hard sectorsize fs_context: fix parameter name in infofc() macro VFS: change old_dir and new_dir in struct renamedata to dentrys proc_fd_getattr(): don't bother with S_ISDIR() check ...
2025-07-28Merge tag 'pull-mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull vfs mount updates from Al Viro: - mount hash conflicts rudiments are gone now - we do not allow multiple mounts with the same parent/mountpoint to be hashed at the same time. - 'struct mount' changes: - mnt_umounting is gone - mnt_slave_list/mnt_slave is an hlist now - overmounts are kept track of by explicit pointer in mount - a bunch of flags moved out of mnt_flags to a new field, with only namespace_sem for protection - mnt_expiry is protected by mount_lock now (instead of namespace_sem) - MNT_LOCKED is used only for mounts that need to remain attached to their parents to prevent mountpoint exposure - no more overloading it for absolute roots - all mnt_list uses are transient now - it's used only to represent temporary sets during umount_tree() - mount refcounting change: children no longer pin parents for any mounts, whether they'd passed through umount_tree() or not - 'struct mountpoint' changes: - refcount is no more; what matters is ->m_list emptiness - instead of temporary bumping the refcount, we insert a new object (pinned_mountpoint) into ->m_list - new calling conventions for lock_mount() and friends - do_move_mount()/attach_recursive_mnt() seriously cleaned up - globals in fs/pnode.c are gone - propagate_mnt(), change_mnt_propagation() and propagate_umount() cleaned up (in the last case - pretty much completely rewritten). - freeing of emptied mnt_namespace is done in namespace_unlock(). For one thing, there are subtle ordering requirements there; for another it simplifies cleanups. - assorted cleanups - restore the machinery for long-term mounts from accumulated bitrot. This is going to get a followup come next cycle, when the change of vfs_fs_parse_string() calling conventions goes into -next * tag 'pull-mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (48 commits) statmount_mnt_basic(): simplify the logics for group id invent_group_ids(): zero ->mnt_group_id always implies !IS_MNT_SHARED() get rid of CL_SHARE_TO_SLAVE take freeing of emptied mnt_namespace to namespace_unlock() copy_tree(): don't link the mounts via mnt_list change_mnt_propagation(): move ->mnt_master assignment into MS_SLAVE case mnt_slave_list/mnt_slave: turn into hlist_head/hlist_node turn do_make_slave() into transfer_propagation() do_make_slave(): choose new master sanely change_mnt_propagation(): do_make_slave() is a no-op unless IS_MNT_SHARED() change_mnt_propagation() cleanups, step 1 propagate_mnt(): fix comment and convert to kernel-doc, while we are at it propagate_mnt(): get rid of last_dest fs/pnode.c: get rid of globals propagate_one(): fold into the sole caller propagate_one(): separate the "what should be the master for this copy" part propagate_one(): separate the "do we need secondary here?" logics propagate_mnt(): handle all peer groups in the same loop propagate_one(): get rid of dest_master mount: separate the flags accessed only under namespace_sem ...
2025-07-28Merge tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull CLASS(fd) update from Al Viro: "A missing bit of commit 66635b077624 ("assorted variants of irqfd setup: convert to CLASS(fd)") from a year ago. mshv_eventfd would've been covered by that, but it had forked slightly before that series and got merged into mainline later" * tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: mshv_eventfd: convert to CLASS(fd)
2025-07-28Merge tag 'pull-ceph-d_name-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ceph dentry->d_name fixes from Al Viro: "Stuff that had fallen through the cracks back in February; ceph folks tested that pile and said they prefer to have it go through my tree..." * tag 'pull-ceph-d_name-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ceph: fix a race with rename() in ceph_mdsc_build_path() prep for ceph_encode_encrypted_fname() fixes [ceph] parse_longname(): strrchr() expects NUL-terminated string
2025-07-28Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull misc VFS updates from Al Viro: "VFS-related cleanups in various places (mostly of the "that really can't happen" or "there's a better way to do it" variety)" * tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: gpib: use file_inode() binder_ioctl_write_read(): simplify control flow a bit secretmem: move setting O_LARGEFILE and bumping users' count to the place where we create the file apparmor: file never has NULL f_path.mnt landlock: opened file never has a negative dentry
2025-07-28Merge tag 'pull-securityfs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull securityfs updates from Al Viro: "Securityfs cleanups and fixes: - one extra reference is enough to pin a dentry down; no need for two. Switch to regular scheme, similar to shmem, debugfs, etc. This fixes a securityfs_recursive_remove() dentry leak, among other things. - we need to have the filesystem pinned to prevent the contents disappearing; what we do not need is pinning it for each file. Doing that only for files and directories in the root is enough. - the previous two changes allow us to get rid of the racy kludges in efi_secret_unlink(), where we can use simple_unlink() instead of securityfs_remove(). Which does not require unlocking and relocking the parent, with all deadlocks that invites. - Make securityfs_remove() take the entire subtree out, turning securityfs_recursive_remove() into its alias. Makes a lot more sense for callers and fixes a mount leak, while we are at it. - Making securityfs_remove() remove the entire subtree allows for much simpler life in most of the users - efi_secret, ima_fs, evm, ipe, tmp get cleaner. I hadn't touched apparmor use of securityfs, but I suspect that it would be useful there as well" * tag 'pull-securityfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: tpm: don't bother with removal of files in directory we'll be removing ipe: don't bother with removal of files in directory we'll be removing evm_secfs: clear securityfs interactions ima_fs: get rid of lookup-by-dentry stuff ima_fs: don't bother with removal of files in directory we'll be removing efi_secret: clean securityfs use up make securityfs_remove() remove the entire subtree fix locking in efi_secret_unlink() securityfs: pin filesystem only for objects directly in root securityfs: don't pin dentries twice, once is enough...
2025-07-28Merge tag 'pull-rpc_pipefs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull rpc_pipefs updates from Al Viro: "Massage rpc_pipefs to use saner primitives and clean up the APIs provided to the rest of the kernel" * tag 'pull-rpc_pipefs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: rpc_create_client_dir(): return 0 or -E... rpc_create_client_dir(): don't bother with rpc_populate() rpc_new_dir(): the last argument is always NULL rpc_pipe: expand the calls of rpc_mkdir_populate() rpc_gssd_dummy_populate(): don't bother with rpc_populate() rpc_mkpipe_dentry(): switch to simple_start_creating() rpc_pipe: saner primitive for creating regular files rpc_pipe: saner primitive for creating subdirectories rpc_pipe: don't overdo directory locking rpc_mkpipe_dentry(): saner calling conventions rpc_unlink(): saner calling conventions rpc_populate(): lift cleanup into callers rpc_unlink(): use simple_recursive_removal() rpc_{rmdir_,}depopulate(): use simple_recursive_removal() instead rpc_pipe: clean failure exits in fill_super new helper: simple_start_creating()
2025-07-28Merge tag 'pull-simple_recursive_removal' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull simple_recursive_removal() update from Al Viro: "Removing subtrees of kernel filesystems is done in quite a few places; unfortunately, it's easy to get wrong. A number of open-coded attempts are out there, with varying amount of bogosities. simple_recursive_removal() had been introduced for doing that with all precautions needed; it does an equivalent of rm -rf, with sufficient locking, eviction of anything mounted on top of the subtree, etc. This series converts a bunch of open-coded instances to using that" * tag 'pull-simple_recursive_removal' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: functionfs, gadgetfs: use simple_recursive_removal() kill binderfs_remove_file() fuse_ctl: use simple_recursive_removal() pstore: switch to locked_recursive_removal() binfmt_misc: switch to locked_recursive_removal() spufs: switch to locked_recursive_removal() add locked_recursive_removal() better lockdep annotations for simple_recursive_removal() simple_recursive_removal(): saner interaction with fsnotify
2025-07-28Merge tag 'pull-dcache' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull dentry d_flags updates from Al Viro: "The current exclusion rules for dentry->d_flags stores are rather unpleasant. The basic rules are simple: - stores to dentry->d_flags are OK under dentry->d_lock - stores to dentry->d_flags are OK in the dentry constructor, before becomes potentially visible to other threads Unfortunately, there's a couple of exceptions to that, and that's where the headache comes from. The main PITA comes from d_set_d_op(); that primitive sets ->d_op of dentry and adjusts the flags that correspond to presence of individual methods. It's very easy to misuse; existing uses _are_ safe, but proof of correctness is brittle. Use in __d_alloc() is safe (we are within a constructor), but we might as well precalculate the initial value of 'd_flags' when we set the default ->d_op for given superblock and set 'd_flags' directly instead of messing with that helper. The reasons why other uses are safe are bloody convoluted; I'm not going to reproduce it here. See [1] for gory details, if you care. The critical part is using d_set_d_op() only just prior to d_splice_alias(), which makes a combination of d_splice_alias() with setting ->d_op, etc a natural replacement primitive. Better yet, if we go that way, it's easy to take setting ->d_op and modifying 'd_flags' under ->d_lock, which eliminates the headache as far as 'd_flags' exclusion rules are concerned. Other exceptions are minor and easy to deal with. What this series does: - d_set_d_op() is no longer available; instead a new primitive (d_splice_alias_ops()) is provided, equivalent to combination of d_set_d_op() and d_splice_alias(). - new field of struct super_block - 's_d_flags'. This sets the default value of 'd_flags' to be used when allocating dentries on this filesystem. - new primitive for setting 's_d_op': set_default_d_op(). This replaces stores to 's_d_op' at mount time. All in-tree filesystems converted; out-of-tree ones will get caught by the compiler ('s_d_op' is renamed, so stores to it will be caught). 's_d_flags' is set by the same primitive to match the 's_d_op'. - a lot of filesystems had sb->s_d_op->d_delete equal to always_delete_dentry; that is equivalent to setting DCACHE_DONTCACHE in 'd_flags', so such filesystems can bloody well set that bit in 's_d_flags' and drop 'd_delete()' from dentry_operations. In quite a few cases that results in empty dentry_operations, which means that we can get rid of those. - kill simple_dentry_operations - not needed anymore - massage d_alloc_parallel() to get rid of the other exception wrt 'd_flags' stores - we can set DCACHE_PAR_LOOKUP as soon as we allocate the new dentry; no need to delay that until we commit to using the sucker. As the result, 'd_flags' stores are all either under ->d_lock or done before the dentry becomes visible in any shared data structures" Link: https://lore.kernel.org/all/20250224010624.GT1977892@ZenIV/ [1] * tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits) configfs: use DCACHE_DONTCACHE debugfs: use DCACHE_DONTCACHE efivarfs: use DCACHE_DONTCACHE instead of always_delete_dentry() 9p: don't bother with always_delete_dentry ramfs, hugetlbfs, mqueue: set DCACHE_DONTCACHE kill simple_dentry_operations devpts, sunrpc, hostfs: don't bother with ->d_op shmem: no dentry retention past the refcount reaching zero d_alloc_parallel(): set DCACHE_PAR_LOOKUP earlier make d_set_d_op() static simple_lookup(): just set DCACHE_DONTCACHE tracefs: Add d_delete to remove negative dentries set_default_d_op(): calculate the matching value for ->d_flags correct the set of flags forbidden at d_set_d_op() time split d_flags calculation out of d_set_d_op() new helper: set_default_d_op() fuse: no need for special dentry_operations for root dentry switch procfs from d_set_d_op() to d_splice_alias_ops() new helper: d_splice_alias_ops() procfs: kill ->proc_dops ...
2025-07-28Merge tag 'pull-headers_param' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull asm/param cleanup from Al Viro: "This massages asm/param.h to simpler and more uniform shape: - all arch/*/include/uapi/asm/param.h are either generated includes of <asm-generic/param.h> or a #define or two followed by such include - no arch/*/include/asm/param.h anywhere, generated or not - include <asm/param.h> resolves to arch/*/include/uapi/asm/param.h of the architecture in question (or that of host in case of uml) - include/asm-generic/param.h pulls uapi/asm-generic/param.h and deals with USER_HZ, CLOCKS_PER_SEC and with HZ redefinition after that" * tag 'pull-headers_param' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: loongarch, um, xtensa: get rid of generated arch/$ARCH/include/asm/param.h alpha: regularize the situation with asm/param.h xtensa: get rid uapi/asm/param.h
2025-07-28Merge tag 'nfsd-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds
Pull nfsd updates from Chuck Lever: "NFSD is finally able to offer write delegations to clients that open files with O_WRONLY, thanks to patches from Dai Ngo. We're expecting this to accelerate a few interesting corner cases. The cap on the number of operations per NFSv4 COMPOUND has been lifted. Now, clients that send COMPOUNDs containing dozens of operations (for example, a long stream of LOOKUP operations to walk a pathname in a single round trip) will no longer be rejected. This release re-enables the ability for NFSD to perform NFSv4.2 COPY operations asynchronously. This feature has been disabled to mitigate the risk of denial-of-service when too many such requests arrive. Many thanks to the contributors, reviewers, testers, and bug reporters who participated during the v6.17 development cycle" * tag 'nfsd-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (32 commits) nfsd: Drop dprintk in blocklayout xdr functions sunrpc: make svc_tcp_sendmsg() take a signed sentp pointer sunrpc: rearrange struct svc_rqst for fewer cachelines sunrpc: return better error in svcauth_gss_accept() on alloc failure sunrpc: reset rq_accept_statp when starting a new RPC sunrpc: remove SVC_SYSERR sunrpc: fix handling of unknown auth status codes NFSD: Simplify struct knfsd_fh NFSD: Access a knfsd_fh's fsid by pointer Revert "NFSD: Force all NFSv4.2 COPY requests to be synchronous" NFSD: Avoid multiple -Wflex-array-member-not-at-end warnings NFSD: Use vfs_iocb_iter_write() NFSD: Use vfs_iocb_iter_read() NFSD: Clean up kdoc for nfsd_open_local_fh() NFSD: Clean up kdoc for nfsd_file_put_local() NFSD: Remove definition for trace_nfsd_ctl_maxconn NFSD: Remove definition for trace_nfsd_file_gc_recent NFSD: Remove definitions for unused trace_nfsd_file_lru trace points NFSD: Remove definition for trace_nfsd_file_unhash_and_queue nfsd: Use correct error code when decoding extents ...
2025-07-28Merge tag 'gfs2-for-6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Prevent cluster nodes from trying to recover their own filesystems during a withdraw - Add two missing migrate_folio aops and an additional exhash directory consistency check (both triggered by syzbot bug reports) - Sanitize how dlm results are processed and clean up a few quirks in the glock code - Minor stuff: Get rid of the GIF_ALLOC_FAILED flag; use SECTOR_SIZE and SECTOR_SHIFT * tag 'gfs2-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: No more self recovery gfs2: Validate i_depth for exhash directories gfs2: Set .migrate_folio in gfs2_{rgrp,meta}_aops gfs2: a minor finish_xmote cleanup gfs2: simplify finish_xmote gfs2: sanitize the gdlm_ast -> finish_xmote interface gfs2: Minor do_xmote cancelation fix gfs2: Remove GIF_ALLOC_FAILED flag gfs2: Use SECTOR_SIZE and SECTOR_SHIFT
2025-07-28Merge tag 'xfs-merge-6.17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs updates from Carlos Maiolino: "This doesn't contain any new features. It mostly is a collection of clean ups and code refactoring that I preferred to postpone to the merge window. It includes removal of several unused tracepoints, refactoring key comparing routines under the B-Trees management and cleanup of xfs journaling code" * tag 'xfs-merge-6.17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (44 commits) xfs: don't use a xfs_log_iovec for ri_buf in log recovery xfs: don't use a xfs_log_iovec for attr_item names and values xfs: use better names for size members in xfs_log_vec xfs: cleanup the ordered item logic in xlog_cil_insert_format_items xfs: don't pass the old lv to xfs_cil_prepare_item xfs: remove unused trace event xfs_reflink_cow_enospc xfs: remove unused trace event xfs_discard_rtrelax xfs: remove unused trace event xfs_log_cil_return xfs: remove unused trace event xfs_dqreclaim_dirty fs/xfs: replace strncpy with memtostr_pad() xfs: Remove unused label in xfs_dax_notify_dev_failure xfs: improve the comments in xfs_select_zone_nowait xfs: improve the comments in xfs_max_open_zones xfs: stop passing an inode to the zone space reservation helpers xfs: rename oz_write_pointer to oz_allocated xfs: use a uint32_t to cache i_used_blocks in xfs_init_zone xfs: improve the xg_active_ref check in xfs_group_free xfs: remove the xlog_ticket_t typedef xfs: remove xrep_trans_{alloc,cancel}_hook_dummy xfs: return the allocated transaction from xchk_trans_alloc_empty ...
2025-07-28Merge tag 'erofs-for-6.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "We now support metadata compression. It can be useful for embedded use cases or archiving a large number of small files. Additionally, readdir performance has been improved by enabling readahead (note that it was already common practice for ext3/4 non-dx and f2fs directories). We may consider further improvements later to align with ext4's s_inode_readahead_blks behavior for slow devices too. The remaining commits are minor. Summary: - Add support for metadata compression - Enable readahead for directories to improve readdir performance - Minor fixes and cleanups" * tag 'erofs-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: support to readahead dirent blocks in erofs_readdir() erofs: implement metadata compression erofs: add on-disk definition for metadata compression erofs: fix build error with CONFIG_EROFS_FS_ZIP_ACCEL=y erofs: remove ENOATTR definition erofs: refine erofs_iomap_begin() erofs: unify meta buffers in z_erofs_fill_inode() erofs: remove need_kmap in erofs_read_metabuf() erofs: do sanity check on m->type in z_erofs_load_compact_lcluster() erofs: get rid of {get,put}_page() for ztailpacking data
2025-07-28Merge tag 'ntfs3_for_6.17' of ↵Linus Torvalds
https://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs3 updates from Konstantin Komarov: "Added: - sanity check for file name - mark live inode as bad and avoid any operations Fixed: - handling of symlinks created in windows - creation of symlinks for relative path Changed: - cancel setting inode as bad after removing name fails - revert 'replace inode_trylock with inode_lock'" * tag 'ntfs3_for_6.17' of https://github.com/Paragon-Software-Group/linux-ntfs3: Revert "fs/ntfs3: Replace inode_trylock with inode_lock" fs/ntfs3: Exclude call make_bad_inode for live nodes. fs/ntfs3: cancle set bad inode after removing name fails fs/ntfs3: Add sanity check for file name fs/ntfs3: correctly create symlink for relative path fs/ntfs3: fix symlinks cannot be handled correctly
2025-07-28Merge tag 'for-6.17-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "A number of usability and feature updates, scattered performance improvements and fixes. Highlight of the core changes is getting closer to enabling large folios (now behind a config option). User visible changes: - update defrag ioctl, add new flag to request no compression on existing extents - restrict writes to block devices after mount - in experimental config, enable large folios for data, almost complete but not widely tested - add stats tracking duration of critical section in transaction commit to /sys/fs/btrfs/FSID/commit_stats Performance improvements: - caching of lookup results of free space bitmap (20% runtime improvement on an empty file creation benchmark) - accessors to metadata (b-tree items) simplified and optimized, minor improvement in metadata-heavy workloads - readahead on compressed data improves sequential read - the xarray for extent buffers is indexed by denser keys, leading to better packing of the nodes (50-70% reduction of leaf nodes) Notable fixes: - stricter compression mount option parsing - send properly emits fallocate command for file holes when protocol v2 is used - fix overallocation of chunks with mount option 'ssd_spread', due to interaction with size classes not finding the right chunk (workaround: manual reclaim by 'usage' balance filter) - various quota enable/disable races with rescan, more verbose notifications about inconsistent state - populate otime in tree-log during log replay - handle ENOSPC when NOCOW file is used with mmap() Core: - large data folios enabled in experimental config - improved error handling, transaction abort call sites - in zoned mode, allocate reloc block group on mount to make sure there's always one available for zone reclaim under heavy load - rework device opening, they're always open as read-only and delayed until the super block is created, allowing the restricted writes after mount - preparatory work for adding blk_holder_ops, allowing device freeze/thaw in the future Cleanups, refactoring: - type and naming unifications (int/bool, return variables) - rb-tree helper refactoring and simplifications - reorder memory allocations to less critical places - RCU string (used for device name) refactoring and API removal - replace all remaining use of strcpy()" * tag 'for-6.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (209 commits) btrfs: send: use fallocate for hole punching with send stream v2 btrfs: unfold transaction aborts when writing dirty block groups btrfs: use saner variable type and name to indicate extrefs at add_inode_ref() btrfs: don't skip remaining extrefs if dir not found during log replay btrfs: don't ignore inode missing when replaying log tree btrfs: enable large data folios for data reloc inode btrfs: output more info when btrfs_subpage_assert() failed btrfs: reloc: unconditionally invalidate the page cache for each cluster btrfs: defrag: add flag to force no-compression btrfs: fix ssd_spread overallocation btrfs: zoned: requeue to unused block group list if zone finish failed btrfs: zoned: do not remove unwritten non-data block group btrfs: remove btrfs_clear_extent_bits() btrfs: use cached state when falling back from NOCoW write to CoW write btrfs: set EXTENT_NORESERVE before range unlock in btrfs_truncate_block() btrfs: don't print relocation messages from auto reclaim btrfs: remove redundant auto reclaim log message btrfs: make btrfs_check_nocow_lock() check more than one extent btrfs: assert we can NOCOW the range in btrfs_truncate_block() btrfs: update function comment for btrfs_check_nocow_lock() ...
2025-07-28MIPS: alchemy: gpio: use new GPIO line value setter callbacks for the ↵Bartosz Golaszewski
remaining chips Previous commit missed two other places that need converting, it only came out in tests on autobuilders now. Convert the rest of the driver. Fixes: 68bdc4dc1130 ("MIPS: alchemy: gpio: use new line value setter callbacks") Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Link: https://lore.kernel.org/r/20250727082442.13182-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-07-27Linux 6.16v6.16Linus Torvalds
2025-07-27Merge tag 'timers-urgent-2025-07-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix for the PTP systemcounter mechanism: The rework of this mechanism added a 'use_nsec' member to struct system_counterval. get_device_system_crosststamp() instantiates that struct on the stack and hands a pointer to the driver callback. Only the drivers which set use_nsec to true, initialize that field, but all others ignore it. As get_device_system_crosststamp() does not initialize the struct, the use_nsec field contains random stack content in those cases. That causes a miscalulation usually resulting in a failing range check in the best case. Initialize the structure before handing it to the drivers to cure that" * tag 'timers-urgent-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Zero initialize system_counterval when querying time from phc drivers
2025-07-27gpiolib: enable CONFIG_GPIOLIB_LEGACY even for !GPIOLIBArnd Bergmann
A few drivers that use the legacy GPIOLIB interfaces can be enabled even when GPIOLIB is disabled entirely. With my previous patch this now causes build failures like: drivers/nfc/s3fwrn5/uart.c: In function 's3fwrn82_uart_parse_dt': drivers/nfc/s3fwrn5/uart.c:100:14: error: implicit declaration of function 'gpio_is_valid'; did you mean 'uuid_is_valid'? [-Werror=implicit-function-declaration] These did not show up in my randconfig tests because randconfig almost always has GPIOLIB selected by some other driver, and I did most of the testing with follow-up patches that address the failures properly. Move the symbol outside of the 'if CONFIG_GPIOLIB' block for the moment to avoid the build failures. It can be moved back and turned off by default once all the driver specific changes are merged. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202507261934.yIHeUuEQ-lkp@intel.com/ Fixes: 678bae2eaa81 ("gpiolib: make legacy interfaces optional") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20250726211053.2226857-1-arnd@kernel.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-07-26Merge tag 'spi-fix-v6.16-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fix from Mark Brown: "One last fix for v6.16, removing some hard coding to avoid data corruption on some NAND devices in the QPIC driver" * tag 'spi-fix-v6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-qpic-snand: don't hardcode ECC steps
2025-07-26sched/task_stack: Add missing const qualifier to end_of_stack()Kees Cook
Add missing const qualifier to the non-CONFIG_THREAD_INFO_IN_TASK version of end_of_stack() to match the CONFIG_THREAD_INFO_IN_TASK version. Fixes a warning with CONFIG_KSTACK_ERASE=y on archs that don't select THREAD_INFO_IN_TASK (such as LoongArch): error: passing 'const struct task_struct *' to parameter of type 'struct task_struct *' discards qualifiers The stackleak_task_low_bound() function correctly uses a const task parameter, but the legacy end_of_stack() prototype didn't like that. Build tested on loongarch (with CONFIG_KSTACK_ERASE=y) and m68k (with CONFIG_DEBUG_STACK_USAGE=y). Fixes: a45728fd4120 ("LoongArch: Enable HAVE_ARCH_STACKLEAK") Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/all/20250726004313.GA3650901@ax162 Cc: Youling Tang <tangyouling@kylinos.cn> Cc: Huacai Chen <chenhuacai@loongson.cn> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26kstack_erase: Support Clang stack depth trackingKees Cook
Wire up CONFIG_KSTACK_ERASE to Clang 21's new stack depth tracking callback[1] option. Link: https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-stack-depth [1] Acked-by: Nicolas Schier <n.schier@avm.de> Link: https://lore.kernel.org/r/20250724055029.3623499-4-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26kstack_erase: Add -mgeneral-regs-only to silence Clang warningsKees Cook
Once CONFIG_KSTACK_ERASE is enabled with Clang on i386, the build warns: kernel/kstack_erase.c:168:2: warning: function with attribute 'no_caller_saved_registers' should only call a function with attribute 'no_caller_saved_registers' or be compiled with '-mgeneral-regs-only' [-Wexcessive-regsave] Add -mgeneral-regs-only for the kstack_erase handler, to make Clang feel better (it is effectively a no-op flag for the kernel). No binary changes encountered. Build & boot tested with Clang 21 on x86_64, and i386. Build tested with GCC 14.2.0 on x86_64, i386, arm64, and arm. Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/all/20250726004313.GA3650901@ax162 Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26init.h: Disable sanitizer coverage for __init and __headKees Cook
While __noinstr already contained __no_sanitize_coverage, it needs to be added to __init and __head section markings to support the Clang implementation of CONFIG_KSTACK_ERASE. This is to make sure the stack depth tracking callback is not executed in unsupported contexts. The other sanitizer coverage options (trace-pc and trace-cmp) aren't needed in __head nor __init either ("We are interested in code coverage as a function of a syscall inputs"[1]), so this is fine to disable for them as well. Link: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/kcov.c?h=v6.14#n179 [1] Acked-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/20250724055029.3623499-3-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26kstack_erase: Disable kstack_erase for all of arm compressed boot codeKees Cook
When building with CONFIG_KSTACK_ERASE=y and CONFIG_ARM_ATAG_DTB_COMPAT=y, the compressed boot environment encounters an undefined symbol error: ld.lld: error: undefined symbol: __sanitizer_cov_stack_depth >>> referenced by atags_to_fdt.c:135 This occurs because the compiler instruments the atags_to_fdt() function with sanitizer coverage calls, but the minimal compressed boot environment lacks access to sanitizer runtime support. The compressed boot environment already disables stack protector with -fno-stack-protector. Similarly disable sanitizer coverage by adding $(DISABLE_KSTACK_ERASE) to the general compiler flags (and remove it from the one place it was noticed before), which contains the appropriate flags to prevent sanitizer instrumentation. This follows the same pattern used in other early boot contexts where sanitizer runtime support is unavailable. Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/all/CA+G9fYtBk8qnpWvoaFwymCx5s5i-5KXtPGpmf=_+UKJddCOnLA@mail.gmail.com Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/all/20250726004313.GA3650901@ax162 Suggested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26Merge tag 'i2c-for-6.16-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - qup: avoid potential hang when waiting for bus idle - tegra: improve ACPI reset error handling - virtio: use interruptible wait to prevent hang during transfer * tag 'i2c-for-6.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: qup: jump out of the loop in case of timeout i2c: virtio: Avoid hang by using interruptible completion wait i2c: tegra: Fix reset error handling with ACPI
2025-07-26Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A few Allwinner clk driver fixes: - Mark Allwinner A523 MBUS clock as critical to avoid system stalls - Fix names of CSI related clocks on Allwinner V3s. This includes changes to the driver, DT bindings and DT files. - Fix parents of TCON clock on Allwinner V3s" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: sunxi-ng: v3s: Fix TCON clock parents clk: sunxi-ng: v3s: Fix CSI1 MCLK clock name clk: sunxi-ng: v3s: Fix CSI SCLK clock name clk: sunxi-ng: a523: Mark MBUS clock as critical
2025-07-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linuxLinus Torvalds
Pull ARM fixes from Russell King: - use an absolute path for asm/unified.h in KBUILD_AFLAGS to solve a regression caused by commit d5c8d6e0fa61 ("kbuild: Update assembler calls to use proper flags and language target") - fix dead code elimination binutils version check again * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9450/1: Fix allowing linker DCE with binutils < 2.36 ARM: 9448/1: Use an absolute path to unified.h in KBUILD_AFLAGS
2025-07-26Merge tag 'soc-fixes-6.16-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "These are two fixes that came in late, one addresses a regression on a rockchips based board, the other is for ensuring a consistent dt binding for a device added in 6.16 before the incorrect one makes it into a release" * tag 'soc-fixes-6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: dts: rockchip: Drop netdev led-triggers on NanoPi R5S arm64: dts: allwinner: a523: Rename emac0 to gmac0
2025-07-26Merge tag 'sunxi-fixes-for-6.16' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into soc/dt Allwinner fixes for 6.16 Only one fix: Correct the name of the A523's EMAC0 to GMAC0, as seen in the SoC's datasheets. The matching DT binding change is in the net tree. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-07-26Merge tag 'i2c-host-fixes-6.16-rc8' of ↵Wolfram Sang
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current i2c-host-fixes for v6.16-rc8 qup: avoid potential hang when waiting for bus idle tegra: improve ACPI reset error handling virtio: use interruptible wait to prevent hang during transfer
2025-07-25hfs: fix general protection fault in hfs_find_init()Viacheslav Dubeyko
The hfs_find_init() method can trigger the crash if tree pointer is NULL: [ 45.746290][ T9787] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000008: 0000 [#1] SMP KAI [ 45.747287][ T9787] KASAN: null-ptr-deref in range [0x0000000000000040-0x0000000000000047] [ 45.748716][ T9787] CPU: 2 UID: 0 PID: 9787 Comm: repro Not tainted 6.16.0-rc3 #10 PREEMPT(full) [ 45.750250][ T9787] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 45.751983][ T9787] RIP: 0010:hfs_find_init+0x86/0x230 [ 45.752834][ T9787] Code: c1 ea 03 80 3c 02 00 0f 85 9a 01 00 00 4c 8d 6b 40 48 c7 45 18 00 00 00 00 48 b8 00 00 00 00 00 fc [ 45.755574][ T9787] RSP: 0018:ffffc90015157668 EFLAGS: 00010202 [ 45.756432][ T9787] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff819a4d09 [ 45.757457][ T9787] RDX: 0000000000000008 RSI: ffffffff819acd3a RDI: ffffc900151576e8 [ 45.758282][ T9787] RBP: ffffc900151576d0 R08: 0000000000000005 R09: 0000000000000000 [ 45.758943][ T9787] R10: 0000000080000000 R11: 0000000000000001 R12: 0000000000000004 [ 45.759619][ T9787] R13: 0000000000000040 R14: ffff88802c50814a R15: 0000000000000000 [ 45.760293][ T9787] FS: 00007ffb72734540(0000) GS:ffff8880cec64000(0000) knlGS:0000000000000000 [ 45.761050][ T9787] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.761606][ T9787] CR2: 00007f9bd8225000 CR3: 000000010979a000 CR4: 00000000000006f0 [ 45.762286][ T9787] Call Trace: [ 45.762570][ T9787] <TASK> [ 45.762824][ T9787] hfs_ext_read_extent+0x190/0x9d0 [ 45.763269][ T9787] ? submit_bio_noacct_nocheck+0x2dd/0xce0 [ 45.763766][ T9787] ? __pfx_hfs_ext_read_extent+0x10/0x10 [ 45.764250][ T9787] hfs_get_block+0x55f/0x830 [ 45.764646][ T9787] block_read_full_folio+0x36d/0x850 [ 45.765105][ T9787] ? __pfx_hfs_get_block+0x10/0x10 [ 45.765541][ T9787] ? const_folio_flags+0x5b/0x100 [ 45.765972][ T9787] ? __pfx_hfs_read_folio+0x10/0x10 [ 45.766415][ T9787] filemap_read_folio+0xbe/0x290 [ 45.766840][ T9787] ? __pfx_filemap_read_folio+0x10/0x10 [ 45.767325][ T9787] ? __filemap_get_folio+0x32b/0xbf0 [ 45.767780][ T9787] do_read_cache_folio+0x263/0x5c0 [ 45.768223][ T9787] ? __pfx_hfs_read_folio+0x10/0x10 [ 45.768666][ T9787] read_cache_page+0x5b/0x160 [ 45.769070][ T9787] hfs_btree_open+0x491/0x1740 [ 45.769481][ T9787] hfs_mdb_get+0x15e2/0x1fb0 [ 45.769877][ T9787] ? __pfx_hfs_mdb_get+0x10/0x10 [ 45.770316][ T9787] ? find_held_lock+0x2b/0x80 [ 45.770731][ T9787] ? lockdep_init_map_type+0x5c/0x280 [ 45.771200][ T9787] ? lockdep_init_map_type+0x5c/0x280 [ 45.771674][ T9787] hfs_fill_super+0x38e/0x720 [ 45.772092][ T9787] ? __pfx_hfs_fill_super+0x10/0x10 [ 45.772549][ T9787] ? snprintf+0xbe/0x100 [ 45.772931][ T9787] ? __pfx_snprintf+0x10/0x10 [ 45.773350][ T9787] ? do_raw_spin_lock+0x129/0x2b0 [ 45.773796][ T9787] ? find_held_lock+0x2b/0x80 [ 45.774215][ T9787] ? set_blocksize+0x40a/0x510 [ 45.774636][ T9787] ? sb_set_blocksize+0x176/0x1d0 [ 45.775087][ T9787] ? setup_bdev_super+0x369/0x730 [ 45.775533][ T9787] get_tree_bdev_flags+0x384/0x620 [ 45.775985][ T9787] ? __pfx_hfs_fill_super+0x10/0x10 [ 45.776453][ T9787] ? __pfx_get_tree_bdev_flags+0x10/0x10 [ 45.776950][ T9787] ? bpf_lsm_capable+0x9/0x10 [ 45.777365][ T9787] ? security_capable+0x80/0x260 [ 45.777803][ T9787] vfs_get_tree+0x8e/0x340 [ 45.778203][ T9787] path_mount+0x13de/0x2010 [ 45.778604][ T9787] ? kmem_cache_free+0x2b0/0x4c0 [ 45.779052][ T9787] ? __pfx_path_mount+0x10/0x10 [ 45.779480][ T9787] ? getname_flags.part.0+0x1c5/0x550 [ 45.779954][ T9787] ? putname+0x154/0x1a0 [ 45.780335][ T9787] __x64_sys_mount+0x27b/0x300 [ 45.780758][ T9787] ? __pfx___x64_sys_mount+0x10/0x10 [ 45.781232][ T9787] do_syscall_64+0xc9/0x480 [ 45.781631][ T9787] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 45.782149][ T9787] RIP: 0033:0x7ffb7265b6ca [ 45.782539][ T9787] Code: 48 8b 0d c9 17 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 [ 45.784212][ T9787] RSP: 002b:00007ffc0c10cfb8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 45.784935][ T9787] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffb7265b6ca [ 45.785626][ T9787] RDX: 0000200000000240 RSI: 0000200000000280 RDI: 00007ffc0c10d100 [ 45.786316][ T9787] RBP: 00007ffc0c10d190 R08: 00007ffc0c10d000 R09: 0000000000000000 [ 45.787011][ T9787] R10: 0000000000000048 R11: 0000000000000206 R12: 0000560246733250 [ 45.787697][ T9787] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 45.788393][ T9787] </TASK> [ 45.788665][ T9787] Modules linked in: [ 45.789058][ T9787] ---[ end trace 0000000000000000 ]--- [ 45.789554][ T9787] RIP: 0010:hfs_find_init+0x86/0x230 [ 45.790028][ T9787] Code: c1 ea 03 80 3c 02 00 0f 85 9a 01 00 00 4c 8d 6b 40 48 c7 45 18 00 00 00 00 48 b8 00 00 00 00 00 fc [ 45.792364][ T9787] RSP: 0018:ffffc90015157668 EFLAGS: 00010202 [ 45.793155][ T9787] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff819a4d09 [ 45.794123][ T9787] RDX: 0000000000000008 RSI: ffffffff819acd3a RDI: ffffc900151576e8 [ 45.795105][ T9787] RBP: ffffc900151576d0 R08: 0000000000000005 R09: 0000000000000000 [ 45.796135][ T9787] R10: 0000000080000000 R11: 0000000000000001 R12: 0000000000000004 [ 45.797114][ T9787] R13: 0000000000000040 R14: ffff88802c50814a R15: 0000000000000000 [ 45.798024][ T9787] FS: 00007ffb72734540(0000) GS:ffff8880cec64000(0000) knlGS:0000000000000000 [ 45.799019][ T9787] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.799822][ T9787] CR2: 00007f9bd8225000 CR3: 000000010979a000 CR4: 00000000000006f0 [ 45.800747][ T9787] Kernel panic - not syncing: Fatal exception The hfs_fill_super() calls hfs_mdb_get() method that tries to construct Extents Tree and Catalog Tree: HFS_SB(sb)->ext_tree = hfs_btree_open(sb, HFS_EXT_CNID, hfs_ext_keycmp); if (!HFS_SB(sb)->ext_tree) { pr_err("unable to open extent tree\n"); goto out; } HFS_SB(sb)->cat_tree = hfs_btree_open(sb, HFS_CAT_CNID, hfs_cat_keycmp); if (!HFS_SB(sb)->cat_tree) { pr_err("unable to open catalog tree\n"); goto out; } However, hfs_btree_open() calls read_mapping_page() that calls hfs_get_block(). And this method calls hfs_ext_read_extent(): static int hfs_ext_read_extent(struct inode *inode, u16 block) { struct hfs_find_data fd; int res; if (block >= HFS_I(inode)->cached_start && block < HFS_I(inode)->cached_start + HFS_I(inode)->cached_blocks) return 0; res = hfs_find_init(HFS_SB(inode->i_sb)->ext_tree, &fd); if (!res) { res = __hfs_ext_cache_extent(&fd, inode, block); hfs_find_exit(&fd); } return res; } The problem here that hfs_find_init() is trying to use HFS_SB(inode->i_sb)->ext_tree that is not initialized yet. It will be initailized when hfs_btree_open() finishes the execution. The patch adds checking of tree pointer in hfs_find_init() and it reworks the logic of hfs_btree_open() by reading the b-tree's header directly from the volume. The read_mapping_page() is exchanged on filemap_grab_folio() that grab the folio from mapping. Then, sb_bread() extracts the b-tree's header content and copy it into the folio. Reported-by: Wenzhi Wang <wenzhi.wang@uwaterloo.ca> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> cc: Yangtao Li <frank.li@vivo.com> cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250710213657.108285-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25hfs: fix slab-out-of-bounds in hfs_bnode_read()Viacheslav Dubeyko
This patch introduces is_bnode_offset_valid() method that checks the requested offset value. Also, it introduces check_and_correct_requested_length() method that checks and correct the requested length (if it is necessary). These methods are used in hfs_bnode_read(), hfs_bnode_write(), hfs_bnode_clear(), hfs_bnode_copy(), and hfs_bnode_move() with the goal to prevent the access out of allocated memory and triggering the crash. Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Link: https://lore.kernel.org/r/20250703214912.244138-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25hfsplus: fix slab-out-of-bounds in hfsplus_bnode_read()Viacheslav Dubeyko
The hfsplus_bnode_read() method can trigger the issue: [ 174.852007][ T9784] ================================================================== [ 174.852709][ T9784] BUG: KASAN: slab-out-of-bounds in hfsplus_bnode_read+0x2f4/0x360 [ 174.853412][ T9784] Read of size 8 at addr ffff88810b5fc6c0 by task repro/9784 [ 174.854059][ T9784] [ 174.854272][ T9784] CPU: 1 UID: 0 PID: 9784 Comm: repro Not tainted 6.16.0-rc3 #7 PREEMPT(full) [ 174.854281][ T9784] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 174.854286][ T9784] Call Trace: [ 174.854289][ T9784] <TASK> [ 174.854292][ T9784] dump_stack_lvl+0x10e/0x1f0 [ 174.854305][ T9784] print_report+0xd0/0x660 [ 174.854315][ T9784] ? __virt_addr_valid+0x81/0x610 [ 174.854323][ T9784] ? __phys_addr+0xe8/0x180 [ 174.854330][ T9784] ? hfsplus_bnode_read+0x2f4/0x360 [ 174.854337][ T9784] kasan_report+0xc6/0x100 [ 174.854346][ T9784] ? hfsplus_bnode_read+0x2f4/0x360 [ 174.854354][ T9784] hfsplus_bnode_read+0x2f4/0x360 [ 174.854362][ T9784] hfsplus_bnode_dump+0x2ec/0x380 [ 174.854370][ T9784] ? __pfx_hfsplus_bnode_dump+0x10/0x10 [ 174.854377][ T9784] ? hfsplus_bnode_write_u16+0x83/0xb0 [ 174.854385][ T9784] ? srcu_gp_start+0xd0/0x310 [ 174.854393][ T9784] ? __mark_inode_dirty+0x29e/0xe40 [ 174.854402][ T9784] hfsplus_brec_remove+0x3d2/0x4e0 [ 174.854411][ T9784] __hfsplus_delete_attr+0x290/0x3a0 [ 174.854419][ T9784] ? __pfx_hfs_find_1st_rec_by_cnid+0x10/0x10 [ 174.854427][ T9784] ? __pfx___hfsplus_delete_attr+0x10/0x10 [ 174.854436][ T9784] ? __asan_memset+0x23/0x50 [ 174.854450][ T9784] hfsplus_delete_all_attrs+0x262/0x320 [ 174.854459][ T9784] ? __pfx_hfsplus_delete_all_attrs+0x10/0x10 [ 174.854469][ T9784] ? rcu_is_watching+0x12/0xc0 [ 174.854476][ T9784] ? __mark_inode_dirty+0x29e/0xe40 [ 174.854483][ T9784] hfsplus_delete_cat+0x845/0xde0 [ 174.854493][ T9784] ? __pfx_hfsplus_delete_cat+0x10/0x10 [ 174.854507][ T9784] hfsplus_unlink+0x1ca/0x7c0 [ 174.854516][ T9784] ? __pfx_hfsplus_unlink+0x10/0x10 [ 174.854525][ T9784] ? down_write+0x148/0x200 [ 174.854532][ T9784] ? __pfx_down_write+0x10/0x10 [ 174.854540][ T9784] vfs_unlink+0x2fe/0x9b0 [ 174.854549][ T9784] do_unlinkat+0x490/0x670 [ 174.854557][ T9784] ? __pfx_do_unlinkat+0x10/0x10 [ 174.854565][ T9784] ? __might_fault+0xbc/0x130 [ 174.854576][ T9784] ? getname_flags.part.0+0x1c5/0x550 [ 174.854584][ T9784] __x64_sys_unlink+0xc5/0x110 [ 174.854592][ T9784] do_syscall_64+0xc9/0x480 [ 174.854600][ T9784] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 174.854608][ T9784] RIP: 0033:0x7f6fdf4c3167 [ 174.854614][ T9784] Code: f0 ff ff 73 01 c3 48 8b 0d 26 0d 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 08 [ 174.854622][ T9784] RSP: 002b:00007ffcb948bca8 EFLAGS: 00000206 ORIG_RAX: 0000000000000057 [ 174.854630][ T9784] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f6fdf4c3167 [ 174.854636][ T9784] RDX: 00007ffcb948bcc0 RSI: 00007ffcb948bcc0 RDI: 00007ffcb948bd50 [ 174.854641][ T9784] RBP: 00007ffcb948cd90 R08: 0000000000000001 R09: 00007ffcb948bb40 [ 174.854645][ T9784] R10: 00007f6fdf564fc0 R11: 0000000000000206 R12: 0000561e1bc9c2d0 [ 174.854650][ T9784] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 174.854658][ T9784] </TASK> [ 174.854661][ T9784] [ 174.879281][ T9784] Allocated by task 9784: [ 174.879664][ T9784] kasan_save_stack+0x20/0x40 [ 174.880082][ T9784] kasan_save_track+0x14/0x30 [ 174.880500][ T9784] __kasan_kmalloc+0xaa/0xb0 [ 174.880908][ T9784] __kmalloc_noprof+0x205/0x550 [ 174.881337][ T9784] __hfs_bnode_create+0x107/0x890 [ 174.881779][ T9784] hfsplus_bnode_find+0x2d0/0xd10 [ 174.882222][ T9784] hfsplus_brec_find+0x2b0/0x520 [ 174.882659][ T9784] hfsplus_delete_all_attrs+0x23b/0x320 [ 174.883144][ T9784] hfsplus_delete_cat+0x845/0xde0 [ 174.883595][ T9784] hfsplus_rmdir+0x106/0x1b0 [ 174.884004][ T9784] vfs_rmdir+0x206/0x690 [ 174.884379][ T9784] do_rmdir+0x2b7/0x390 [ 174.884751][ T9784] __x64_sys_rmdir+0xc5/0x110 [ 174.885167][ T9784] do_syscall_64+0xc9/0x480 [ 174.885568][ T9784] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 174.886083][ T9784] [ 174.886293][ T9784] The buggy address belongs to the object at ffff88810b5fc600 [ 174.886293][ T9784] which belongs to the cache kmalloc-192 of size 192 [ 174.887507][ T9784] The buggy address is located 40 bytes to the right of [ 174.887507][ T9784] allocated 152-byte region [ffff88810b5fc600, ffff88810b5fc698) [ 174.888766][ T9784] [ 174.888976][ T9784] The buggy address belongs to the physical page: [ 174.889533][ T9784] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10b5fc [ 174.890295][ T9784] flags: 0x57ff00000000000(node=1|zone=2|lastcpupid=0x7ff) [ 174.890927][ T9784] page_type: f5(slab) [ 174.891284][ T9784] raw: 057ff00000000000 ffff88801b4423c0 ffffea000426dc80 dead000000000002 [ 174.892032][ T9784] raw: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000 [ 174.892774][ T9784] page dumped because: kasan: bad access detected [ 174.893327][ T9784] page_owner tracks the page as allocated [ 174.893825][ T9784] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52c00(GFP_NOIO|__GFP_NOWARN|__GFP_NO1 [ 174.895373][ T9784] post_alloc_hook+0x1c0/0x230 [ 174.895801][ T9784] get_page_from_freelist+0xdeb/0x3b30 [ 174.896284][ T9784] __alloc_frozen_pages_noprof+0x25c/0x2460 [ 174.896810][ T9784] alloc_pages_mpol+0x1fb/0x550 [ 174.897242][ T9784] new_slab+0x23b/0x340 [ 174.897614][ T9784] ___slab_alloc+0xd81/0x1960 [ 174.898028][ T9784] __slab_alloc.isra.0+0x56/0xb0 [ 174.898468][ T9784] __kmalloc_noprof+0x2b0/0x550 [ 174.898896][ T9784] usb_alloc_urb+0x73/0xa0 [ 174.899289][ T9784] usb_control_msg+0x1cb/0x4a0 [ 174.899718][ T9784] usb_get_string+0xab/0x1a0 [ 174.900133][ T9784] usb_string_sub+0x107/0x3c0 [ 174.900549][ T9784] usb_string+0x307/0x670 [ 174.900933][ T9784] usb_cache_string+0x80/0x150 [ 174.901355][ T9784] usb_new_device+0x1d0/0x19d0 [ 174.901786][ T9784] register_root_hub+0x299/0x730 [ 174.902231][ T9784] page last free pid 10 tgid 10 stack trace: [ 174.902757][ T9784] __free_frozen_pages+0x80c/0x1250 [ 174.903217][ T9784] vfree.part.0+0x12b/0xab0 [ 174.903645][ T9784] delayed_vfree_work+0x93/0xd0 [ 174.904073][ T9784] process_one_work+0x9b5/0x1b80 [ 174.904519][ T9784] worker_thread+0x630/0xe60 [ 174.904927][ T9784] kthread+0x3a8/0x770 [ 174.905291][ T9784] ret_from_fork+0x517/0x6e0 [ 174.905709][ T9784] ret_from_fork_asm+0x1a/0x30 [ 174.906128][ T9784] [ 174.906338][ T9784] Memory state around the buggy address: [ 174.906828][ T9784] ffff88810b5fc580: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 174.907528][ T9784] ffff88810b5fc600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 174.908222][ T9784] >ffff88810b5fc680: 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc [ 174.908917][ T9784] ^ [ 174.909481][ T9784] ffff88810b5fc700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 174.910432][ T9784] ffff88810b5fc780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 174.911401][ T9784] ================================================================== The reason of the issue that code doesn't check the correctness of the requested offset and length. As a result, incorrect value of offset or/and length could result in access out of allocated memory. This patch introduces is_bnode_offset_valid() method that checks the requested offset value. Also, it introduces check_and_correct_requested_length() method that checks and correct the requested length (if it is necessary). These methods are used in hfsplus_bnode_read(), hfsplus_bnode_write(), hfsplus_bnode_clear(), hfsplus_bnode_copy(), and hfsplus_bnode_move() with the goal to prevent the access out of allocated memory and triggering the crash. Reported-by: Kun Hu <huk23@m.fudan.edu.cn> Reported-by: Jiaji Qin <jjtan24@m.fudan.edu.cn> Reported-by: Shuoran Bai <baishuoran@hrbeu.edu.cn> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Link: https://lore.kernel.org/r/20250703214804.244077-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25hfsplus: fix slab-out-of-bounds read in hfsplus_uni2asc()Viacheslav Dubeyko
The hfsplus_readdir() method is capable to crash by calling hfsplus_uni2asc(): [ 667.121659][ T9805] ================================================================== [ 667.122651][ T9805] BUG: KASAN: slab-out-of-bounds in hfsplus_uni2asc+0x902/0xa10 [ 667.123627][ T9805] Read of size 2 at addr ffff88802592f40c by task repro/9805 [ 667.124578][ T9805] [ 667.124876][ T9805] CPU: 3 UID: 0 PID: 9805 Comm: repro Not tainted 6.16.0-rc3 #1 PREEMPT(full) [ 667.124886][ T9805] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 667.124890][ T9805] Call Trace: [ 667.124893][ T9805] <TASK> [ 667.124896][ T9805] dump_stack_lvl+0x10e/0x1f0 [ 667.124911][ T9805] print_report+0xd0/0x660 [ 667.124920][ T9805] ? __virt_addr_valid+0x81/0x610 [ 667.124928][ T9805] ? __phys_addr+0xe8/0x180 [ 667.124934][ T9805] ? hfsplus_uni2asc+0x902/0xa10 [ 667.124942][ T9805] kasan_report+0xc6/0x100 [ 667.124950][ T9805] ? hfsplus_uni2asc+0x902/0xa10 [ 667.124959][ T9805] hfsplus_uni2asc+0x902/0xa10 [ 667.124966][ T9805] ? hfsplus_bnode_read+0x14b/0x360 [ 667.124974][ T9805] hfsplus_readdir+0x845/0xfc0 [ 667.124984][ T9805] ? __pfx_hfsplus_readdir+0x10/0x10 [ 667.124994][ T9805] ? stack_trace_save+0x8e/0xc0 [ 667.125008][ T9805] ? iterate_dir+0x18b/0xb20 [ 667.125015][ T9805] ? trace_lock_acquire+0x85/0xd0 [ 667.125022][ T9805] ? lock_acquire+0x30/0x80 [ 667.125029][ T9805] ? iterate_dir+0x18b/0xb20 [ 667.125037][ T9805] ? down_read_killable+0x1ed/0x4c0 [ 667.125044][ T9805] ? putname+0x154/0x1a0 [ 667.125051][ T9805] ? __pfx_down_read_killable+0x10/0x10 [ 667.125058][ T9805] ? apparmor_file_permission+0x239/0x3e0 [ 667.125069][ T9805] iterate_dir+0x296/0xb20 [ 667.125076][ T9805] __x64_sys_getdents64+0x13c/0x2c0 [ 667.125084][ T9805] ? __pfx___x64_sys_getdents64+0x10/0x10 [ 667.125091][ T9805] ? __x64_sys_openat+0x141/0x200 [ 667.125126][ T9805] ? __pfx_filldir64+0x10/0x10 [ 667.125134][ T9805] ? do_user_addr_fault+0x7fe/0x12f0 [ 667.125143][ T9805] do_syscall_64+0xc9/0x480 [ 667.125151][ T9805] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 667.125158][ T9805] RIP: 0033:0x7fa8753b2fc9 [ 667.125164][ T9805] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 48 [ 667.125172][ T9805] RSP: 002b:00007ffe96f8e0f8 EFLAGS: 00000217 ORIG_RAX: 00000000000000d9 [ 667.125181][ T9805] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa8753b2fc9 [ 667.125185][ T9805] RDX: 0000000000000400 RSI: 00002000000063c0 RDI: 0000000000000004 [ 667.125190][ T9805] RBP: 00007ffe96f8e110 R08: 00007ffe96f8e110 R09: 00007ffe96f8e110 [ 667.125195][ T9805] R10: 0000000000000000 R11: 0000000000000217 R12: 0000556b1e3b4260 [ 667.125199][ T9805] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 667.125207][ T9805] </TASK> [ 667.125210][ T9805] [ 667.145632][ T9805] Allocated by task 9805: [ 667.145991][ T9805] kasan_save_stack+0x20/0x40 [ 667.146352][ T9805] kasan_save_track+0x14/0x30 [ 667.146717][ T9805] __kasan_kmalloc+0xaa/0xb0 [ 667.147065][ T9805] __kmalloc_noprof+0x205/0x550 [ 667.147448][ T9805] hfsplus_find_init+0x95/0x1f0 [ 667.147813][ T9805] hfsplus_readdir+0x220/0xfc0 [ 667.148174][ T9805] iterate_dir+0x296/0xb20 [ 667.148549][ T9805] __x64_sys_getdents64+0x13c/0x2c0 [ 667.148937][ T9805] do_syscall_64+0xc9/0x480 [ 667.149291][ T9805] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 667.149809][ T9805] [ 667.150030][ T9805] The buggy address belongs to the object at ffff88802592f000 [ 667.150030][ T9805] which belongs to the cache kmalloc-2k of size 2048 [ 667.151282][ T9805] The buggy address is located 0 bytes to the right of [ 667.151282][ T9805] allocated 1036-byte region [ffff88802592f000, ffff88802592f40c) [ 667.152580][ T9805] [ 667.152798][ T9805] The buggy address belongs to the physical page: [ 667.153373][ T9805] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x25928 [ 667.154157][ T9805] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 667.154916][ T9805] anon flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff) [ 667.155631][ T9805] page_type: f5(slab) [ 667.155997][ T9805] raw: 00fff00000000040 ffff88801b442f00 0000000000000000 dead000000000001 [ 667.156770][ T9805] raw: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000 [ 667.157536][ T9805] head: 00fff00000000040 ffff88801b442f00 0000000000000000 dead000000000001 [ 667.158317][ T9805] head: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000 [ 667.159088][ T9805] head: 00fff00000000003 ffffea0000964a01 00000000ffffffff 00000000ffffffff [ 667.159865][ T9805] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008 [ 667.160643][ T9805] page dumped because: kasan: bad access detected [ 667.161216][ T9805] page_owner tracks the page as allocated [ 667.161732][ T9805] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN9 [ 667.163566][ T9805] post_alloc_hook+0x1c0/0x230 [ 667.164003][ T9805] get_page_from_freelist+0xdeb/0x3b30 [ 667.164503][ T9805] __alloc_frozen_pages_noprof+0x25c/0x2460 [ 667.165040][ T9805] alloc_pages_mpol+0x1fb/0x550 [ 667.165489][ T9805] new_slab+0x23b/0x340 [ 667.165872][ T9805] ___slab_alloc+0xd81/0x1960 [ 667.166313][ T9805] __slab_alloc.isra.0+0x56/0xb0 [ 667.166767][ T9805] __kmalloc_cache_noprof+0x255/0x3e0 [ 667.167255][ T9805] psi_cgroup_alloc+0x52/0x2d0 [ 667.167693][ T9805] cgroup_mkdir+0x694/0x1210 [ 667.168118][ T9805] kernfs_iop_mkdir+0x111/0x190 [ 667.168568][ T9805] vfs_mkdir+0x59b/0x8d0 [ 667.168956][ T9805] do_mkdirat+0x2ed/0x3d0 [ 667.169353][ T9805] __x64_sys_mkdir+0xef/0x140 [ 667.169784][ T9805] do_syscall_64+0xc9/0x480 [ 667.170195][ T9805] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 667.170730][ T9805] page last free pid 1257 tgid 1257 stack trace: [ 667.171304][ T9805] __free_frozen_pages+0x80c/0x1250 [ 667.171770][ T9805] vfree.part.0+0x12b/0xab0 [ 667.172182][ T9805] delayed_vfree_work+0x93/0xd0 [ 667.172612][ T9805] process_one_work+0x9b5/0x1b80 [ 667.173067][ T9805] worker_thread+0x630/0xe60 [ 667.173486][ T9805] kthread+0x3a8/0x770 [ 667.173857][ T9805] ret_from_fork+0x517/0x6e0 [ 667.174278][ T9805] ret_from_fork_asm+0x1a/0x30 [ 667.174703][ T9805] [ 667.174917][ T9805] Memory state around the buggy address: [ 667.175411][ T9805] ffff88802592f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 667.176114][ T9805] ffff88802592f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 667.176830][ T9805] >ffff88802592f400: 00 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 667.177547][ T9805] ^ [ 667.177933][ T9805] ffff88802592f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 667.178640][ T9805] ffff88802592f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 667.179350][ T9805] ================================================================== The hfsplus_uni2asc() method operates by struct hfsplus_unistr: struct hfsplus_unistr { __be16 length; hfsplus_unichr unicode[HFSPLUS_MAX_STRLEN]; } __packed; where HFSPLUS_MAX_STRLEN is 255 bytes. The issue happens if length of the structure instance has value bigger than 255 (for example, 65283). In such case, pointer on unicode buffer is going beyond of the allocated memory. The patch fixes the issue by checking the length value of hfsplus_unistr instance and using 255 value in the case if length value is bigger than HFSPLUS_MAX_STRLEN. Potential reason of such situation could be a corruption of Catalog File b-tree's node. Reported-by: Wenzhi Wang <wenzhi.wang@uwaterloo.ca> Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> cc: Yangtao Li <frank.li@vivo.com> cc: linux-fsdevel@vger.kernel.org Reviewed-by: Yangtao Li <frank.li@vivo.com> Link: https://lore.kernel.org/r/20250710230830.110500-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25hfsplus: don't use BUG_ON() in hfsplus_create_attributes_file()Tetsuo Handa
When the volume header contains erroneous values that do not reflect the actual state of the filesystem, hfsplus_fill_super() assumes that the attributes file is not yet created, which later results in hitting BUG_ON() when hfsplus_create_attributes_file() is called. Replace this BUG_ON() with -EIO error with a message to suggest running fsck tool. Reported-by: syzbot <syzbot+1107451c16b9eb9d29e6@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=1107451c16b9eb9d29e6 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com> Link: https://lore.kernel.org/r/7b587d24-c8a1-4413-9b9a-00a33fbd849f@I-love.SAKURA.ne.jp Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25hfsplus: don't set REQ_SYNC for hfsplus_submit_bio()Johannes Thumshirn
hfsplus_submit_bio() called by hfsplus_sync_fs() uses bdev_virt_rw() which in turn uses submit_bio_wait() to submit the BIO. But submit_bio_wait() already sets the REQ_SYNC flag on the BIO so there is no need for setting the flag in hfsplus_sync_fs() when calling hfsplus_submit_bio(). Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Link: https://lore.kernel.org/r/20250710063553.4805-1-johannes.thumshirn@wdc.com Link: https://lore.kernel.org/r/20250710063553.4805-1-johannes.thumshirn@wdc.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25Merge tag 'hisi-drivers-for-6.17' of https://github.com/hisilicon/linux-hisi ↵Arnd Bergmann
into soc/drivers HiSilicon driver updates for v6.17 - Print the hardware ID instead of the index in the HCCS driver * tag 'hisi-drivers-for-6.17' of https://github.com/hisilicon/linux-hisi: soc: hisilicon: kunpeng_hccs: Fix incorrect log information Link: https://lore.kernel.org/r/6879FFED.1050002@hisilicon.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-07-25Merge tag 'qcom-drivers-for-6.17-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/drivers More Qualcomm driver updates for v6.17 Fix race condition during SCM driver initialization, in relation to tzmem and waitqueue irq handling, Make the rpmh RSC driver support version 4 of the IP block. Add SM7635 family and related PMICs to the socinfo driver. Also add support for retrieving the bootloader build details. * tag 'qcom-drivers-for-6.17-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: dt-bindings: soc: qcom: qcom,pmic-glink: document Milos compatible dt-bindings: soc: qcom,aoss-qmp: document the Milos Always-On Subsystem side channel dt-bindings: firmware: qcom,scm: document Milos SCM Firmware Interface soc: qcom: socinfo: Add support to retrieve APPSBL build details soc: qcom: pmic_glink: fix OF node leak soc: qcom: spmi-pmic: add more PMIC SUBTYPE IDs soc: qcom: socinfo: Add PM7550 & PMIV0108 PMICs soc: qcom: socinfo: Add SoC IDs for SM7635 family dt-bindings: arm: qcom,ids: Add SoC IDs for SM7635 family firmware: qcom: scm: request the waitqueue irq *after* initializing SCM firmware: qcom: scm: initialize tzmem before marking SCM as available firmware: qcom: scm: take struct device as argument in SHM bridge enable firmware: qcom: scm: remove unused arguments from SHM bridge routines soc: qcom: rpmh-rsc: Add RSC version 4 support Link: https://lore.kernel.org/r/20250720030743.285440-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-07-25Merge tag 'drm-fixes-2025-07-26' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes (part 2) from Dave Airlie: "Just the follow up fixes for i915 and xe, all pretty minor. i915: - Fix DP 2.7 Gbps DP_LINK_BW value on g4x - Fix return value on intel_atomic_commit_fence_wait xe: - Fix build without debugfs" * tag 'drm-fixes-2025-07-26' of https://gitlab.freedesktop.org/drm/kernel: drm/xe: Fix build without debugfs drm/i915/display: Fix dma_fence_wait_timeout() return value handling drm/i915/dp: Fix 2.7 Gbps DP_LINK_BW value on g4x
2025-07-25dt-bindings: display: mediatek,dp: Allow DisplayPort AUX busAngeloGioacchino Del Regno
Like others, the MediaTek DisplayPort controller provides an auxiliary bus: import the common dp-aux-bus.yaml in this binding to allow specifying an aux-bus subnode. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20250724083914.61351-3-angelogioacchino.delregno@collabora.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-07-25dt-bindings: fsl: convert fsl,vf610-mscm-ir.txt to yaml formatFrank Li
Convert fsl,vf610-mscm-ir.txt to yaml format. Additional changes: - remove label at example dts. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20250724190342.1321632-1-Frank.Li@nxp.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-07-25dt-bindings: interrupt-controller: Add fsl,icoll.yamlFrank Li
Add fsl,icoll.yaml for i.MX23 and i.MX28. Also add a generic fallback compatible string "fsl,icoll" for legacy devices, which have existed for over 15 years. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20250724164624.1271661-1-Frank.Li@nxp.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-07-25dt-bindings: interrupt-controller: Add missing Xilinx INTC bindingMichal Simek
Add missing description for AMD/Xilinx interrupt controller. The binding is used by Microblaze before dt-binding even existed but never been documented properly. IP acts as primary interrupt controller on Microblaze systems or can be used as secondary interrupt controller on ARM based systems like Zynq, ZynqMP, Versal or Versal Gen 2. Also as secondary interrupt controller on Microblaze-V (Risc-V) systems. Over the years IP exists in multiple variants based on attached bus as OPB, PLB or AXI that's why generic filename is used. Property xlnx,kind-of-intr is in hex because every bit position corresponds to interrupt line. Controller support mixing edge or level interrupts together and this is the property which distinguish them. Signed-off-by: Michal Simek <michal.simek@amd.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/2b9d4a3a693f501d420da88b8418732ba9def877.1753354675.git.michal.simek@amd.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>