summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2025-07-04bcachefs: Fix btree for nonexistent tree depthKent Overstreet
The fix for when we should increase tree depth in journal replay was entirely bogus. We should only increase the tree depth in journal replay when recovery from btree node scan, and then only for keys found by btree node scan. This needs additional work - we should be shooting down existing interior node pointers when recovery from scan, they shouldn't be showing up here. Fixes: b47a82ff4772 ("bcachefs: Only run 'increase_depth' for keys from btree node csan") Cc: Alan Huang <mmpgouride@gmail.com> Reported-by: syzbot+8deb6ff4415db67a9f18@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-04bcachefs: Fix bch2_io_failures_to_text()Kent Overstreet
This wasn't updated when we added tracking for btree validate errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-04bcachefs: bch2_fpunch_snapshot()Kent Overstreet
Add a new version of fpunch for operating on a snapshot ID, not a subvolume - and use it for "extent past end of inode" repair. Previously, repair would try to delete everything at once, but deleting too many extents at once can overflow the btree_trans bump allocator, as well as causing other problems - the new helper properly uses bch2_extent_trim_atomic(). Reported-and-tested-by: Edoardo Codeglia <bcachefs@404.blue> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-04fscrypt: Don't use problematic non-inline crypto enginesEric Biggers
Make fscrypt no longer use Crypto API drivers for non-inline crypto engines, even when the Crypto API prioritizes them over CPU-based code (which unfortunately it often does). These drivers tend to be really problematic, especially for fscrypt's workload. This commit has no effect on inline crypto engines, which are different and do work well. Specifically, exclude drivers that have CRYPTO_ALG_KERN_DRIVER_ONLY or CRYPTO_ALG_ALLOCATES_MEMORY set. (Later, CRYPTO_ALG_ASYNC should be excluded too. That's omitted for now to keep this commit backportable, since until recently some CPU-based code had CRYPTO_ALG_ASYNC set.) There are two major issues with these drivers: bugs and performance. First, these drivers tend to be buggy. They're fundamentally much more error-prone and harder to test than the CPU-based code. They often don't get tested before kernel releases, and even if they do, the crypto self-tests don't properly test these drivers. Released drivers have en/decrypted or hashed data incorrectly. These bugs cause issues for fscrypt users who often didn't even want to use these drivers, e.g.: - https://github.com/google/fscryptctl/issues/32 - https://github.com/google/fscryptctl/issues/9 - https://lore.kernel.org/r/PH0PR02MB731916ECDB6C613665863B6CFFAA2@PH0PR02MB7319.namprd02.prod.outlook.com These drivers have also similarly caused issues for dm-crypt users, including data corruption and deadlocks. Since Linux v5.10, dm-crypt has disabled most of them by excluding CRYPTO_ALG_ALLOCATES_MEMORY. Second, these drivers tend to be *much* slower than the CPU-based code. This may seem counterintuitive, but benchmarks clearly show it. There's a *lot* of overhead associated with going to a hardware driver, off the CPU, and back again. To prove this, I gathered as many systems with this type of crypto engine as I could, and I measured synchronous encryption of 4096-byte messages (which matches fscrypt's workload): Intel Emerald Rapids server: AES-256-XTS: xts-aes-vaes-avx512 16171 MB/s [CPU-based, Vector AES] qat_aes_xts 289 MB/s [Offload, Intel QuickAssist] Qualcomm SM8650 HDK: AES-256-XTS: xts-aes-ce 4301 MB/s [CPU-based, ARMv8 Crypto Extensions] xts-aes-qce 73 MB/s [Offload, Qualcomm Crypto Engine] i.MX 8M Nano LPDDR4 EVK: AES-256-XTS: xts-aes-ce 647 MB/s [CPU-based, ARMv8 Crypto Extensions] xts(ecb-aes-caam) 20 MB/s [Offload, CAAM] AES-128-CBC-ESSIV: essiv(cbc-aes-caam,sha256-lib) 23 MB/s [Offload, CAAM] STM32MP157F-DK2: AES-256-XTS: xts-aes-neonbs 13.2 MB/s [CPU-based, ARM NEON] xts(stm32-ecb-aes) 3.1 MB/s [Offload, STM32 crypto engine] AES-128-CBC-ESSIV: essiv(cbc-aes-neonbs,sha256-lib) 14.7 MB/s [CPU-based, ARM NEON] essiv(stm32-cbc-aes,sha256-lib) 3.2 MB/s [Offload, STM32 crypto engine] Adiantum: adiantum(xchacha12-arm,aes-arm,nhpoly1305-neon) 52.8 MB/s [CPU-based, ARM scalar + NEON] So, there was no case in which the crypto engine was even *close* to being faster. On the first three, which have AES instructions in the CPU, the CPU was 30 to 55 times faster (!). Even on STM32MP157F-DK2 which has a Cortex-A7 CPU that doesn't have AES instructions, AES was over 4 times faster on the CPU. And Adiantum encryption, which is what actually should be used on CPUs like that, was over 17 times faster. Other justifications that have been given for these non-inline crypto engines (almost always coming from the hardware vendors, not actual users) don't seem very plausible either: - The crypto engine throughput could be improved by processing multiple requests concurrently. Currently irrelevant to fscrypt, since it doesn't do that. This would also be complex, and unhelpful in many cases. 2 of the 4 engines I tested even had only one queue. - Some of the engines, e.g. STM32, support hardware keys. Also currently irrelevant to fscrypt, since it doesn't support these. Interestingly, the STM32 driver itself doesn't support this either. - Free up CPU for other tasks and/or reduce energy usage. Not very plausible considering the "short" message length, driver overhead, and scheduling overhead. There's just very little time for the CPU to do something else like run another task or enter low-power state, before the message finishes and it's time to process the next one. - Some of these engines resist power analysis and electromagnetic attacks, while the CPU-based crypto generally does not. In theory, this sounds great. In practice, if this benefit requires the use of an off-CPU offload that massively regresses performance and has a low-quality, buggy driver, the price for this hardening (which is not relevant to most fscrypt users, and tends to be incomplete) is just too high. Inline crypto engines are much more promising here, as are on-CPU solutions like RISC-V High Assurance Cryptography. Fixes: b30ab0e03407 ("ext4 crypto: add ext4 encryption facilities") Cc: stable@vger.kernel.org Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250704070322.20692-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-04Merge tag 'bcachefs-2025-07-03' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: "The 'opts.casefold_disabled' patch is non critical, but would be a 6.15 backport; it's to address the casefolding + overlayfs incompatibility that was discovvered late. It's late because I was hoping that this would be addressed on the overlayfs side (and will be in 6.17), but user reports keep coming in on this one (lots of people are using docker these days)" * tag 'bcachefs-2025-07-03' of git://evilpiepirate.org/bcachefs: bcachefs: opts.casefold_disabled bcachefs: Work around deadlock to btree node rewrites in journal replay bcachefs: Fix incorrect transaction restart handling bcachefs: fix btree_trans_peek_prev_journal() bcachefs: mark invalid_btree_id autofix
2025-07-04Merge tag 'vfs-6.16-rc5.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix a regression caused by the anonymous inode rework. Making them regular files causes various places in the kernel to tip over starting with io_uring. Revert to the former status quo and port our assertion to be based on checking the inode so we don't lose the valuable VFS_*_ON_*() assertions that have already helped discover weird behavior our outright bugs. - Fix the the upper bound calculation in fuse_fill_write_pages() - Fix priority inversion issues in the eventpoll code - Make secretmen use anon_inode_make_secure_inode() to avoid bypassing the LSM layer - Fix a netfs hang due to missing case in final DIO read result collection - Fix a double put of the netfs_io_request struct - Provide some helpers to abstract out NETFS_RREQ_IN_PROGRESS flag wrangling - Fix infinite looping in netfs_wait_for_pause/request() - Fix a netfs ref leak on an extra subrequest inserted into a request's list of subreqs - Fix various cifs RPC callbacks to set NETFS_SREQ_NEED_RETRY if a subrequest fails retriably - Fix a cifs warning in the workqueue code when reconnecting a channel - Fix the updating of i_size in netfs to avoid a race between testing if we should have extended the file with a DIO write and changing i_size - Merge the places in netfs that update i_size on write - Fix coredump socket selftests * tag 'vfs-6.16-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: anon_inode: rework assertions netfs: Update tracepoints in a number of ways netfs: Renumber the NETFS_RREQ_* flags to make traces easier to read netfs: Merge i_size update functions netfs: Fix i_size updating smb: client: set missing retry flag in cifs_writev_callback() smb: client: set missing retry flag in cifs_readv_callback() smb: client: set missing retry flag in smb2_writev_callback() netfs: Fix ref leak on inserted extra subreq in write retry netfs: Fix looping in wait functions netfs: Provide helpers to perform NETFS_RREQ_IN_PROGRESS flag wangling netfs: Fix double put of request netfs: Fix hang due to missing case in final DIO read result collection eventpoll: Fix priority inversion problem fuse: fix fuse_fill_write_pages() upper bound calculation fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass selftests/coredump: Fix "socket_detect_userspace_client" test failure
2025-07-04tree-wide: s/struct fileattr/struct file_kattr/gChristian Brauner
Now that we expose struct file_attr as our uapi struct rename all the internal struct to struct file_kattr to clearly communicate that it is a kernel internal struct. This is similar to struct mount_{k}attr and others. Link: https://lore.kernel.org/20250703-restlaufzeit-baurecht-9ed44552b481@brauner Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-03fix proc_sys_compare() handling of in-lookup dentriesAl Viro
There's one case where ->d_compare() can be called for an in-lookup dentry; usually that's nothing special from ->d_compare() point of view, but... proc_sys_compare() is weird. The thing is, /proc/sys subdirectories can look differently for different processes. Up to and including having the same name resolve to different dentries - all of them hashed. The way it's done is ->d_compare() refusing to admit a match unless this dentry is supposed to be visible to this caller. The information needed to discriminate between them is stored in inode; it is set during proc_sys_lookup() and until it's done d_splice_alias() we really can't tell who should that dentry be visible for. Normally there's no negative dentries in /proc/sys; we can run into a dying dentry in RCU dcache lookup, but those can be safely rejected. However, ->d_compare() is also called for in-lookup dentries, before they get positive - or hashed, for that matter. In case of match we will wait until dentry leaves in-lookup state and repeat ->d_compare() afterwards. In other words, the right behaviour is to treat the name match as sufficient for in-lookup dentries; if dentry is not for us, we'll see that when we recheck once proc_sys_lookup() is done with it. While we are at it, fix the misspelled READ_ONCE and WRITE_ONCE there. Fixes: d9171b934526 ("parallel lookups machinery, part 4 (and last)") Reported-by: NeilBrown <neilb@brown.name> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-03smb: client: fix native SMB symlink traversalPaulo Alcantara
We've seen customers having shares mounted in paths like /??/C:/ or /??/UNC/foo.example.com/share in order to get their native SMB symlinks successfully followed from different mounts. After commit 12b466eb52d9 ("cifs: Fix creating and resolving absolute NT-style symlinks"), the client would then convert absolute paths from "/??/C:/" to "/mnt/c/" by default. The absolute paths would vary depending on the value of symlinkroot= mount option. Fix this by restoring old behavior of not trying to convert absolute paths by default. Only do this if symlinkroot= was _explicitly_ set. Before patch: $ mount.cifs //w22-fs0/test2 /mnt/1 -o vers=3.1.1,username=xxx,password=yyy $ ls -l /mnt/1/symlink2 lrwxr-xr-x 1 root root 15 Jun 20 14:22 /mnt/1/symlink2 -> /mnt/c/testfile $ mkdir -p /??/C:; echo foo > //??/C:/testfile $ cat /mnt/1/symlink2 cat: /mnt/1/symlink2: No such file or directory After patch: $ mount.cifs //w22-fs0/test2 /mnt/1 -o vers=3.1.1,username=xxx,password=yyy $ ls -l /mnt/1/symlink2 lrwxr-xr-x 1 root root 15 Jun 20 14:22 /mnt/1/symlink2 -> '/??/C:/testfile' $ mkdir -p /??/C:; echo foo > //??/C:/testfile $ cat /mnt/1/symlink2 foo Cc: linux-cifs@vger.kernel.org Reported-by: Pierguido Lambri <plambri@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Stefan Metzmacher <metze@samba.org> Fixes: 12b466eb52d9 ("cifs: Fix creating and resolving absolute NT-style symlinks") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-07-03smb: client: fix race condition in negotiate timeout by using more precise ↵Wang Zhaolong
timing When the SMB server reboots and the client immediately accesses the mount point, a race condition can occur that causes operations to fail with "Host is down" error. Reproduction steps: # Mount SMB share mount -t cifs //192.168.245.109/TEST /mnt/ -o xxxx ls /mnt # Reboot server ssh root@192.168.245.109 reboot ssh root@192.168.245.109 /path/to/cifs_server_setup.sh ssh root@192.168.245.109 systemctl stop firewalld # Immediate access fails ls /mnt ls: cannot access '/mnt': Host is down # But works if there is a delay The issue is caused by a race condition between negotiate and reconnect. The 20-second negotiate timeout mechanism can interfere with the normal recovery process when both are triggered simultaneously. ls cifsd --------------------------------------------------- cifs_getattr cifs_revalidate_dentry cifs_get_inode_info cifs_get_fattr smb2_query_path_info smb2_compound_op SMB2_open_init smb2_reconnect cifs_negotiate_protocol smb2_negotiate cifs_send_recv smb_send_rqst wait_for_response cifs_demultiplex_thread cifs_read_from_socket cifs_readv_from_socket server_unresponsive cifs_reconnect __cifs_reconnect cifs_abort_connection mid->mid_state = MID_RETRY_NEEDED cifs_wake_up_task cifs_sync_mid_result // case MID_RETRY_NEEDED rc = -EAGAIN; // In smb2_negotiate() rc = -EHOSTDOWN; The server_unresponsive() timeout triggers cifs_reconnect(), which aborts ongoing mid requests and causes the ls command to receive -EAGAIN, leading to -EHOSTDOWN. Fix this by introducing a dedicated `neg_start` field to precisely tracks when the negotiate process begins. The timeout check now uses this accurate timestamp instead of `lstrp`, ensuring that: 1. Timeout is only triggered after negotiate has actually run for 20s 2. The mechanism doesn't interfere with concurrent recovery processes 3. Uninitialized timestamps (value 0) don't trigger false timeouts Fixes: 7ccc1465465d ("smb: client: fix hang in wait_for_response() for negproto") Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-07-03Merge tag 'for-6.16-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - tree-log fixes: - fixes of log tracking of directories and subvolumes - fix iteration and error handling of inode references during log replay - fix free space tree rebuild (reported by syzbot) * tag 'for-6.16-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: use btrfs_record_snapshot_destroy() during rmdir btrfs: propagate last_unlink_trans earlier when doing a rmdir btrfs: record new subvolume in parent dir earlier to avoid dir logging races btrfs: fix inode lookup error handling during log replay btrfs: fix iteration of extrefs during log replay btrfs: fix missing error handling when searching for inode refs during log replay btrfs: fix failure to rebuild free space tree using multiple transactions
2025-07-03Merge tag 'xfs-fixes-6.16-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Carlos Maiolino: - Fix umount hang with unflushable inodes (and add new tracepoint used for debugging this) - Fix ABBA deadlock in xfs_reclaim_inode() vs xfs_ifree_cluster() - Fix dquot buffer pin deadlock * tag 'xfs-fixes-6.16-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: add FALLOC_FL_ALLOCATE_RANGE to supported flags mask xfs: fix unmount hang with unflushable inodes stuck in the AIL xfs: factor out stale buffer item completion xfs: rearrange code in xfs_buf_item.c xfs: add tracepoints for stale pinned inode state debug xfs: avoid dquot buffer pin deadlock xfs: catch stale AGF/AGF metadata xfs: xfs_ifree_cluster vs xfs_iflush_shutdown_abort deadlock xfs: actually use the xfs_growfs_check_rtgeom tracepoint xfs: Improve error handling in xfs_mru_cache_create() xfs: move xfs_submit_zoned_bio a bit xfs: use xfs_readonly_buftarg in xfs_remount_rw xfs: remove NULL pointer checks in xfs_mru_cache_insert xfs: check for shutdown before going to sleep in xfs_select_zone
2025-07-02rpc_mkpipe_dentry(): saner calling conventionsAl Viro
Instead of returning a dentry or ERR_PTR(-E...), return 0 and store dentry into pipe->dentry on success and return -E... on failure. Callers are happier that way... NOTE: dummy rpc_pipe is getting ->dentry set; we never access that, since we 1) never call rpc_unlink() for it (dentry is taken out by ->kill_sb()) 2) never call rpc_queue_upcall() for it (writing to that sucker fails; no downcalls are ever submitted, so no replies are going to arrive) IOW, having that ->dentry set (and left dangling) is harmless, if ugly; cleaner solution will take more massage. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_unlink(): saner calling conventionsAl Viro
1) pass it pipe instead of pipe->dentry 2) zero pipe->dentry afterwards 3) it always returns 0; why bother? Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02new helper: simple_start_creating()Al Viro
Set the things up for kernel-initiated creation of object in a tree-in-dcache filesystem. With respect to locking it's an equivalent of filename_create() - we either get a negative dentry with locked parent, or ERR_PTR() and no locks taken. tracefs and debugfs had that open-coded as part of their object creation machinery; switched to calling new helper. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02fuse_ctl: use simple_recursive_removal()Al Viro
easier that way - no need to keep that array of dentry references, etc. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02pstore: switch to locked_recursive_removal()Al Viro
rather than playing with manual d_invalidate() Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02binfmt_misc: switch to locked_recursive_removal()Al Viro
... fixing a mount leak, strictly speaking. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02add locked_recursive_removal()Al Viro
simple_recursive_removal() assumes that parent is not locked and locks it when it finally gets to removing the victim itself. Usually that's what we want, but there are places where the parent is *already* locked and we need it to stay that way. In those cases simple_recursive_removal() would, of course, deadlock, so we have to play racy games with unlocking/relocking the parent around the call or open-code the entire thing. A better solution is to provide a variant that expects to be called with the parent already locked by the caller. Parent should be locked with I_MUTEX_PARENT, to avoid false positives from lockdep. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02better lockdep annotations for simple_recursive_removal()Al Viro
We want a class that nests outside of I_MUTEX_NORMAL (for the sake of callbacks that might want to lock the victim) and inside I_MUTEX_PARENT (so that a variant of that could be used with parent of the victim held locked by the caller). In reality, simple_recursive_removal() * never holds two locks at once * holds the lock on parent of dentry passed to callback * is used only on the trees with fixed topology, so the depths are not changing. So the locking order is actually fine. AFAICS, the best solution is to assign I_MUTEX_CHILD to the locks grabbed by that thing. Reported-by: syzbot+169de184e9defe7fe709@syzkaller.appspotmail.com Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02fs: introduce file_getattr and file_setattr syscallsAndrey Albershteyn
Introduce file_getattr() and file_setattr() syscalls to manipulate inode extended attributes. The syscalls takes pair of file descriptor and pathname. Then it operates on inode opened accroding to openat() semantics. The struct file_attr is passed to obtain/change extended attributes. This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference that file don't need to be open as we can reference it with a path instead of fd. By having this we can manipulated inode extended attributes not only on regular files but also on special ones. This is not possible with FS_IOC_FSSETXATTR ioctl as with special files we can not call ioctl() directly on the filesystem inode using fd. This patch adds two new syscalls which allows userspace to get/set extended inode attributes on special files by using parent directory and a path - *at() like syscall. CC: linux-api@vger.kernel.org CC: linux-fsdevel@vger.kernel.org CC: linux-xfs@vger.kernel.org Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-6-c4e3bc35227b@kernel.org Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-02anon_inode: rework assertionsChristian Brauner
Making anonymous inodes regular files comes with a lot of risk and regression potential as evidenced by a recent hickup in io_uring. We're better of continuing to not have them be regular files. Since we have S_ANON_INODE we can port all of our assertions easily. Link: https://lore.kernel.org/20250702-work-fixes-v1-1-ff76ea589e33@kernel.org Fixes: cfd86ef7e8e7 ("anon_inode: use a proper mode internally") Acked-by: Jens Axboe <axboe@kernel.dk> Cc: stable@kernel.org Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-02fs: prepare for extending file_get/setattr()Amir Goldstein
We intend to add support for more xflags to selective filesystems and We cannot rely on copy_struct_from_user() to detect this extension. In preparation of extending the API, do not allow setting xflags unknown by this kernel version. Also do not pass the read-only flags and read-only field fsx_nextents to filesystem. These changes should not affect existing chattr programs that use the ioctl to get fsxattr before setting the new values. Link: https://lore.kernel.org/linux-fsdevel/20250216164029.20673-4-pali@kernel.org/ Cc: Pali Rohár <pali@kernel.org> Cc: Andrey Albershteyn <aalbersh@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-5-c4e3bc35227b@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-02fs: make vfs_fileattr_[get|set] return -EOPNOTSUPPAndrey Albershteyn
Future patches will add new syscalls which use these functions. As this interface won't be used for ioctls only, the EOPNOSUPP is more appropriate return code. This patch converts return code from ENOIOCTLCMD to EOPNOSUPP for vfs_fileattr_get and vfs_fileattr_set. To save old behavior translate EOPNOSUPP back for current users - overlayfs, encryptfs and fs/ioctl.c. Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-4-c4e3bc35227b@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-02bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's nodeSong Liu
BPF programs, such as LSM and sched_ext, would benefit from tags on cgroups. One common practice to apply such tags is to set xattrs on cgroupfs folders. Introduce kfunc bpf_cgroup_read_xattr, which allows reading cgroup's xattr. Note that, we already have bpf_get_[file|dentry]_xattr. However, these two APIs are not ideal for reading cgroupfs xattrs, because: 1) These two APIs only works in sleepable contexts; 2) There is no kfunc that matches current cgroup to cgroupfs dentry. bpf_cgroup_read_xattr is generic and can be useful for many program types. It is also safe, because it requires trusted or rcu protected argument (KF_RCU). Therefore, we make it available to all program types. Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/20250623063854.1896364-3-song@kernel.org Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-02kernfs: remove iattr_mutexChristian Brauner
All allocations of struct kernfs_iattrs are serialized through a global mutex. Simply do a racy allocation and let the first one win. I bet most callers are under inode->i_rwsem anyway and it wouldn't be needed but let's not require that. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/20250623063854.1896364-2-song@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01bcachefs: opts.casefold_disabledKent Overstreet
Add an option for completely disabling casefolding on a filesystem, as a workaround for overlayfs. This should only be needed as a temporary workaround, until the overlayfs fix arrives. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-01bcachefs: Work around deadlock to btree node rewrites in journal replayKent Overstreet
Don't mark btree nodes for rewrites, if they are or would be degraded, if journal replay hasn't finished, to avoid a deadlock. This is because btree node rewrites generate more updates for the interior updates (alloc, backpointers), and if those updates touch new nodes and generate more rewrites - we can only have so many interior btree updates in flight before we deadlock on open_buckets. The biggest cause is that we don't use the btree write buffer (for the backpointer updates - this needs some real thought on locking in order to fix. The problem with this workaround (not doing the rewrite for degraded nodes in journal replay) is that those degraded nodes persist, and we don't want that (this is a real bug when a btree node write completes with fewer replicas than we wanted and leaves a degraded node due to device _removal_, i.e. the device went away mid write). It's less of a bug here, but still a problem because we don't yet have a way of tracking degraded data - we another index (all extents/btree nodes, by replicas entry) in order to fix properly (re-replicate degraded data at the earliest possible time). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-01lsm: introduce new hooks for setting/getting inode fsxattrAndrey Albershteyn
Introduce new hooks for setting and getting filesystem extended attributes on inode (FS_IOC_FSGETXATTR). Cc: selinux@vger.kernel.org Cc: Paul Moore <paul@paul-moore.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-2-c4e3bc35227b@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01fs: split fileattr related helpers into separate fileAndrey Albershteyn
This patch moves function related to file extended attributes manipulations to separate file. Refactoring only. Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-1-c4e3bc35227b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01Merge tag 'nfs-for-6.16-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client fixes from Anna Schumaker: - Fix loop in GSS sequence number cache - Clean up /proc/net/rpc/nfs if nfs_fs_proc_net_init() fails - Fix a race to wake on NFS_LAYOUT_DRAIN - Fix handling of NFS level errors in I/O * tag 'nfs-for-6.16-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4/flexfiles: Fix handling of NFS level errors in I/O NFSv4/pNFS: Fix a race to wake on NFS_LAYOUT_DRAIN nfs: Clean up /proc/net/rpc/nfs when nfs_fs_proc_net_init() fails. sunrpc: fix loop in gss seqno cache
2025-07-01netfs: Update tracepoints in a number of waysDavid Howells
Make a number of updates to the netfs tracepoints: (1) Remove a duplicate trace from netfs_unbuffered_write_iter_locked(). (2) Move the trace in netfs_wake_rreq_flag() to after the flag is cleared so that the change appears in the trace. (3) Differentiate the use of netfs_rreq_trace_wait/woke_queue symbols. (4) Don't do so many trace emissions in the wait functions as some of them are redundant. (5) In netfs_collect_read_results(), differentiate a subreq that's being abandoned vs one that has been consumed in a regular way. (6) Add a tracepoint to indicate the call to ->ki_complete(). (7) Don't double-increment the subreq_counter when retrying a write. (8) Move the netfs_sreq_trace_io_progress tracepoint within cifs code to just MID_RESPONSE_RECEIVED and add different tracepoints for other MID states and note check failure. Signed-off-by: David Howells <dhowells@redhat.com> Co-developed-by: Paulo Alcantara <pc@manguebit.org> Signed-off-by: Paulo Alcantara <pc@manguebit.org> Link: https://lore.kernel.org/20250701163852.2171681-14-dhowells@redhat.com cc: Steve French <sfrench@samba.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-cifs@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Renumber the NETFS_RREQ_* flags to make traces easier to readDavid Howells
Renumber the NETFS_RREQ_* flags to put the most useful status bits in the bottom nibble - and therefore the last hex digit in the trace output - making it easier to grasp the state at a glance. In particular, put the IN_PROGRESS flag in bit 0 and ALL_QUEUED at bit 1. Also make the flags field in /proc/fs/netfs/requests larger to accommodate all the flags. Also make the flags field in the netfs_sreq tracepoint larger to accommodate all the NETFS_SREQ_* flags. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-13-dhowells@redhat.com Reviewed-by: Paulo Alcantara <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Merge i_size update functionsDavid Howells
Netfslib has two functions for updating the i_size after a write: one for buffered writes into the pagecache and one for direct/unbuffered writes. However, what needs to be done is much the same in both cases, so merge them together. This does raise one question, though: should updating the i_size after a direct write do the same estimated update of i_blocks as is done for buffered writes. Also get rid of the cleanup function pointer from netfs_io_request as it's only used for direct write to update i_size; instead do the i_size setting directly from write collection. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-12-dhowells@redhat.com cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Fix i_size updatingDavid Howells
Fix the updating of i_size, particularly in regard to the completion of DIO writes and especially async DIO writes by using a lock. The bug is triggered occasionally by the generic/207 xfstest as it chucks a bunch of AIO DIO writes at the filesystem and then checks that fstat() returns a reasonable st_size as each completes. The problem is that netfs is trying to do "if new_size > inode->i_size, update inode->i_size" sort of thing but without a lock around it. This can be seen with cifs, but shouldn't be seen with kafs because kafs serialises modification ops on the client whereas cifs sends the requests to the server as they're generated and lets the server order them. Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-11-dhowells@redhat.com Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01smb: client: set missing retry flag in cifs_writev_callback()Paulo Alcantara
Set NETFS_SREQ_NEED_RETRY flag to tell netfslib that the subreq needs to be retried. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-9-dhowells@redhat.com Tested-by: Steve French <sfrench@samba.org> Cc: linux-cifs@vger.kernel.org Cc: netfs@lists.linux.dev Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01smb: client: set missing retry flag in cifs_readv_callback()Paulo Alcantara
Set NETFS_SREQ_NEED_RETRY flag to tell netfslib that the subreq needs to be retried. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-8-dhowells@redhat.com Tested-by: Steve French <sfrench@samba.org> Cc: linux-cifs@vger.kernel.org Cc: netfs@lists.linux.dev Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01smb: client: set missing retry flag in smb2_writev_callback()Paulo Alcantara
Set NETFS_SREQ_NEED_RETRY flag to tell netfslib that the subreq needs to be retried. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-7-dhowells@redhat.com Tested-by: Steve French <sfrench@samba.org> Cc: linux-cifs@vger.kernel.org Cc: netfs@lists.linux.dev Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Fix ref leak on inserted extra subreq in write retryDavid Howells
The write-retry algorithm will insert extra subrequests into the list if it can't get sufficient capacity to split the range that needs to be retried into the sequence of subrequests it currently has (for instance, if the cifs credit pool has fewer credits available than it did when the range was originally divided). However, the allocator furnishes each new subreq with 2 refs and then another is added for resubmission, causing one to be leaked. Fix this by replacing the ref-getting line with a neutral trace line. Fixes: 288ace2f57c9 ("netfs: New writeback implementation") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-6-dhowells@redhat.com Tested-by: Steve French <sfrench@samba.org> Reviewed-by: Paulo Alcantara <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Fix looping in wait functionsDavid Howells
netfs_wait_for_request() and netfs_wait_for_pause() can loop forever if netfs_collect_in_app() returns 2, indicating that it wants to repeat because the ALL_QUEUED flag isn't yet set and there are no subreqs left that haven't been collected. The problem is that, unless collection is offloaded (OFFLOAD_COLLECTION), we have to return to the application thread to continue and eventually set ALL_QUEUED after pausing to deal with a retry - but we never get there. Fix this by inserting checks for the IN_PROGRESS and PAUSE flags as appropriate before cycling round - and add cond_resched() for good measure. Fixes: 2b1424cd131c ("netfs: Fix wait/wake to be consistent about the waitqueue used") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-5-dhowells@redhat.com Tested-by: Steve French <sfrench@samba.org> Reviewed-by: Paulo Alcantara <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Provide helpers to perform NETFS_RREQ_IN_PROGRESS flag wanglingDavid Howells
Provide helpers to clear and test the NETFS_RREQ_IN_PROGRESS and to insert the appropriate barrierage. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-4-dhowells@redhat.com Tested-by: Steve French <sfrench@samba.org> Reviewed-by: Paulo Alcantara <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Fix double put of requestDavid Howells
If a netfs request finishes during the pause loop, it will have the ref that belongs to the IN_PROGRESS flag removed at that point - however, if it then goes to the final wait loop, that will *also* put the ref because it sees that the IN_PROGRESS flag is clear and incorrectly assumes that this happened when it called the collector. In fact, since IN_PROGRESS is clear, we shouldn't call the collector again since it's done all the cleanup, such as calling ->ki_complete(). Fix this by making netfs_collect_in_app() just return, indicating that we're done if IN_PROGRESS is removed. Fixes: 2b1424cd131c ("netfs: Fix wait/wake to be consistent about the waitqueue used") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-3-dhowells@redhat.com Tested-by: Steve French <sfrench@samba.org> Reviewed-by: Paulo Alcantara <pc@manguebit.org> cc: Steve French <sfrench@samba.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-cifs@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Fix hang due to missing case in final DIO read result collectionDavid Howells
When doing a DIO read, if the subrequests we issue fail and cause the request PAUSE flag to be set to put a pause on subrequest generation, we may complete collection of the subrequests (possibly discarding them) prior to the ALL_QUEUED flags being set. In such a case, netfs_read_collection() doesn't see ALL_QUEUED being set after netfs_collect_read_results() returns and will just return to the app (the collector can be seen unpausing the generator in the trace log). The subrequest generator can then set ALL_QUEUED and the app thread reaches netfs_wait_for_request(). This causes netfs_collect_in_app() to be called to see if we're done yet, but there's missing case here. netfs_collect_in_app() will see that a thread is active and set inactive to false, but won't see any subrequests in the read stream, and so won't set need_collect to true. The function will then just return 0, indicating that the caller should just sleep until further activity (which won't be forthcoming) occurs. Fix this by making netfs_collect_in_app() check to see if an active thread is complete - i.e. that ALL_QUEUED is set and the subrequests list is empty - and to skip the sleep return path. The collector will then be called which will clear the request IN_PROGRESS flag, allowing the app to progress. Fixes: 2b1424cd131c ("netfs: Fix wait/wake to be consistent about the waitqueue used") Reported-by: Steve French <sfrench@samba.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-2-dhowells@redhat.com Tested-by: Steve French <sfrench@samba.org> Reviewed-by: Paulo Alcantara <pc@manguebit.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01eventpoll: Fix priority inversion problemNam Cao
The ready event list of an epoll object is protected by read-write semaphore: - The consumer (waiter) acquires the write lock and takes items. - the producer (waker) takes the read lock and adds items. The point of this design is enabling epoll to scale well with large number of producers, as multiple producers can hold the read lock at the same time. Unfortunately, this implementation may cause scheduling priority inversion problem. Suppose the consumer has higher scheduling priority than the producer. The consumer needs to acquire the write lock, but may be blocked by the producer holding the read lock. Since read-write semaphore does not support priority-boosting for the readers (even with CONFIG_PREEMPT_RT=y), we have a case of priority inversion: a higher priority consumer is blocked by a lower priority producer. This problem was reported in [1]. Furthermore, this could also cause stall problem, as described in [2]. To fix this problem, make the event list half-lockless: - The consumer acquires a mutex (ep->mtx) and takes items. - The producer locklessly adds items to the list. Performance is not the main goal of this patch, but as the producer now can add items without waiting for consumer to release the lock, performance improvement is observed using the stress test from https://github.com/rouming/test-tools/blob/master/stress-epoll.c. This is the same test that justified using read-write semaphore in the past. Testing using 12 x86_64 CPUs: Before After Diff threads events/ms events/ms 8 6932 19753 +185% 16 7820 27923 +257% 32 7648 35164 +360% 64 9677 37780 +290% 128 11166 38174 +242% Testing using 1 riscv64 CPU (averaged over 10 runs, as the numbers are noisy): Before After Diff threads events/ms events/ms 1 73 129 +77% 2 151 216 +43% 4 216 364 +69% 8 234 382 +63% 16 251 392 +56% Reported-by: Frederic Weisbecker <frederic@kernel.org> Closes: https://lore.kernel.org/linux-rt-users/20210825132754.GA895675@lothringen/ [1] Reported-by: Valentin Schneider <vschneid@redhat.com> Closes: https://lore.kernel.org/linux-rt-users/xhsmhttqvnall.mognet@vschneid.remote.csb/ [2] Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/20250527090836.1290532-1-namcao@linutronix.de Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01lib/group_cpus: Let group_cpu_evenly() return the number of initialized masksDaniel Wagner
group_cpu_evenly() might have allocated less groups then requested: group_cpu_evenly() __group_cpus_evenly() alloc_nodes_groups() # allocated total groups may be less than numgrps when # active total CPU number is less then numgrps In this case, the caller will do an out of bound access because the caller assumes the masks returned has numgrps. Return the number of groups created so the caller can limit the access range accordingly. Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250617-isolcpus-queue-counters-v1-1-13923686b54b@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-30bcachefs: Fix incorrect transaction restart handlingAlan Huang
Reported-by: syzbot+cc7567f096079cb4146f@syzkaller.appspotmail.com Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-30cifs: all initializations for tcon should happen in tcon_info_allocShyam Prasad N
Today, a few work structs inside tcon are initialized inside cifs_get_tcon and not in tcon_info_alloc. As a result, if a tcon is obtained from tcon_info_alloc, but not called as a part of cifs_get_tcon, we may trip over. Cc: <stable@vger.kernel.org> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-30smb: client: fix warning when reconnecting channelPaulo Alcantara
When reconnecting a channel in smb2_reconnect_server(), a dummy tcon is passed down to smb2_reconnect() with ->query_interface uninitialized, so we can't call queue_delayed_work() on it. Fix the following warning by ensuring that we're queueing the delayed worker from correct tcon. WARNING: CPU: 4 PID: 1126 at kernel/workqueue.c:2498 __queue_delayed_work+0x1d2/0x200 Modules linked in: cifs cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CPU: 4 UID: 0 PID: 1126 Comm: kworker/4:0 Not tainted 6.16.0-rc3 #5 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-4.fc42 04/01/2014 Workqueue: cifsiod smb2_reconnect_server [cifs] RIP: 0010:__queue_delayed_work+0x1d2/0x200 Code: 41 5e 41 5f e9 7f ee ff ff 90 0f 0b 90 e9 5d ff ff ff bf 02 00 00 00 e8 6c f3 07 00 89 c3 eb bd 90 0f 0b 90 e9 57 f> 0b 90 e9 65 fe ff ff 90 0f 0b 90 e9 72 fe ff ff 90 0f 0b 90 e9 RSP: 0018:ffffc900014afad8 EFLAGS: 00010003 RAX: 0000000000000000 RBX: ffff888124d99988 RCX: ffffffff81399cc1 RDX: dffffc0000000000 RSI: ffff888114326e00 RDI: ffff888124d999f0 RBP: 000000000000ea60 R08: 0000000000000001 R09: ffffed10249b3331 R10: ffff888124d9998f R11: 0000000000000004 R12: 0000000000000040 R13: ffff888114326e00 R14: ffff888124d999d8 R15: ffff888114939020 FS: 0000000000000000(0000) GS:ffff88829f7fe000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe7a2b4038 CR3: 0000000120a6f000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> queue_delayed_work_on+0xb4/0xc0 smb2_reconnect+0xb22/0xf50 [cifs] smb2_reconnect_server+0x413/0xd40 [cifs] ? __pfx_smb2_reconnect_server+0x10/0x10 [cifs] ? local_clock_noinstr+0xd/0xd0 ? local_clock+0x15/0x30 ? lock_release+0x29b/0x390 process_one_work+0x4c5/0xa10 ? __pfx_process_one_work+0x10/0x10 ? __list_add_valid_or_report+0x37/0x120 worker_thread+0x2f1/0x5a0 ? __kthread_parkme+0xde/0x100 ? __pfx_worker_thread+0x10/0x10 kthread+0x1fe/0x380 ? kthread+0x10f/0x380 ? __pfx_kthread+0x10/0x10 ? local_clock_noinstr+0xd/0xd0 ? ret_from_fork+0x1b/0x1f0 ? local_clock+0x15/0x30 ? lock_release+0x29b/0x390 ? rcu_is_watching+0x20/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x15b/0x1f0 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 1116206 hardirqs last enabled at (1116205): [<ffffffff8143af42>] __up_console_sem+0x52/0x60 hardirqs last disabled at (1116206): [<ffffffff81399f0e>] queue_delayed_work_on+0x6e/0xc0 softirqs last enabled at (1116138): [<ffffffffc04562fd>] __smb_send_rqst+0x42d/0x950 [cifs] softirqs last disabled at (1116136): [<ffffffff823d35e1>] release_sock+0x21/0xf0 Cc: linux-cifs@vger.kernel.org Reported-by: David Howells <dhowells@redhat.com> Fixes: 42ca547b13a2 ("cifs: do not disable interface polling on failure") Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Steve French <stfrench@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-30btrfs: stop parsing crc32c driver nameEric Biggers
To determine whether the crc32c implementation is "fast", use crc32_optimizations() instead of parsing the crypto_shash driver name. This keeps the code working as intended after the driver name is changed by the next commit. Acked-by: David Sterba <dsterba@suse.com> Link: https://lore.kernel.org/r/20250613183753.31864-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30xfs: add FALLOC_FL_ALLOCATE_RANGE to supported flags maskYouling Tang
Add FALLOC_FL_ALLOCATE_RANGE to the set of supported fallocate flags in XFS_FALLOC_FL_SUPPORTED. This change improves code clarity and maintains by explicitly showing this flag in the supported flags mask. Note that since FALLOC_FL_ALLOCATE_RANGE is defined as 0x00, this addition has no functional modifications. Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>