Age | Commit message (Collapse) | Author |
|
The type argument of gfs2_meta_check_ii() is always set to "magic
number", so remove that argument and hardcode the string in
gfs2_meta_check_ii(). Change the string to "bad magic number" to
emphasize that the problem is the incorrect magic number.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Get rid of attempts to create multi-line syslog entries; this only makes
the messages harder to read.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
gfs2_consist_inode() logs an error message with the source file and line
number. When we jump before calling it, the line number becomes less
useful as it no longer relates to the source of the error. To aid
troubleshooting, replace the gotos with the gfs2_consist_inode() calls
so that the error messages are more informative.
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Remove the context parameter for message allocations and
always use GFP_ATOMIC. This prepares for softirq message
processing.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
If the "bootconfig" kernel command-line argument was specified or if
the kernel was built with CONFIG_BOOT_CONFIG_FORCE, but if there are
no embedded kernel parameter, omit the "# Parameters from bootloader:"
comment from the /proc/bootconfig file. This will cause automation
to fall back to the /proc/cmdline file, which will be identical to the
comment in this no-embedded-kernel-parameters case.
Link: https://lore.kernel.org/all/20240409044358.1156477-2-paulmck@kernel.org/
Fixes: 8b8ce6c75430 ("fs/proc: remove redundant comments from /proc/bootconfig")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
commit 717c7c894d4b ("fs/proc: Add boot loader arguments as comment to
/proc/bootconfig") adds bootloader argument comments into /proc/bootconfig.
/proc/bootconfig shows boot_command_line[] multiple times following
every xbc key value pair, that's duplicated and not necessary.
Remove redundant ones.
Output before and after the fix is like:
key1 = value1
*bootloader argument comments*
key2 = value2
*bootloader argument comments*
key3 = value3
*bootloader argument comments*
...
key1 = value1
key2 = value2
key3 = value3
*bootloader argument comments*
...
Link: https://lore.kernel.org/all/20240409044358.1156477-1-paulmck@kernel.org/
Fixes: 717c7c894d4b ("fs/proc: Add boot loader arguments as comment to /proc/bootconfig")
Signed-off-by: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: <linux-trace-kernel@vger.kernel.org>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: stable@vger.kernel.org
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Currently, the __d_clear_type_and_inode() writes the value flags to
dentry->d_flags, then immediately re-reads it in order to use it in a if
statement. This re-read is useless because no other update to
dentry->d_flags can occur at this point.
This commit therefore re-use flags in the if statement instead of
re-reading dentry->d_flags.
Signed-off-by: linke li <lilinke99@qq.com>
Link: https://lore.kernel.org/r/tencent_5E187BD0A61BA28605E85405F15228254D0A@qq.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
These IOCTL commands should be implemented by setting attributes on the
superblock, rather than in the IOCTL hooks in struct file_operations.
By returning -ENOTTY instead of -ENOIOCTLCMD, we instruct the fs/ioctl.c
logic to return -ENOTTY immediately, rather than attempting to call
f_op->unlocked_ioctl() or f_op->compat_ioctl() as a fallback.
Why this is safe:
Before this change, fs/ioctl.c would unsuccessfully attempt calling the
IOCTL hooks, and then return -ENOTTY. By returning -ENOTTY directly, we
return the same error code immediately, but save ourselves the fallback
attempt.
Motivation:
This simplifies the logic for these IOCTL commands and lets us reason about
the side effects of these IOCTLs more easily. It will be possible to
permit these IOCTLs under LSM IOCTL policies, without having to worry about
them getting dispatched to problematic device drivers (which sometimes do
work before looking at the IOCTL command number).
Link: https://lore.kernel.org/all/cnwpkeovzbumhprco7q2c2y6zxzmxfpwpwe3tyy6c3gg2szgqd@vfzjaw5v5imr/
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20240405214040.101396-2-gnoack@google.com
Acked-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Instead of checking for specific file_operations, add a bit to
file_operations which denotes a file that only contain hugetlb pages.
This lets us make hugetlbfs_file_operations static, and removes
is_file_shm_hugepages() completely.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20240407201122.3783877-1-willy@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
There is some care taken to ensure these destination buffers are
NUL-terminated by bounding the strncpy()'s by ORANGEFS_NAME_MAX - 1 or
ORANGEFS_MAX_SERVER_ADDR_LEN - 1. Instead, we can use the new 2-argument
version of strscpy() to guarantee NUL-termination on the destination
buffers while simplifying the code.
Based on usage with printf-likes, we can see these buffers are expected
to be NUL-terminated:
| gossip_debug(GOSSIP_NAME_DEBUG,
| "%s: doing lookup on %s under %pU,%d\n",
| __func__,
| new_op->upcall.req.lookup.d_name,
| &new_op->upcall.req.lookup.parent_refn.khandle,
| new_op->upcall.req.lookup.parent_refn.fs_id);
...
| gossip_debug(GOSSIP_SUPER_DEBUG,
| "Attempting ORANGEFS Remount via host %s\n",
| new_op->upcall.req.fs_mount.orangefs_config_server);
NUL-padding isn't required for any of these destination buffers as
they've all been zero-allocated with op_alloc() or kzalloc().
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20240322-strncpy-fs-orangefs-dcache-c-v1-1-15d12debbf38@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
clang-14 points out that v_size is always smaller than a 64KB
page size if that is configured by the CPU architecture:
fs/nilfs2/ioctl.c:63:19: error: result of comparison of constant 65536 with expression of type '__u16' (aka 'unsigned short') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (argv->v_size > PAGE_SIZE)
~~~~~~~~~~~~ ^ ~~~~~~~~~
This is ok, so just shut up that warning with a cast.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20240328143051.1069575-7-arnd@kernel.org
Fixes: 3358b4aaa84f ("nilfs2: fix problems of memory allocation in ioctl")
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reviewed-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
shoot down journal keys _before_ populating journal keys with pointers
to scanned nodes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
- fix return types: promoting from unsigned to ssize_t does not do what
we want here, and was pointless since the rest of the eytzinger code
is u32
- nr, not size
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Several fixes to qgroups that have been recently identified by test
generic/475:
- fix prealloc reserve leak in subvolume operations
- various other fixes in reservation setup, conversion or cleanup"
* tag 'for-6.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: always clear PERTRANS metadata during commit
btrfs: make btrfs_clear_delalloc_extent() free delalloc reserve
btrfs: qgroup: convert PREALLOC to PERTRANS after record_root_in_trans
btrfs: record delayed inode root in transaction
btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations
btrfs: qgroup: correctly model root qgroup rsv in convert
|
|
bch2_acl_from_disk() uses allocate_dropping_locks, and can thus return
a transaction restart - this wasn't handled.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When allocating bkey_cached from bc->freed_pcpu list, it missed
decreasing the count of nr_freed_pcpu which would cause the mismatch
between the value of nr_freed_pcpu and the list items. This problem
also exists in moving new bkey_cached to bc->freed_pcpu list.
If these happened, the bug info may appear in
bch2_fs_btree_key_cache_exit by the follow code:
BUG_ON(list_count_nodes(&bc->freed_pcpu) != bc->nr_freed_pcpu);
BUG_ON(list_count_nodes(&bc->freed_nonpcpu) != bc->nr_freed_nonpcpu);
Fixes: c65c13f0eac6 ("bcachefs: Run btree key cache shrinker less aggressively")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
There's a bunch of flags that are purely based on what the file
operations support while also never being conditionally set or unset.
IOW, they're not subject to change for individual files. Imho, such
flags don't need to live in f_mode they might as well live in the fops
structs itself. And the fops struct already has that lonely
mmap_supported_flags member. We might as well turn that into a generic
fop_flags member and move a few flags from FMODE_* space into FOP_*
space. That gets us four FMODE_* bits back and the ability for new
static flags that are about file ops to not have to live in FMODE_*
space but in their own FOP_* space. It's not the most beautiful thing
ever but it gets the job done. Yes, there'll be an additional pointer
chase but hopefully that won't matter for these flags.
I suspect there's a few more we can move into there and that we can also
redirect a bunch of new flag suggestions that follow this pattern into
the fop_flags field instead of f_mode.
Link: https://lore.kernel.org/r/20240328-gewendet-spargel-aa60a030ef74@brauner
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Multiple bug fixes for journal iters:
- When the journal keys gap buffer is resized, we have to adjust the
iterators for moving the gap to the end
- We don't want to rewind iterators to point to the key we just
inserted if it's not for the correct btree/level
Also, add some new assertions.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The struct field swap can collide with the swap() macro defined in
linux/minmax.h. Rename the struct field to prevent such collisions.
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
"bcachefs; Fix deadlock in bch2_btree_update_start()" was a significant
performance regression (nearly 50%) on multithreaded random writes with
fio.
The reason is that the journal watermark checks multiple things,
including the state of the btree write buffer, and on multithreaded
update heavy workloads we're bottleneked on write buffer flushing - we
don't want kicknig off btree updates to depend on the state of the write
buffer.
This isn't strictly correct; the interior btree update path does do
write buffer updates, but it's a tiny fraction of total accounting
updates and we're more concerned with space in the journal itself.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
BCH_IOCTL_FSCK_OFFLINE allows the userspace fsck tool to use the kernel
implementation of fsck - primarily when the kernel version is a better
version match.
It should look and act exactly like the normal userspace fsck that the
user expected to be invoking, so errors should never result in a kernel
panic.
We may want to consider further restricting errors=panic - it's only
intended for debugging in controlled test environments, it should have
no purpose it normal usage.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
To open an encrypted filesystem, we use request_key() to get the
encryption key from the user's keyring - but request_key() needs to
happen in the context of the process that invoked the ioctl.
This easily fixed by using bch2_fs_open() in nostart mode.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Address a slow memory leak with RPC-over-TCP
- Prevent another NFS4ERR_DELAY loop during CREATE_SESSION
* tag 'nfsd-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
SUNRPC: Fix a slow server-side memory leak with RPC-over-TCP
|
|
Pull xfs fix from Chandan Babu:
- Allow creating new links to special files which were not associated
with a project quota
* tag 'xfs-6.9-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: allow cross-linking special files without project quota
|
|
Pull smb client fixes from Steve French:
- fix to retry close to avoid potential handle leaks when server
returns EBUSY
- DFS fixes including a fix for potential use after free
- fscache fix
- minor strncpy cleanup
- reconnect race fix
- deal with various possible UAF race conditions tearing sessions down
* tag '6.9-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix potential UAF in cifs_signal_cifsd_for_reconnect()
smb: client: fix potential UAF in smb2_is_network_name_deleted()
smb: client: fix potential UAF in is_valid_oplock_break()
smb: client: fix potential UAF in smb2_is_valid_oplock_break()
smb: client: fix potential UAF in smb2_is_valid_lease_break()
smb: client: fix potential UAF in cifs_stats_proc_show()
smb: client: fix potential UAF in cifs_stats_proc_write()
smb: client: fix potential UAF in cifs_dump_full_key()
smb: client: fix potential UAF in cifs_debug_files_proc_show()
smb3: retrying on failed server close
smb: client: serialise cifs_construct_tcon() with cifs_mount_mutex
smb: client: handle DFS tcons in cifs_construct_tcon()
smb: client: refresh referral without acquiring refpath_lock
smb: client: guarantee refcounted children from parent session
cifs: Fix caching to try to do open O_WRONLY as rdwr on server
smb: client: fix UAF in smb2_reconnect_server()
smb: client: replace deprecated strncpy with strscpy
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The ! was obviously intended to be ~. As it is, this function does
the equivalent to: "addr[bit / 64] = 0;".
Fixes: 27fcec6c27ca ("bcachefs: Clear recovery_passes_required as they complete without errors")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the
client. While a callback job is technically an RPC that counter is
really more for client-driven RPCs, and this has the effect of
preventing the client from being unhashed until the callback completes.
If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we
can end up in a situation where the callback can't complete on the (now
dead) callback channel, but the new client can't connect because the old
client can't be unhashed. This usually manifests as a NFS4ERR_DELAY
return on the CREATE_SESSION operation.
The job is only holding a reference to the client so it can clear a flag
after the RPC completes. Fix this by having CB_RECALL_ANY instead hold a
reference to the cl_nfsdfs.cl_ref. Typically we only take that sort of
reference when dealing with the nfsdfs info files, but it should work
appropriately here to ensure that the nfs4_client doesn't disappear.
Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition")
Reported-by: Vladimir Benes <vbenes@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Pull smb server fixes from Steve French:
"Three fixes, all also for stable:
- encryption fix
- memory overrun fix
- oplock break fix"
* tag '6.9-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: do not set SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1
ksmbd: validate payload size in ipc response
ksmbd: don't send oplock break if rename fails
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"This contains a few small fixes. This comes with some delay because I
wanted to wait on people running their reproducers and the Easter
Holidays meant that those replies came in a little later than usual:
- Fix handling of preventing writes to mounted block devices.
Since last kernel we allow to prevent writing to mounted block
devices provided CONFIG_BLK_DEV_WRITE_MOUNTED isn't set and the
block device is opened with restricted writes. When we switched to
opening block devices as files we altered the mechanism by which we
recognize when a block device has been opened with write
restrictions.
The detection logic assumed that only read-write mounted
filesystems would apply write restrictions to their block devices
from other openers. That of course is not true since it also makes
sense to apply write restrictions for filesystems that are
read-only.
Fix the detection logic using an FMODE_* bit. We still have a few
left since we freed up a couple a while ago. I also picked up a
patch to free up four additional FMODE_* bits scheduled for the
next merge window.
- Fix counting the number of writers to a block device. This just
changes the logic to be consistent.
- Fix a bug in aio causing a NULL pointer derefernce after we
implemented batched processing in aio.
- Finally, add the changes we discussed that allows to yield block
devices early even though file closing itself is deferred.
This also allows us to remove two holder operations to get and
release the holder to align lifetime of file and holder of the
block device"
* tag 'vfs-6.9-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
aio: Fix null ptr deref in aio_complete() wakeup
fs,block: yield devices early
block: count BLK_OPEN_RESTRICT_WRITES openers
block: handle BLK_OPEN_RESTRICT_WRITES correctly
|
|
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
While there, use struct_size() helper, instead of the open-coded
version.
[brauner@kernel.org: contains a fix by Edward for an OOB access]
Reported-by: syzbot+4139435cb1b34cf759c2@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Link: https://lore.kernel.org/r/tencent_A7845DD769577306D813742365E976E3A205@qq.com
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/ZgImCXTdGDTeBvSS@neat
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Since aio use folios in most functions, convert ring/internal_pages
to ring/internal_folios, let's directly use folio instead of page
throughout aio to remove hidden calls to compound_head(), eg,
flush_dcache_page().
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20240321131640.948634-4-wangkefeng.wang@huawei.com
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Use a folio throughout aio_free_ring() to remove calls to compound_head(),
also move pr_debug after folio check to remove unnecessary print.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20240321131640.948634-3-wangkefeng.wang@huawei.com
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Use a folio throughout aio_setup_ring() to remove calls to compound_head(),
also use folio_end_read() to simultaneously mark the folio uptodate and
unlock it.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20240321131640.948634-2-wangkefeng.wang@huawei.com
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The 'TAG 66 Packet Format' description is missing the cipher code and
checksum fields that are packed into the message packet. As a result,
the buffer allocated for the packet is 3 bytes too small and
write_tag_66_packet() will write up to 3 bytes past the end of the
buffer.
Fix this by increasing the size of the allocation so the whole packet
will always fit in the buffer.
This fixes the below kasan slab-out-of-bounds bug:
BUG: KASAN: slab-out-of-bounds in ecryptfs_generate_key_packet_set+0x7d6/0xde0
Write of size 1 at addr ffff88800afbb2a5 by task touch/181
CPU: 0 PID: 181 Comm: touch Not tainted 6.6.13-gnu #1 4c9534092be820851bb687b82d1f92a426598dc6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2/GNU Guix 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x4c/0x70
print_report+0xc5/0x610
? ecryptfs_generate_key_packet_set+0x7d6/0xde0
? kasan_complete_mode_report_info+0x44/0x210
? ecryptfs_generate_key_packet_set+0x7d6/0xde0
kasan_report+0xc2/0x110
? ecryptfs_generate_key_packet_set+0x7d6/0xde0
__asan_store1+0x62/0x80
ecryptfs_generate_key_packet_set+0x7d6/0xde0
? __pfx_ecryptfs_generate_key_packet_set+0x10/0x10
? __alloc_pages+0x2e2/0x540
? __pfx_ovl_open+0x10/0x10 [overlay 30837f11141636a8e1793533a02e6e2e885dad1d]
? dentry_open+0x8f/0xd0
ecryptfs_write_metadata+0x30a/0x550
? __pfx_ecryptfs_write_metadata+0x10/0x10
? ecryptfs_get_lower_file+0x6b/0x190
ecryptfs_initialize_file+0x77/0x150
ecryptfs_create+0x1c2/0x2f0
path_openat+0x17cf/0x1ba0
? __pfx_path_openat+0x10/0x10
do_filp_open+0x15e/0x290
? __pfx_do_filp_open+0x10/0x10
? __kasan_check_write+0x18/0x30
? _raw_spin_lock+0x86/0xf0
? __pfx__raw_spin_lock+0x10/0x10
? __kasan_check_write+0x18/0x30
? alloc_fd+0xf4/0x330
do_sys_openat2+0x122/0x160
? __pfx_do_sys_openat2+0x10/0x10
__x64_sys_openat+0xef/0x170
? __pfx___x64_sys_openat+0x10/0x10
do_syscall_64+0x60/0xd0
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7f00a703fd67
Code: 25 00 00 41 00 3d 00 00 41 00 74 37 64 8b 04 25 18 00 00 00 85 c0 75 5b 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 85 00 00 00 48 83 c4 68 5d 41 5c c3 0f 1f
RSP: 002b:00007ffc088e30b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 00007ffc088e3368 RCX: 00007f00a703fd67
RDX: 0000000000000941 RSI: 00007ffc088e48d7 RDI: 00000000ffffff9c
RBP: 00007ffc088e48d7 R08: 0000000000000001 R09: 0000000000000000
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
R13: 0000000000000000 R14: 00007ffc088e48d7 R15: 00007f00a7180040
</TASK>
Allocated by task 181:
kasan_save_stack+0x2f/0x60
kasan_set_track+0x29/0x40
kasan_save_alloc_info+0x25/0x40
__kasan_kmalloc+0xc5/0xd0
__kmalloc+0x66/0x160
ecryptfs_generate_key_packet_set+0x6d2/0xde0
ecryptfs_write_metadata+0x30a/0x550
ecryptfs_initialize_file+0x77/0x150
ecryptfs_create+0x1c2/0x2f0
path_openat+0x17cf/0x1ba0
do_filp_open+0x15e/0x290
do_sys_openat2+0x122/0x160
__x64_sys_openat+0xef/0x170
do_syscall_64+0x60/0xd0
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Fixes: dddfa461fc89 ("[PATCH] eCryptfs: Public key; packet management")
Signed-off-by: Brian Kubisiak <brian@kubisiak.com>
Link: https://lore.kernel.org/r/5j2q56p6qkhezva6b2yuqfrsurmvrrqtxxzrnp3wqu7xrz22i7@hoecdztoplbl
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
writeback_inodes_sb doesn't have return value, just remove unnecessary
return in it.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Link: https://lore.kernel.org/r/20240228091958.288260-7-shikemeng@huaweicloud.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Commit e8e8a0c6c9bfc ("writeback: move nr_pages == 0 logic to one
location") removed parameter nr_pages of __wakeup_flusher_threads_bdi
and we try to writeback all dirty pages in __wakeup_flusher_threads_bdi
now. Just correct stale comment.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Link: https://lore.kernel.org/r/20240228091958.288260-6-shikemeng@huaweicloud.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The dirtied_before is only used when b_io is not empty, so only calculate
when b_io is not empty.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Link: https://lore.kernel.org/r/20240228091958.288260-5-shikemeng@huaweicloud.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Remove unused parameter wb of finish_writeback_work.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Link: https://lore.kernel.org/r/20240228091958.288260-4-shikemeng@huaweicloud.com
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
For case there is no more inodes for IO in io list from last wb_writeback,
We may bail out early even there is inode in dirty list should be written
back. Only bail out when we queued once to avoid missing dirtied inode.
This is from code reading...
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Link: https://lore.kernel.org/r/20240228091958.288260-3-shikemeng@huaweicloud.com
Reviewed-by: Jan Kara <jack@suse.cz>
[brauner@kernel.org: fold in memory corruption fix from Jan in [1]]
Link: https://lore.kernel.org/r/20240405132346.bid7gibby3lxxhez@quack3 [1]
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
list_del_init_careful() needs to be the last access to the wait queue
entry - it effectively unlocks access.
Previously, finish_wait() would see the empty list head and skip taking
the lock, and then we'd return - but the completion path would still
attempt to do the wakeup after the task_struct pointer had been
overwritten.
Fixes: 71eb6b6b0ba9 ("fs/aio: obey min_nr when doing wakeups")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/linux-fsdevel/CAHTA-ubfwwB51A5Wg5M6H_rPEQK9pNf8FkAGH=vr=FEkyRrtqw@mail.gmail.com/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lore.kernel.org/stable/20240331215212.522544-1-kent.overstreet%40linux.dev
Link: https://lore.kernel.org/r/20240331215212.522544-1-kent.overstreet@linux.dev
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
net/ipv4/ip_gre.c
17af420545a7 ("erspan: make sure erspan_base_hdr is present in skb->head")
5832c4a77d69 ("ip_tunnel: convert __be16 tunnel flags to bitmaps")
https://lore.kernel.org/all/20240402103253.3b54a1cf@canb.auug.org.au/
Adjacent changes:
net/ipv6/ip6_fib.c
d21d40605bca ("ipv6: Fix infinite recursion in fib6_dump_done().")
5fc68320c1fb ("ipv6: remove RTNL protection from inet6_dump_fib()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzkaller reported a warning [0] triggered while destroying immature
netns.
rpc_proc_register() was called in init_nfs_fs(), but its error
has been ignored since at least the initial commit 1da177e4c3f4
("Linux-2.6.12-rc2").
Recently, commit d47151b79e32 ("nfs: expose /proc/net/sunrpc/nfs
in net namespaces") converted the procfs to per-netns and made
the problem more visible.
Even when rpc_proc_register() fails, nfs_net_init() could succeed,
and thus nfs_net_exit() will be called while destroying the netns.
Then, remove_proc_entry() will be called for non-existing proc
directory and trigger the warning below.
Let's handle the error of rpc_proc_register() properly in nfs_net_init().
[0]:
name 'nfs'
WARNING: CPU: 1 PID: 1710 at fs/proc/generic.c:711 remove_proc_entry+0x1bb/0x2d0 fs/proc/generic.c:711
Modules linked in:
CPU: 1 PID: 1710 Comm: syz-executor.2 Not tainted 6.8.0-12822-gcd51db110a7e #12
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:remove_proc_entry+0x1bb/0x2d0 fs/proc/generic.c:711
Code: 41 5d 41 5e c3 e8 85 09 b5 ff 48 c7 c7 88 58 64 86 e8 09 0e 71 02 e8 74 09 b5 ff 4c 89 e6 48 c7 c7 de 1b 80 84 e8 c5 ad 97 ff <0f> 0b eb b1 e8 5c 09 b5 ff 48 c7 c7 88 58 64 86 e8 e0 0d 71 02 eb
RSP: 0018:ffffc9000c6d7ce0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff8880422b8b00 RCX: ffffffff8110503c
RDX: ffff888030652f00 RSI: ffffffff81105045 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffffffff81bb62cb R12: ffffffff84807ffc
R13: ffff88804ad6fcc0 R14: ffffffff84807ffc R15: ffffffff85741ff8
FS: 00007f30cfba8640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff51afe8000 CR3: 000000005a60a005 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
rpc_proc_unregister+0x64/0x70 net/sunrpc/stats.c:310
nfs_net_exit+0x1c/0x30 fs/nfs/inode.c:2438
ops_exit_list+0x62/0xb0 net/core/net_namespace.c:170
setup_net+0x46c/0x660 net/core/net_namespace.c:372
copy_net_ns+0x244/0x590 net/core/net_namespace.c:505
create_new_namespaces+0x2ed/0x770 kernel/nsproxy.c:110
unshare_nsproxy_namespaces+0xae/0x160 kernel/nsproxy.c:228
ksys_unshare+0x342/0x760 kernel/fork.c:3322
__do_sys_unshare kernel/fork.c:3393 [inline]
__se_sys_unshare kernel/fork.c:3391 [inline]
__x64_sys_unshare+0x1f/0x30 kernel/fork.c:3391
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x46/0x4e
RIP: 0033:0x7f30d0febe5d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007f30cfba7cc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00000000004bbf80 RCX: 00007f30d0febe5d
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000006c020600
RBP: 00000000004bbf80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
R13: 000000000000000b R14: 00007f30d104c530 R15: 0000000000000000
</TASK>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Pull bcachefs repair code from Kent Overstreet:
"A couple more small fixes, and new repair code.
We can now automatically recover from arbitrary corrupted interior
btree nodes by scanning, and we can reconstruct metadata as needed to
bring a filesystem back into a working, consistent, read-write state
and preserve access to whatevver wasn't corrupted.
Meaning - you can blow away all metadata except for extents and
dirents leaf nodes, and repair will reconstruct everything else and
give you your data, and under the correct paths. If inodes are missing
i_size will be slightly off and permissions/ownership/timestamps will
be gone, and we do still need the snapshots btree if snapshots were in
use - in the future we'll be able to guess the snapshot tree structure
in some situations.
IOW - aside from shaking out remaining bugs (fuzz testing is still
coming), repair code should be complete and if repair ever doesn't
work that's the highest priority bug that I want to know about
immediately.
This patchset was kindly tested by a user from India who accidentally
wiped one drive out of a three drive filesystem with no replication on
the family computer - it took a couple weeks but we got everything
important back"
* tag 'bcachefs-2024-04-03' of https://evilpiepirate.org/git/bcachefs:
bcachefs: reconstruct_inode()
bcachefs: Subvolume reconstruction
bcachefs: Check for extents that point to same space
bcachefs: Reconstruct missing snapshot nodes
bcachefs: Flag btrees with missing data
bcachefs: Topology repair now uses nodes found by scanning to fill holes
bcachefs: Repair pass for scanning for btree nodes
bcachefs: Don't skip fake btree roots in fsck
bcachefs: bch2_btree_root_alloc() -> bch2_btree_root_alloc_fake()
bcachefs: Etyzinger cleanups
bcachefs: bch2_shoot_down_journal_keys()
bcachefs: Clear recovery_passes_required as they complete without errors
bcachefs: ratelimit informational fsck errors
bcachefs: Check for bad needs_discard before doing discard
bcachefs: Improve bch2_btree_update_to_text()
mean_and_variance: Drop always failing tests
bcachefs: fix nocow lock deadlock
bcachefs: BCH_WATERMARK_interior_updates
bcachefs: Fix btree node reserve
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Print start and end level of the btree update; also a bit of cleanup.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
sysfs is limited to PAGE_SIZE, and when we're debugging strange
deadlocks/priority inversions we need to see the full list.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|