Age | Commit message (Collapse) | Author |
|
`err` could be -EINTR and it should not be the case. Actually such
DBG_BUGON is useless.
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230309053148.9223-2-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Let's revert commit 12724ba38992 ("erofs: fix kvcalloc() misuse with
__GFP_NOFAIL") since kvmalloc() already supports __GFP_NOFAIL in commit
a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc"). So
the original fix was wrong.
Actually there was some issue as [1] discussed, so before that mm fix
is landed, the warn could still happen but applying this commit first
will cause less.
[1] https://lore.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.com
Fixes: 12724ba38992 ("erofs: fix kvcalloc() misuse with __GFP_NOFAIL")
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230309053148.9223-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
As the call trace shown, the root cause is kunmap incorrect pages:
BUG: kernel NULL pointer dereference, address: 00000000
CPU: 1 PID: 40 Comm: kworker/u5:0 Not tainted 6.2.0-rc5 #4
Workqueue: erofs_worker z_erofs_decompressqueue_work
EIP: z_erofs_lzma_decompress+0x34b/0x8ac
z_erofs_decompress+0x12/0x14
z_erofs_decompress_queue+0x7e7/0xb1c
z_erofs_decompressqueue_work+0x32/0x60
process_one_work+0x24b/0x4d8
? process_one_work+0x1a4/0x4d8
worker_thread+0x14c/0x3fc
kthread+0xe6/0x10c
? rescuer_thread+0x358/0x358
? kthread_complete_and_exit+0x18/0x18
ret_from_fork+0x1c/0x28
---[ end trace 0000000000000000 ]---
The bug is trivial and should be fixed now. It has no impact on
!HIGHMEM platforms.
Fixes: 622ceaddb764 ("erofs: lzma compression support")
Cc: <stable@vger.kernel.org> # 5.16+
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230305134455.88236-1-hsiangkao@linux.alibaba.com
|
|
They are used during the erofs module init phase. Let's mark it as
__init like any other function.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230303063731.66760-1-frank.li@vivo.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Variable 'cmd' set but not used.
fs/locks.c:2428:3: warning: Value stored to 'cmd' is never read.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4439
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
The variable ret belongs to redundant assignment and can be deleted.
fs/splice.c:940:2: warning: Value stored to 'ret' is never read.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4406
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
This reverts commit 76d62f24db07f22ccf9bc18ca793c27d4ebef721.
So while priority inversion on the pmsg_lock is an occasional
problem that an rt_mutex would help with, in uses where logging
is writing to pmsg heavily from multiple threads, the pmsg_lock
can be heavily contended.
After this change landed, it was reported that cases where the
mutex locking overhead was commonly adding on the order of 10s
of usecs delay had suddenly jumped to ~msec delay with rtmutex.
It seems the slight differences in the locks under this level
of contention causes the normal mutexes to utilize the spinning
optimizations, while the rtmutexes end up in the sleeping
slowpath (which allows additional threads to pile on trying
to take the lock).
In this case, it devolves to a worse case senerio where the lock
acquisition and scheduling overhead dominates, and each thread
is waiting on the order of ~ms to do ~us of work.
Obviously, having tons of threads all contending on a single
lock for logging is non-optimal, so the proper fix is probably
reworking pstore pmsg to have per-cpu buffers so we don't have
contention.
Additionally, Steven Rostedt has provided some furhter
optimizations for rtmutexes that improves the rtmutex spinning
path, but at least in my testing, I still see the test tripping
into the sleeping path on rtmutexes while utilizing the spinning
path with mutexes.
But in the short term, lets revert the change to the rt_mutex
and go back to normal mutexes to avoid a potentially major
performance regression. And we can work on optimizations to both
rtmutexes and finer-grained locking for pstore pmsg in the
future.
Cc: Wei Wang <wvw@google.com>
Cc: Midas Chien<midaschieh@google.com>
Cc: "Chunhui Li (李春辉)" <chunhui.li@mediatek.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: kernel-team@android.com
Fixes: 76d62f24db07 ("pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion")
Reported-by: "Chunhui Li (李春辉)" <chunhui.li@mediatek.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230308204043.2061631-1-jstultz@google.com
|
|
Make udf_adinicb_writepage() fully converted to using the passed folio
instead of converting it to the page.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
As lockdep properly warns, we should not be locking i_rwsem while having
transactions started as the proper lock ordering used by all directory
handling operations is i_rwsem -> transaction start. Fix the lock
ordering by moving the locking of the directory earlier in
ext4_rename().
Reported-by: syzbot+9d16c39efb5fade84574@syzkaller.appspotmail.com
Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory")
Link: https://syzkaller.appspot.com/bug?extid=9d16c39efb5fade84574
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230301141004.15087-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
64BIT is part of the incompatible feature set, update the comment
accordingly.
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20230301133842.671821-1-tudor.ambarus@linaro.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Apparently syzbot figured out that issuing this FSMAP call:
struct fsmap_head cmd = {
.fmh_count = ...;
.fmh_keys = {
{ .fmr_device = /* ext4 dev */, .fmr_physical = 0, },
{ .fmr_device = /* ext4 dev */, .fmr_physical = 0, },
},
...
};
ret = ioctl(fd, FS_IOC_GETFSMAP, &cmd);
Produces this crash if the underlying filesystem is a 1k-block ext4
filesystem:
kernel BUG at fs/ext4/ext4.h:3331!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 3 PID: 3227965 Comm: xfs_io Tainted: G W O 6.2.0-rc8-achx
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:ext4_mb_load_buddy_gfp+0x47c/0x570 [ext4]
RSP: 0018:ffffc90007c03998 EFLAGS: 00010246
RAX: ffff888004978000 RBX: ffffc90007c03a20 RCX: ffff888041618000
RDX: 0000000000000000 RSI: 00000000000005a4 RDI: ffffffffa0c99b11
RBP: ffff888012330000 R08: ffffffffa0c2b7d0 R09: 0000000000000400
R10: ffffc90007c03950 R11: 0000000000000000 R12: 0000000000000001
R13: 00000000ffffffff R14: 0000000000000c40 R15: ffff88802678c398
FS: 00007fdf2020c880(0000) GS:ffff88807e100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd318a5fe8 CR3: 000000007f80f001 CR4: 00000000001706e0
Call Trace:
<TASK>
ext4_mballoc_query_range+0x4b/0x210 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
ext4_getfsmap_datadev+0x713/0x890 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
ext4_getfsmap+0x2b7/0x330 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
ext4_ioc_getfsmap+0x153/0x2b0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
__ext4_ioctl+0x2a7/0x17e0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
__x64_sys_ioctl+0x82/0xa0
do_syscall_64+0x2b/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7fdf20558aff
RSP: 002b:00007ffd318a9e30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000000200c0 RCX: 00007fdf20558aff
RDX: 00007fdf1feb2010 RSI: 00000000c0c0583b RDI: 0000000000000003
RBP: 00005625c0634be0 R08: 00005625c0634c40 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdf1feb2010
R13: 00005625be70d994 R14: 0000000000000800 R15: 0000000000000000
For GETFSMAP calls, the caller selects a physical block device by
writing its block number into fsmap_head.fmh_keys[01].fmr_device.
To query mappings for a subrange of the device, the starting byte of the
range is written to fsmap_head.fmh_keys[0].fmr_physical and the last
byte of the range goes in fsmap_head.fmh_keys[1].fmr_physical.
IOWs, to query what mappings overlap with bytes 3-14 of /dev/sda, you'd
set the inputs as follows:
fmh_keys[0] = { .fmr_device = major(8, 0), .fmr_physical = 3},
fmh_keys[1] = { .fmr_device = major(8, 0), .fmr_physical = 14},
Which would return you whatever is mapped in the 12 bytes starting at
physical offset 3.
The crash is due to insufficient range validation of keys[1] in
ext4_getfsmap_datadev. On 1k-block filesystems, block 0 is not part of
the filesystem, which means that s_first_data_block is nonzero.
ext4_get_group_no_and_offset subtracts this quantity from the blocknr
argument before cracking it into a group number and a block number
within a group. IOWs, block group 0 spans blocks 1-8192 (1-based)
instead of 0-8191 (0-based) like what happens with larger blocksizes.
The net result of this encoding is that blocknr < s_first_data_block is
not a valid input to this function. The end_fsb variable is set from
the keys that are copied from userspace, which means that in the above
example, its value is zero. That leads to an underflow here:
blocknr = blocknr - le32_to_cpu(es->s_first_data_block);
The division then operates on -1:
offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >>
EXT4_SB(sb)->s_cluster_bits;
Leaving an impossibly large group number (2^32-1) in blocknr.
ext4_getfsmap_check_keys checked that keys[0].fmr_physical and
keys[1].fmr_physical are in increasing order, but
ext4_getfsmap_datadev adjusts keys[0].fmr_physical to be at least
s_first_data_block. This implies that we have to check it again after
the adjustment, which is the piece that I forgot.
Reported-by: syzbot+6be2b977c89f79b6b153@syzkaller.appspotmail.com
Fixes: 4a4956249dac ("ext4: fix off-by-one fsmap error on 1k block filesystems")
Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002
Cc: stable@vger.kernel.org
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/Y+58NPTH7VNGgzdd@magnolia
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
A significant number of xfstests can cause ext4 to log one or more
warning messages when they are run on a test file system where the
inline_data feature has been enabled. An example:
"EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode
#16385: comm fsstress: No space for directory leaf checksum. Please
run e2fsck -D."
The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068,
070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585,
589, 626, 631, and 650.
In this situation, the warning message indicates a bug in the code that
performs the RENAME_WHITEOUT operation on a directory entry that has
been stored inline. It doesn't detect that the directory is stored
inline, and incorrectly attempts to compute a dirent block checksum on
the whiteout inode when creating it. This attempt fails as a result
of the integrity checking in get_dirent_tail (usually due to a failure
to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message
is then emitted.
Fix this by simply collecting the inlined data state at the time the
search for the source directory entry is performed. Existing code
handles the rest, and this is sufficient to eliminate all spurious
warning messages produced by the tests above. Go one step further
and do the same in the code that resets the source directory entry in
the event of failure. The inlined state should be present in the
"old" struct, but given the possibility of a race there's no harm
in taking a conservative approach and getting that information again
since the directory entry is being reread anyway.
Fixes: b7ff91fd030d ("ext4: find old entry again if failed to rename whiteout")
Cc: stable@kernel.org
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230210173244.679890-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.
Take advantage of this to constify the structure definitions to prevent
modification at runtime.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230209-kobj_type-ext4-v1-1-6865fb05c1f8@weissschuh.net
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
When writing a page from an encrypted file that is using
filesystem-layer encryption (not inline encryption), ext4 encrypts the
pagecache page into a bounce page, then writes the bounce page.
It also passes the bounce page to wbc_account_cgroup_owner(). That's
incorrect, because the bounce page is a newly allocated temporary page
that doesn't have the memory cgroup of the original pagecache page.
This makes wbc_account_cgroup_owner() not account the I/O to the owner
of the pagecache page as it should.
Fix this by always passing the pagecache page to
wbc_account_cgroup_owner().
Fixes: 001e4a8775f6 ("ext4: implement cgroup writeback support")
Cc: stable@vger.kernel.org
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203005503.141557-1-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
When buffered write fails to copy data into underlying page cache page,
ocfs2_write_end_nolock() just zeroes out and dirties the page. This can
leave dirty page beyond EOF and if page writeback tries to write this page
before write succeeds and expands i_size, page gets into inconsistent
state where page dirty bit is clear but buffer dirty bits stay set
resulting in page data never getting written and so data copied to the
page is lost. Fix the problem by invalidating page beyond EOF after
failed write.
Link: https://lkml.kernel.org/r/20230302153843.18499-1-jack@suse.cz
Fixes: 6dbf7bb55598 ("fs: Don't invalidate page buffers in block_write_full_page()")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
We can often end up inserting a block group item, for a new block group,
with a wrong value for the used bytes field.
This happens if for the new allocated block group, in the same transaction
that created the block group, we have tasks allocating extents from it as
well as tasks removing extents from it.
For example:
1) Task A creates a metadata block group X;
2) Two extents are allocated from block group X, so its "used" field is
updated to 32K, and its "commit_used" field remains as 0;
3) Transaction commit starts, by some task B, and it enters
btrfs_start_dirty_block_groups(). There it tries to update the block
group item for block group X, which currently has its "used" field with
a value of 32K. But that fails since the block group item was not yet
inserted, and so on failure update_block_group_item() sets the
"commit_used" field of the block group back to 0;
4) The block group item is inserted by task A, when for example
btrfs_create_pending_block_groups() is called when releasing its
transaction handle. This results in insert_block_group_item() inserting
the block group item in the extent tree (or block group tree), with a
"used" field having a value of 32K, but without updating the
"commit_used" field in the block group, which remains with value of 0;
5) The two extents are freed from block X, so its "used" field changes
from 32K to 0;
6) The transaction commit by task B continues, it enters
btrfs_write_dirty_block_groups() which calls update_block_group_item()
for block group X, and there it decides to skip the block group item
update, because "used" has a value of 0 and "commit_used" has a value
of 0 too.
As a result, we end up with a block item having a 32K "used" field but
no extents allocated from it.
When this issue happens, a btrfs check reports an error like this:
[1/7] checking root items
[2/7] checking extents
block group [1104150528 1073741824] used 39796736 but extent items used 0
ERROR: errors found in extent allocation tree or chunk allocation
(...)
Fix this by making insert_block_group_item() update the block group's
"commit_used" field.
Fixes: 7248e0cebbef ("btrfs: skip update of block group item if used bytes are the same")
CC: stable@vger.kernel.org # 6.2+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Flole observes this WARNING on occasion:
[1210423.486503] WARNING: CPU: 8 PID: 1524732 at fs/ext4/ext4_jbd2.c:75 ext4_journal_check_start+0x68/0xb0
Reported-by: <flole@flole.de>
Suggested-by: Jan Kara <jack@suse.cz>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217123
Fixes: 73da852e3831 ("nfsd: use vfs_iter_read/write")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
This patch moves lkb_sbflags handling to atomic bits ops. This should
prepare for a possible manipulating of lkb_sbflags flags at the same
time by concurrent execution.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
This patch moves the rsb hash table handling to atomic flag operations.
The flag operations for DLM_RTF_SHRINK are protected by
ls->ls_rsbtbl[b].lock. However we switch to atomic ops if new possible
flags will be used in a different way and don't assume such lock
dependencies.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
This patch will move the lkb_flags value to the recently introduced
lkb_iflags value. For lkb_iflags we use atomic bit operations because
some flags like DLM_IFL_CB_PENDING are used while non rsb lock is held
to avoid issues with other flag manipulations which might run at the
same time we switch to atomic bit operations. Snapshot the bit values to
an uint32_t value is only used for debugging/logging use cases and don't
need to be 100% correct.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
Currently manipulating lkb_dflags assumes to held the rsb lock assigned
to the lkb. This is held by dlm message processing after certain
time to lookup the right rsb from the received lkb message id. For user
space locks flags, which is currently the only use case for lkb_dflags,
flags are also being set during dlm character device handling without
holding the rsb lock. To minimize the risk that bit operations are
getting corrupted we switch to atomic bit operations. This patch will
also introduce helpers to snapshot atomic bit values in an non atomic
way. There might be still issues with the flag handling e.g. running in
case of manipulating bit ops and snapshot them at the same time, but this
patch minimize them and will start to use atomic bit operations.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
This patch stores lkb distributed flags value in an separate value
instead of sharing internal and distributed flags in lkb->lkb_flags value.
This has the advantage to not mask/write back flag values in
receive_flags() functionality. The dlm debug_fs does not provide the
distributed flags anymore, those can be added in future.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
The DLM_IFL_LOCAL_MS flag is an internal non shared flag but used in
m_flags of dlm messages. It is not shared because it is only used for
local messaging. Instead using DLM_IFL_LOCAL_MS in dlm messages we pass a
parameter around to signal local messaging or not. This patch is adding
the local parameter to signal local messaging.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
This patch renames DLM_IFL_STUB_MS to DLM_IFL_LOCAL_MS flag. The
DLM_IFL_STUB_MS flag is somewhat misnamed, it means the dlm message is
used for local message transfer only. It is used by recovery to resolve
lock states if a node got fenced.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
This patch removes code parts which was declared deprecated by
commit 6b0afc0cc3e9 ("fs: dlm: don't use deprecated timeout features by
default"). This contains the following dlm functionality:
- start a cancel of a dlm request did not complete after certain timeout:
The current way how dlm cancellation works and interfering with other
dlm requests triggered by the user can end in an overlapping and
returning in -EBUSY. The most user don't handle this case and are
unaware that DLM can return such errno in such situation. Due the
timeout the user are mostly unaware when this happens.
- start a netlink warning messages for user space if dlm requests did
not complete after certain timeout:
This feature was never being built in the only known dlm user space side.
As we are to remove the timeout cancellation feature we can directly
remove this feature as well.
There might be the possibility to bring the timeout cancellation feature
back. However the current way of handling the -EBUSY case which is only
a software limitation and not a hardware limitation should be changed.
We minimize the current code base in DLM cancellation feature to not have
to deal with those existing features while solving the DLM cancellation
feature in general.
UAPI define DLM_LSFL_TIMEWARN is commented as deprecated and reserved
value. We should avoid at first to give it a new meaning but let
possible users still compile by keeping this define. In far future we
can give this flag a new meaning. The same for the DLM_LKF_TIMEOUT lock
request flag.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
On a 16 node virtual cluster with e1000 NICs joining the 12th node prints
SYN flood warnings for the DLM port:
Dec 21 01:46:41 localhost kernel: [ 2146.516664] TCP: request_sock_TCP: Possible SYN flooding on port 21064. Sending cookies. Check SNMP counters.
And then joining a DLM lockspace hangs:
```
Dec 21 01:49:00 localhost kernel: [ 2285.780913] INFO: task xapi-clusterd:17638 blocked for more than 120 seconds.
Dec 21 01:49:00 localhost kernel: [ 2285.786476] Not tainted 4.4.0+10 #1
Dec 21 01:49:00 localhost kernel: [ 2285.789043] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 21 01:49:00 localhost kernel: [ 2285.794611] xapi-clusterd D ffff88001930bc58 0 17638 1 0x00000000
Dec 21 01:49:00 localhost kernel: [ 2285.794615] ffff88001930bc58 ffff880025593800 ffff880022433800 ffff88001930c000
Dec 21 01:49:00 localhost kernel: [ 2285.794617] ffff88000ef4a660 ffff88000ef4a658 ffff880022433800 ffff88000ef4a000
Dec 21 01:49:00 localhost kernel: [ 2285.794619] ffff88001930bc70 ffffffff8159f6b4 7fffffffffffffff ffff88001930bd10
Dec 21 01:49:00 localhost kernel: [ 2285.794644] [<ffffffff811570fe>] ? printk+0x4d/0x4f
Dec 21 01:49:00 localhost kernel: [ 2285.794647] [<ffffffff810b1741>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
Dec 21 01:49:00 localhost kernel: [ 2285.794649] [<ffffffff815a085d>] wait_for_completion+0x9d/0x110
Dec 21 01:49:00 localhost kernel: [ 2285.794653] [<ffffffff810979e0>] ? wake_up_q+0x80/0x80
Dec 21 01:49:00 localhost kernel: [ 2285.794661] [<ffffffffa03fa4b8>] dlm_new_lockspace+0x908/0xac0 [dlm]
Dec 21 01:49:00 localhost kernel: [ 2285.794665] [<ffffffff810aaa60>] ? prepare_to_wait_event+0x100/0x100
Dec 21 01:49:00 localhost kernel: [ 2285.794670] [<ffffffffa0402e37>] device_write+0x497/0x6b0 [dlm]
Dec 21 01:49:00 localhost kernel: [ 2285.794673] [<ffffffff811834f0>] ? handle_mm_fault+0x7f0/0x13b0
Dec 21 01:49:00 localhost kernel: [ 2285.794677] [<ffffffff811b4438>] __vfs_write+0x28/0xd0
Dec 21 01:49:00 localhost kernel: [ 2285.794679] [<ffffffff811b4b7f>] ? rw_verify_area+0x6f/0xd0
Dec 21 01:49:00 localhost kernel: [ 2285.794681] [<ffffffff811b4dc1>] vfs_write+0xb1/0x190
Dec 21 01:49:00 localhost kernel: [ 2285.794686] [<ffffffff8105ffc2>] ? __do_page_fault+0x302/0x420
Dec 21 01:49:00 localhost kernel: [ 2285.794688] [<ffffffff811b5986>] SyS_write+0x46/0xa0
Dec 21 01:49:00 localhost kernel: [ 2285.794690] [<ffffffff815a31ae>] entry_SYSCALL_64_fastpath+0x12/0x71
```
The previous limit of 5 seems like an arbitrary number, that doesn't match any
known DLM cluster size upper bound limit.
Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Cc: cluster-devel@redhat.com
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
This patch will add the WQ_UNBOUND flag to the lowcomms dlm_io workqueue
which handles socket io handling to send and receive dlm messages.
The amount of sockets will be 2 for a 3 node cluster. Each socket has
two different workers for doing send and receive work by calling socket
API functionality. Each worker will do their task in order to send dlm
messages in a ordered stream based socket communication. On receive
side the receive buffer will be queued up for an ordered dlm_process
workqueue to parse received dlm messages. The parsing need to be done
currently in an ordered synchronized way because the dlm message processing
is not being made to parse parallel.
After explaining all those workqueue behaviours in lowcomms, the dlm_io
workqueue is only being used for socket handling. Each socket handling
has 2 workers (send and receive). In a 3 cluster node we will end up
with 4 workers. Without the WQ_UNBOUND flag the workers are tight to a
CPU and can never switch, this could be an advantage because local CPU
execution. However with dlm_locktorture testcase I expierenced not all
workers are always in use and my assumption is that some workers are
bound to the same CPU. We should always send or receive when we are
ready to do so, one reason why we disable nigel algorithm on sockets.
We should be safe to do the socket io handling on any CPU which can be
switched during runtime. There is no assumption that the worker stays on
the same CPU. There is no need to respect any workqueue concurrency
model that each worker can only run on one CPU. Lowcomms queue_work()
mechanism has an higher level flag to be sure that it can't schedule
work if the previous worker did not signal it to keep ordered socket
handling. Therefore this patch sets the WQ_UNBOUND flag to allow workers
being executed by any available CPU.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
This patch introduce a new internal flag per lkb value to handle
internal flags which are handled not on wire. The current lkb internal
flags stored as lkb->lkb_flags are split in upper and lower bits, the
lower bits are used to share internal flags over wire for other cluster
wide lkb copies on other nodes.
In commit 61bed0baa4db ("fs: dlm: use a non-static queue for callbacks")
we introduced a new internal flag for pending callbacks for the dlm
callback queue. This flag is protected by the lkb->lkb_cb_lock lock.
This patch overlooked that on dlm receive path and the mentioned upper
and lower bits, that dlm will read the flags, mask it and write it
back. As example receive_flags() in fs/dlm/lock.c. This flag
manipulation is not done atomically and is not protected by
lkb->lkb_cb_lock. This has unknown side effects of the current callback
handling.
In future we should move to set/clear/test bit functionality and avoid
read, mask and writing back flag values. In later patches we will move
the upper parts to the new introduced internal lkb flags which are not
shared between other cluster nodes to the new non shared internal flag
field to avoid similar issues.
Cc: stable@vger.kernel.org
Fixes: 61bed0baa4db ("fs: dlm: use a non-static queue for callbacks")
Reported-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
range
At btrfs_drop_extent_map_range() we are clearing the EXTENT_FLAG_LOGGING
bit on a 'flags' variable that was not initialized. This makes static
checkers complain about it, so initialize the 'flags' variable before
clearing the bit.
In practice this has no consequences, because EXTENT_FLAG_LOGGING should
not be set when btrfs_drop_extent_map_range() is called, as an fsync locks
the inode in exclusive mode, locks the inode's mmap semaphore in exclusive
mode too and it always flushes all delalloc.
Also add a comment about why we clear EXTENT_FLAG_LOGGING on a copy of the
flags of the split extent map.
Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/Y%2FyipSVozUDEZKow@kili/
Fixes: db21370bffbc ("btrfs: drop extent map range more efficiently")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We have a report, that the info message for block-group reclaim is
crossing the 100% used mark.
This is happening as we were truncating the divisor for the division
(the block_group->length) to a 32bit value.
Fix this by using div64_u64() to not truncate the divisor.
In the worst case, it can lead to a div by zero error and should be
possible to trigger on 4 disks RAID0, and each device is large enough:
$ mkfs.btrfs -f /dev/test/scratch[1234] -m raid1 -d raid0
btrfs-progs v6.1
[...]
Filesystem size: 40.00GiB
Block group profiles:
Data: RAID0 4.00GiB <<<
Metadata: RAID1 256.00MiB
System: RAID1 8.00MiB
Reported-by: Forza <forza@tnonline.net>
Link: https://lore.kernel.org/linux-btrfs/e99483.c11a58d.1863591ca52@tnonline.net/
Fixes: 5f93e776c673 ("btrfs: zoned: print unusable percentage when reclaiming block groups")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add Qu's note ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Current btrfs_log_dev_io_error() increases the read error count even if the
erroneous IO is a WRITE request. This is because it forget to use "else
if", and all the error WRITE requests counts as READ error as there is (of
course) no REQ_RAHEAD bit set.
Fixes: c3a62baf21ad ("btrfs: use chained bios when cloning")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Even if the slot is already read out, we may still need to re-balance
the tree, thus it can cause error in that btrfs_del_item() call and we
need to handle it properly.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: void0red <void0red@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently user space utilizes dev info ioctl to grab the info of a
certain devid, this includes its device uuid. But the returned info is
not enough to determine if a device is a seed.
Commit a26d60dedf9a ("btrfs: sysfs: add devinfo/fsid to retrieve actual
fsid from the device") exports the same value in sysfs so this is for
parity with ioctl. Add a new member, fsid, into
btrfs_ioctl_dev_info_args, and populate the member with fsid value.
This should not cause any compatibility problem, following the
combinations:
- Old user space, old kernel
- Old user space, new kernel
User space tool won't even check the new member.
- New user space, old kernel
The kernel won't touch the new member, and user space tool should
zero out its argument, thus the new member is all zero.
User space tool can then know the kernel doesn't support this fsid
reporting, and falls back to whatever they can.
- New user space, new kernel
Go as planned.
Would find the fsid member is no longer zero, and trust its value.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
As reported by Filipe, there's a potential deadlock caused by
using btrfs_search_forward on commit_root. The locking there is
unconditional, even if ->skip_locking and ->search_commit_root is set.
It's not meant to be used for commit roots, so it always needs to do
locking.
So if another task is COWing a child node of the same root node and
then needs to wait for block group caching to complete when trying to
allocate a metadata extent, it deadlocks.
For example:
[539604.239315] sysrq: Show Blocked State
[539604.240133] task:kworker/u16:6 state:D stack:0 pid:2119594 ppid:2 flags:0x00004000
[539604.241613] Workqueue: btrfs-cache btrfs_work_helper [btrfs]
[539604.242673] Call Trace:
[539604.243129] <TASK>
[539604.243925] __schedule+0x41d/0xee0
[539604.244797] ? rcu_read_lock_sched_held+0x12/0x70
[539604.245399] ? rwsem_down_read_slowpath+0x185/0x490
[539604.246111] schedule+0x5d/0xf0
[539604.246593] rwsem_down_read_slowpath+0x2da/0x490
[539604.247290] ? rcu_barrier_tasks_trace+0x10/0x20
[539604.248090] __down_read_common+0x3d/0x150
[539604.248702] down_read_nested+0xc3/0x140
[539604.249280] __btrfs_tree_read_lock+0x24/0x100 [btrfs]
[539604.250097] btrfs_read_lock_root_node+0x48/0x60 [btrfs]
[539604.250915] btrfs_search_forward+0x59/0x460 [btrfs]
[539604.251781] ? btrfs_global_root+0x50/0x70 [btrfs]
[539604.252476] caching_thread+0x1be/0x920 [btrfs]
[539604.253167] btrfs_work_helper+0xf6/0x400 [btrfs]
[539604.253848] process_one_work+0x24f/0x5a0
[539604.254476] worker_thread+0x52/0x3b0
[539604.255166] ? __pfx_worker_thread+0x10/0x10
[539604.256047] kthread+0xf0/0x120
[539604.256591] ? __pfx_kthread+0x10/0x10
[539604.257212] ret_from_fork+0x29/0x50
[539604.257822] </TASK>
[539604.258233] task:btrfs-transacti state:D stack:0 pid:2236474 ppid:2 flags:0x00004000
[539604.259802] Call Trace:
[539604.260243] <TASK>
[539604.260615] __schedule+0x41d/0xee0
[539604.261205] ? rcu_read_lock_sched_held+0x12/0x70
[539604.262000] ? rwsem_down_read_slowpath+0x185/0x490
[539604.262822] schedule+0x5d/0xf0
[539604.263374] rwsem_down_read_slowpath+0x2da/0x490
[539604.266228] ? lock_acquire+0x160/0x310
[539604.266917] ? rcu_read_lock_sched_held+0x12/0x70
[539604.267996] ? lock_contended+0x19e/0x500
[539604.268720] __down_read_common+0x3d/0x150
[539604.269400] down_read_nested+0xc3/0x140
[539604.270057] __btrfs_tree_read_lock+0x24/0x100 [btrfs]
[539604.271129] btrfs_read_lock_root_node+0x48/0x60 [btrfs]
[539604.272372] btrfs_search_slot+0x143/0xf70 [btrfs]
[539604.273295] update_block_group_item+0x9e/0x190 [btrfs]
[539604.274282] btrfs_start_dirty_block_groups+0x1c4/0x4f0 [btrfs]
[539604.275381] ? __mutex_unlock_slowpath+0x45/0x280
[539604.276390] btrfs_commit_transaction+0xee/0xed0 [btrfs]
[539604.277391] ? lock_acquire+0x1a4/0x310
[539604.278080] ? start_transaction+0xcb/0x6c0 [btrfs]
[539604.279099] transaction_kthread+0x142/0x1c0 [btrfs]
[539604.279996] ? __pfx_transaction_kthread+0x10/0x10 [btrfs]
[539604.280673] kthread+0xf0/0x120
[539604.281050] ? __pfx_kthread+0x10/0x10
[539604.281496] ret_from_fork+0x29/0x50
[539604.281966] </TASK>
[539604.282255] task:fsstress state:D stack:0 pid:2236483 ppid:1 flags:0x00004006
[539604.283897] Call Trace:
[539604.284700] <TASK>
[539604.285088] __schedule+0x41d/0xee0
[539604.285660] schedule+0x5d/0xf0
[539604.286175] btrfs_wait_block_group_cache_progress+0xf2/0x170 [btrfs]
[539604.287342] ? __pfx_autoremove_wake_function+0x10/0x10
[539604.288450] find_free_extent+0xd93/0x1750 [btrfs]
[539604.289256] ? _raw_spin_unlock+0x29/0x50
[539604.289911] ? btrfs_get_alloc_profile+0x127/0x2a0 [btrfs]
[539604.290843] btrfs_reserve_extent+0x147/0x290 [btrfs]
[539604.291943] btrfs_alloc_tree_block+0xcb/0x3e0 [btrfs]
[539604.292903] __btrfs_cow_block+0x138/0x580 [btrfs]
[539604.293773] btrfs_cow_block+0x10e/0x240 [btrfs]
[539604.294595] btrfs_search_slot+0x7f3/0xf70 [btrfs]
[539604.295585] btrfs_update_device+0x71/0x1b0 [btrfs]
[539604.296459] btrfs_chunk_alloc_add_chunk_item+0xe0/0x340 [btrfs]
[539604.297489] btrfs_chunk_alloc+0x1bf/0x490 [btrfs]
[539604.298335] find_free_extent+0x6fa/0x1750 [btrfs]
[539604.299174] ? _raw_spin_unlock+0x29/0x50
[539604.299950] ? btrfs_get_alloc_profile+0x127/0x2a0 [btrfs]
[539604.300918] btrfs_reserve_extent+0x147/0x290 [btrfs]
[539604.301797] btrfs_alloc_tree_block+0xcb/0x3e0 [btrfs]
[539604.303017] ? lock_release+0x224/0x4a0
[539604.303855] __btrfs_cow_block+0x138/0x580 [btrfs]
[539604.304789] btrfs_cow_block+0x10e/0x240 [btrfs]
[539604.305611] btrfs_search_slot+0x7f3/0xf70 [btrfs]
[539604.306682] ? btrfs_global_root+0x50/0x70 [btrfs]
[539604.308198] lookup_inline_extent_backref+0x17b/0x7a0 [btrfs]
[539604.309254] lookup_extent_backref+0x43/0xd0 [btrfs]
[539604.310122] __btrfs_free_extent+0xf8/0x810 [btrfs]
[539604.310874] ? lock_release+0x224/0x4a0
[539604.311724] ? btrfs_merge_delayed_refs+0x17b/0x1d0 [btrfs]
[539604.313023] __btrfs_run_delayed_refs+0x2ba/0x1260 [btrfs]
[539604.314271] btrfs_run_delayed_refs+0x8f/0x1c0 [btrfs]
[539604.315445] ? rcu_read_lock_sched_held+0x12/0x70
[539604.316706] btrfs_commit_transaction+0xa2/0xed0 [btrfs]
[539604.317855] ? do_raw_spin_unlock+0x4b/0xa0
[539604.318544] ? _raw_spin_unlock+0x29/0x50
[539604.319240] create_subvol+0x53d/0x6e0 [btrfs]
[539604.320283] btrfs_mksubvol+0x4f5/0x590 [btrfs]
[539604.321220] __btrfs_ioctl_snap_create+0x11b/0x180 [btrfs]
[539604.322307] btrfs_ioctl_snap_create_v2+0xc6/0x150 [btrfs]
[539604.323295] btrfs_ioctl+0x9f7/0x33e0 [btrfs]
[539604.324331] ? rcu_read_lock_sched_held+0x12/0x70
[539604.325137] ? lock_release+0x224/0x4a0
[539604.325808] ? __x64_sys_ioctl+0x87/0xc0
[539604.326467] __x64_sys_ioctl+0x87/0xc0
[539604.327109] do_syscall_64+0x38/0x90
[539604.327875] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[539604.328792] RIP: 0033:0x7f05a7babaeb
This needs to use regular btrfs_search_slot() with some skip and stop
logic.
Since we only consider five samples (five search slots), don't bother
with the complexity of looking for commit_root_sem contention. If
necessary, it can be added to the load function in between samples.
Reported-by: Filipe Manana <fdmanana@kernel.org>
Link: https://lore.kernel.org/linux-btrfs/CAL3q7H7eKMD44Z1+=Kb-1RFMMeZpAm2fwyO59yeBwCcSOU80Pg@mail.gmail.com/
Fixes: c7eec3d9aa95 ("btrfs: load block group size class when caching")
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Check that log of block size stored in the superblock has sensible
value. Otherwise the shift computing the block size can overflow leading
to undefined behavior.
Reported-by: syzbot+4fec412f59eba8c01b77@syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Ext2 has traditionally supported filesystem block sizes upto page size
or upto 65536. Macro EXT2_MAX_BLOCK_SIZE is set to 4096, however that is
never used in ext2 so practically we always allowed whatever
sb_set_blocksize() accepted. Fix value of EXT2_MAX_BLOCK_SIZE because it
will be used in the next patch.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Now that address space operations are merge dfor in-ICB and normal
files, it is more likely some code mistakenly tries to map blocks for
in-ICB files. WARN and return error instead of silently returning
garbage.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
After merging address space operations of normal and in-ICB files,
readahead could get called for in-ICB files which resulted in
udf_get_block() being called for these files. udf_get_block() is not
prepared to be called for in-ICB files and ends up returning garbage
results as it interprets file data as extent list. Fix the problem by
skipping readahead for in-ICB files.
Fixes: 37a8a39f7ad3 ("udf: Switch to single address_space_operations")
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
The patch converting udf_adinicb_writepage() to avoid manually kmapping
the page used memcpy_to_page() however that copies in the wrong
direction (effectively overwriting file data with the old contents).
What we should be using is memcpy_from_page() to copy data from the page
into the inode and then mark inode dirty to store the data.
Fixes: 5cfc45321a6d ("udf: Convert udf_adinicb_writepage() to memcpy_to_page()")
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
All codepaths that don't want to implement POSIX ACLs should simply not
implement the associated inode operations instead of relying on
IOP_XATTR. That's the case for all filesystems today.
For vfs_listxattr() all filesystems that explicitly turn of xattrs for a
given inode all set inode->i_op to a dedicated set of inode operations
that doesn't implement ->listxattr(). We can remove the dependency of
vfs_listxattr() on IOP_XATTR.
Removing this dependency will allow us to decouple POSIX ACLs from
IOP_XATTR and they can still be listed even if no other xattr handlers
are implemented. Otherwise we would have to implement elaborate schemes
to raise IOP_XATTR even if sb->s_xattr is set to NULL.
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
We have decoupled vfs_listxattr() from IOP_XATTR. Instead we just need
to check whether inode->i_op->listxattr is implemented.
Cc: linux-unionfs@vger.kernel.org
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
Reiserfs is the only filesystem that removes IOP_XATTR without also
using a set of dedicated inode operations at the same time that nop all
xattr related inode operations. This means we need to have a IOP_XATTR
check in vfs_listxattr() instead of just being able to check for
->listxatt() being implemented.
Introduce a dedicated set of nop inode operations that are used when
IOP_XATTR is removed, allowing us to remove that check from
vfs_listxattr(). This in turn allows us to completely decouple POSIX ACLs from
IOP_XATTR.
Cc: reiserfs-devel@vger.kernel.org
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
Reflect in their naming and document that they are kept around for
legacy reasons and shouldn't be used anymore by new code.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
Rework reiserfs so it doesn't have to rely on the dummy xattr handlers
in its s_xattr list anymore as this is completely unused for setting and
getting posix acls.
Cc: reiserfs-devel@vger.kernel.org
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
The ext{2,4}, erofs, f2fs, and jffs2 filesystems use the same logic to
check whether a given xattr can be listed. Simplify them and avoid
open-coding the same check by calling the helper we introduced earlier.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: linux-erofs@lists.ozlabs.org
Cc: linux-ext4@vger.kernel.org
Cc: linux-mtd@lists.infradead.org
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
Remove struct posix_acl_{access,default}_handler for all filesystems
that don't depend on the xattr handler in their inode->i_op->listxattr()
method in any way. There's nothing more to do than to simply remove the
handler. It's been effectively unused ever since we introduced the new
posix acl api.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
his helpers is really just used to check for user.* xattr support so
don't make it pointlessly generic.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
The generic_listxattr() and simple_xattr_list() helpers list xattrs and
contain duplicated code. Add two helpers that both generic_listxattr()
and simple_xattr_list() can use.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
always NULL...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
When send SMB_COM_NT_CANCEL and RFC1002_SESSION_REQUEST, the
in_send statistic was lost.
Let's move the in_send statistic to the send function to avoid
this scenario.
Fixes: 7ee1af765dfa ("[CIFS]")
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|