Age | Commit message (Collapse) | Author |
|
Now that all vs_dispatch functions invoke svcxdr_init_encode(), it
is common code and can be pushed down into the generic RPC server.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The WARN_ON_ONCE check is not terribly useful. It also seems possible
for nfs4_find_file to race with the destruction of an fi_deleg_file
while trying to take a reference to it.
Now that it's safe to pass nfs_get_file a NULL pointer, remove the WARN
and NULL pointer check. Take the fi_lock when fetching fi_deleg_file.
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
...and remove some now-useless NULL pointer checks in its callers.
Suggested-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Currently nfsd4_setup_inter_ssc returns the vfsmount of the source
server's export when the mount completes. After the copy is done
nfsd4_cleanup_inter_ssc is called with the vfsmount of the source
server and it searches nfsd_ssc_mount_list for a matching entry
to do the clean up.
The problems with this approach are (1) the need to search the
nfsd_ssc_mount_list and (2) the code has to handle the case where
the matching entry is not found which looks ugly.
The enhancement is instead of nfsd4_setup_inter_ssc returning the
vfsmount, it returns the nfsd4_ssc_umount_item which has the
vfsmount embedded in it. When nfsd4_cleanup_inter_ssc is called
it's passed with the nfsd4_ssc_umount_item directly to do the
clean up so no searching is needed and there is no need to handle
the 'not found' case.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[ cel: adjusted whitespace and variable/function names ]
Reviewed-by: Olga Kornievskaia <kolga@netapp.com>
|
|
Now that all vs_dispatch functions invoke svcxdr_init_decode(), it
is common code and can be pushed down into the generic RPC server.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Now with NFSD being able to cross into auto mounts,
the check can be removed.
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Ian Kent <raven@themaw.net>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
This function is only used by NFSD to cross mount points.
If a mount point is of type auto mount, follow_down() will
not uncover it. Add LOOKUP_AUTOMOUNT to the lookup flags
to have ->d_automount() called when NFSD walks down the
mount tree.
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Ian Kent <raven@themaw.net>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Currently nfsd_mountpoint() tests for mount points using d_mountpoint(),
this works only when a mount point is already uncovered.
In our case the mount point is of type auto mount and can be coverted.
i.e. ->d_automount() was not called.
Using d_managed() nfsd_mountpoint() can test whether a mount point is
either already uncovered or can be uncovered later.
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Ian Kent <raven@themaw.net>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Six hotfixes. Five are cc:stable: four for MM, one for nilfs2.
Also a MAINTAINERS update"
* tag 'mm-hotfixes-stable-2023-02-17-15-16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
nilfs2: fix underflow in second superblock position calculations
hugetlb: check for undefined shift on 32 bit architectures
mm/migrate: fix wrongly apply write bit after mkdirty on sparc64
MAINTAINERS: update FPU EMULATOR web page
mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
mm/filemap: fix page end in filemap_get_read_batch
|
|
Macro NILFS_SB2_OFFSET_BYTES, which computes the position of the second
superblock, underflows when the argument device size is less than 4096
bytes. Therefore, when using this macro, it is necessary to check in
advance that the device size is not less than a lower limit, or at least
that underflow does not occur.
The current nilfs2 implementation lacks this check, causing out-of-bound
block access when mounting devices smaller than 4096 bytes:
I/O error, dev loop0, sector 36028797018963960 op 0x0:(READ) flags 0x0
phys_seg 1 prio class 2
NILFS (loop0): unable to read secondary superblock (blocksize = 1024)
In addition, when trying to resize the filesystem to a size below 4096
bytes, this underflow occurs in nilfs_resize_fs(), passing a huge number
of segments to nilfs_sufile_resize(), corrupting parameters such as the
number of segments in superblocks. This causes excessive loop iterations
in nilfs_sufile_resize() during a subsequent resize ioctl, causing
semaphore ns_segctor_sem to block for a long time and hang the writer
thread:
INFO: task segctord:5067 blocked for more than 143 seconds.
Not tainted 6.2.0-rc8-syzkaller-00015-gf6feea56f66d #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:segctord state:D stack:23456 pid:5067 ppid:2
flags:0x00004000
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5293 [inline]
__schedule+0x1409/0x43f0 kernel/sched/core.c:6606
schedule+0xc3/0x190 kernel/sched/core.c:6682
rwsem_down_write_slowpath+0xfcf/0x14a0 kernel/locking/rwsem.c:1190
nilfs_transaction_lock+0x25c/0x4f0 fs/nilfs2/segment.c:357
nilfs_segctor_thread_construct fs/nilfs2/segment.c:2486 [inline]
nilfs_segctor_thread+0x52f/0x1140 fs/nilfs2/segment.c:2570
kthread+0x270/0x300 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
</TASK>
...
Call Trace:
<TASK>
folio_mark_accessed+0x51c/0xf00 mm/swap.c:515
__nilfs_get_page_block fs/nilfs2/page.c:42 [inline]
nilfs_grab_buffer+0x3d3/0x540 fs/nilfs2/page.c:61
nilfs_mdt_submit_block+0xd7/0x8f0 fs/nilfs2/mdt.c:121
nilfs_mdt_read_block+0xeb/0x430 fs/nilfs2/mdt.c:176
nilfs_mdt_get_block+0x12d/0xbb0 fs/nilfs2/mdt.c:251
nilfs_sufile_get_segment_usage_block fs/nilfs2/sufile.c:92 [inline]
nilfs_sufile_truncate_range fs/nilfs2/sufile.c:679 [inline]
nilfs_sufile_resize+0x7a3/0x12b0 fs/nilfs2/sufile.c:777
nilfs_resize_fs+0x20c/0xed0 fs/nilfs2/super.c:422
nilfs_ioctl_resize fs/nilfs2/ioctl.c:1033 [inline]
nilfs_ioctl+0x137c/0x2440 fs/nilfs2/ioctl.c:1301
...
This fixes these issues by inserting appropriate minimum device size
checks or anti-underflow checks, depending on where the macro is used.
Link: https://lkml.kernel.org/r/0000000000004e1dfa05f4a48e6b@google.com
Link: https://lkml.kernel.org/r/20230214224043.24141-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: <syzbot+f0c4082ce5ebebdac63b@syzkaller.appspotmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This reverts commit 7fd461c47c6cfab4ca4d003790ec276209e52978.
Unfortunately, it has come to our attention that there is still a bug
somewhere in the READ_PLUS code that can result in nfsroot systems on
ARM to crash during boot.
Let's do the right thing and revert this change so we don't break
people's nfsroot setups.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fix from Chuck Lever:
- Fix a teardown bug in the new nfs4_file hashtable
* tag 'nfsd-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: don't destroy global nfs4_file table in per-net shutdown
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Twelve hotfixes, mostly against mm/.
Five of these fixes are cc:stable"
* tag 'mm-hotfixes-stable-2023-02-13-13-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
of: reserved_mem: Have kmemleak ignore dynamically allocated reserved mem
scripts/gdb: fix 'lx-current' for x86
lib: parser: optimize match_NUMBER apis to use local array
mm: shrinkers: fix deadlock in shrinker debugfs
mm: hwpoison: support recovery from ksm_might_need_to_copy()
kasan: fix Oops due to missing calls to kasan_arch_is_ready()
revert "squashfs: harden sanity check in squashfs_read_xattr_id_table"
fsdax: dax_unshare_iter() should return a valid length
mm/gup: add folio to list when folio_isolate_lru() succeed
aio: fix mremap after fork null-deref
mailmap: add entry for Alexander Mikhalitsyn
mm: extend max struct page size for kmsan
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- one more fix for a tree-log 'write time corruption' report, update
the last dir index directly and don't keep in the log context
- do VFS-level inode lock around FIEMAP to prevent a deadlock with
concurrent fsync, the extent-level lock is not sufficient
- don't cache a single-device filesystem device to avoid cases when a
loop device is reformatted and the entry gets stale
* tag 'for-6.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: free device in btrfs_close_devices for a single device filesystem
btrfs: lock the inode in shared mode before starting fiemap
btrfs: simplify update of last_dir_index_offset when logging a directory
|
|
The nfs4_file table is global, so shutting it down when a containerized
nfsd is shut down is wrong and can lead to double-frees. Tear down the
nfs4_file_rhltable in nfs4_state_shutdown instead of
nfs4_state_shutdown_net.
Fixes: d47b295e8d76 ("NFSD: Use rhashtable for managing nfs4_file objects")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2169017
Reported-by: JianHong Yin <jiyin@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Pull ceph fix from Ilya Dryomov:
"A fix for a pretty embarrassing omission in the session flush handler
from Xiubo, marked for stable"
* tag 'ceph-for-6.2-rc8' of https://github.com/ceph/ceph-client:
ceph: flush cap releases when the session is flushed
|
|
Pull cifx fix from Steve French:
"Small fix for use after free"
* tag '6.2-rc8-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix use-after-free in rdata->read_into_pages()
|
|
We have this check to make sure we don't accidentally add older devices
that may have disappeared and re-appeared with an older generation from
being added to an fs_devices (such as a replace source device). This
makes sense, we don't want stale disks in our file system. However for
single disks this doesn't really make sense.
I've seen this in testing, but I was provided a reproducer from a
project that builds btrfs images on loopback devices. The loopback
device gets cached with the new generation, and then if it is re-used to
generate a new file system we'll fail to mount it because the new fs is
"older" than what we have in cache.
Fix this by freeing the cache when closing the device for a single device
filesystem. This will ensure that the mount command passed device path is
scanned successfully during the next mount.
CC: stable@vger.kernel.org # 5.10+
Reported-by: Daan De Meyer <daandemeyer@fb.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently fiemap does not take the inode's lock (VFS lock), it only locks
a file range in the inode's io tree. This however can lead to a deadlock
if we have a concurrent fsync on the file and fiemap code triggers a fault
when accessing the user space buffer with fiemap_fill_next_extent(). The
deadlock happens on the inode's i_mmap_lock semaphore, which is taken both
by fsync and btrfs_page_mkwrite(). This deadlock was recently reported by
syzbot and triggers a trace like the following:
task:syz-executor361 state:D stack:20264 pid:5668 ppid:5119 flags:0x00004004
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5293 [inline]
__schedule+0x995/0xe20 kernel/sched/core.c:6606
schedule+0xcb/0x190 kernel/sched/core.c:6682
wait_on_state fs/btrfs/extent-io-tree.c:707 [inline]
wait_extent_bit+0x577/0x6f0 fs/btrfs/extent-io-tree.c:751
lock_extent+0x1c2/0x280 fs/btrfs/extent-io-tree.c:1742
find_lock_delalloc_range+0x4e6/0x9c0 fs/btrfs/extent_io.c:488
writepage_delalloc+0x1ef/0x540 fs/btrfs/extent_io.c:1863
__extent_writepage+0x736/0x14e0 fs/btrfs/extent_io.c:2174
extent_write_cache_pages+0x983/0x1220 fs/btrfs/extent_io.c:3091
extent_writepages+0x219/0x540 fs/btrfs/extent_io.c:3211
do_writepages+0x3c3/0x680 mm/page-writeback.c:2581
filemap_fdatawrite_wbc+0x11e/0x170 mm/filemap.c:388
__filemap_fdatawrite_range mm/filemap.c:421 [inline]
filemap_fdatawrite_range+0x175/0x200 mm/filemap.c:439
btrfs_fdatawrite_range fs/btrfs/file.c:3850 [inline]
start_ordered_ops fs/btrfs/file.c:1737 [inline]
btrfs_sync_file+0x4ff/0x1190 fs/btrfs/file.c:1839
generic_write_sync include/linux/fs.h:2885 [inline]
btrfs_do_write_iter+0xcd3/0x1280 fs/btrfs/file.c:1684
call_write_iter include/linux/fs.h:2189 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x7dc/0xc50 fs/read_write.c:584
ksys_write+0x177/0x2a0 fs/read_write.c:637
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f7d4054e9b9
RSP: 002b:00007f7d404fa2f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f7d405d87a0 RCX: 00007f7d4054e9b9
RDX: 0000000000000090 RSI: 0000000020000000 RDI: 0000000000000006
RBP: 00007f7d405a51d0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 61635f65646f6e69
R13: 65646f7475616f6e R14: 7261637369646f6e R15: 00007f7d405d87a8
</TASK>
INFO: task syz-executor361:5697 blocked for more than 145 seconds.
Not tainted 6.2.0-rc3-syzkaller-00376-g7c6984405241 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor361 state:D stack:21216 pid:5697 ppid:5119 flags:0x00004004
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5293 [inline]
__schedule+0x995/0xe20 kernel/sched/core.c:6606
schedule+0xcb/0x190 kernel/sched/core.c:6682
rwsem_down_read_slowpath+0x5f9/0x930 kernel/locking/rwsem.c:1095
__down_read_common+0x54/0x2a0 kernel/locking/rwsem.c:1260
btrfs_page_mkwrite+0x417/0xc80 fs/btrfs/inode.c:8526
do_page_mkwrite+0x19e/0x5e0 mm/memory.c:2947
wp_page_shared+0x15e/0x380 mm/memory.c:3295
handle_pte_fault mm/memory.c:4949 [inline]
__handle_mm_fault mm/memory.c:5073 [inline]
handle_mm_fault+0x1b79/0x26b0 mm/memory.c:5219
do_user_addr_fault+0x69b/0xcb0 arch/x86/mm/fault.c:1428
handle_page_fault arch/x86/mm/fault.c:1519 [inline]
exc_page_fault+0x7a/0x110 arch/x86/mm/fault.c:1575
asm_exc_page_fault+0x22/0x30 arch/x86/include/asm/idtentry.h:570
RIP: 0010:copy_user_short_string+0xd/0x40 arch/x86/lib/copy_user_64.S:233
Code: 74 0a 89 (...)
RSP: 0018:ffffc9000570f330 EFLAGS: 00050202
RAX: ffffffff843e6601 RBX: 00007fffffffefc8 RCX: 0000000000000007
RDX: 0000000000000000 RSI: ffffc9000570f3e0 RDI: 0000000020000120
RBP: ffffc9000570f490 R08: 0000000000000000 R09: fffff52000ae1e83
R10: fffff52000ae1e83 R11: 1ffff92000ae1e7c R12: 0000000000000038
R13: ffffc9000570f3e0 R14: 0000000020000120 R15: ffffc9000570f3e0
copy_user_generic arch/x86/include/asm/uaccess_64.h:37 [inline]
raw_copy_to_user arch/x86/include/asm/uaccess_64.h:58 [inline]
_copy_to_user+0xe9/0x130 lib/usercopy.c:34
copy_to_user include/linux/uaccess.h:169 [inline]
fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144
emit_fiemap_extent+0x22d/0x3c0 fs/btrfs/extent_io.c:3458
fiemap_process_hole+0xa00/0xad0 fs/btrfs/extent_io.c:3716
extent_fiemap+0xe27/0x2100 fs/btrfs/extent_io.c:3922
btrfs_fiemap+0x172/0x1e0 fs/btrfs/inode.c:8209
ioctl_fiemap fs/ioctl.c:219 [inline]
do_vfs_ioctl+0x185b/0x2980 fs/ioctl.c:810
__do_sys_ioctl fs/ioctl.c:868 [inline]
__se_sys_ioctl+0x83/0x170 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f7d4054e9b9
RSP: 002b:00007f7d390d92f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f7d405d87b0 RCX: 00007f7d4054e9b9
RDX: 0000000020000100 RSI: 00000000c020660b RDI: 0000000000000005
RBP: 00007f7d405a51d0 R08: 00007f7d390d9700 R09: 0000000000000000
R10: 00007f7d390d9700 R11: 0000000000000246 R12: 61635f65646f6e69
R13: 65646f7475616f6e R14: 7261637369646f6e R15: 00007f7d405d87b8
</TASK>
What happens is the following:
1) Task A is doing an fsync, enters btrfs_sync_file() and flushes delalloc
before locking the inode and the i_mmap_lock semaphore, that is, before
calling btrfs_inode_lock();
2) After task A flushes delalloc and before it calls btrfs_inode_lock(),
another task dirties a page;
3) Task B starts a fiemap without FIEMAP_FLAG_SYNC, so the page dirtied
at step 2 remains dirty and unflushed. Then when it enters
extent_fiemap() and it locks a file range that includes the range of
the page dirtied in step 2;
4) Task A calls btrfs_inode_lock() and locks the inode (VFS lock) and the
inode's i_mmap_lock semaphore in write mode. Then it tries to flush
delalloc by calling start_ordered_ops(), which will block, at
find_lock_delalloc_range(), when trying to lock the range of the page
dirtied at step 2, since this range was locked by the fiemap task (at
step 3);
5) Task B generates a page fault when accessing the user space fiemap
buffer with a call to fiemap_fill_next_extent().
The fault handler needs to call btrfs_page_mkwrite() for some other
page of our inode, and there we deadlock when trying to lock the
inode's i_mmap_lock semaphore in read mode, since the fsync task locked
it in write mode (step 4) and the fsync task can not progress because
it's waiting to lock a file range that is currently locked by us (the
fiemap task, step 3).
Fix this by taking the inode's lock (VFS lock) in shared mode when
entering fiemap. This effectively serializes fiemap with fsync (except the
most expensive part of fsync, the log sync), preventing this deadlock.
Reported-by: syzbot+cc35f55c41e34c30dcb5@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/00000000000032dc7305f2a66f46@google.com/
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
MDS expects the completed cap release prior to responding to the
session flush for cache drop.
Cc: stable@vger.kernel.org
Link: http://tracker.ceph.com/issues/38009
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
When the network status is unstable, use-after-free may occur when
read data from the server.
BUG: KASAN: use-after-free in readpages_fill_pages+0x14c/0x7e0
Call Trace:
<TASK>
dump_stack_lvl+0x38/0x4c
print_report+0x16f/0x4a6
kasan_report+0xb7/0x130
readpages_fill_pages+0x14c/0x7e0
cifs_readv_receive+0x46d/0xa40
cifs_demultiplex_thread+0x121c/0x1490
kthread+0x16b/0x1a0
ret_from_fork+0x2c/0x50
</TASK>
Allocated by task 2535:
kasan_save_stack+0x22/0x50
kasan_set_track+0x25/0x30
__kasan_kmalloc+0x82/0x90
cifs_readdata_direct_alloc+0x2c/0x110
cifs_readdata_alloc+0x2d/0x60
cifs_readahead+0x393/0xfe0
read_pages+0x12f/0x470
page_cache_ra_unbounded+0x1b1/0x240
filemap_get_pages+0x1c8/0x9a0
filemap_read+0x1c0/0x540
cifs_strict_readv+0x21b/0x240
vfs_read+0x395/0x4b0
ksys_read+0xb8/0x150
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Freed by task 79:
kasan_save_stack+0x22/0x50
kasan_set_track+0x25/0x30
kasan_save_free_info+0x2e/0x50
__kasan_slab_free+0x10e/0x1a0
__kmem_cache_free+0x7a/0x1a0
cifs_readdata_release+0x49/0x60
process_one_work+0x46c/0x760
worker_thread+0x2a4/0x6f0
kthread+0x16b/0x1a0
ret_from_fork+0x2c/0x50
Last potentially related work creation:
kasan_save_stack+0x22/0x50
__kasan_record_aux_stack+0x95/0xb0
insert_work+0x2b/0x130
__queue_work+0x1fe/0x660
queue_work_on+0x4b/0x60
smb2_readv_callback+0x396/0x800
cifs_abort_connection+0x474/0x6a0
cifs_reconnect+0x5cb/0xa50
cifs_readv_from_socket.cold+0x22/0x6c
cifs_read_page_from_socket+0xc1/0x100
readpages_fill_pages.cold+0x2f/0x46
cifs_readv_receive+0x46d/0xa40
cifs_demultiplex_thread+0x121c/0x1490
kthread+0x16b/0x1a0
ret_from_fork+0x2c/0x50
The following function calls will cause UAF of the rdata pointer.
readpages_fill_pages
cifs_read_page_from_socket
cifs_readv_from_socket
cifs_reconnect
__cifs_reconnect
cifs_abort_connection
mid->callback() --> smb2_readv_callback
queue_work(&rdata->work) # if the worker completes first,
# the rdata is freed
cifs_readv_complete
kref_put
cifs_readdata_release
kfree(rdata)
return rdata->... # UAF in readpages_fill_pages()
Similarly, this problem also occurs in the uncache_fill_pages().
Fix this by adjusts the order of condition judgment in the return
statement.
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When logging a directory, we always set the inode's last_dir_index_offset
to the offset of the last dir index item we found. This is using an extra
field in the log context structure, and it makes more sense to update it
only after we insert dir index items, and we could directly update the
inode's last_dir_index_offset field instead.
So make this simpler by updating the inode's last_dir_index_offset only
when we actually insert dir index keys in the log tree, and getting rid
of the last_dir_item_offset field in the log context structure.
Reported-by: David Arendt <admin@prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@leemhuis.info/
Reported-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/Y8voyTXdnPDz8xwY@mail.gmail.com/
Reported-by: Hunter Wardlaw <wardlawhunter@gmail.com>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1207231
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216851
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- explicitly initialize zlib work memory to fix a KCSAN warning
- limit number of send clones by maximum memory allocated
- limit device size extent in case it device shrink races with chunk
allocation
- raid56 fixes:
- fix copy&paste error in RAID6 stripe recovery
- make error bitmap update atomic
* tag 'for-6.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: raid56: make error_bitmap update atomic
btrfs: send: limit number of clones and allocated memory size
btrfs: zlib: zero-initialize zlib workspace
btrfs: limit device extents to the device size
btrfs: raid56: fix stripes if vertical errors are found
|
|
Pull ELF fix from Al Viro:
"One of the many equivalent build warning fixes for !CONFIG_ELF_CORE
configs. Geert's is the earliest one I've been able to find"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
coredump: Move dump_emit_page() to kill unused warning
|
|
This fix was nacked by Philip, for reasons identified in the email linked
below.
Link: https://lkml.kernel.org/r/68f15d67-8945-2728-1f17-5b53a80ec52d@squashfs.org.uk
Fixes: 72e544b1b28325 ("squashfs: harden sanity check in squashfs_read_xattr_id_table")
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Fedor Pchelkin <pchelkin@ispras.ru>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The copy_mc_to_kernel() will return 0 if it executed successfully. Then
the return value should be set to the length it copied.
[akpm@linux-foundation.org: don't mess up `ret', per Matthew]
Link: https://lkml.kernel.org/r/1675341227-14-1-git-send-email-ruansy.fnst@fujitsu.com
Fixes: d984648e428b ("fsdax,xfs: port unshare to fsdax")
Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit e4a0d3e720e7 ("aio: Make it possible to remap aio ring") introduced
a null-deref if mremap is called on an old aio mapping after fork as
mm->ioctx_table will be set to NULL.
[jmoyer@redhat.com: fix 80 column issue]
Link: https://lkml.kernel.org/r/x49sffq4nvg.fsf@segfault.boston.devel.redhat.com
Fixes: e4a0d3e720e7 ("aio: Make it possible to remap aio ring")
Signed-off-by: Seth Jenkins <sethjenkins@google.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Jann Horn <jannh@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Pull ceph fix from Ilya Dryomov:
"A safeguard to prevent the kernel client from further damaging the
filesystem after running into a case of an invalid snap trace.
The root cause of this metadata corruption is still being investigated
but it appears to be stemming from the MDS. As such, this is the best
we can do for now"
* tag 'ceph-for-6.2-rc7' of https://github.com/ceph/ceph-client:
ceph: blocklist the kclient when receiving corrupted snap trace
ceph: move mount state enum to super.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"25 hotfixes, mainly for MM. 13 are cc:stable"
* tag 'mm-hotfixes-stable-2023-02-02-19-24-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (26 commits)
mm: memcg: fix NULL pointer in mem_cgroup_track_foreign_dirty_slowpath()
Kconfig.debug: fix the help description in SCHED_DEBUG
mm/swapfile: add cond_resched() in get_swap_pages()
mm: use stack_depot_early_init for kmemleak
Squashfs: fix handling and sanity checking of xattr_ids count
sh: define RUNTIME_DISCARD_EXIT
highmem: round down the address passed to kunmap_flush_on_unmap()
migrate: hugetlb: check for hugetlb shared PMD in node migration
mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups
Revert "mm: kmemleak: alloc gray object for reserved region with direct map"
freevxfs: Kconfig: fix spelling
maple_tree: should get pivots boundary by type
.mailmap: update e-mail address for Eugen Hristev
mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
squashfs: harden sanity check in squashfs_read_xattr_id_table
ia64: fix build error due to switch case label appearing next to declaration
mm: multi-gen LRU: fix crash during cgroup migration
Revert "mm: add nodes= arg to memory.reclaim"
zsmalloc: fix a race with deferred_handles storing
...
|
|
When received corrupted snap trace we don't know what exactly has
happened in MDS side. And we shouldn't continue IOs and metadatas
access to MDS, which may corrupt or get incorrect contents.
This patch will just block all the further IO/MDS requests
immediately and then evict the kclient itself.
The reason why we still need to evict the kclient just after
blocking all the further IOs is that the MDS could revoke the caps
faster.
Link: https://tracker.ceph.com/issues/57686
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
These flags are only used in ceph filesystem in fs/ceph, so just
move it to the place it should be.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
A Sysbot [1] corrupted filesystem exposes two flaws in the handling and
sanity checking of the xattr_ids count in the filesystem. Both of these
flaws cause computation overflow due to incorrect typing.
In the corrupted filesystem the xattr_ids value is 4294967071, which
stored in a signed variable becomes the negative number -225.
Flaw 1 (64-bit systems only):
The signed integer xattr_ids variable causes sign extension.
This causes variable overflow in the SQUASHFS_XATTR_*(A) macros. The
variable is first multiplied by sizeof(struct squashfs_xattr_id) where the
type of the sizeof operator is "unsigned long".
On a 64-bit system this is 64-bits in size, and causes the negative number
to be sign extended and widened to 64-bits and then become unsigned. This
produces the very large number 18446744073709548016 or 2^64 - 3600. This
number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and
divided by SQUASHFS_METADATA_SIZE overflows and produces a length of 0
(stored in len).
Flaw 2 (32-bit systems only):
On a 32-bit system the integer variable is not widened by the unsigned
long type of the sizeof operator (32-bits), and the signedness of the
variable has no effect due it always being treated as unsigned.
The above corrupted xattr_ids value of 4294967071, when multiplied
overflows and produces the number 4294963696 or 2^32 - 3400. This number
when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and divided by
SQUASHFS_METADATA_SIZE overflows again and produces a length of 0.
The effect of the 0 length computation:
In conjunction with the corrupted xattr_ids field, the filesystem also has
a corrupted xattr_table_start value, where it matches the end of
filesystem value of 850.
This causes the following sanity check code to fail because the
incorrectly computed len of 0 matches the incorrect size of the table
reported by the superblock (0 bytes).
len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids);
/*
* The computed size of the index table (len bytes) should exactly
* match the table start and end points
*/
start = table_start + sizeof(*id_table);
end = msblk->bytes_used;
if (len != (end - start))
return ERR_PTR(-EINVAL);
Changing the xattr_ids variable to be "usigned int" fixes the flaw on a
64-bit system. This relies on the fact the computation is widened by the
unsigned long type of the sizeof operator.
Casting the variable to u64 in the above macro fixes this flaw on a 32-bit
system.
It also means 64-bit systems do not implicitly rely on the type of the
sizeof operator to widen the computation.
[1] https://lore.kernel.org/lkml/000000000000cd44f005f1a0f17f@google.com/
Link: https://lkml.kernel.org/r/20230127061842.10965-1-phillip@squashfs.org.uk
Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup")
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: <syzbot+082fa4af80a5bb1a9843@syzkaller.appspotmail.com>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Fedor Pchelkin <pchelkin@ispras.ru>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Fixes for hugetlb mapcount at most 1 for shared PMDs".
This issue of mapcount in hugetlb pages referenced by shared PMDs was
discussed in [1]. The following two patches address user visible behavior
caused by this issue.
[1] https://lore.kernel.org/linux-mm/Y9BF+OCdWnCSilEu@monkey/
This patch (of 2):
A hugetlb page will have a mapcount of 1 if mapped by multiple processes
via a shared PMD. This is because only the first process increases the
map count, and subsequent processes just add the shared PMD page to their
page table.
page_mapcount is being used to decide if a hugetlb page is shared or
private in /proc/PID/smaps. Pages referenced via a shared PMD were
incorrectly being counted as private.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found
count the hugetlb page as shared. A new helper to check for a shared PMD
is added.
[akpm@linux-foundation.org: simplification, per David]
[akpm@linux-foundation.org: hugetlb.h: include page_ref.h for page_count()]
Link: https://lkml.kernel.org/r/20230126222721.222195-2-mike.kravetz@oracle.com
Fixes: 25ee01a2fca0 ("mm: hugetlb: proc: add hugetlb-related fields to /proc/PID/smaps")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Fix a spello in freevxfs Kconfig.
(reported by codespell)
Link: https://lkml.kernel.org/r/20230124181638.15604-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
While mounting a corrupted filesystem, a signed integer '*xattr_ids' can
become less than zero. This leads to the incorrect computation of 'len'
and 'indexes' values which can cause null-ptr-deref in copy_bio_to_actor()
or out-of-bounds accesses in the next sanity checks inside
squashfs_read_xattr_id_table().
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Link: https://lkml.kernel.org/r/20230117105226.329303-2-pchelkin@ispras.ru
Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup")
Reported-by: <syzbot+082fa4af80a5bb1a9843@syzkaller.appspotmail.com>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
fscache_create_volume_work() uses wake_up_bit() to wake up the processes
which are waiting for the completion of volume creation. According to
comments in wake_up_bit() and waitqueue_active(), an extra smp_mb() is
needed to guarantee the memory order between FSCACHE_VOLUME_CREATING
flag and waitqueue_active() before invoking wake_up_bit().
Fixing it by using clear_and_wake_up_bit() to add the missing memory
barrier.
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20230113115211.2895845-3-houtao@huaweicloud.com/ # v3
|
|
The freeing of relinquished volume will wake up the pending volume
acquisition by using wake_up_bit(), however it is mismatched with
wait_var_event() used in fscache_wait_on_volume_collision() and it will
never wake up the waiter in the wait-queue because these two functions
operate on different wait-queues.
According to the implementation in fscache_wait_on_volume_collision(),
if the wake-up of pending acquisition is delayed longer than 20 seconds
(e.g., due to the delay of on-demand fd closing), the first
wait_var_event_timeout() will timeout and the following wait_var_event()
will hang forever as shown below:
FS-Cache: Potential volume collision new=00000024 old=00000022
......
INFO: task mount:1148 blocked for more than 122 seconds.
Not tainted 6.1.0-rc6+ #1
task:mount state:D stack:0 pid:1148 ppid:1
Call Trace:
<TASK>
__schedule+0x2f6/0xb80
schedule+0x67/0xe0
fscache_wait_on_volume_collision.cold+0x80/0x82
__fscache_acquire_volume+0x40d/0x4e0
erofs_fscache_register_volume+0x51/0xe0 [erofs]
erofs_fscache_register_fs+0x19c/0x240 [erofs]
erofs_fc_fill_super+0x746/0xaf0 [erofs]
vfs_get_super+0x7d/0x100
get_tree_nodev+0x16/0x20
erofs_fc_get_tree+0x20/0x30 [erofs]
vfs_get_tree+0x24/0xb0
path_mount+0x2fa/0xa90
do_mount+0x7c/0xa0
__x64_sys_mount+0x8b/0xe0
do_syscall_64+0x30/0x60
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Considering that wake_up_bit() is more selective, so fix it by using
wait_on_bit() instead of wait_var_event() to wait for the freeing of
relinquished volume. In addition because waitqueue_active() is used in
wake_up_bit() and clear_bit() doesn't imply any memory barrier, use
clear_and_wake_up_bit() to add the missing memory barrier between
cursor->flags and waitqueue_active().
Fixes: 62ab63352350 ("fscache: Implement volume registration")
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20230113115211.2895845-2-houtao@huaweicloud.com/ # v3
|
|
Pull ksmbd server fixes from Steve French:
"Four smb3 server fixes, all also for stable:
- fix for signing bug
- fix to more strictly check packet length
- add a max connections parm to limit simultaneous connections
- fix error message flood that can occur with newer Samba xattr
format"
* tag '6.2-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: downgrade ndr version error message to debug
ksmbd: limit pdu length size according to connection status
ksmbd: do not sign response to session request for guest login
ksmbd: add max connections parameter
|
|
Pull cifs fix from Steve French:
"Fix for reconnect oops in smbdirect (RDMA), also is marked for stable"
* tag '6.2-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix oops due to uncleared server->smbd_conn in reconnect
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
"Fix two bugs, a recent one introduced in the last cycle, and an older
one from v5.11"
* tag 'ovl-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fail on invalid uid/gid mapping at copy up
ovl: fix tmpfile leak
|
|
If st_uid/st_gid doesn't have a mapping in the mounter's user_ns, then
copy-up should fail, just like it would fail if the mounter task was doing
the copy using "cp -a".
There's a corner case where the "cp -a" would succeed but copy up fail: if
there's a mapping of the invalid uid/gid (65534 by default) in the user
namespace. This is because stat(2) will return this value if the mapping
doesn't exist in the current user_ns and "cp -a" will in turn be able to
create a file with this uid/gid.
This behavior would be inconsistent with POSIX ACL's, which return -1 for
invalid uid/gid which result in a failed copy.
For consistency and simplicity fail the copy of the st_uid/st_gid are
invalid.
Fixes: 459c7c565ac3 ("ovl: unprivieged mounts")
Cc: <stable@vger.kernel.org> # v5.11
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Seth Forshee <sforshee@kernel.org>
|
|
Missed an error cleanup.
Reported-by: syzbot+fd749a7ea127a84e0ffd@syzkaller.appspotmail.com
Fixes: 2b1a77461f16 ("ovl: use vfs_tmpfile_open() helper")
Cc: <stable@vger.kernel.org> # v6.1
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
In the rework of raid56 code, there is very limited concurrency in the
endio context.
Most of the work is done inside the sectors arrays, which different bios
will never touch the same sector.
But there is a concurrency here for error_bitmap. Both read and write
endio functions need to touch them, and we can have multiple write bios
touching the same error bitmap if they all hit some errors.
Here we fix the unprotected bitmap operation by going set_bit() in a
loop.
Since we have a very small ceiling of the sectors (at most 16 sectors),
such set_bit() in a loop should be very acceptable.
Fixes: 2942a50dea74 ("btrfs: raid56: introduce btrfs_raid_bio::error_bitmap")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The arg->clone_sources_count is u64 and can trigger a warning when a
huge value is passed from user space and a huge array is allocated.
Limit the allocated memory to 8MiB (can be increased if needed), which
in turn limits the number of clone sources to 8M / sizeof(struct
clone_root) = 8M / 40 = 209715. Real world number of clones is from
tens to hundreds, so this is future proof.
Reported-by: syzbot+4376a9a073770c173269@syzkaller.appspotmail.com
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When user switch samba to ksmbd, The following message flood is coming
when accessing files. Samba seems to changs dos attribute version to v5.
This patch downgrade ndr version error message to debug.
$ dmesg
...
[68971.766914] ksmbd: v5 version is not supported
[68971.779808] ksmbd: v5 version is not supported
[68971.871544] ksmbd: v5 version is not supported
[68971.910135] ksmbd: v5 version is not supported
...
Cc: stable@vger.kernel.org
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Stream protocol length will never be larger than 16KB until session setup.
After session setup, the size of requests will not be larger than
16KB + SMB2 MAX WRITE size. This patch limits these invalidly oversized
requests and closes the connection immediately.
Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers")
Cc: stable@vger.kernel.org
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18259
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
KMSAN reports uses of uninitialized memory in zlib's longest_match()
called on memory originating from zlib_alloc_workspace().
This issue is known by zlib maintainers and is claimed to be harmless,
but to be on the safe side we'd better initialize the memory.
Link: https://zlib.net/zlib_faq.html#faq36
Reported-by: syzbot+14d9e7602ebdf7ec0a60@syzkaller.appspotmail.com
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There was a recent regression in btrfs/177 that started happening with
the size class patches ("btrfs: introduce size class to block group
allocator"). This however isn't a regression introduced by those
patches, but rather the bug was uncovered by a change in behavior in
these patches. The patches triggered more chunk allocations in the
^free-space-tree case, which uncovered a race with device shrink.
The problem is we will set the device total size to the new size, and
use this to find a hole for a device extent. However during shrink we
may have device extents allocated past this range, so we could
potentially find a hole in a range past our new shrink size. We don't
actually limit our found extent to the device size anywhere, we assume
that we will not find a hole past our device size. This isn't true with
shrink as we're relocating block groups and thus creating holes past the
device size.
Fix this by making sure we do not search past the new device size, and
if we wander into any device extents that start after our device size
simply break from the loop and use whatever hole we've already found.
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We take two stripe numbers if vertical errors are found. In case it is
just a pstripe it does not matter but in case of raid 6 it matters as
both stripes need to be fixed.
Fixes: 7a3150723061 ("btrfs: raid56: do data csum verification during RMW cycle")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Tanmay Bhushan <007047221b@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull fuse ACL fix from Christian Brauner:
"The new posix acl API doesn't depend on the xattr handler
infrastructure anymore and instead only relies on the posix acl inode
operations. As a result daemons without FUSE_POSIX_ACL are unable to
use posix acls like they used to.
Fix this by copying what we did for overlayfs during the posix acl api
conversion. Make fuse implement a dedicated ->get_inode_acl() method
as does overlayfs. Fuse can then also uses this to express different
needs for vfs permission checking during lookup and acl based
retrieval via the regular system call path.
This allows fuse to continue to refuse retrieving posix acls for
daemons that don't set FUSE_POSXI_ACL for permission checking while
also allowing a fuse server to retrieve it via the usual system calls"
* tag 'fs.fuse.acl.v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
fuse: fixes after adapting to new posix acl api
|