Age | Commit message (Collapse) | Author |
|
The FS_IOC_READ_VERITY_METADATA ioctl will need to return the fs-verity
descriptor (and signature) to userspace.
There are a few ways we could implement this:
- Save a copy of the descriptor (and signature) in the fsverity_info
struct that hangs off of the in-memory inode. However, this would
waste memory since most of the time it wouldn't be needed.
- Regenerate the descriptor from the merkle_tree_params in the
fsverity_info. However, this wouldn't work for the signature, nor for
the salt which the merkle_tree_params only contains indirectly as part
of the 'hashstate'. It would also be error-prone.
- Just get them from the filesystem again. The disadvantage is that in
general we can't trust that they haven't been maliciously changed
since the file has opened. However, the use cases for
FS_IOC_READ_VERITY_METADATA don't require that it verifies the chain
of trust. So this is okay as long as we do some basic validation.
In preparation for implementing the third option, factor out a helper
function fsverity_get_descriptor() which gets the descriptor (and
appended signature) from the filesystem and does some basic validation.
As part of this, start checking the sig_size field for overflow.
Currently fsverity_verify_signature() does this. But the new ioctl will
need this too, so do it earlier.
Link: https://lore.kernel.org/r/20210115181819.34732-2-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
Pull cifs fixes from Steve French:
"Three small smb3 fixes for stable"
* tag '5.11-rc6-smb3' of git://git.samba.org/sfrench/cifs-2.6:
cifs: report error instead of invalid when revalidating a dentry fails
smb3: fix crediting for compounding when only one request in flight
smb3: Fix out-of-bounds bug in SMB2_negotiate()
|
|
Pull io_uring fixes from Jens Axboe:
"Two small fixes that should go into 5.11:
- task_work resource drop fix (Pavel)
- identity COW fix (Xiaoguang)"
* tag 'io_uring-5.11-2021-02-05' of git://git.kernel.dk/linux-block:
io_uring: drop mm/files between task_work_submit
io_uring: don't modify identity's files uncess identity is cowed
|
|
Assuming
- //HOST/a is mounted on /mnt
- //HOST/b is mounted on /mnt/b
On a slow connection, running 'df' and killing it while it's
processing /mnt/b can make cifs_get_inode_info() returns -ERESTARTSYS.
This triggers the following chain of events:
=> the dentry revalidation fail
=> dentry is put and released
=> superblock associated with the dentry is put
=> /mnt/b is unmounted
This patch makes cifs_d_revalidate() return the error instead of 0
(invalid) when cifs_revalidate_dentry() fails, except for ENOENT (file
deleted) and ESTALE (file recreated).
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Suggested-by: Shyam Prasad N <nspmangalore@gmail.com>
Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>
CC: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Patch fb6791d100d1 was designed to allow gfs2 to unmount quicker by
skipping the step where it tells dlm to unlock glocks in EX with lvbs.
This was done because when gfs2 unmounts a file system, it destroys the
dlm lockspace shortly after it destroys the glocks so it doesn't need to
unlock them all: the unlock is implied when the lockspace is destroyed
by dlm.
However, that patch introduced a use-after-free in dlm: as part of its
normal dlm_recoverd process, it can call ls_recovery to recover dead
locks. In so doing, it can call recover_rsbs which calls recover_lvb for
any mastered rsbs. Func recover_lvb runs through the list of lkbs queued
to the given rsb (if the glock is cached but unlocked, it will still be
queued to the lkb, but in NL--Unlocked--mode) and if it has an lvb,
copies it to the rsb, thus trying to preserve the lkb. However, when
gfs2 skips the dlm unlock step, it frees the glock and its lvb, which
means dlm's function recover_lvb references the now freed lvb pointer,
copying the freed lvb memory to the rsb.
This patch changes the check in gdlm_put_lock so that it calls
dlm_unlock for all glocks that contain an lvb pointer.
Fixes: fb6791d100d1 ("GFS2: skip dlm_unlock calls in unmount")
Cc: stable@vger.kernel.org # v3.8+
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
If a new hugetlb page is allocated during fallocate it will not be
marked as active (set_page_huge_active) which will result in a later
isolate_huge_page failure when the page migration code would like to
move that page. Such a failure would be unexpected and wrong.
Only export set_page_huge_active, just leave clear_page_huge_active as
static. Because there are no external users.
Link: https://lkml.kernel.org/r/20210115124942.46403-3-songmuchun@bytedance.com
Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In gfs2_recover_one, fix a sd_log_flush_lock imbalance when a recovery
pass fails.
Fixes: c9ebc4b73799 ("gfs2: allow journal replay to hold sd_log_flush_lock")
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Current iov handling with recvmsg/sendmsg may be confusing. First make a
rule for msg->iov: either it points to an allocated iov that have to be
kfree()'d later, or it's NULL and we use fast_iov. That's much better
than current 3-state (also can point to fast_iov). And rename it into
free_iov for uniformity with read/write.
Also, instead of after struct io_async_msghdr copy fixing up of
msg.msg_iter.iov has been happening in io_recvmsg()/io_sendmsg(). Move
it into io_setup_async_msg(), that's the right place.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
[axboe: add comment on NULL check before kfree()]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Don't pretend we don't know that REQ_F_BUFFER_SELECT for recvmsg always
uses fast_iov -- clean up confusing intermixing kmsg->iov and
kmsg->fast_iov for buffer select.
Also don't init iter with garbage in __io_recvmsg_copy_hdr() only for it
to be set shortly after in io_recvmsg().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_setup_async_msg() should fully prepare io_async_msghdr, let it also
handle assigning msg_name and don't hand code it in [send,recv]msg().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently we try to guess if a compound request is going to
succeed waiting for credits or not based on the number of
requests in flight. This approach doesn't work correctly
all the time because there may be only one request in
flight which is going to bring multiple credits satisfying
the compound request.
Change the behavior to fail a request only if there are no requests
in flight at all and proceed waiting for credits otherwise.
Cc: <stable@vger.kernel.org> # 5.1+
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Since SQPOLL task can be shared and so task_work entries can be a mix of
them, we need to drop mm and files before trying to issue next request.
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
- Fix capability conversion and minor overlayfs bugs that are related
to the unprivileged overlay mounts introduced in this cycle.
- Fix two recent (v5.10) and one old (v4.10) bug.
- Clean up security xattr copy-up (related to a SELinux regression).
* tag 'ovl-fixes-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: implement volatile-specific fsync error behaviour
ovl: skip getxattr of security labels
ovl: fix dentry leak in ovl_get_redirect
ovl: avoid deadlock on directory ioctl
cap: fix conversions on getxattr
ovl: perform vfs_getxattr() with mounter creds
ovl: add warning on user_ns mismatch
|
|
quota types
While writing up a regression test for broken behavior when a chprojid
request fails, I noticed that we were logging corruption notices about
the root dquot of the group/project quota file at mount time when
testing V4 filesystems.
In commit afeda6000b0c, I was trying to improve ondisk dquot validation
by making sure that when we load an ondisk dquot into memory on behalf
of an incore dquot, the dquot id and type matches. Unfortunately, I
forgot that V4 filesystems only have two quota files, and can switch
that file between group and project quota types at mount time. When we
perform that switch, we'll try to load the default quota limits from the
root dquot prior to running quotacheck and log a corruption error when
the types don't match.
This is inconsequential because quotacheck will reset the second quota
file as part of doing the switch, but we shouldn't leave scary messages
in the kernel log.
Fixes: afeda6000b0c ("xfs: validate ondisk/incore dquot flags")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
|
|
Saving one lock/unlock for io-wq is not super important, but adds some
ugliness in the code. More important, atomic decs not turning it to zero
for some archs won't give the right ordering/barriers so the
io_steal_work() may pretty easily get subtly and completely broken.
Return back 2-step io-wq work exchange and clean it up.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Extract a helper io_fixed_file_slot() returning a place in our fixed
files table, so we don't hand-code it three times in the code.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_import_iovec() doesn't return IO size anymore, only error code. Make
it more apparent by returning int instead of ssize and clean up
leftovers.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Make decision making of whether we need to retry read/write similar for
O_NONBLOCK and RWF_NOWAIT. Set REQ_F_NOWAIT when either is specified and
use it for all relevant checks. Also fix resubmitting NOWAIT requests
via io_rw_reissue().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We already have implicit do-while for read-retries but with goto in the
end. Convert it to an actual do-while, it highlights it so making a
bit more understandable and is cleaner in general.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_read() has not the simpliest control flow with a lot of jumps and
it's hard to read. One of those is a out_free: label, which frees iovec.
However, from the middle of io_read() iovec is NULL'ed and so
kfree(iovec) is no-op, it leaves us with two place where we can inline
it and further clean up the code.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We have invariant in io_read() of how much we're trying to read spilled
into an iter and io_size variable. The last one controls decision making
about whether to do read-retries. However, io_size is modified only
after the first read attempt, so if we happen to go for a third retry in
a single call to io_read(), we will get io_size greater than in the
iterator, so may lead to various side effects up to live-locking.
Modify io_size each time.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now we give out ownership of iovec into io_setup_async_rw(), so it
either sets request's context right or frees the iovec on error itself.
Makes our life a bit easier at call sites.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
First, instead of checking iov_iter_count(iter) for 0 to find out that
all needed bytes were read, just compare returned code against io_size.
It's more reliable and arguably cleaner.
Also, place the half-read case into an else branch and delete an extra
label.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
!io_file_supports_async() case of io_read() is hard to read, it jumps
somewhere in the middle of the function just to do async setup and fail
on a similar check. Call io_setup_async_rw() directly for this case,
it's much easier to follow.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
It's easy to make a mistake in io_cqring_wait() because for all
break/continue clauses we need to watch for prepare/finish_wait to be
used correctly. Extract all those into a new helper
io_cqring_wait_schedule(), and transforming the loop into simple series
of func calls: prepare(); check_and_schedule(); finish();
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
schedule_timeout() with timeout=MAX_SCHEDULE_TIMEOUT is guaranteed to
work just as schedule(), so instead of hand-coding it based on arguments
always use the timeout version and simplify code.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Files and task cancellations go over same steps trying to cancel
requests in io-wq, poll, etc. Deduplicate it with a helper.
note: new io_uring_try_cancel_requests() is former
__io_uring_cancel_task_requests() with files passed as an agrument and
flushing overflowed requests.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Abaci Robot reported following panic:
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 800000010ef3f067 P4D 800000010ef3f067 PUD 10d9df067 PMD 0
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 1869 Comm: io_wqe_worker-0 Not tainted 5.11.0-rc3+ #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:put_files_struct+0x1b/0x120
Code: 24 18 c7 00 f4 ff ff ff e9 4d fd ff ff 66 90 0f 1f 44 00 00 41 57 41 56 49 89 fe 41 55 41 54 55 53 48 83 ec 08 e8 b5 6b db ff 41 ff 0e 74 13 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 9c
RSP: 0000:ffffc90002147d48 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88810d9a5300 RCX: 0000000000000000
RDX: ffff88810d87c280 RSI: ffffffff8144ba6b RDI: 0000000000000000
RBP: 0000000000000080 R08: 0000000000000001 R09: ffffffff81431500
R10: ffff8881001be000 R11: 0000000000000000 R12: ffff88810ac2f800
R13: ffff88810af38a00 R14: 0000000000000000 R15: ffff8881057130c0
FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010dbaa002 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__io_clean_op+0x10c/0x2a0
io_dismantle_req+0x3c7/0x600
__io_free_req+0x34/0x280
io_put_req+0x63/0xb0
io_worker_handle_work+0x60e/0x830
? io_wqe_worker+0x135/0x520
io_wqe_worker+0x158/0x520
? __kthread_parkme+0x96/0xc0
? io_worker_handle_work+0x830/0x830
kthread+0x134/0x180
? kthread_create_worker_on_cpu+0x90/0x90
ret_from_fork+0x1f/0x30
Modules linked in:
CR2: 0000000000000000
---[ end trace c358ca86af95b1e7 ]---
I guess case below can trigger above panic: there're two threads which
operates different io_uring ctxs and share same sqthread identity, and
later one thread exits, io_uring_cancel_task_requests() will clear
task->io_uring->identity->files to be NULL in sqpoll mode, then another
ctx that uses same identity will panic.
Indeed we don't need to clear task->io_uring->identity->files here,
io_grab_identity() should handle identity->files changes well, if
task->io_uring->identity->files is not equal to current->files,
io_cow_identity() should handle this changes well.
Cc: stable@vger.kernel.org # 5.5+
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Added "ckpt_thread_ioprio" sysfs node to give a way to change checkpoint
merge daemon's io priority. Its default value is "be,3", which means
"BE" I/O class and I/O priority "3". We can select the class between "rt"
and "be", and set the I/O priority within valid range of it.
"," delimiter is necessary in between I/O class and priority number.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
We've added a new mount options, "checkpoint_merge" and "nocheckpoint_merge",
which creates a kernel daemon and makes it to merge concurrent checkpoint
requests as much as possible to eliminate redundant checkpoint issues. Plus,
we can eliminate the sluggish issue caused by slow checkpoint operation
when the checkpoint is done in a process context in a cgroup having
low i/o budget and cpu shares. To make this do better, we set the
default i/o priority of the kernel daemon to "3", to give one higher
priority than other kernel threads. The below verification result
explains this.
The basic idea has come from https://opensource.samsung.com.
[Verification]
Android Pixel Device(ARM64, 7GB RAM, 256GB UFS)
Create two I/O cgroups (fg w/ weight 100, bg w/ wight 20)
Set "strict_guarantees" to "1" in BFQ tunables
In "fg" cgroup,
- thread A => trigger 1000 checkpoint operations
"for i in `seq 1 1000`; do touch test_dir1/file; fsync test_dir1;
done"
- thread B => gererating async. I/O
"fio --rw=write --numjobs=1 --bs=128k --runtime=3600 --time_based=1
--filename=test_img --name=test"
In "bg" cgroup,
- thread C => trigger repeated checkpoint operations
"echo $$ > /dev/blkio/bg/tasks; while true; do touch test_dir2/file;
fsync test_dir2; done"
We've measured thread A's execution time.
[ w/o patch ]
Elapsed Time: Avg. 68 seconds
[ w/ patch ]
Elapsed Time: Avg. 48 seconds
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
[Jaegeuk Kim: fix the return value in f2fs_start_ckpt_thread, reported by Dan]
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
The mp variable in xfs_file_compat_ioctl is only used when
BROKEN_X86_ALIGNMENT is define. Remove it and just open code the
dereference in a few places.
Link: https://lore.kernel.org/r/20210203173009.462205-1-christian.brauner@ubuntu.com
Fixes: f736d93d76d3 ("xfs: support idmapped mounts")
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
If uid or gid of mount options is larger than INT_MAX, udf_fill_super will
return -EINVAL.
The problem can be encountered by a domain user or reproduced via:
mount -o loop,uid=2147483648 something-in-udf-format.iso /mnt
This can be fixed as commit 233a01fa9c4c ("fuse: handle large user and
group ID").
Link: https://lore.kernel.org/r/20210129045502.10546-1-bingjingc@synology.com
Reviewed-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com>
Signed-off-by: BingJing Chang <bingjingc@synology.com>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
If uid or gid of mount options is larger than INT_MAX, isofs_fill_super
will return -EINVAL.
The problem can be encountered by a domain user or reproduced via:
mount -o loop,uid=2147483648 ubuntu-16.04.6-server-amd64.iso /mnt
This can be fixed as commit 233a01fa9c4c ("fuse: handle large user and
group ID").
Link: https://lore.kernel.org/r/20210129045315.10375-1-bingjingc@synology.com
Reviewed-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com>
Signed-off-by: BingJing Chang <bingjingc@synology.com>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Move this function further up in log.c so that we can use it in the next patch.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Keep the current value of the updated log tail in the super block as
sb_log_flush_tail instead of computing it on the fly. This avoids
unnecessary sd_ail_lock taking and cleans up the code.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Use a tighter bound for the number of blocks required by transactions in
gfs2_trans_begin: in the worst case, we'll have mixed data and metadata,
so we'll need a log desciptor for each type.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Wake up log waiters in gfs2_log_release when log space has actually become
available. This is a much better place for the wakeup than gfs2_logd.
Check if enough log space is immeditely available before anything else. If
there isn't, use io_wait_event to wait instead of open-coding it.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Commit 588bff95c94e added gfs2_write_log_header() and started using it in
clean_journal(), with an additional call to log_flush_wait() at the end of
gfs2_write_log_header() which is unnecessary for clean_journal(). Move
that call out of gfs2_write_log_header() to restore the previous behavior.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Move the read locking of sd_log_flush_lock from gfs2_log_reserve to
gfs2_trans_begin, and its unlocking from gfs2_log_release to
gfs2_trans_end. Use gfs2_log_release in two places in which it was open
coded before.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
This counter and the associated wait queue are only used so that
gfs2_make_fs_ro can efficiently wait for all pending log space
allocations to fail after setting the filesystem to read-only. This
comes at the cost of waking up that wait queue very frequently.
Instead, when gfs2_log_reserve fails because the filesystem has become
read-only, Wake up sd_log_waitq. In gfs2_make_fs_ro, set the file
system read-only and then wait until all the log space has been
released. Give up and report the problem after a while. With that,
sd_reserving_log and sd_reserving_log_wait can be removed.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Replace the TR_ALLOCED flag by its inverse, TR_ONSTACK: that way, the flag only
needs to be set in the exceptional case of on-stack transactions. Split off
__gfs2_trans_begin from gfs2_trans_begin and use it to replace the open-coded
version in gfs2_ail_empty_gl.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Such usage isn't encouraged by the kernel coding style. Leave the
definitions alone in case of userspace users.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
It actually means the delta block count of growfs. Rename it in order
to make it clear. Also introduce nb_div to avoid reusing `delta`.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
As xfs supports the feature of inode btree block counters now, expose
this feature flag in xfs geometry, for userspace can check if the
inobtcnt is enabled or not.
Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Since xfs_inode_free_eofblocks and xfs_inode_free_cowblocks are now
internal static functions, we can save ourselves a cycling of the iolock
by passing the lock state out to xfs_blockgc_scan_inode and letting it
do all the unlocking.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Expose the workqueue sysfs knobs for the speculative preallocation gc
workers on all kernels, and update the sysadmin information.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Split the block preallocation garbage collection work into per-AG work
items so that we can take advantage of parallelization.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Shorten the names of the two functions that start and stop block
preallocation garbage collection and move them up to the other blockgc
functions.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Perform background block preallocation gc scans more efficiently by
walking the incore inode tree once.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Remove the separate cowblocks work items and knob so that we can control
and run everything from a single blockgc work queue. Note that the
speculative_prealloc_lifetime sysfs knob retains its historical name
even though the functions move to prefix xfs_blockgc_*.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|