Age | Commit message (Collapse) | Author |
|
We use non-blocking sockets in those tests, retrying for
EAGAIN is ugly because there is no upper bound for the packet
arrival time, at least in theory. After we fix poll() on
sockmap sockets, now we can switch to select()+recv().
Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-5-xiyou.wangcong@gmail.com
|
|
Yucong noticed we can't poll() sockets in sockmap even
when they are the destination sockets of redirections.
This is because we never poll any psock queues in ->poll(),
except for TCP. With ->sock_is_readable() now we can
overwrite >sock_is_readable(), invoke and implement it for
both UDP and AF_UNIX sockets.
Reported-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-4-xiyou.wangcong@gmail.com
|
|
tcp_bpf_sock_is_readable() is pretty much generic,
we can extract it and reuse it for non-TCP sockets.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-3-xiyou.wangcong@gmail.com
|
|
The proto ops ->stream_memory_read() is currently only used
by TCP to check whether psock queue is empty or not. We need
to rename it before reusing it for non-TCP protocols, and
adjust the exsiting users accordingly.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-2-xiyou.wangcong@gmail.com
|
|
With two Msgs, msgA and msgB and a user doing nonblocking sendmsg calls (or
multiple cores) on a single socket 'sk' we could get the following flow.
msgA, sk msgB, sk
----------- ---------------
tcp_bpf_sendmsg()
lock(sk)
psock = sk->psock
tcp_bpf_sendmsg()
lock(sk) ... blocking
tcp_bpf_send_verdict
if (psock->eval == NONE)
psock->eval = sk_psock_msg_verdict
..
< handle SK_REDIRECT case >
release_sock(sk) < lock dropped so grab here >
ret = tcp_bpf_sendmsg_redir
psock = sk->psock
tcp_bpf_send_verdict
lock_sock(sk) ... blocking on B
if (psock->eval == NONE) <- boom.
psock->eval will have msgA state
The problem here is we dropped the lock on msgA and grabbed it with msgB.
Now we have old state in psock and importantly psock->eval has not been
cleared. So msgB will run whatever action was done on A and the verdict
program may never see it.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211012052019.184398-1-liujian56@huawei.com
|
|
Use asm/unwind.h to implement wchan, since we cannot always rely on
STACKTRACE=y.
Fixes: bc9bbb81730e ("x86: Fix get_wchan() to support the ORC unwinder")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20211022152104.137058575@infradead.org
|
|
rpcif_sw_init() can fail so make sure we check the return value
of it and on error exit rpcif_spi_probe() callback with error code.
Fixes: eb8d6d464a27 ("spi: add Renesas RPC-IF driver")
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20211025205631.21151-4-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
pm_runtime_disable() cancels all pending power requests, while they
should be completed for the Tegra SPI driver. Otherwise SPI clock won't
be disabled ever again because clk refcount will become unbalanced.
Enforce runtime PM suspension to put device into expected state before
driver is unbound and device's RPM state is reset by driver's core.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20211023225951.14253-2-digetx@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
pm_runtime_disable() cancels all pending power requests, while they
should be completed for the Tegra SPI driver. Otherwise SPI clock won't
be disabled ever again because clk refcount will become unbalanced.
Enforce runtime PM suspension to put device into expected state before
driver is unbound and device's RPM state is reset by driver's core.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20211023225951.14253-1-digetx@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Fix the missing clk_disable_unprepare() before return
from bcm_qspi_probe() in the error handling case.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211018073413.2029081-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Fixes screen orientation for GPD Win 3 handheld gaming console.
Signed-off-by: Mario Risoldi <awxkrnl@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026112737.9181-1-awxkrnl@gmail.com
|
|
TI's implementation does not service the watchdog even if the kernel
command line parameter omap_wdt.early_enable is set to 1. This patch
fixes the issue.
Signed-off-by: Walter Stoll <walter.stoll@duagon.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/88a8fe5229cd68fa0f1fd22f5d66666c1b7057a0.camel@duagon.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
|
|
sparse reports the following address space warning.
drivers/watchdog/ixp4xx_wdt.c:122:20: sparse:
incorrect type in assignment (different address spaces)
drivers/watchdog/ixp4xx_wdt.c:122:20: sparse:
expected void [noderef] __iomem *base
drivers/watchdog/ixp4xx_wdt.c:122:20: sparse:
got void *platform_data
Add a typecast to solve the problem.
Fixes: 21a0a29d16c6 ("watchdog: ixp4xx: Rewrite driver to use core")
Cc: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210911042925.556889-1-linux@roeck-us.net
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
|
|
The MODULE_DEVICE_TABLE already creates proper alias for platform
driver. Having another MODULE_ALIAS causes the alias to be duplicated.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210917092024.19323-1-krzysztof.kozlowski@canonical.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
|
|
SBSA says of the generic watchdog:
All registers are 32 bits in size and should be accessed using 32-bit
reads and writes. If an access size other than 32 bits is used then
the results are IMPLEMENTATION DEFINED.
and for qemu, the implementation will only allow 32-bit accesses
resulting in a synchronous external abort when configuring the watchdog.
Use lo_hi_* accessors rather than a readq/writeq.
Fixes: abd3ac7902fb ("watchdog: sbsa: Support architecture version 1")
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20210903112101.493552-1-quic_jiles@quicinc.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
|
|
This reverts commit cb011044e34c ("watchdog: iTCO_wdt: Account for
rebooting on second timeout") and commit aec42642d91f ("watchdog: iTCO_wdt:
Fix detection of SMI-off case") since those patches cause a regression
on certain boards (https://bugzilla.kernel.org/show_bug.cgi?id=213809).
While this revert may result in some boards to only reset after twice
the configured timeout value, that is still better than a watchdog reset
after half the configured value.
Fixes: cb011044e34c ("watchdog: iTCO_wdt: Account for rebooting on second timeout")
Fixes: aec42642d91f ("watchdog: iTCO_wdt: Fix detection of SMI-off case")
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Mantas Mikulėnas <grawity@gmail.com>
Reported-by: Javier S. Pedro <debbugs@javispedro.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20211008003302.1461733-1-linux@roeck-us.net
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
|
|
revert commit 46ae40b94d88 ("net/mlx5: Let user configure io_eq_size param")
revert commit a6cb08daa3b4 ("net/mlx5: Let user configure event_eq_size param")
revert commit 554604061979 ("net/mlx5: Let user configure max_macs param")
The EQE parameters are applicable to more drivers, they should
be configured via standard API, probably ethtool. Example of
another driver needing something similar:
https://lore.kernel.org/all/1633454136-14679-3-git-send-email-sbhatta@marvell.com/
The last param for "max_macs" is probably fine but the documentation
is severely lacking. The meaning and implications for changing the
param need to be stated.
Link: https://lore.kernel.org/r/20211026152939.3125950-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 95ea0486b20e ("btrfs: allow read-write for 4K sectorsize on 64K
page size systems") added write support for 4K sectorsize on a 64K
systems. Fix the now stale comments.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Christoph pointed out that I'm updating bdev->bd_inode for the device
time when we remove block devices from a btrfs file system, however this
isn't actually exposed to anything. The inode we want to update is the
one that's associated with the path to the device, usually on devtmpfs,
so that blkid notices the difference.
We still don't want to do the blkdev_open, so use kern_path() to get the
path to the given device and do the update time on that inode.
Fixes: 8f96a5bfa150 ("btrfs: update the bdev time directly when closing")
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
If you already have an inode and need to update the time on the inode
there is no way to do this properly. Export this helper to allow file
systems to update time on the inode so the appropriate handler is
called, either ->update_time or generic_update_time.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Attempting to defragment a Btrfs file containing a transparent huge page
immediately deadlocks with the following stack trace:
#0 context_switch (kernel/sched/core.c:4940:2)
#1 __schedule (kernel/sched/core.c:6287:8)
#2 schedule (kernel/sched/core.c:6366:3)
#3 io_schedule (kernel/sched/core.c:8389:2)
#4 wait_on_page_bit_common (mm/filemap.c:1356:4)
#5 __lock_page (mm/filemap.c:1648:2)
#6 lock_page (./include/linux/pagemap.h:625:3)
#7 pagecache_get_page (mm/filemap.c:1910:4)
#8 find_or_create_page (./include/linux/pagemap.h:420:9)
#9 defrag_prepare_one_page (fs/btrfs/ioctl.c:1068:9)
#10 defrag_one_range (fs/btrfs/ioctl.c:1326:14)
#11 defrag_one_cluster (fs/btrfs/ioctl.c:1421:9)
#12 btrfs_defrag_file (fs/btrfs/ioctl.c:1523:9)
#13 btrfs_ioctl_defrag (fs/btrfs/ioctl.c:3117:9)
#14 btrfs_ioctl (fs/btrfs/ioctl.c:4872:10)
#15 vfs_ioctl (fs/ioctl.c:51:10)
#16 __do_sys_ioctl (fs/ioctl.c:874:11)
#17 __se_sys_ioctl (fs/ioctl.c:860:1)
#18 __x64_sys_ioctl (fs/ioctl.c:860:1)
#19 do_syscall_x64 (arch/x86/entry/common.c:50:14)
#20 do_syscall_64 (arch/x86/entry/common.c:80:7)
#21 entry_SYSCALL_64+0x7c/0x15b (arch/x86/entry/entry_64.S:113)
A huge page is represented by a compound page, which consists of a
struct page for each PAGE_SIZE page within the huge page. The first
struct page is the "head page", and the remaining are "tail pages".
Defragmentation attempts to lock each page in the range. However,
lock_page() on a tail page actually locks the corresponding head page.
So, if defragmentation tries to lock more than one struct page in a
compound page, it tries to lock the same head page twice and deadlocks
with itself.
Ideally, we should be able to defragment transparent huge pages.
However, THP for filesystems is currently read-only, so a lot of code is
not ready to use huge pages for I/O. For now, let's just return
ETXTBUSY.
This can be reproduced with the following on a kernel with
CONFIG_READ_ONLY_THP_FOR_FS=y:
$ cat create_thp_file.c
#include <fcntl.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
static const char zeroes[1024 * 1024];
static const size_t FILE_SIZE = 2 * 1024 * 1024;
int main(int argc, char **argv)
{
if (argc != 2) {
fprintf(stderr, "usage: %s PATH\n", argv[0]);
return EXIT_FAILURE;
}
int fd = creat(argv[1], 0777);
if (fd == -1) {
perror("creat");
return EXIT_FAILURE;
}
size_t written = 0;
while (written < FILE_SIZE) {
ssize_t ret = write(fd, zeroes,
sizeof(zeroes) < FILE_SIZE - written ?
sizeof(zeroes) : FILE_SIZE - written);
if (ret < 0) {
perror("write");
return EXIT_FAILURE;
}
written += ret;
}
close(fd);
fd = open(argv[1], O_RDONLY);
if (fd == -1) {
perror("open");
return EXIT_FAILURE;
}
/*
* Reserve some address space so that we can align the file mapping to
* the huge page size.
*/
void *placeholder_map = mmap(NULL, FILE_SIZE * 2, PROT_NONE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (placeholder_map == MAP_FAILED) {
perror("mmap (placeholder)");
return EXIT_FAILURE;
}
void *aligned_address =
(void *)(((uintptr_t)placeholder_map + FILE_SIZE - 1) & ~(FILE_SIZE - 1));
void *map = mmap(aligned_address, FILE_SIZE, PROT_READ | PROT_EXEC,
MAP_SHARED | MAP_FIXED, fd, 0);
if (map == MAP_FAILED) {
perror("mmap");
return EXIT_FAILURE;
}
if (madvise(map, FILE_SIZE, MADV_HUGEPAGE) < 0) {
perror("madvise");
return EXIT_FAILURE;
}
char *line = NULL;
size_t line_capacity = 0;
FILE *smaps_file = fopen("/proc/self/smaps", "r");
if (!smaps_file) {
perror("fopen");
return EXIT_FAILURE;
}
for (;;) {
for (size_t off = 0; off < FILE_SIZE; off += 4096)
((volatile char *)map)[off];
ssize_t ret;
bool this_mapping = false;
while ((ret = getline(&line, &line_capacity, smaps_file)) > 0) {
unsigned long start, end, huge;
if (sscanf(line, "%lx-%lx", &start, &end) == 2) {
this_mapping = (start <= (uintptr_t)map &&
(uintptr_t)map < end);
} else if (this_mapping &&
sscanf(line, "FilePmdMapped: %ld", &huge) == 1 &&
huge > 0) {
return EXIT_SUCCESS;
}
}
sleep(6);
rewind(smaps_file);
fflush(smaps_file);
}
}
$ ./create_thp_file huge
$ btrfs fi defrag -czstd ./huge
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Commit 2efc459d06f1 ("sysfs: Add sysfs_emit and sysfs_emit_at to format
sysfs out") merged in 5.10 introduced two new functions sysfs_emit() and
sysfs_emit_at() which are aware of the PAGE_SIZE limit of the output
buffer.
Use the above two new functions instead of scnprintf() and snprintf()
in various sysfs show().
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It's a common practice to avoid use sizeof(struct btrfs_super_block)
(3531), but to use BTRFS_SUPER_INFO_SIZE (4096).
The problem is that, sizeof(struct btrfs_super_block) doesn't match
BTRFS_SUPER_INFO_SIZE from the very beginning.
Furthermore, for all call sites except selftests, we always allocate
BTRFS_SUPER_INFO_SIZE space for super block, there isn't any real reason
to use the smaller value, and it doesn't really save any space.
So let's get rid of such confusing behavior, and unify those two values.
This modification also adds a new static_assert() to verify the size,
and moves the BTRFS_SUPER_INFO_* macros to the definition of
btrfs_super_block for the static_assert().
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Update the comments at btrfs_chunk_alloc() and do_chunk_alloc() that
describe which cases can lead to a failure to allocate metadata and system
space despite having previously reserved space. This adds one more reason
that I previously forgot to mention.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When a task is doing some modification to the chunk btree and it is not in
the context of a chunk allocation or a chunk removal, it can deadlock with
another task that is currently allocating a new data or metadata chunk.
These contexts are the following:
* When relocating a system chunk, when we need to COW the extent buffers
that belong to the chunk btree;
* When adding a new device (ioctl), where we need to add a new device item
to the chunk btree;
* When removing a device (ioctl), where we need to remove a device item
from the chunk btree;
* When resizing a device (ioctl), where we need to update a device item in
the chunk btree and may need to relocate a system chunk that lies beyond
the new device size when shrinking a device.
The problem happens due to a sequence of steps like the following:
1) Task A starts a data or metadata chunk allocation and it locks the
chunk mutex;
2) Task B is relocating a system chunk, and when it needs to COW an extent
buffer of the chunk btree, it has locked both that extent buffer as
well as its parent extent buffer;
3) Since there is not enough available system space, either because none
of the existing system block groups have enough free space or because
the only one with enough free space is in RO mode due to the relocation,
task B triggers a new system chunk allocation. It blocks when trying to
acquire the chunk mutex, currently held by task A;
4) Task A enters btrfs_chunk_alloc_add_chunk_item(), in order to insert
the new chunk item into the chunk btree and update the existing device
items there. But in order to do that, it has to lock the extent buffer
that task B locked at step 2, or its parent extent buffer, but task B
is waiting on the chunk mutex, which is currently locked by task A,
therefore resulting in a deadlock.
One example report when the deadlock happens with system chunk relocation:
INFO: task kworker/u9:5:546 blocked for more than 143 seconds.
Not tainted 5.15.0-rc3+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u9:5 state:D stack:25936 pid: 546 ppid: 2 flags:0x00004000
Workqueue: events_unbound btrfs_async_reclaim_metadata_space
Call Trace:
context_switch kernel/sched/core.c:4940 [inline]
__schedule+0xcd9/0x2530 kernel/sched/core.c:6287
schedule+0xd3/0x270 kernel/sched/core.c:6366
rwsem_down_read_slowpath+0x4ee/0x9d0 kernel/locking/rwsem.c:993
__down_read_common kernel/locking/rwsem.c:1214 [inline]
__down_read kernel/locking/rwsem.c:1223 [inline]
down_read_nested+0xe6/0x440 kernel/locking/rwsem.c:1590
__btrfs_tree_read_lock+0x31/0x350 fs/btrfs/locking.c:47
btrfs_tree_read_lock fs/btrfs/locking.c:54 [inline]
btrfs_read_lock_root_node+0x8a/0x320 fs/btrfs/locking.c:191
btrfs_search_slot_get_root fs/btrfs/ctree.c:1623 [inline]
btrfs_search_slot+0x13b4/0x2140 fs/btrfs/ctree.c:1728
btrfs_update_device+0x11f/0x500 fs/btrfs/volumes.c:2794
btrfs_chunk_alloc_add_chunk_item+0x34d/0xea0 fs/btrfs/volumes.c:5504
do_chunk_alloc fs/btrfs/block-group.c:3408 [inline]
btrfs_chunk_alloc+0x84d/0xf50 fs/btrfs/block-group.c:3653
flush_space+0x54e/0xd80 fs/btrfs/space-info.c:670
btrfs_async_reclaim_metadata_space+0x396/0xa90 fs/btrfs/space-info.c:953
process_one_work+0x9df/0x16d0 kernel/workqueue.c:2297
worker_thread+0x90/0xed0 kernel/workqueue.c:2444
kthread+0x3e5/0x4d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
INFO: task syz-executor:9107 blocked for more than 143 seconds.
Not tainted 5.15.0-rc3+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor state:D stack:23200 pid: 9107 ppid: 7792 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:4940 [inline]
__schedule+0xcd9/0x2530 kernel/sched/core.c:6287
schedule+0xd3/0x270 kernel/sched/core.c:6366
schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6425
__mutex_lock_common kernel/locking/mutex.c:669 [inline]
__mutex_lock+0xc96/0x1680 kernel/locking/mutex.c:729
btrfs_chunk_alloc+0x31a/0xf50 fs/btrfs/block-group.c:3631
find_free_extent_update_loop fs/btrfs/extent-tree.c:3986 [inline]
find_free_extent+0x25cb/0x3a30 fs/btrfs/extent-tree.c:4335
btrfs_reserve_extent+0x1f1/0x500 fs/btrfs/extent-tree.c:4415
btrfs_alloc_tree_block+0x203/0x1120 fs/btrfs/extent-tree.c:4813
__btrfs_cow_block+0x412/0x1620 fs/btrfs/ctree.c:415
btrfs_cow_block+0x2f6/0x8c0 fs/btrfs/ctree.c:570
btrfs_search_slot+0x1094/0x2140 fs/btrfs/ctree.c:1768
relocate_tree_block fs/btrfs/relocation.c:2694 [inline]
relocate_tree_blocks+0xf73/0x1770 fs/btrfs/relocation.c:2757
relocate_block_group+0x47e/0xc70 fs/btrfs/relocation.c:3673
btrfs_relocate_block_group+0x48a/0xc60 fs/btrfs/relocation.c:4070
btrfs_relocate_chunk+0x96/0x280 fs/btrfs/volumes.c:3181
__btrfs_balance fs/btrfs/volumes.c:3911 [inline]
btrfs_balance+0x1f03/0x3cd0 fs/btrfs/volumes.c:4301
btrfs_ioctl_balance+0x61e/0x800 fs/btrfs/ioctl.c:4137
btrfs_ioctl+0x39ea/0x7b70 fs/btrfs/ioctl.c:4949
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
So fix this by making sure that whenever we try to modify the chunk btree
and we are neither in a chunk allocation context nor in a chunk remove
context, we reserve system space before modifying the chunk btree.
Reported-by: Hao Sun <sunhao.th@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CACkBjsax51i4mu6C0C3vJqQN3NR_iVuucoeG3U1HXjrgzn5FFQ@mail.gmail.com/
Fixes: 79bd37120b1495 ("btrfs: rework chunk allocation to avoid exhaustion of the system chunk array")
CC: stable@vger.kernel.org # 5.14+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently auto reclaim of unusable zones reclaims the block-groups in
the order they have been added to the reclaim list.
Change this to a greedy algorithm by sorting the list so we have the
block-groups with the least amount of valid bytes reclaimed first.
Note: we can't splice the block groups from reclaim_bgs to let the sort
happen outside of the lock. The block groups can be still in use by
other parts eg. via bg_list and we must hold unused_bgs_lock while
processing them.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ write note and comment why we can't splice the list ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Just use the %pg format specifier in all the debug printks previously
using it. Note that both bdevname and the %pg specifier never print
a pathname, so the kbasename call wasn't needed to start with.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
[ adjust messages and indentation ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
For device removal and replace we call btrfs_find_device_by_devspec,
which if we give it a device path and nothing else will call
btrfs_get_dev_args_from_path, which opens the block device and reads the
super block and then looks up our device based on that.
However at this point we're holding the sb write "lock", so reading the
block device pulls in the dependency of ->open_mutex, which produces the
following lockdep splat
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc2+ #405 Not tainted
------------------------------------------------------
losetup/11576 is trying to acquire lock:
ffff9bbe8cded938 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock:
ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
lo_open+0x28/0x60 [loop]
blkdev_get_whole+0x25/0xf0
blkdev_get_by_dev.part.0+0x168/0x3c0
blkdev_open+0xd2/0xe0
do_dentry_open+0x161/0x390
path_openat+0x3cc/0xa20
do_filp_open+0x96/0x120
do_sys_openat2+0x7b/0x130
__x64_sys_openat+0x46/0x70
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
blkdev_get_by_dev.part.0+0x56/0x3c0
blkdev_get_by_path+0x98/0xa0
btrfs_get_bdev_and_sb+0x1b/0xb0
btrfs_find_device_by_devspec+0x12b/0x1c0
btrfs_rm_device+0x127/0x610
btrfs_ioctl+0x2a31/0x2e70
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}:
lo_write_bvec+0xc2/0x240 [loop]
loop_process_work+0x238/0xd00 [loop]
process_one_work+0x26b/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
process_one_work+0x245/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}:
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
flush_workqueue+0x91/0x5e0
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
(wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&lo->lo_mutex);
lock(&disk->open_mutex);
lock(&lo->lo_mutex);
lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11576:
#0: ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace:
CPU: 0 PID: 11576 Comm: losetup Not tainted 5.14.0-rc2+ #405
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack_lvl+0x57/0x72
check_noncircular+0xcf/0xf0
? stack_trace_save+0x3b/0x50
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
? flush_workqueue+0x67/0x5e0
? lockdep_init_map_type+0x47/0x220
flush_workqueue+0x91/0x5e0
? flush_workqueue+0x67/0x5e0
? verify_cpu+0xf0/0x100
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
? blkdev_ioctl+0x8d/0x2a0
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f31b02404cb
Instead what we want to do is populate our device lookup args before we
grab any locks, and then pass these args into btrfs_rm_device(). From
there we can find the device and do the appropriate removal.
Suggested-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We are going to want to populate our device lookup args outside of any
locks and then do the actual device lookup later, so add a helper to do
this work and make btrfs_find_device_by_devspec() use this helper for
now.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We have a lot of device lookup functions that all do something slightly
different. Clean this up by adding a struct to hold the different
lookup criteria, and then pass this around to btrfs_find_device() so it
can do the proper matching based on the lookup criteria.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There's a subtle case where if we're removing the seed device from a
file system we need to free its private copy of the fs_devices. However
we do not need to call close_fs_devices(), because at this point there
are no devices left to close as we've closed the last one. The only
thing that close_fs_devices() does is decrement ->opened, which should
be 1. We want to avoid calling close_fs_devices() here because it has a
lockdep_assert_held(&uuid_mutex), and we are going to stop holding the
uuid_mutex in this path.
So simply decrement the ->opened counter like we should, and then clean
up like normal. Also add a comment explaining what we're doing here as
I initially removed this code erroneously.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
A bug was was checking a wrong device count before we delete the struct
btrfs_fs_devices in btrfs_rm_device(). To avoid future confusion and
easy reference add a comment about the various device counts that we have
in the struct btrfs_fs_devices.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
For both sprout and seed fsids,
btrfs_fs_devices::num_devices provides device count including missing
btrfs_fs_devices::open_devices provides device count excluding missing
We create a dummy struct btrfs_device for the missing device, so
num_devices != open_devices when there is a missing device.
In btrfs_rm_devices() we wrongly check for %cur_devices->open_devices
before freeing the seed fs_devices. Instead we should check for
%cur_devices->num_devices.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
At replay_dir_deletes(), if find_dir_range() returns an error we break out
of the main while loop and then assign a value of 0 (success) to the 'ret'
variable, resulting in completely ignoring that an error happened. Fix
that by jumping to the 'out' label when find_dir_range() returns an error
(negative value).
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The member btrfs_bio::logical is only initialized by two call sites:
- btrfs_repair_one_sector()
No corresponding site to utilize it.
- btrfs_submit_direct()
The corresponding site to utilize it is btrfs_check_read_dio_bio().
However for btrfs_check_read_dio_bio(), we can grab the file_offset from
btrfs_dio_private::file_offset directly.
Thus it turns out we don't really need that btrfs_bio::logical member at
all.
For btrfs_bio, the logical bytenr can be fetched from its
bio->bi_iter.bi_sector directly.
So let's just remove the member to save 8 bytes for structure btrfs_bio.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The naming of "logical_offset" can be confused with logical bytenr of
the dio range.
In fact it's file offset, and the naming "file_offset" is already widely
used in all other sites.
Just do the rename to avoid confusion.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Using local kmaps slightly reduces the chances to stray writes, and
the bvec interface cleans up the code a little bit.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_update_block_group() accounts for the number of bytes allocated or
freed. Argument @alloc specifies whether the call is for alloc or free.
Convert the argument @alloc type from int to bool.
Reviewed-by: Su Yue <l@damenly.su>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Now that real_root is only used in ref-verify core gate it behind
CONFIG_BTRFS_FS_REF_VERIFY ifdef. This shrinks the size of pending
delayed refs by 8 bytes per ref, of which we can have many at any one
time depending on intensity of the workload. Also change the comment
about the member as it no longer deals with qgroups.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Instead of checking whether qgroup processing for a dealyed ref has to
happen in the core of delayed ref, simply pull the check at init time of
respective delayed ref structures. This eliminates the final use of
real_root in delayed-ref core paving the way to making this member
optional.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
In order to make 'real_root' used only in ref-verify it's required to
have the necessary context to perform the same checks that this member
is used for. So add 'mod_root' which will contain the root on behalf of
which a delayed ref was created and a 'skip_group' parameter which
will contain callsite-specific override of skip_qgroup.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
CHUNK_ROOT
The real_root field is going to be used only by ref-verify tool so limit
its use outside of it. Blocks belonging to the chunk root will always
have it as an owner so the check is equivalent.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Both data and metadata delayed ref structures have fields named
root/ref_root respectively. Those are somewhat cryptic and don't really
convey the real meaning. In fact those roots are really the original
owners of the respective block (i.e in case of a snapshot a data delayed
ref will contain the original root that owns the given block). Rename
those fields accordingly and adjust comments.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Error injection stressing uncovered a busy loop in our data reclaim
loop. There are two cases here, one where we loop creating block groups
until space_info->full is set, or in the main loop we will skip erroring
out any tickets if space_info->full == 0. Unfortunately if we aborted
the transaction then we will never allocate chunks or reclaim any space
and thus never get ->full, and you'll see stack traces like this:
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [kworker/u4:4:139]
CPU: 0 PID: 139 Comm: kworker/u4:4 Tainted: G W 5.13.0-rc1+ #328
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Workqueue: events_unbound btrfs_async_reclaim_data_space
RIP: 0010:btrfs_join_transaction+0x12/0x20
RSP: 0018:ffffb2b780b77de0 EFLAGS: 00000246
RAX: ffffb2b781863d58 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000801 RSI: ffff987952b57400 RDI: ffff987940aa3000
RBP: ffff987954d55000 R08: 0000000000000001 R09: ffff98795539e8f0
R10: 000000000000000f R11: 000000000000000f R12: ffffffffffffffff
R13: ffff987952b574c8 R14: ffff987952b57400 R15: 0000000000000008
FS: 0000000000000000(0000) GS:ffff9879bbc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0703da4000 CR3: 0000000113398004 CR4: 0000000000370ef0
Call Trace:
flush_space+0x4a8/0x660
btrfs_async_reclaim_data_space+0x55/0x130
process_one_work+0x1e9/0x380
worker_thread+0x53/0x3e0
? process_one_work+0x380/0x380
kthread+0x118/0x140
? __kthread_bind_mask+0x60/0x60
ret_from_fork+0x1f/0x30
Fix this by checking to see if we have a btrfs fs error in either of the
reclaim loops, and if so fail the tickets and bail. In addition to
this, fix maybe_fail_all_tickets() to not try to grant tickets if we've
aborted, simply fail everything.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We have a few flags that are inconsistently used to describe the fs in
different states of failure. As of 5963ffcaf383 ("btrfs: always abort
the transaction if we abort a trans handle") we will always set
BTRFS_FS_STATE_ERROR if we abort, so we don't have to check both ABORTED
and ERROR to see if things have gone wrong. Add a helper to check
BTRFS_FS_STATE_ERROR and then convert all checkers of FS_STATE_ERROR to
use the helper.
The TRANS_ABORTED bit check was added in af7227338135 ("Btrfs: clean up
resources during umount after trans is aborted") but is not actually
specific.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently we will abort the transaction if we get a random error (like
-EIO) while trying to remove the directory entries from the root log
during rename.
However since these are simply log tree related errors, we can mark the
trans as needing a full commit. Then if the error was truly
catastrophic we'll hit it during the normal commit and abort as
appropriate.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
During inspection of the return path for replay I noticed that we don't
actually abort the transaction if we get a failure during replay. This
isn't a problem necessarily, as we properly return the error and will
fail to mount. However we still leave this dangling transaction that
could conceivably be committed without thinking there was an error.
We were using btrfs_handle_fs_error() here, but that pre-dates the
transaction abort code. Simply replace the btrfs_handle_fs_error()
calls with transaction aborts, so we still know where exactly things
went wrong, and add a few in some other un-handled error cases.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Fix memdup.cocci warning:
fs/btrfs/zoned.c:1198:23-30: WARNING opportunity for kmemdup
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Kai Song <songkai01@inspur.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
For compressed write, we use a mechanism called async COW, which unlike
regular run_delalloc_cow() or cow_file_range() will also unlock the
first page.
This mechanism allows us to continue handling next ranges, without
waiting for the time consuming compression.
But this has a problem for subpage case, as we could have the following
delalloc range for a page:
0 32K 64K
| |///////| |///////|
\- A \- B
In the above case, if we pass both ranges to cow_file_range_async(),
both range A and range B will try to unlock the full page [0, 64K).
And which one finishes later than the other one will try to do other
page operations like end_page_writeback() on a unlocked page, triggering
VM layer BUG_ON().
To make subpage compression work at least partially, here we add another
restriction for it, only allow compression if the delalloc range is
fully page aligned.
By that, async extent is always ensured to unlock the first page
exclusively, just like it used to be for regular sectorsize.
In theory, we only need to make sure the delalloc range fully covers its
first page, but the tail page will be locked anyway, blocking later
writeback until the compression finishes.
Thus here we choose to make sure the range is fully page aligned before
doing the compression.
In the future, we could optimize the situation by properly increasing
subpage::writers number for the locked page, but that also means we need
to change how we run delalloc range of page.
(Instead of running each delalloc range we hit, we need to find and lock
all delalloc ranges covering the page, then run each of them).
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[BUG]
With experimental subpage compression enabled, a simple fsstress can
lead to self deadlock on page 720896:
mkfs.btrfs -f -s 4k $dev > /dev/null
mount $dev -o compress $mnt
$fsstress -p 1 -n 100 -w -d $mnt -v -s 1625511156
[CAUSE]
If we have a file layout looks like below:
0 32K 64K 96K 128K
|//| |///////////////|
4K
Then we run delalloc range for the inode, it will:
- Call find_lock_delalloc_range() with @delalloc_start = 0
Then we got a delalloc range [0, 4K).
This range will be COWed.
- Call find_lock_delalloc_range() again with @delalloc_start = 4K
Since find_lock_delalloc_range() never cares whether the range
is still inside page range [0, 64K), it will return range [64K, 128K).
This range meets the condition for subpage compression, will go
through async COW path.
And async COW path will return @page_started.
But that @page_started is now for range [64K, 128K), not for range
[0, 64K).
- writepage_dellloc() returned 1 for page [0, 64K)
Thus page [0, 64K) will not be unlocked, nor its page dirty status
will be cleared.
Next time when we try to lock page [0, 64K) we will deadlock, as there
is no one to release page [0, 64K).
This problem will never happen for regular page size as one page only
contains one sector. After the first find_lock_delalloc_range() call,
the @delalloc_end will go beyond @page_end no matter if we found a
delalloc range or not
Thus this bug only happens for subpage, as now we need multiple runs to
exhaust the delalloc range of a page.
[FIX]
Fix the problem by ensuring the delalloc range we ran at least started
inside @locked_page.
So that we will never get incorrect @page_started.
And to prevent such problem from happening again:
- Make find_lock_delalloc_range() return false if the found range is
beyond @end value passed in.
Since @end will be utilized now, add an ASSERT() to ensure we pass
correct @end into find_lock_delalloc_range().
This also means, for selftests we needs to populate @end before calling
find_lock_delalloc_range().
- New ASSERT() in find_lock_delalloc_range()
Now we will make sure the @start/@end passed in at least covers part
of the page.
- New ASSERT() in run_delalloc_range()
To make sure the range at least starts inside @locked page.
- Use @delalloc_start as proper cursor, while @delalloc_end is always
reset to @page_end.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|