Age | Commit message (Collapse) | Author |
|
request_queue param is no longer used by blk_rq_map_sg and
__blk_rq_map_sg. Remove it.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250313035322.243239-1-anuj20.g@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs
Pull UBI and UBIFS updates from Richard Weinberger:
"UBI:
- New interface to dump detailed erase counters
- Fixes around wear-leveling
UBIFS:
- Minor cleanups
- Fix for TNC dumping code"
* tag 'ubifs-for-linus-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
ubi: ubi_get_ec_info: Fix compiling error 'cast specifies array type'
ubi: Implement ioctl for detailed erase counters
ubi: Expose interface for detailed erase counters
ubifs: skip dumping tnc tree when zroot is null
ubi: Revert "ubi: wl: Close down wear-leveling before nand is suspended"
ubifs: ubifs_dump_leb: remove return from end of void function
ubifs: dump_lpt_leb: remove return at end of void function
ubi: Add a check for ubi_num
|
|
On risc V platform, there is a type conversion for the return value
(unsigned long type) of __untagged_addr_remote() in function
untagged_addr(). The compiler will complain when the parameter 'addr'
is an array type:
arch/riscv/include/asm/uaccess.h:33:9: error: cast specifies array type
(__force __typeof__(addr))__untagged_addr_remote(current->mm, __addr)
Fix it by converting the input parameter as a pointer.
Fixes: 01099f635a4c ("ubi: Implement ioctl for detailed erase counters")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501191405.WYnmdL0U-lkp@intel.com/
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Currently, "max_ec" can be read from sysfs, which provides a limited
view of the flash device’s wear. In certain cases, such as bugs in
the wear-leveling algorithm, specific blocks can be worn down more
than others, resulting in uneven wear distribution. Also some use cases
can wear the erase blocks of the fastmap area more heavily than other
parts of flash.
Providing detailed erase counter values give a better understanding of
the overall flash wear and is needed to be able to calculate for example
expected life time.
There exists more detailed info in debugfs, but this information is
only available for debug builds.
Signed-off-by: Rickard Andersson <rickard.andersson@axis.com>
Tested-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Commit 5580cdae05ae ("ubi: wl: Close down wear-leveling before nand is
suspended") added a reboot notification in UBI layer to shutdown the
wear-leveling subsystem, which imported an UAF problem[1]. Besides that,
the method also brings other potential UAF problems, for example:
reboot kworker
ubi_wl_reboot_notifier
ubi_wl_close
ubi_fastmap_close
kfree(ubi->fm)
update_fastmap_work_fn
ubi_update_fastmap
old_fm = ubi->fm
if (old_fm && old_fm->e[i]) // UAF!
Actually, the problem fixed by commit 5580cdae05ae ("ubi: wl: Close down
wear-leveling before nand is suspended") has been solved by commit
8cba323437a4 ("mtd: rawnand: protect access to rawnand devices while in
suspend"), which was discussed in [2]. So we can revert the commit
5580cdae05ae ("ubi: wl: Close down wear-leveling before nand is
suspended") directly.
[1] https://lore.kernel.org/linux-mtd/20241208175211.9406-2-dennis.lamerice@gmail.com/
[2] https://lore.kernel.org/all/9bf76f5d-12a4-46ff-90d4-4a7f0f47c381@axis.com/
Fixes: 5580cdae05ae ("ubi: wl: Close down wear-leveling before nand is suspended")
Reported-by: Dennis Lam <dennis.lamerice@gmail.com>
Closes: https://lore.kernel.org/linux-mtd/20241208175211.9406-2-dennis.lamerice@gmail.com/
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Acked-by: Mårten Lindahl <marten.lindahl@axis.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Added a check for ubi_num for negative numbers
If the variable ubi_num takes negative values then we get:
qemu-system-arm ... -append "ubi.mtd=0,0,0,-22222345" ...
[ 0.745065] ubi_attach_mtd_dev from ubi_init+0x178/0x218
[ 0.745230] ubi_init from do_one_initcall+0x70/0x1ac
[ 0.745344] do_one_initcall from kernel_init_freeable+0x198/0x224
[ 0.745474] kernel_init_freeable from kernel_init+0x18/0x134
[ 0.745600] kernel_init from ret_from_fork+0x14/0x28
[ 0.745727] Exception stack(0x90015fb0 to 0x90015ff8)
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 83ff59a06663 ("UBI: support ubi_num on mtd.ubi command line")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
BLK_MQ_F_SHOULD_MERGE is set for all tag_sets except those that purely
process passthrough commands (bsg-lib, ufs tmf, various nvme admin
queues) and thus don't even check the flag. Remove it to simplify the
driver interface.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20241219060214.1928848-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In function ubi_nvmem_reg_read the while-loop can only be exiting
of bytes_left is zero or an error has occurred. There is an exit
return path if an error occurs, so the bytes_left can only be
zero after that point. Hence the check for a non-zero bytes_left
at the end of the function is redundant and can be removed. Remove
the check and just return 0.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The 'fw_vols' fwnode_handle initialized via
device_get_named_child_node() requires explicit calls to
fwnode_handle_put() when the variable is no longer required.
Add the missing calls to fwnode_handle_put() before the function
returns.
Cc: stable@vger.kernel.org
Fixes: 51932f9fc487 ("mtd: ubi: populate ubi volume fwnode")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Since commit 4c39529663b9 ("slab: Warn on duplicate cache names when
DEBUG_VM=y"), the duplicate slab cache names can be detected and a
kernel WARNING is thrown out.
In UBI fast attaching process, alloc_ai() could be invoked twice
with the same slab cache name 'ubi_aeb_slab_cache', which will trigger
following warning messages:
kmem_cache of name 'ubi_aeb_slab_cache' already exists
WARNING: CPU: 0 PID: 7519 at mm/slab_common.c:107
__kmem_cache_create_args+0x100/0x5f0
Modules linked in: ubi(+) nandsim [last unloaded: nandsim]
CPU: 0 UID: 0 PID: 7519 Comm: modprobe Tainted: G 6.12.0-rc2
RIP: 0010:__kmem_cache_create_args+0x100/0x5f0
Call Trace:
__kmem_cache_create_args+0x100/0x5f0
alloc_ai+0x295/0x3f0 [ubi]
ubi_attach+0x3c3/0xcc0 [ubi]
ubi_attach_mtd_dev+0x17cf/0x3fa0 [ubi]
ubi_init+0x3fb/0x800 [ubi]
do_init_module+0x265/0x7d0
__x64_sys_finit_module+0x7a/0xc0
The problem could be easily reproduced by loading UBI device by fastmap
with CONFIG_DEBUG_VM=y.
Fix it by using different slab names for alloc_ai() callers.
Fixes: d2158f69a7d4 ("UBI: Remove alloc_ai() slab name from parameter list")
Fixes: fdf10ed710c0 ("ubi: Rework Fastmap attach base code")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
If a reboot/shutdown signal with double force (-ff) is triggered when
the erase worker or wear-leveling worker function runs we may end up in
a race condition since the MTD device gets a reboot notification and
suspends the nand flash before the erase or wear-leveling is done. This
will reject all accesses to the flash with -EBUSY.
Sequence for the erase worker function:
systemctl reboot -ff ubi_thread
do_work
__do_sys_reboot
blocking_notifier_call_chain
mtd_reboot_notifier
nand_shutdown
nand_suspend
__erase_worker
ubi_sync_erase
mtd_erase
nand_erase_nand
# Blocked by suspended chip
nand_get_device
=> EBUSY
Similar sequence for the wear-leveling function:
systemctl reboot -ff ubi_thread
do_work
__do_sys_reboot
blocking_notifier_call_chain
mtd_reboot_notifier
nand_shutdown
nand_suspend
wear_leveling_worker
ubi_eba_copy_leb
ubi_io_write
mtd_write
nand_write_oob
# Blocked by suspended chip
nand_get_device
=> EBUSY
systemd-shutdown[1]: Rebooting.
ubi0 error: ubi_io_write: error -16 while writing 2048 bytes to PEB
CPU: 1 PID: 82 Comm: ubi_bgt0d Kdump: loaded Tainted: G O
(unwind_backtrace) from [<80107b9f>] (show_stack+0xb/0xc)
(show_stack) from [<8033641f>] (dump_stack_lvl+0x2b/0x34)
(dump_stack_lvl) from [<803b7f3f>] (ubi_io_write+0x3ab/0x4a8)
(ubi_io_write) from [<803b817d>] (ubi_io_write_vid_hdr+0x71/0xb4)
(ubi_io_write_vid_hdr) from [<803b6971>] (ubi_eba_copy_leb+0x195/0x2f0)
(ubi_eba_copy_leb) from [<803b939b>] (wear_leveling_worker+0x2ff/0x738)
(wear_leveling_worker) from [<803b86ef>] (do_work+0x5b/0xb0)
(do_work) from [<803b9ee1>] (ubi_thread+0xb1/0x11c)
(ubi_thread) from [<8012c113>] (kthread+0x11b/0x134)
(kthread) from [<80100139>] (ret_from_fork+0x11/0x38)
Exception stack(0x80c43fb0 to 0x80c43ff8)
...
ubi0 error: ubi_dump_flash: err -16 while reading 2048 bytes from PEB
ubi0 error: wear_leveling_worker: error -16 while moving PEB 246 to PEB
ubi0 warning: ubi_ro_mode.part.0: switch to read-only mode
...
ubi0 error: do_work: work failed with error code -16
ubi0 error: ubi_thread: ubi_bgt0d: work failed with error code -16
...
Kernel panic - not syncing: Software Watchdog Timer expired
Add a reboot notification for the ubi/wear-leveling to shutdown any
potential flash work actions before the nand is suspended.
Signed-off-by: Mårten Lindahl <marten.lindahl@axis.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The definition of ubi_destroy_ai() has been removed since
commit dac6e2087a41 ("UBI: Add fastmap stuff to attach.c").
Remove the empty declaration in header file.
Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Since commit 14072ee33d5a ("ubi: fastmap: Check wl_pool for free peb
before wear leveling"), wear_leveling_worker() won't schedule fm_work
if wear-leveling pool is empty, which could temporarily disable the
wear-leveling until the fastmap is updated(eg. pool becomes empty).
Fix it by scheduling fm_work if wl_pool is empty during wear-leveing.
Fixes: 14072ee33d5a ("ubi: fastmap: Check wl_pool for free peb before wear leveling")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
During wear-leveing work, the source PEB will be moved into scrub list
when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong
for non-scrub type source PEB. The problem could bring extra and
ineffective wear-leveing jobs, which makes more or less negative effects
for the life time of flash. Specifically, the process is divided 2 steps:
1. wear_leveling_worker // generate false scrub type PEB
ubi_eba_copy_leb // MOVE_RETRY is returned
leb_write_trylock // trylock failed
scrubbing = 1;
e1 is put into ubi->scrub
2. wear_leveling_worker // schedule false scrub type PEB for wl
scrubbing = 1
e1 = rb_entry(rb_first(&ubi->scrub))
The problem can be reproduced easily by running fsstress on a small
UBIFS partition(<64M, simulated by nandsim) for 5~10mins
(CONFIG_MTD_UBI_FASTMAP=y,CONFIG_MTD_UBI_WL_THRESHOLD=50). Following
message is shown:
ubi0: scrubbed PEB 66 (LEB 0:10), data moved to PEB 165
Since scrub type source PEB has set variable scrubbing as '1', and
variable scrubbing is checked before variable keep, so the problem can
be fixed by setting keep variable as 1 directly if the source LEB cannot
be locked.
Fixes: e801e128b220 ("UBI: fix missing scrub when there is a bit-flip")
CC: stable@vger.kernel.org
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
no_llseek had been defined to NULL two years ago, in commit 868941b14441
("fs: remove no_llseek")
To quote that commit,
At -rc1 we'll need do a mechanical removal of no_llseek -
git grep -l -w no_llseek | grep -v porting.rst | while read i; do
sed -i '/\<no_llseek\>/d' $i
done
would do it.
Unfortunately, that hadn't been done. Linus, could you do that now, so
that we could finally put that thing to rest? All instances are of the
form
.llseek = no_llseek,
so it's obviously safe.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since ubiblock_exit() is now called from an init function,
the __exit section no longer makes sense.
Cc: Ben Hutchings <bwh@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407131403.wZJpd8n2-lkp@intel.com/
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
|
|
Similar to commit adbf4c4954e3 ("ubi: block: fix memleak in
ubiblock_create()"), 'dev->gd' is not assigned but dereferenced if
blk_mq_alloc_tag_set() fails, and leading to a null-pointer-dereference.
Fix it by using pr_err() and variable 'dev' to print error log.
Additionally, the log in the error handle path of idr_alloc() has
been improved by using pr_err(), too. Before initializing device
name, using dev_err() will print error log with 'null' instead of
the actual device name, like this:
block (null): ...
~~~~~~
It is unclear. Using pr_err() can print more details of the device.
The improved log is:
ubiblock0_0: ...
Fixes: 77567b25ab9f ("ubi: use blk_mq_alloc_disk and blk_cleanup_disk")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The UBIFS_DFS_DIR_LEN macro, which defines the maximum length of the UBIFS
debugfs directory name, has an incorrect formula and misleading comments.
The current formula is (3 + 1 + 2*2 + 1), which assumes that both UBI device
number and volume ID are limited to 2 characters. However, UBI device number
ranges from 0 to 31 (2 characters), and volume ID ranges from 0 to 127 (up
to 3 characters).
Although the current code works due to the cancellation of mathematical
errors (9 + 1 = 10, which matches the correct UBIFS_DFS_DIR_LEN value), it
can lead to confusion and potential issues in the future.
This patch aims to improve the code clarity and maintainability by making
the following changes:
1. Corrects the UBIFS_DFS_DIR_LEN macro definition to (3 + 1 + 2 + 3 + 1),
accommodating the maximum lengths of both UBI device number and volume ID,
plus the separators and null terminator.
2. Updates the snprintf calls to use UBIFS_DFS_DIR_LEN instead of
UBIFS_DFS_DIR_LEN + 1, removing the unnecessary +1.
3. Modifies the error checks to compare against UBIFS_DFS_DIR_LEN using >=
instead of >, aligning with the corrected macro definition.
4. Removes the redundant +1 in the dfs_dir_name array definitions in ubi.h
and debug.h.
While these changes do not affect the runtime behavior, they make the code
more readable, maintainable, and less prone to future errors.
v2->v3:
- Removes the duplicated UBIFS_DFS_DIR_LEN and UBIFS_DFS_DIR_NAME macro
definitions in ubifs.h, as they are already defined in debug.h.
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
We need to clean-up debugfs and ubiblock if we fail after initialising
them.
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Fixes: 927c145208b0 ("mtd: ubi: attach from device tree")
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The use of do_div() in ubi_nvmem_reg_read() makes calling it on
32-bit machines rather expensive. Since the 'from' variable is
known to be a 32-bit quantity, it is clearly never needed and
can be optimized into a regular division operation.
Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems")
Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the ubi_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
In case of a memory allocation failure in the volumes loop we can only
process the already allocated scan_eba and fm_eba array elements on the
error path - others are still uninitialized.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 00abf3041590 ("UBI: Add self_check_eba()")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs
Pull UBI and UBIFS updates from Richard Weinberger:
"UBI:
- Add Zhihao Cheng as reviewer
- Attach via device tree
- Add NVMEM layer
- Various fastmap related fixes
UBIFS:
- Add Zhihao Cheng as reviewer
- Convert to folios
- Various fixes (memory leaks in error paths, function prototypes)"
* tag 'ubifs-for-linus-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: (34 commits)
mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems
mtd: ubi: provide NVMEM layer over UBI volumes
mtd: ubi: populate ubi volume fwnode
mtd: ubi: introduce pre-removal notification for UBI volumes
mtd: ubi: attach from device tree
mtd: ubi: block: use notifier to create ubiblock from parameter
dt-bindings: mtd: ubi-volume: allow UBI volumes to provide NVMEM
dt-bindings: mtd: add basic bindings for UBI
ubifs: Queue up space reservation tasks if retrying many times
ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path
ubifs: dbg_check_idx_size: Fix kmemleak if loading znode failed
ubi: Correct the number of PEBs after a volume resize failure
ubi: fix slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130
ubi: correct the calculation of fastmap size
ubifs: Remove unreachable code in dbg_check_ltab_lnum
ubifs: fix function pointer cast warnings
ubifs: fix sort function prototype
ubi: Check for too small LEB size in VTBL code
MAINTAINERS: Add Zhihao Cheng as UBI/UBIFS reviewer
ubifs: Convert populate_page() to take a folio
...
|
|
A compiler warning related to sizeof(int) != 8 when calling do_div()
is triggered when building on 32-bit platforms.
Address this by using integer types having a well-defined size.
Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
In an ideal world we would like UBI to be used where ever possible on a
NAND chip. And with UBI support in ARM Trusted Firmware and U-Boot it
is possible to achieve an (almost-)all-UBI flash layout. Hence the need
for a way to also use UBI volumes to store board-level constants, such
as MAC addresses and calibration data of wireless interfaces.
Add UBI volume NVMEM driver module exposing UBI volumes as NVMEM
providers. Allow UBI devices to have a "volumes" firmware subnode with
volumes which may be compatible with "nvmem-cells".
Access to UBI volumes via the NVMEM interface at this point is
read-only, and it is slow, opening and closing the UBI volume for each
access due to limitations of the NVMEM provider API.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Look for the 'volumes' subnode of an MTD partition attached to a UBI
device and attach matching child nodes to UBI volumes.
This allows UBI volumes to be referenced in device tree, e.g. for use
as NVMEM providers.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Introduce a new notification type UBI_VOLUME_SHUTDOWN to inform users
that a volume is just about to be removed.
This is needed because users (such as the NVMEM subsystem) expect that
at the time their removal function is called, the parenting device is
still available (for removal of sysfs nodes, for example, in case of
NVMEM which otherwise WARNs on volume removal).
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Introduce device tree compatible 'linux,ubi' and attach compatible MTD
devices using the MTD add notifier. This is needed for a UBI device to
be available early at boot (and not only after late_initcall), so
volumes on them can be used eg. as NVMEM providers for other drivers.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Use UBI_VOLUME_ADDED notification to create ubiblock device specified
on kernel cmdline or module parameter.
This makes thing more simple and has the advantage that ubiblock devices
on volumes which are not present at the time the ubi module is probed
will still be created.
Suggested-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
In the error handling path `out_acc` of `ubi_resize_volume()`,
when `pebs < 0`, it indicates that the volume table record failed to
update when the volume was shrunk. In this case, the number of `ubi->avail_pebs`
and `ubi->rsvd_pebs` should be restored to their previous values to prevent
the UBI layer from reporting an incorrect number of available PEBs.
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
When using the ioctl interface to resize a UBI volume, `ubi_resize_volume`
resizes the EBA table first but does not change `vol->reserved_pebs` in
the same atomic context, which may cause concurrent access to the EBA table.
For example, when a user shrinks UBI volume A by calling `ubi_resize_volume`,
while another thread is writing to volume B and triggering wear-leveling,
which may call `ubi_write_fastmap`, under these circumstances, KASAN may
report a slab-out-of-bounds error in `ubi_eba_get_ldesc+0xfb/0x130`.
This patch fixes race conditions in `ubi_resize_volume` and
`ubi_update_fastmap` to avoid out-of-bounds reads of `eba_tbl`. First,
it ensures that updates to `eba_tbl` and `reserved_pebs` are protected
by `vol->volumes_lock`. Second, it implements a rollback mechanism in case
of resize failure. It is also worth mentioning that for volume shrinkage
failures, since part of the volume has already been shrunk and unmapped,
there is no need to recover `{rsvd/avail}_pebs`.
==================================================================
BUG: KASAN: slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130 [ubi]
Read of size 4 at addr ffff88800f43f570 by task kworker/u16:0/7
CPU: 0 PID: 7 Comm: kworker/u16:0 Not tainted 5.16.0-rc7 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Workqueue: writeback wb_workfn (flush-ubifs_0_0)
Call Trace:
<TASK>
dump_stack_lvl+0x4d/0x66
print_address_description.constprop.0+0x41/0x60
kasan_report.cold+0x83/0xdf
ubi_eba_get_ldesc+0xfb/0x130 [ubi]
ubi_update_fastmap.cold+0x60f/0xc7d [ubi]
ubi_wl_get_peb+0x25b/0x4f0 [ubi]
try_write_vid_and_data+0x9a/0x4d0 [ubi]
ubi_eba_write_leb+0x7e4/0x17d0 [ubi]
ubi_leb_map+0x1a0/0x2c0 [ubi]
ubifs_leb_map+0x139/0x270 [ubifs]
ubifs_add_bud_to_log+0xb40/0xf30 [ubifs]
make_reservation+0x86e/0xb00 [ubifs]
ubifs_jnl_write_data+0x430/0x9d0 [ubifs]
do_writepage+0x1d1/0x550 [ubifs]
ubifs_writepage+0x37c/0x670 [ubifs]
__writepage+0x67/0x170
write_cache_pages+0x259/0xa90
do_writepages+0x277/0x5d0
__writeback_single_inode+0xb8/0x850
writeback_sb_inodes+0x4b3/0xb20
__writeback_inodes_wb+0xc1/0x220
wb_writeback+0x59f/0x740
wb_workfn+0x6d0/0xca0
process_one_work+0x711/0xfc0
worker_thread+0x95/0xd00
kthread+0x3a6/0x490
ret_from_fork+0x1f/0x30
</TASK>
Allocated by task 711:
kasan_save_stack+0x1e/0x50
__kasan_kmalloc+0x81/0xa0
ubi_eba_create_table+0x88/0x1a0 [ubi]
ubi_resize_volume.cold+0x175/0xae7 [ubi]
ubi_cdev_ioctl+0x57f/0x1a60 [ubi]
__x64_sys_ioctl+0x13a/0x1c0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Last potentially related work creation:
kasan_save_stack+0x1e/0x50
__kasan_record_aux_stack+0xb7/0xc0
call_rcu+0xd6/0x1000
blk_stat_free_callback+0x28/0x30
blk_release_queue+0x8a/0x2e0
kobject_put+0x186/0x4c0
scsi_device_dev_release_usercontext+0x620/0xbd0
execute_in_process_context+0x2f/0x120
device_release+0xa4/0x240
kobject_put+0x186/0x4c0
put_device+0x20/0x30
__scsi_remove_device+0x1c3/0x300
scsi_probe_and_add_lun+0x2140/0x2eb0
__scsi_scan_target+0x1f2/0xbb0
scsi_scan_channel+0x11b/0x1a0
scsi_scan_host_selected+0x24c/0x310
do_scsi_scan_host+0x1e0/0x250
do_scan_async+0x45/0x490
async_run_entry_fn+0xa2/0x530
process_one_work+0x711/0xfc0
worker_thread+0x95/0xd00
kthread+0x3a6/0x490
ret_from_fork+0x1f/0x30
The buggy address belongs to the object at ffff88800f43f500
which belongs to the cache kmalloc-128 of size 128
The buggy address is located 112 bytes inside of
128-byte region [ffff88800f43f500, ffff88800f43f580)
The buggy address belongs to the page:
page:ffffea00003d0f00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xf43c
head:ffffea00003d0f00 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea000046ba08 ffffea0000457208 ffff88810004d1c0
raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88800f43f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88800f43f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88800f43f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
^
ffff88800f43f580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88800f43f600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
The following steps can used to reproduce:
Process 1: write and trigger ubi wear-leveling
ubimkvol /dev/ubi0 -s 5000MiB -N v1
ubimkvol /dev/ubi0 -s 2000MiB -N v2
ubimkvol /dev/ubi0 -s 10MiB -N v3
mount -t ubifs /dev/ubi0_0 /mnt/ubifs
while true;
do
filename=/mnt/ubifs/$((RANDOM))
dd if=/dev/random of=${filename} bs=1M count=$((RANDOM % 1000))
rm -rf ${filename}
sync /mnt/ubifs/
done
Process 2: do random resize
struct ubi_rsvol_req req;
req.vol_id = 1;
req.bytes = (rand() % 50) * 512KB;
ioctl(fd, UBI_IOCRSVOL, &req);
V3:
- Fix the commit message error.
V2:
- Add volumes_lock in ubi_eba_copy_leb() to avoid race caused by
updating eba_tbl.
V1:
- Rebase the patch on the latest mainline.
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Now that the calculation of fastmap size in ubi_calc_fm_size() is
incorrect since it miss each user volume's ubi_fm_eba structure and the
Internal UBI volume info. Let's correct the calculation.
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
If the LEB size is smaller than a volume table record we cannot
have volumes.
In this case abort attaching.
Cc: Chenyuan Yang <cy54@illinois.edu>
Cc: stable@vger.kernel.org
Fixes: 801c135ce73d ("UBI: Unsorted Block Images")
Reported-by: Chenyuan Yang <cy54@illinois.edu>
Closes: https://lore.kernel.org/linux-mtd/1433EB7A-FC89-47D6-8F47-23BE41B263B3@illinois.edu/
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
|
|
Pass the few limits ubiblock imposes directly to blk_mq_alloc_disk
instead of setting them one at a time.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Link: https://lore.kernel.org/r/20240215070300.2200308-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pass a queue_limits to blk_mq_alloc_disk and apply it if non-NULL. This
will allow allocating queues with valid queue limits instead of setting
the values one at a time later.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20240213073425.1621680-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If idr_alloc() fails, dev->gd will be put after goto out_cleanup_disk in
ubiblock_create(), but dev->gd has not been assigned yet at this time, and
'gd' will not be put anymore. Fix it by putting 'gd' directly.
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Because the mask received by the emulate_failures interface
is a 32-bit unsigned integer, ensure that there is sufficient
buffer length to receive and display this value.
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
This commit adds six fault injection type for testing to cover the
abnormal path of the UBI driver.
Inject the following faults when the UBI reads the LEB:
+----------------------------+-----------------------------------+
| Interface name | emulate behavior |
+----------------------------+-----------------------------------+
| emulate_eccerr | ECC error |
+----------------------------+-----------------------------------+
| emulate_read_failure | read failure |
|----------------------------+-----------------------------------+
| emulate_io_ff | read content as all FF |
|----------------------------+-----------------------------------+
| emulate_io_ff_bitflips | content FF with MTD err reported |
+----------------------------+-----------------------------------+
| emulate_bad_hdr | bad leb header |
|----------------------------+-----------------------------------+
| emulate_bad_hdr_ebadmsg | bad header with ECC err |
+----------------------------+-----------------------------------+
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The emulate_io_failures debugfs entry controls both write
failure and erase failure. This patch split io_failures
to write_failure and erase_failure.
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
To make debug parameters configurable at run time, use the
fault injection framework to reconstruct the debugfs interface,
and retain the legacy fault injection interface.
Now, the file emulate_failures and fault_attr files control whether
to enable fault emmulation.
The file emulate_failures receives a mask that controls type and
process of fault injection. Generally, for ease of use, you can
directly enter a mask with all 1s.
echo 0xffff > /sys/kernel/debug/ubi/ubi0/emulate_failures
And you need to configure other fault-injection capabilities for
testing purpose:
echo 100 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/probability
echo 15 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/space
echo 2 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/verbose
echo -1 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/times
The CONFIG_MTD_UBI_FAULT_INJECTION to enable the Fault Injection is
added to kconfig.
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The following BUG is reported when a ubiblock is removed:
==================================================================
BUG: KASAN: slab-use-after-free in ubiblock_cleanup+0x88/0xa0 [ubi]
Read of size 4 at addr ffff88810c8f3804 by task ubiblock/1716
CPU: 5 PID: 1716 Comm: ubiblock Not tainted 6.6.0-rc2+ #135
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x37/0x50
print_report+0xd0/0x620
kasan_report+0xb6/0xf0
ubiblock_cleanup+0x88/0xa0 [ubi]
ubiblock_remove+0x121/0x190 [ubi]
vol_cdev_ioctl+0x355/0x630 [ubi]
__x64_sys_ioctl+0xc7/0x100
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7f08d7445577
Code: b3 66 90 48 8b 05 11 89 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 8
RSP: 002b:00007ffde05a3018 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f08d7445577
RDX: 0000000000000000 RSI: 0000000000004f08 RDI: 0000000000000003
RBP: 0000000000816010 R08: 00000000008163a7 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000206 R12: 0000000000000003
R13: 00007ffde05a3130 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Allocated by task 1715:
kasan_save_stack+0x22/0x50
kasan_set_track+0x25/0x30
__kasan_kmalloc+0x7f/0x90
__alloc_disk_node+0x40/0x2b0
__blk_mq_alloc_disk+0x3e/0xb0
ubiblock_create+0x2ba/0x620 [ubi]
vol_cdev_ioctl+0x581/0x630 [ubi]
__x64_sys_ioctl+0xc7/0x100
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Freed by task 0:
kasan_save_stack+0x22/0x50
kasan_set_track+0x25/0x30
kasan_save_free_info+0x2b/0x50
__kasan_slab_free+0x10e/0x190
__kmem_cache_free+0x96/0x220
bdev_free_inode+0xa4/0xf0
rcu_core+0x496/0xec0
__do_softirq+0xeb/0x384
The buggy address belongs to the object at ffff88810c8f3800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 4 bytes inside of
freed 1024-byte region [ffff88810c8f3800, ffff88810c8f3c00)
The buggy address belongs to the physical page:
page:00000000d03de848 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10c8f0
head:00000000d03de848 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x200000000000840(slab|head|node=0|zone=2)
page_type: 0xffffffff()
raw: 0200000000000840 ffff888100042dc0 ffffea0004244400 dead000000000002
raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88810c8f3700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88810c8f3780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88810c8f3800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88810c8f3880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88810c8f3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Fix it by using a local variable to record the gendisk ID.
Fixes: 77567b25ab9f ("ubi: use blk_mq_alloc_disk and blk_cleanup_disk")
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
pools
This patch imports a new field 'need_resv_pool' in struct 'ubi_attach_req'
to control whether or not reserving free PEBs for filling pool/wl_pool.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Adding 6th module parameter in 'mtd=xxx' to control whether or not
reserving PEBs for filling pool/wl_pool.
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The anchor PEB must be picked from first 64 PEBs, these PEBs could have
large erase counter greater than other PEBs especially when free space
is nearly running out.
The ubi_update_fastmap will be called as long as pool/wl_pool is empty,
old anchor PEB is erased when updating fastmap. Given an UBI device with
N PEBs, free PEBs is nearly running out and pool will be filled with 1
PEB every time ubi_update_fastmap invoked. So t=N/POOL_SIZE[1]/64 means
that in worst case the erase counter of first 64 PEBs is t times greater
than other PEBs in theory.
After running fsstress for 24h, the erase counter statistics for two UBI
devices shown as follow(CONFIG_MTD_UBI_WL_THRESHOLD=128):
Device A(1024 PEBs, pool=50, wl_pool=25):
=========================================================
from to count min avg max
---------------------------------------------------------
0 .. 9: 0 0 0 0
10 .. 99: 0 0 0 0
100 .. 999: 0 0 0 0
1000 .. 9999: 0 0 0 0
10000 .. 99999: 960 29224 29282 29362
100000 .. inf: 64 117897 117934 117940
---------------------------------------------------------
Total : 1024 29224 34822 117940
Device B(8192 PEBs, pool=256, wl_pool=128):
=========================================================
from to count min avg max
---------------------------------------------------------
0 .. 9: 0 0 0 0
10 .. 99: 0 0 0 0
100 .. 999: 0 0 0 0
1000 .. 9999: 8128 2253 2321 2387
10000 .. 99999: 64 35387 35387 35388
100000 .. inf: 0 0 0 0
---------------------------------------------------------
Total : 8192 2253 2579 35388
The key point is reducing fastmap updating frequency by enlarging
POOL_SIZE, so let UBI reserve ubi->fm_pool.max_size PEBs during
attaching. Then POOL_SIZE will become ubi->fm_pool.max_size/2 even
in free space running out case.
Given an UBI device with 8192 PEBs(16384\8192\4096 is common
large-capacity flash), t=8192/128/64=1. The fastmap updating will
happen in either wl_pool or pool is empty, so setting fm_pool_rsv_cnt
as ubi->fm_pool.max_size can fill wl_pool in full state.
After pool reservation, running fsstress for 24h:
Device A(1024 PEBs, pool=50, wl_pool=25):
=========================================================
from to count min avg max
---------------------------------------------------------
0 .. 9: 0 0 0 0
10 .. 99: 0 0 0 0
100 .. 999: 0 0 0 0
1000 .. 9999: 0 0 0 0
10000 .. 99999: 1024 33801 33997 34056
100000 .. inf: 0 0 0 0
---------------------------------------------------------
Total : 1024 33801 33997 34056
Device B(8192 PEBs, pool=256, wl_pool=128):
=========================================================
from to count min avg max
---------------------------------------------------------
0 .. 9: 0 0 0 0
10 .. 99: 0 0 0 0
100 .. 999: 0 0 0 0
1000 .. 9999: 8192 2205 2397 2460
10000 .. 99999: 0 0 0 0
100000 .. inf: 0 0 0 0
---------------------------------------------------------
Total : 8192 2205 2397 2460
The difference of erase counter between first 64 PEBs and others is
under WL_FREE_MAX_DIFF(2*UBI_WL_THRESHOLD=2*128=256).
Device A: 34056 - 33801 = 255
Device B: 2460 - 2205 = 255
Next patch will add a switch to control whether UBI needs to reserve
PEBs for filling pool.
Fixes: dbb7d2a88d2a ("UBI: Add fastmap core")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
This is the part 2 to fix cyclically reusing single fastmap data PEBs.
Consider one situation, if there are four free PEBs for fm_anchor, pool,
wl_pool and fastmap data PEB with erase counter 100, 100, 100, 5096
(ubi->beb_rsvd_pebs is 0). PEB with erase counter 5096 is always picked
for fastmap data according to the realization of find_wl_entry(), since
fastmap data PEB is not scheduled for wl, finally there are two PEBs
(fm data) with great erase counter than other PEBS.
Get wl PEB even its erase counter exceeds the 'max' in find_wl_entry()
when free PEBs are run out after filling pools and fm data. Then the PEB
with biggest erase conter is taken as wl PEB, it can be scheduled for wl.
Fixes: dbb7d2a88d2a ("UBI: Add fastmap core")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
This is the part 1 to fix cyclically reusing single fastmap data PEBs.
After running fsstress on UBIFS for a while, UBI (16384 blocks, fastmap
takes 2 blocks) has an erase block(PEB: 8031) with big erase counter
greater than any other pebs:
=========================================================
from to count min avg max
---------------------------------------------------------
0 .. 9: 0 0 0 0
10 .. 99: 532 84 92 99
100 .. 999: 15787 100 147 229
1000 .. 9999: 64 4699 4765 4826
10000 .. 99999: 0 0 0 0
100000 .. inf: 1 272935 272935 272935
---------------------------------------------------------
Total : 16384 84 180 272935
Not like fm_anchor, there is no candidate PEBs for fastmap data area,
so old fastmap data pebs will be reused after all free pebs are filled
into pool/wl_pool:
ubi_update_fastmap
for (i = 1; i < new_fm->used_blocks; i++)
erase_block(ubi, old_fm->e[i]->pnum)
new_fm->e[i] = old_fm->e[i]
According to wear leveling algorithm, UBI selects one small erase
counter PEB from ubi->used and one big erase counter PEB from wl_pool,
the reused fastmap data PEB is not in these trees. UBI won't schedule
this PEB for wl even it is in ubi->used because wl algorithm expects
small erase counter for used PEB.
Don't reserve PEB for fastmap in may_reserve_for_fm() if fm_anchor
already exists. Otherwise, when UBI is running out of free PEBs,
the only one free PEB (pnum < 64) will be skipped and fastmap data
will be written on the same old PEB.
Fixes: dbb7d2a88d2a ("UBI: Add fastmap core")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Change pool filling stop condition. Commit d09e9a2bddba ("ubi:
fastmap: Fix high cpu usage of ubi_bgt by making sure wl_pool
not empty") reserves fastmap data PEBs after filling 1 PEB in
wl_pool. Now wait_free_pebs_for_pool() makes enough free PEBs
before filling pool, there will still be at least 1 PEB in pool
and 1 PEB in wl_pool after doing ubi_refill_pools().
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Wait until there are enough free PEBs before filling pool/wl_pool,
sometimes erase_worker is not scheduled in time, which causes two
situations:
A. There are few PEBs filled in pool, which makes ubi_update_fastmap
is frequently called and leads first 64 PEBs are erased more times
than other PEBs. So waiting free PEBs before filling pool reduces
fastmap updating frequency and prolongs flash service life.
B. In situation that space is nearly running out, ubi_refill_pools()
cannot make sure pool and wl_pool are filled with free PEBs, caused
by the delay of erase_worker. After this patch applied, there must
exist free PEBs in pool after one call of ubi_update_fastmap.
Besides, this patch is a preparetion for fixing large erase counter in
fastmap data block and fixing lapsed wear leveling for first 64 PEBs.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
If new bad PEBs occur, UBI firstly consumes ubi->beb_rsvd_pebs, and then
ubi->avail_pebs, finally UBI becomes read-only if above two items are 0,
which means that the amount of PEBs for user volumes is not effected.
Besides, UBI reserves count of free PBEs is ubi->beb_rsvd_pebs while
filling wl pool or getting free PEBs, but ubi->avail_pebs is not reserved.
So ubi->beb_rsvd_pebs and ubi->avail_pebs have nothing to do with the
usage of free PEBs, UBI can use all free PEBs.
Commit 78d6d497a648 ("UBI: Move fastmap specific functions out of wl.c")
has removed beb_rsvd_pebs checking while filling pool. Now, don't reserve
ubi->beb_rsvd_pebs while filling wl_pool. This will fill more PEBs in pool
and also reduce fastmap updating frequency.
Also remove beb_rsvd_pebs checking in ubi_wl_get_fm_peb.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Since erase_block() has same logic with sync_erase(), just replace it
with sync_erase(), also rename 'sync_erase()' to 'ubi_sync_erase()'.
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|