summaryrefslogtreecommitdiff
path: root/drivers/mtd/ubi
AgeCommit message (Collapse)Author
2025-03-13block: remove unused parameter 'q' parameter in __blk_rq_map_sg()Anuj Gupta
request_queue param is no longer used by blk_rq_map_sg and __blk_rq_map_sg. Remove it. Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250313035322.243239-1-anuj20.g@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-30Merge tag 'ubifs-for-linus-6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI and UBIFS updates from Richard Weinberger: "UBI: - New interface to dump detailed erase counters - Fixes around wear-leveling UBIFS: - Minor cleanups - Fix for TNC dumping code" * tag 'ubifs-for-linus-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubi: ubi_get_ec_info: Fix compiling error 'cast specifies array type' ubi: Implement ioctl for detailed erase counters ubi: Expose interface for detailed erase counters ubifs: skip dumping tnc tree when zroot is null ubi: Revert "ubi: wl: Close down wear-leveling before nand is suspended" ubifs: ubifs_dump_leb: remove return from end of void function ubifs: dump_lpt_leb: remove return at end of void function ubi: Add a check for ubi_num
2025-01-20ubi: ubi_get_ec_info: Fix compiling error 'cast specifies array type'Zhihao Cheng
On risc V platform, there is a type conversion for the return value (unsigned long type) of __untagged_addr_remote() in function untagged_addr(). The compiler will complain when the parameter 'addr' is an array type: arch/riscv/include/asm/uaccess.h:33:9: error: cast specifies array type (__force __typeof__(addr))__untagged_addr_remote(current->mm, __addr) Fix it by converting the input parameter as a pointer. Fixes: 01099f635a4c ("ubi: Implement ioctl for detailed erase counters") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501191405.WYnmdL0U-lkp@intel.com/ Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-18ubi: Implement ioctl for detailed erase countersRickard Andersson
Currently, "max_ec" can be read from sysfs, which provides a limited view of the flash device’s wear. In certain cases, such as bugs in the wear-leveling algorithm, specific blocks can be worn down more than others, resulting in uneven wear distribution. Also some use cases can wear the erase blocks of the fastmap area more heavily than other parts of flash. Providing detailed erase counter values give a better understanding of the overall flash wear and is needed to be able to calculate for example expected life time. There exists more detailed info in debugfs, but this information is only available for debug builds. Signed-off-by: Rickard Andersson <rickard.andersson@axis.com> Tested-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-18ubi: Revert "ubi: wl: Close down wear-leveling before nand is suspended"Zhihao Cheng
Commit 5580cdae05ae ("ubi: wl: Close down wear-leveling before nand is suspended") added a reboot notification in UBI layer to shutdown the wear-leveling subsystem, which imported an UAF problem[1]. Besides that, the method also brings other potential UAF problems, for example: reboot kworker ubi_wl_reboot_notifier ubi_wl_close ubi_fastmap_close kfree(ubi->fm) update_fastmap_work_fn ubi_update_fastmap old_fm = ubi->fm if (old_fm && old_fm->e[i]) // UAF! Actually, the problem fixed by commit 5580cdae05ae ("ubi: wl: Close down wear-leveling before nand is suspended") has been solved by commit 8cba323437a4 ("mtd: rawnand: protect access to rawnand devices while in suspend"), which was discussed in [2]. So we can revert the commit 5580cdae05ae ("ubi: wl: Close down wear-leveling before nand is suspended") directly. [1] https://lore.kernel.org/linux-mtd/20241208175211.9406-2-dennis.lamerice@gmail.com/ [2] https://lore.kernel.org/all/9bf76f5d-12a4-46ff-90d4-4a7f0f47c381@axis.com/ Fixes: 5580cdae05ae ("ubi: wl: Close down wear-leveling before nand is suspended") Reported-by: Dennis Lam <dennis.lamerice@gmail.com> Closes: https://lore.kernel.org/linux-mtd/20241208175211.9406-2-dennis.lamerice@gmail.com/ Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Acked-by: Mårten Lindahl <marten.lindahl@axis.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-18ubi: Add a check for ubi_numDenis Arefev
Added a check for ubi_num for negative numbers If the variable ubi_num takes negative values then we get: qemu-system-arm ... -append "ubi.mtd=0,0,0,-22222345" ... [ 0.745065] ubi_attach_mtd_dev from ubi_init+0x178/0x218 [ 0.745230] ubi_init from do_one_initcall+0x70/0x1ac [ 0.745344] do_one_initcall from kernel_init_freeable+0x198/0x224 [ 0.745474] kernel_init_freeable from kernel_init+0x18/0x134 [ 0.745600] kernel_init from ret_from_fork+0x14/0x28 [ 0.745727] Exception stack(0x90015fb0 to 0x90015ff8) Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 83ff59a06663 ("UBI: support ubi_num on mtd.ubi command line") Cc: stable@vger.kernel.org Signed-off-by: Denis Arefev <arefev@swemel.ru> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-12-23block: remove BLK_MQ_F_SHOULD_MERGEChristoph Hellwig
BLK_MQ_F_SHOULD_MERGE is set for all tag_sets except those that purely process passthrough commands (bsg-lib, ufs tmf, various nvme admin queues) and thus don't even check the flag. Remove it to simplify the driver interface. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241219060214.1928848-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-14mtd: ubi: remove redundant check on bytes_left at end of functionColin Ian King
In function ubi_nvmem_reg_read the while-loop can only be exiting of bytes_left is zero or an error has occurred. There is an exit return path if an error occurs, so the bytes_left can only be zero after that point. Hence the check for a non-zero bytes_left at the end of the function is redundant and can be removed. Remove the check and just return 0. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14mtd: ubi: fix unreleased fwnode_handle in find_volume_fwnode()Javier Carrasco
The 'fw_vols' fwnode_handle initialized via device_get_named_child_node() requires explicit calls to fwnode_handle_put() when the variable is no longer required. Add the missing calls to fwnode_handle_put() before the function returns. Cc: stable@vger.kernel.org Fixes: 51932f9fc487 ("mtd: ubi: populate ubi volume fwnode") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubi: fastmap: Fix duplicate slab cache names while attachingZhihao Cheng
Since commit 4c39529663b9 ("slab: Warn on duplicate cache names when DEBUG_VM=y"), the duplicate slab cache names can be detected and a kernel WARNING is thrown out. In UBI fast attaching process, alloc_ai() could be invoked twice with the same slab cache name 'ubi_aeb_slab_cache', which will trigger following warning messages: kmem_cache of name 'ubi_aeb_slab_cache' already exists WARNING: CPU: 0 PID: 7519 at mm/slab_common.c:107 __kmem_cache_create_args+0x100/0x5f0 Modules linked in: ubi(+) nandsim [last unloaded: nandsim] CPU: 0 UID: 0 PID: 7519 Comm: modprobe Tainted: G 6.12.0-rc2 RIP: 0010:__kmem_cache_create_args+0x100/0x5f0 Call Trace: __kmem_cache_create_args+0x100/0x5f0 alloc_ai+0x295/0x3f0 [ubi] ubi_attach+0x3c3/0xcc0 [ubi] ubi_attach_mtd_dev+0x17cf/0x3fa0 [ubi] ubi_init+0x3fb/0x800 [ubi] do_init_module+0x265/0x7d0 __x64_sys_finit_module+0x7a/0xc0 The problem could be easily reproduced by loading UBI device by fastmap with CONFIG_DEBUG_VM=y. Fix it by using different slab names for alloc_ai() callers. Fixes: d2158f69a7d4 ("UBI: Remove alloc_ai() slab name from parameter list") Fixes: fdf10ed710c0 ("ubi: Rework Fastmap attach base code") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubi: wl: Close down wear-leveling before nand is suspendedMårten Lindahl
If a reboot/shutdown signal with double force (-ff) is triggered when the erase worker or wear-leveling worker function runs we may end up in a race condition since the MTD device gets a reboot notification and suspends the nand flash before the erase or wear-leveling is done. This will reject all accesses to the flash with -EBUSY. Sequence for the erase worker function: systemctl reboot -ff ubi_thread do_work __do_sys_reboot blocking_notifier_call_chain mtd_reboot_notifier nand_shutdown nand_suspend __erase_worker ubi_sync_erase mtd_erase nand_erase_nand # Blocked by suspended chip nand_get_device => EBUSY Similar sequence for the wear-leveling function: systemctl reboot -ff ubi_thread do_work __do_sys_reboot blocking_notifier_call_chain mtd_reboot_notifier nand_shutdown nand_suspend wear_leveling_worker ubi_eba_copy_leb ubi_io_write mtd_write nand_write_oob # Blocked by suspended chip nand_get_device => EBUSY systemd-shutdown[1]: Rebooting. ubi0 error: ubi_io_write: error -16 while writing 2048 bytes to PEB CPU: 1 PID: 82 Comm: ubi_bgt0d Kdump: loaded Tainted: G O (unwind_backtrace) from [<80107b9f>] (show_stack+0xb/0xc) (show_stack) from [<8033641f>] (dump_stack_lvl+0x2b/0x34) (dump_stack_lvl) from [<803b7f3f>] (ubi_io_write+0x3ab/0x4a8) (ubi_io_write) from [<803b817d>] (ubi_io_write_vid_hdr+0x71/0xb4) (ubi_io_write_vid_hdr) from [<803b6971>] (ubi_eba_copy_leb+0x195/0x2f0) (ubi_eba_copy_leb) from [<803b939b>] (wear_leveling_worker+0x2ff/0x738) (wear_leveling_worker) from [<803b86ef>] (do_work+0x5b/0xb0) (do_work) from [<803b9ee1>] (ubi_thread+0xb1/0x11c) (ubi_thread) from [<8012c113>] (kthread+0x11b/0x134) (kthread) from [<80100139>] (ret_from_fork+0x11/0x38) Exception stack(0x80c43fb0 to 0x80c43ff8) ... ubi0 error: ubi_dump_flash: err -16 while reading 2048 bytes from PEB ubi0 error: wear_leveling_worker: error -16 while moving PEB 246 to PEB ubi0 warning: ubi_ro_mode.part.0: switch to read-only mode ... ubi0 error: do_work: work failed with error code -16 ubi0 error: ubi_thread: ubi_bgt0d: work failed with error code -16 ... Kernel panic - not syncing: Software Watchdog Timer expired Add a reboot notification for the ubi/wear-leveling to shutdown any potential flash work actions before the nand is suspended. Signed-off-by: Mårten Lindahl <marten.lindahl@axis.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14mtd: ubi: Rmove unused declaration in header fileZhang Zekun
The definition of ubi_destroy_ai() has been removed since commit dac6e2087a41 ("UBI: Add fastmap stuff to attach.c"). Remove the empty declaration in header file. Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubi: fastmap: wl: Schedule fm_work if wear-leveling pool is emptyZhihao Cheng
Since commit 14072ee33d5a ("ubi: fastmap: Check wl_pool for free peb before wear leveling"), wear_leveling_worker() won't schedule fm_work if wear-leveling pool is empty, which could temporarily disable the wear-leveling until the fastmap is updated(eg. pool becomes empty). Fix it by scheduling fm_work if wl_pool is empty during wear-leveing. Fixes: 14072ee33d5a ("ubi: fastmap: Check wl_pool for free peb before wear leveling") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubi: wl: Put source PEB into correct list if trying locking LEB failedZhihao Cheng
During wear-leveing work, the source PEB will be moved into scrub list when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong for non-scrub type source PEB. The problem could bring extra and ineffective wear-leveing jobs, which makes more or less negative effects for the life time of flash. Specifically, the process is divided 2 steps: 1. wear_leveling_worker // generate false scrub type PEB ubi_eba_copy_leb // MOVE_RETRY is returned leb_write_trylock // trylock failed scrubbing = 1; e1 is put into ubi->scrub 2. wear_leveling_worker // schedule false scrub type PEB for wl scrubbing = 1 e1 = rb_entry(rb_first(&ubi->scrub)) The problem can be reproduced easily by running fsstress on a small UBIFS partition(<64M, simulated by nandsim) for 5~10mins (CONFIG_MTD_UBI_FASTMAP=y,CONFIG_MTD_UBI_WL_THRESHOLD=50). Following message is shown: ubi0: scrubbed PEB 66 (LEB 0:10), data moved to PEB 165 Since scrub type source PEB has set variable scrubbing as '1', and variable scrubbing is checked before variable keep, so the problem can be fixed by setting keep variable as 1 directly if the source LEB cannot be locked. Fixes: e801e128b220 ("UBI: fix missing scrub when there is a bit-flip") CC: stable@vger.kernel.org Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-09-27[tree-wide] finally take no_llseek outAl Viro
no_llseek had been defined to NULL two years ago, in commit 868941b14441 ("fs: remove no_llseek") To quote that commit, At -rc1 we'll need do a mechanical removal of no_llseek - git grep -l -w no_llseek | grep -v porting.rst | while read i; do sed -i '/\<no_llseek\>/d' $i done would do it. Unfortunately, that hadn't been done. Linus, could you do that now, so that we could finally put that thing to rest? All instances are of the form .llseek = no_llseek, so it's obviously safe. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28ubi: Fix ubi_init() ubiblock_exit() section mismatchRichard Weinberger
Since ubiblock_exit() is now called from an init function, the __exit section no longer makes sense. Cc: Ben Hutchings <bwh@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202407131403.wZJpd8n2-lkp@intel.com/ Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
2024-07-12ubi: block: fix null-pointer-dereference in ubiblock_create()Li Nan
Similar to commit adbf4c4954e3 ("ubi: block: fix memleak in ubiblock_create()"), 'dev->gd' is not assigned but dereferenced if blk_mq_alloc_tag_set() fails, and leading to a null-pointer-dereference. Fix it by using pr_err() and variable 'dev' to print error log. Additionally, the log in the error handle path of idr_alloc() has been improved by using pr_err(), too. Before initializing device name, using dev_err() will print error log with 'null' instead of the actual device name, like this: block (null): ... ~~~~~~ It is unclear. Using pr_err() can print more details of the device. The improved log is: ubiblock0_0: ... Fixes: 77567b25ab9f ("ubi: use blk_mq_alloc_disk and blk_cleanup_disk") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12ubifs: correct UBIFS_DFS_DIR_LEN macro definition and improve code clarityZhaoLong Wang
The UBIFS_DFS_DIR_LEN macro, which defines the maximum length of the UBIFS debugfs directory name, has an incorrect formula and misleading comments. The current formula is (3 + 1 + 2*2 + 1), which assumes that both UBI device number and volume ID are limited to 2 characters. However, UBI device number ranges from 0 to 31 (2 characters), and volume ID ranges from 0 to 127 (up to 3 characters). Although the current code works due to the cancellation of mathematical errors (9 + 1 = 10, which matches the correct UBIFS_DFS_DIR_LEN value), it can lead to confusion and potential issues in the future. This patch aims to improve the code clarity and maintainability by making the following changes: 1. Corrects the UBIFS_DFS_DIR_LEN macro definition to (3 + 1 + 2 + 3 + 1), accommodating the maximum lengths of both UBI device number and volume ID, plus the separators and null terminator. 2. Updates the snprintf calls to use UBIFS_DFS_DIR_LEN instead of UBIFS_DFS_DIR_LEN + 1, removing the unnecessary +1. 3. Modifies the error checks to compare against UBIFS_DFS_DIR_LEN using >= instead of >, aligning with the corrected macro definition. 4. Removes the redundant +1 in the dfs_dir_name array definitions in ubi.h and debug.h. While these changes do not affect the runtime behavior, they make the code more readable, maintainable, and less prone to future errors. v2->v3: - Removes the duplicated UBIFS_DFS_DIR_LEN and UBIFS_DFS_DIR_NAME macro definitions in ubifs.h, as they are already defined in debug.h. Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12mtd: ubi: Restore missing cleanup on ubi_init() failure pathBen Hutchings
We need to clean-up debugfs and ubiblock if we fail after initialising them. Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Fixes: 927c145208b0 ("mtd: ubi: attach from device tree") Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12mtd: ubi: avoid expensive do_div() on 32-bit machinesArnd Bergmann
The use of do_div() in ubi_nvmem_reg_read() makes calling it on 32-bit machines rather expensive. Since the 'from' variable is known to be a 32-bit quantity, it is clearly never needed and can be optimized into a regular division operation. Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems") Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12mtd: ubi: make ubi_class constantRicardo B. Marliere
Since commit 43a7206b0963 ("driver core: class: make class_register() take a const *"), the driver core allows for struct class to be in read-only memory, so move the ubi_class structure to be declared at build time placing it into read-only memory, instead of having to be dynamically allocated at boot time. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12ubi: eba: properly rollback inside self_check_ebaFedor Pchelkin
In case of a memory allocation failure in the volumes loop we can only process the already allocated scan_eba and fm_eba array elements on the error path - others are still uninitialized. Found by Linux Verification Center (linuxtesting.org). Fixes: 00abf3041590 ("UBI: Add self_check_eba()") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-03-21Merge tag 'ubifs-for-linus-6.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI and UBIFS updates from Richard Weinberger: "UBI: - Add Zhihao Cheng as reviewer - Attach via device tree - Add NVMEM layer - Various fastmap related fixes UBIFS: - Add Zhihao Cheng as reviewer - Convert to folios - Various fixes (memory leaks in error paths, function prototypes)" * tag 'ubifs-for-linus-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: (34 commits) mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems mtd: ubi: provide NVMEM layer over UBI volumes mtd: ubi: populate ubi volume fwnode mtd: ubi: introduce pre-removal notification for UBI volumes mtd: ubi: attach from device tree mtd: ubi: block: use notifier to create ubiblock from parameter dt-bindings: mtd: ubi-volume: allow UBI volumes to provide NVMEM dt-bindings: mtd: add basic bindings for UBI ubifs: Queue up space reservation tasks if retrying many times ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path ubifs: dbg_check_idx_size: Fix kmemleak if loading znode failed ubi: Correct the number of PEBs after a volume resize failure ubi: fix slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130 ubi: correct the calculation of fastmap size ubifs: Remove unreachable code in dbg_check_ltab_lnum ubifs: fix function pointer cast warnings ubifs: fix sort function prototype ubi: Check for too small LEB size in VTBL code MAINTAINERS: Add Zhihao Cheng as UBI/UBIFS reviewer ubifs: Convert populate_page() to take a folio ...
2024-03-10mtd: ubi: fix NVMEM over UBI volumes on 32-bit systemsDaniel Golle
A compiler warning related to sizeof(int) != 8 when calling do_div() is triggered when building on 32-bit platforms. Address this by using integer types having a well-defined size. Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25mtd: ubi: provide NVMEM layer over UBI volumesDaniel Golle
In an ideal world we would like UBI to be used where ever possible on a NAND chip. And with UBI support in ARM Trusted Firmware and U-Boot it is possible to achieve an (almost-)all-UBI flash layout. Hence the need for a way to also use UBI volumes to store board-level constants, such as MAC addresses and calibration data of wireless interfaces. Add UBI volume NVMEM driver module exposing UBI volumes as NVMEM providers. Allow UBI devices to have a "volumes" firmware subnode with volumes which may be compatible with "nvmem-cells". Access to UBI volumes via the NVMEM interface at this point is read-only, and it is slow, opening and closing the UBI volume for each access due to limitations of the NVMEM provider API. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25mtd: ubi: populate ubi volume fwnodeDaniel Golle
Look for the 'volumes' subnode of an MTD partition attached to a UBI device and attach matching child nodes to UBI volumes. This allows UBI volumes to be referenced in device tree, e.g. for use as NVMEM providers. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25mtd: ubi: introduce pre-removal notification for UBI volumesDaniel Golle
Introduce a new notification type UBI_VOLUME_SHUTDOWN to inform users that a volume is just about to be removed. This is needed because users (such as the NVMEM subsystem) expect that at the time their removal function is called, the parenting device is still available (for removal of sysfs nodes, for example, in case of NVMEM which otherwise WARNs on volume removal). Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25mtd: ubi: attach from device treeDaniel Golle
Introduce device tree compatible 'linux,ubi' and attach compatible MTD devices using the MTD add notifier. This is needed for a UBI device to be available early at boot (and not only after late_initcall), so volumes on them can be used eg. as NVMEM providers for other drivers. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25mtd: ubi: block: use notifier to create ubiblock from parameterDaniel Golle
Use UBI_VOLUME_ADDED notification to create ubiblock device specified on kernel cmdline or module parameter. This makes thing more simple and has the advantage that ubiblock devices on volumes which are not present at the time the ubi module is probed will still be created. Suggested-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25ubi: Correct the number of PEBs after a volume resize failureZhaoLong Wang
In the error handling path `out_acc` of `ubi_resize_volume()`, when `pebs < 0`, it indicates that the volume table record failed to update when the volume was shrunk. In this case, the number of `ubi->avail_pebs` and `ubi->rsvd_pebs` should be restored to their previous values to prevent the UBI layer from reporting an incorrect number of available PEBs. Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25ubi: fix slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130Guo Xuenan
When using the ioctl interface to resize a UBI volume, `ubi_resize_volume` resizes the EBA table first but does not change `vol->reserved_pebs` in the same atomic context, which may cause concurrent access to the EBA table. For example, when a user shrinks UBI volume A by calling `ubi_resize_volume`, while another thread is writing to volume B and triggering wear-leveling, which may call `ubi_write_fastmap`, under these circumstances, KASAN may report a slab-out-of-bounds error in `ubi_eba_get_ldesc+0xfb/0x130`. This patch fixes race conditions in `ubi_resize_volume` and `ubi_update_fastmap` to avoid out-of-bounds reads of `eba_tbl`. First, it ensures that updates to `eba_tbl` and `reserved_pebs` are protected by `vol->volumes_lock`. Second, it implements a rollback mechanism in case of resize failure. It is also worth mentioning that for volume shrinkage failures, since part of the volume has already been shrunk and unmapped, there is no need to recover `{rsvd/avail}_pebs`. ================================================================== BUG: KASAN: slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130 [ubi] Read of size 4 at addr ffff88800f43f570 by task kworker/u16:0/7 CPU: 0 PID: 7 Comm: kworker/u16:0 Not tainted 5.16.0-rc7 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Workqueue: writeback wb_workfn (flush-ubifs_0_0) Call Trace: <TASK> dump_stack_lvl+0x4d/0x66 print_address_description.constprop.0+0x41/0x60 kasan_report.cold+0x83/0xdf ubi_eba_get_ldesc+0xfb/0x130 [ubi] ubi_update_fastmap.cold+0x60f/0xc7d [ubi] ubi_wl_get_peb+0x25b/0x4f0 [ubi] try_write_vid_and_data+0x9a/0x4d0 [ubi] ubi_eba_write_leb+0x7e4/0x17d0 [ubi] ubi_leb_map+0x1a0/0x2c0 [ubi] ubifs_leb_map+0x139/0x270 [ubifs] ubifs_add_bud_to_log+0xb40/0xf30 [ubifs] make_reservation+0x86e/0xb00 [ubifs] ubifs_jnl_write_data+0x430/0x9d0 [ubifs] do_writepage+0x1d1/0x550 [ubifs] ubifs_writepage+0x37c/0x670 [ubifs] __writepage+0x67/0x170 write_cache_pages+0x259/0xa90 do_writepages+0x277/0x5d0 __writeback_single_inode+0xb8/0x850 writeback_sb_inodes+0x4b3/0xb20 __writeback_inodes_wb+0xc1/0x220 wb_writeback+0x59f/0x740 wb_workfn+0x6d0/0xca0 process_one_work+0x711/0xfc0 worker_thread+0x95/0xd00 kthread+0x3a6/0x490 ret_from_fork+0x1f/0x30 </TASK> Allocated by task 711: kasan_save_stack+0x1e/0x50 __kasan_kmalloc+0x81/0xa0 ubi_eba_create_table+0x88/0x1a0 [ubi] ubi_resize_volume.cold+0x175/0xae7 [ubi] ubi_cdev_ioctl+0x57f/0x1a60 [ubi] __x64_sys_ioctl+0x13a/0x1c0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Last potentially related work creation: kasan_save_stack+0x1e/0x50 __kasan_record_aux_stack+0xb7/0xc0 call_rcu+0xd6/0x1000 blk_stat_free_callback+0x28/0x30 blk_release_queue+0x8a/0x2e0 kobject_put+0x186/0x4c0 scsi_device_dev_release_usercontext+0x620/0xbd0 execute_in_process_context+0x2f/0x120 device_release+0xa4/0x240 kobject_put+0x186/0x4c0 put_device+0x20/0x30 __scsi_remove_device+0x1c3/0x300 scsi_probe_and_add_lun+0x2140/0x2eb0 __scsi_scan_target+0x1f2/0xbb0 scsi_scan_channel+0x11b/0x1a0 scsi_scan_host_selected+0x24c/0x310 do_scsi_scan_host+0x1e0/0x250 do_scan_async+0x45/0x490 async_run_entry_fn+0xa2/0x530 process_one_work+0x711/0xfc0 worker_thread+0x95/0xd00 kthread+0x3a6/0x490 ret_from_fork+0x1f/0x30 The buggy address belongs to the object at ffff88800f43f500 which belongs to the cache kmalloc-128 of size 128 The buggy address is located 112 bytes inside of 128-byte region [ffff88800f43f500, ffff88800f43f580) The buggy address belongs to the page: page:ffffea00003d0f00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xf43c head:ffffea00003d0f00 order:2 compound_mapcount:0 compound_pincount:0 flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff) raw: 001fffff80010200 ffffea000046ba08 ffffea0000457208 ffff88810004d1c0 raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88800f43f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800f43f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > ffff88800f43f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc ^ ffff88800f43f580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800f43f600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc The following steps can used to reproduce: Process 1: write and trigger ubi wear-leveling ubimkvol /dev/ubi0 -s 5000MiB -N v1 ubimkvol /dev/ubi0 -s 2000MiB -N v2 ubimkvol /dev/ubi0 -s 10MiB -N v3 mount -t ubifs /dev/ubi0_0 /mnt/ubifs while true; do filename=/mnt/ubifs/$((RANDOM)) dd if=/dev/random of=${filename} bs=1M count=$((RANDOM % 1000)) rm -rf ${filename} sync /mnt/ubifs/ done Process 2: do random resize struct ubi_rsvol_req req; req.vol_id = 1; req.bytes = (rand() % 50) * 512KB; ioctl(fd, UBI_IOCRSVOL, &req); V3: - Fix the commit message error. V2: - Add volumes_lock in ubi_eba_copy_leb() to avoid race caused by updating eba_tbl. V1: - Rebase the patch on the latest mainline. Signed-off-by: Guo Xuenan <guoxuenan@huawei.com> Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25ubi: correct the calculation of fastmap sizeZhang Yi
Now that the calculation of fastmap size in ubi_calc_fm_size() is incorrect since it miss each user volume's ubi_fm_eba structure and the Internal UBI volume info. Let's correct the calculation. Cc: stable@vger.kernel.org Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25ubi: Check for too small LEB size in VTBL codeRichard Weinberger
If the LEB size is smaller than a volume table record we cannot have volumes. In this case abort attaching. Cc: Chenyuan Yang <cy54@illinois.edu> Cc: stable@vger.kernel.org Fixes: 801c135ce73d ("UBI: Unsorted Block Images") Reported-by: Chenyuan Yang <cy54@illinois.edu> Closes: https://lore.kernel.org/linux-mtd/1433EB7A-FC89-47D6-8F47-23BE41B263B3@illinois.edu/ Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
2024-02-19ubiblock: pass queue_limits to blk_mq_alloc_diskChristoph Hellwig
Pass the few limits ubiblock imposes directly to blk_mq_alloc_disk instead of setting them one at a time. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Link: https://lore.kernel.org/r/20240215070300.2200308-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-02-13block: pass a queue_limits argument to blk_mq_alloc_diskChristoph Hellwig
Pass a queue_limits to blk_mq_alloc_disk and apply it if non-NULL. This will allow allocating queues with valid queue limits instead of setting the values one at a time later. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240213073425.1621680-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-06ubi: block: fix memleak in ubiblock_create()Li Nan
If idr_alloc() fails, dev->gd will be put after goto out_cleanup_disk in ubiblock_create(), but dev->gd has not been assigned yet at this time, and 'gd' will not be put anymore. Fix it by putting 'gd' directly. Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-06ubi: Reserve sufficient buffer length for the input maskZhaoLong Wang
Because the mask received by the emulate_failures interface is a 32-bit unsigned integer, ensure that there is sufficient buffer length to receive and display this value. Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-06ubi: Add six fault injection type for testingZhaoLong Wang
This commit adds six fault injection type for testing to cover the abnormal path of the UBI driver. Inject the following faults when the UBI reads the LEB: +----------------------------+-----------------------------------+ | Interface name | emulate behavior | +----------------------------+-----------------------------------+ | emulate_eccerr | ECC error | +----------------------------+-----------------------------------+ | emulate_read_failure | read failure | |----------------------------+-----------------------------------+ | emulate_io_ff | read content as all FF | |----------------------------+-----------------------------------+ | emulate_io_ff_bitflips | content FF with MTD err reported | +----------------------------+-----------------------------------+ | emulate_bad_hdr | bad leb header | |----------------------------+-----------------------------------+ | emulate_bad_hdr_ebadmsg | bad header with ECC err | +----------------------------+-----------------------------------+ Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-06ubi: Split io_failures into write_failure and erase_failureZhaoLong Wang
The emulate_io_failures debugfs entry controls both write failure and erase failure. This patch split io_failures to write_failure and erase_failure. Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-06ubi: Use the fault injection framework to enhance the fault injection capabilityZhaoLong Wang
To make debug parameters configurable at run time, use the fault injection framework to reconstruct the debugfs interface, and retain the legacy fault injection interface. Now, the file emulate_failures and fault_attr files control whether to enable fault emmulation. The file emulate_failures receives a mask that controls type and process of fault injection. Generally, for ease of use, you can directly enter a mask with all 1s. echo 0xffff > /sys/kernel/debug/ubi/ubi0/emulate_failures And you need to configure other fault-injection capabilities for testing purpose: echo 100 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/probability echo 15 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/space echo 2 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/verbose echo -1 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/times The CONFIG_MTD_UBI_FAULT_INJECTION to enable the Fault Injection is added to kconfig. Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: block: Fix use-after-free in ubiblock_cleanupZhaoLong Wang
The following BUG is reported when a ubiblock is removed: ================================================================== BUG: KASAN: slab-use-after-free in ubiblock_cleanup+0x88/0xa0 [ubi] Read of size 4 at addr ffff88810c8f3804 by task ubiblock/1716 CPU: 5 PID: 1716 Comm: ubiblock Not tainted 6.6.0-rc2+ #135 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x37/0x50 print_report+0xd0/0x620 kasan_report+0xb6/0xf0 ubiblock_cleanup+0x88/0xa0 [ubi] ubiblock_remove+0x121/0x190 [ubi] vol_cdev_ioctl+0x355/0x630 [ubi] __x64_sys_ioctl+0xc7/0x100 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 RIP: 0033:0x7f08d7445577 Code: b3 66 90 48 8b 05 11 89 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 8 RSP: 002b:00007ffde05a3018 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f08d7445577 RDX: 0000000000000000 RSI: 0000000000004f08 RDI: 0000000000000003 RBP: 0000000000816010 R08: 00000000008163a7 R09: 0000000000000000 R10: 0000000000000003 R11: 0000000000000206 R12: 0000000000000003 R13: 00007ffde05a3130 R14: 0000000000000000 R15: 0000000000000000 </TASK> Allocated by task 1715: kasan_save_stack+0x22/0x50 kasan_set_track+0x25/0x30 __kasan_kmalloc+0x7f/0x90 __alloc_disk_node+0x40/0x2b0 __blk_mq_alloc_disk+0x3e/0xb0 ubiblock_create+0x2ba/0x620 [ubi] vol_cdev_ioctl+0x581/0x630 [ubi] __x64_sys_ioctl+0xc7/0x100 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Freed by task 0: kasan_save_stack+0x22/0x50 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2b/0x50 __kasan_slab_free+0x10e/0x190 __kmem_cache_free+0x96/0x220 bdev_free_inode+0xa4/0xf0 rcu_core+0x496/0xec0 __do_softirq+0xeb/0x384 The buggy address belongs to the object at ffff88810c8f3800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 4 bytes inside of freed 1024-byte region [ffff88810c8f3800, ffff88810c8f3c00) The buggy address belongs to the physical page: page:00000000d03de848 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10c8f0 head:00000000d03de848 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x200000000000840(slab|head|node=0|zone=2) page_type: 0xffffffff() raw: 0200000000000840 ffff888100042dc0 ffffea0004244400 dead000000000002 raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88810c8f3700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88810c8f3780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88810c8f3800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88810c8f3880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88810c8f3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fix it by using a local variable to record the gendisk ID. Fixes: 77567b25ab9f ("ubi: use blk_mq_alloc_disk and blk_cleanup_disk") Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: fastmap: Add control in 'UBI_IOCATT' ioctl to reserve PEBs for filling ↵Zhihao Cheng
pools This patch imports a new field 'need_resv_pool' in struct 'ubi_attach_req' to control whether or not reserving free PEBs for filling pool/wl_pool. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: fastmap: Add module parameter to control reserving filling pool PEBsZhihao Cheng
Adding 6th module parameter in 'mtd=xxx' to control whether or not reserving PEBs for filling pool/wl_pool. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: fastmap: Fix lapsed wear leveling for first 64 PEBsZhihao Cheng
The anchor PEB must be picked from first 64 PEBs, these PEBs could have large erase counter greater than other PEBs especially when free space is nearly running out. The ubi_update_fastmap will be called as long as pool/wl_pool is empty, old anchor PEB is erased when updating fastmap. Given an UBI device with N PEBs, free PEBs is nearly running out and pool will be filled with 1 PEB every time ubi_update_fastmap invoked. So t=N/POOL_SIZE[1]/64 means that in worst case the erase counter of first 64 PEBs is t times greater than other PEBs in theory. After running fsstress for 24h, the erase counter statistics for two UBI devices shown as follow(CONFIG_MTD_UBI_WL_THRESHOLD=128): Device A(1024 PEBs, pool=50, wl_pool=25): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 0 0 0 0 1000 .. 9999: 0 0 0 0 10000 .. 99999: 960 29224 29282 29362 100000 .. inf: 64 117897 117934 117940 --------------------------------------------------------- Total : 1024 29224 34822 117940 Device B(8192 PEBs, pool=256, wl_pool=128): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 0 0 0 0 1000 .. 9999: 8128 2253 2321 2387 10000 .. 99999: 64 35387 35387 35388 100000 .. inf: 0 0 0 0 --------------------------------------------------------- Total : 8192 2253 2579 35388 The key point is reducing fastmap updating frequency by enlarging POOL_SIZE, so let UBI reserve ubi->fm_pool.max_size PEBs during attaching. Then POOL_SIZE will become ubi->fm_pool.max_size/2 even in free space running out case. Given an UBI device with 8192 PEBs(16384\8192\4096 is common large-capacity flash), t=8192/128/64=1. The fastmap updating will happen in either wl_pool or pool is empty, so setting fm_pool_rsv_cnt as ubi->fm_pool.max_size can fill wl_pool in full state. After pool reservation, running fsstress for 24h: Device A(1024 PEBs, pool=50, wl_pool=25): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 0 0 0 0 1000 .. 9999: 0 0 0 0 10000 .. 99999: 1024 33801 33997 34056 100000 .. inf: 0 0 0 0 --------------------------------------------------------- Total : 1024 33801 33997 34056 Device B(8192 PEBs, pool=256, wl_pool=128): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 0 0 0 0 1000 .. 9999: 8192 2205 2397 2460 10000 .. 99999: 0 0 0 0 100000 .. inf: 0 0 0 0 --------------------------------------------------------- Total : 8192 2205 2397 2460 The difference of erase counter between first 64 PEBs and others is under WL_FREE_MAX_DIFF(2*UBI_WL_THRESHOLD=2*128=256). Device A: 34056 - 33801 = 255 Device B: 2460 - 2205 = 255 Next patch will add a switch to control whether UBI needs to reserve PEBs for filling pool. Fixes: dbb7d2a88d2a ("UBI: Add fastmap core") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: fastmap: Get wl PEB even ec beyonds the 'max' if free PEBs are run outZhihao Cheng
This is the part 2 to fix cyclically reusing single fastmap data PEBs. Consider one situation, if there are four free PEBs for fm_anchor, pool, wl_pool and fastmap data PEB with erase counter 100, 100, 100, 5096 (ubi->beb_rsvd_pebs is 0). PEB with erase counter 5096 is always picked for fastmap data according to the realization of find_wl_entry(), since fastmap data PEB is not scheduled for wl, finally there are two PEBs (fm data) with great erase counter than other PEBS. Get wl PEB even its erase counter exceeds the 'max' in find_wl_entry() when free PEBs are run out after filling pools and fm data. Then the PEB with biggest erase conter is taken as wl PEB, it can be scheduled for wl. Fixes: dbb7d2a88d2a ("UBI: Add fastmap core") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: fastmap: may_reserve_for_fm: Don't reserve PEB if fm_anchor existsZhihao Cheng
This is the part 1 to fix cyclically reusing single fastmap data PEBs. After running fsstress on UBIFS for a while, UBI (16384 blocks, fastmap takes 2 blocks) has an erase block(PEB: 8031) with big erase counter greater than any other pebs: ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 532 84 92 99 100 .. 999: 15787 100 147 229 1000 .. 9999: 64 4699 4765 4826 10000 .. 99999: 0 0 0 0 100000 .. inf: 1 272935 272935 272935 --------------------------------------------------------- Total : 16384 84 180 272935 Not like fm_anchor, there is no candidate PEBs for fastmap data area, so old fastmap data pebs will be reused after all free pebs are filled into pool/wl_pool: ubi_update_fastmap for (i = 1; i < new_fm->used_blocks; i++) erase_block(ubi, old_fm->e[i]->pnum) new_fm->e[i] = old_fm->e[i] According to wear leveling algorithm, UBI selects one small erase counter PEB from ubi->used and one big erase counter PEB from wl_pool, the reused fastmap data PEB is not in these trees. UBI won't schedule this PEB for wl even it is in ubi->used because wl algorithm expects small erase counter for used PEB. Don't reserve PEB for fastmap in may_reserve_for_fm() if fm_anchor already exists. Otherwise, when UBI is running out of free PEBs, the only one free PEB (pnum < 64) will be skipped and fastmap data will be written on the same old PEB. Fixes: dbb7d2a88d2a ("UBI: Add fastmap core") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: fastmap: Remove unneeded break condition while filling poolsZhihao Cheng
Change pool filling stop condition. Commit d09e9a2bddba ("ubi: fastmap: Fix high cpu usage of ubi_bgt by making sure wl_pool not empty") reserves fastmap data PEBs after filling 1 PEB in wl_pool. Now wait_free_pebs_for_pool() makes enough free PEBs before filling pool, there will still be at least 1 PEB in pool and 1 PEB in wl_pool after doing ubi_refill_pools(). Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: fastmap: Wait until there are enough free PEBs before filling poolsZhihao Cheng
Wait until there are enough free PEBs before filling pool/wl_pool, sometimes erase_worker is not scheduled in time, which causes two situations: A. There are few PEBs filled in pool, which makes ubi_update_fastmap is frequently called and leads first 64 PEBs are erased more times than other PEBs. So waiting free PEBs before filling pool reduces fastmap updating frequency and prolongs flash service life. B. In situation that space is nearly running out, ubi_refill_pools() cannot make sure pool and wl_pool are filled with free PEBs, caused by the delay of erase_worker. After this patch applied, there must exist free PEBs in pool after one call of ubi_update_fastmap. Besides, this patch is a preparetion for fixing large erase counter in fastmap data block and fixing lapsed wear leveling for first 64 PEBs. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: fastmap: Use free pebs reserved for bad block handlingZhihao Cheng
If new bad PEBs occur, UBI firstly consumes ubi->beb_rsvd_pebs, and then ubi->avail_pebs, finally UBI becomes read-only if above two items are 0, which means that the amount of PEBs for user volumes is not effected. Besides, UBI reserves count of free PBEs is ubi->beb_rsvd_pebs while filling wl pool or getting free PEBs, but ubi->avail_pebs is not reserved. So ubi->beb_rsvd_pebs and ubi->avail_pebs have nothing to do with the usage of free PEBs, UBI can use all free PEBs. Commit 78d6d497a648 ("UBI: Move fastmap specific functions out of wl.c") has removed beb_rsvd_pebs checking while filling pool. Now, don't reserve ubi->beb_rsvd_pebs while filling wl_pool. This will fill more PEBs in pool and also reduce fastmap updating frequency. Also remove beb_rsvd_pebs checking in ubi_wl_get_fm_peb. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28ubi: Replace erase_block() with sync_erase()Zhihao Cheng
Since erase_block() has same logic with sync_erase(), just replace it with sync_erase(), also rename 'sync_erase()' to 'ubi_sync_erase()'. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>