summaryrefslogtreecommitdiff
path: root/scripts/lib/kdoc/kdoc_files.py
diff options
context:
space:
mode:
authorNilay Shroff <nilay@linux.ibm.com>2025-07-24 15:31:51 +0530
committerJens Axboe <axboe@kernel.dk>2025-07-25 06:10:02 -0600
commit5989bfe6ac6bf230c2c84e118c786be0ed4be3f4 (patch)
treeb19d54652f6e206df5c3b27d9b6aea8971227167 /scripts/lib/kdoc/kdoc_files.py
parent5ec9d26b78c4eb7c2fab54dcec6c0eb845302a98 (diff)
block: restore two stage elevator switch while running nr_hw_queue update
The kmemleak reports memory leaks related to elevator resources that were originally allocated in the ->init_hctx() method. The following leak traces are observed after running blktests block/040: unreferenced object 0xffff8881b82f7400 (size 512): comm "check", pid 68454, jiffies 4310588881 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 5bac8b34): __kvmalloc_node_noprof+0x55d/0x7a0 sbitmap_init_node+0x15a/0x6a0 kyber_init_hctx+0x316/0xb90 blk_mq_init_sched+0x419/0x580 elevator_switch+0x18b/0x630 elv_update_nr_hw_queues+0x219/0x2c0 __blk_mq_update_nr_hw_queues+0x36a/0x6f0 blk_mq_update_nr_hw_queues+0x3a/0x60 0xffffffffc09ceb80 0xffffffffc09d7e0b configfs_write_iter+0x2b1/0x470 vfs_write+0x527/0xe70 ksys_write+0xff/0x200 do_syscall_64+0x98/0x3c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e unreferenced object 0xffff8881b82f6000 (size 512): comm "check", pid 68454, jiffies 4310588881 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 5bac8b34): __kvmalloc_node_noprof+0x55d/0x7a0 sbitmap_init_node+0x15a/0x6a0 kyber_init_hctx+0x316/0xb90 blk_mq_init_sched+0x419/0x580 elevator_switch+0x18b/0x630 elv_update_nr_hw_queues+0x219/0x2c0 __blk_mq_update_nr_hw_queues+0x36a/0x6f0 blk_mq_update_nr_hw_queues+0x3a/0x60 0xffffffffc09ceb80 0xffffffffc09d7e0b configfs_write_iter+0x2b1/0x470 vfs_write+0x527/0xe70 ksys_write+0xff/0x200 do_syscall_64+0x98/0x3c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e unreferenced object 0xffff8881b82f5800 (size 512): comm "check", pid 68454, jiffies 4310588881 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 5bac8b34): __kvmalloc_node_noprof+0x55d/0x7a0 sbitmap_init_node+0x15a/0x6a0 kyber_init_hctx+0x316/0xb90 blk_mq_init_sched+0x419/0x580 elevator_switch+0x18b/0x630 elv_update_nr_hw_queues+0x219/0x2c0 __blk_mq_update_nr_hw_queues+0x36a/0x6f0 blk_mq_update_nr_hw_queues+0x3a/0x60 0xffffffffc09ceb80 0xffffffffc09d7e0b configfs_write_iter+0x2b1/0x470 vfs_write+0x527/0xe70 ksys_write+0xff/0x200 do_syscall_64+0x98/0x3c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e The issue arises while we run nr_hw_queue update, Specifically, we first reallocate hardware contexts (hctx) via __blk_mq_realloc_hw_ctxs(), and then later invoke elevator_switch() (assuming q->elevator is not NULL). The elevator switch code would first exit old elevator (elevator_exit) and then switches to the new elevator. The elevator_exit loops through each hctx and invokes the elevator’s per-hctx exit method ->exit_hctx(), which releases resources allocated during ->init_hctx(). This memleak manifests when we reduce the num of h/w queues - for example, when the initial update sets the number of queues to X, and a later update reduces it to Y, where Y < X. In this case, we'd loose the access to old hctxs while we get to elevator exit code because __blk_mq_realloc_hw_ctxs would have already released the old hctxs. As we don't now have any reference left to the old hctxs, we don't have any way to free the scheduler resources (which are allocate in ->init_hctx()) and kmemleak complains about it. This issue was caused due to the commit 596dce110b7d ("block: simplify elevator reattachment for updating nr_hw_queues"). That change unified the two-stage elevator teardown and reattachment into a single call that occurs after __blk_mq_realloc_hw_ctxs() has already freed the hctxs. This patch restores the previous two-stage elevator switch logic during nr_hw_queues updates. First, the elevator is switched to 'none', which ensures all scheduler resources are properly freed. Then, the hardware contexts (hctxs) are reallocated, and the software-to-hardware queue mappings are updated. Finally, the original elevator is reattached. This sequence prevents loss of references to old hctxs and avoids the scheduler resource leaks reported by kmemleak. Reported-by : Yi Zhang <yi.zhang@redhat.com> Fixes: 596dce110b7d ("block: simplify elevator reattachment for updating nr_hw_queues") Closes: https://lore.kernel.org/all/CAHj4cs8oJFvz=daCvjHM5dYCNQH4UXwSySPPU4v-WHce_kZXZA@mail.gmail.com/ Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250724102540.1366308-1-nilay@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
Diffstat (limited to 'scripts/lib/kdoc/kdoc_files.py')
0 files changed, 0 insertions, 0 deletions