summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-11-20ALSA: usb-audio: Fix potential out-of-bound accesses for Extigy and Mbox devicesBenoît Sevens
A bogus device can provide a bNumConfigurations value that exceeds the initial value used in usb_get_configuration for allocating dev->config. This can lead to out-of-bounds accesses later, e.g. in usb_destroy_configuration. Signed-off-by: Benoît Sevens <bsevens@google.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@kernel.org Link: https://patch.msgid.link/20241120124144.3814457-1-bsevens@google.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-11-20ALSA: hda/realtek: Update ALC256 depop procedureKailang Yang
Old procedure has a chance to meet Headphone no output. Fixes: 4a219ef8f370 ("ALSA: hda/realtek - Add ALC256 HP depop function") Signed-off-by: Kailang Yang <kailang@realtek.com> Link: https://lore.kernel.org/463c5f93715d4714967041a0a8cec28e@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-11-20ALSA: ac97: bus: Fix the mistake in the commentPei Xiao
Fix mistake in the comment. sound/ac97/bus.c:192: warning: Function parameter or member 'drv' not described in 'snd_ac97_codec_driver_register' sound/ac97/bus.c:192: warning: Excess function parameter 'dev' description in 'snd_ac97_codec_driver_register' sound/ac97/bus.c:205: warning: Function parameter or member 'drv' not described in 'snd_ac97_codec_driver_unregister' sound/ac97/bus.c:205: warning: Excess function parameter 'dev' description in 'snd_ac97_codec_driver_unregister' sound/ac97/bus.c:351: warning: Function parameter or member 'codecs_pdata' not described in 'snd_ac97_controller_register' Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202411180804.FUfdymYO-lkp@intel.com/ Fixes: 74426fbff66e ("ALSA: ac97: add an ac97 bus") Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Link: https://patch.msgid.link/3990bfc8cd47637908eaa179802c1d91459d829b.1732083924.git.xiaopei01@kylinos.cn Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-11-20dm-verity: remove the unused "data_start" variableMikulas Patocka
Remove the unused "data_start" variable. It is always set to zero and the user can't override it. If the user needs to use some existing offset within a block device, it is possible to use the linear target. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm-bufio: use kmalloc to allocate power-of-two sized buffersMikulas Patocka
Vlastimil Babka said [1] that kmalloc will return a power-of-two-aligned buffer if it was called with a power-of-two size. So, we can use kmalloc instead of our own slab cache in dm-bufio. Note that the code for the slab cache was not removed because dm-bufio supports non-power-of-two buffer sizes. Link: https://lore.kernel.org/linux-mm/e7fca292-7c79-4f97-a90c-d68178d8ca59@suse.cz/ [1] Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm thin: Add missing destroy_work_on_stack()Yuan Can
This commit add missed destroy_work_on_stack() operations for pw->worker in pool_work_wait(). Fixes: e7a3e871d895 ("dm thin: cleanup noflush_work to use a proper completion") Cc: stable@vger.kernel.org Signed-off-by: Yuan Can <yuancan@huawei.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm: add support for get_unique_idBenjamin Coddington
This adds support to obtain a device's unique id through dm, similar to the existing ioctl and persistent resevation handling. We limit this to single-target devices. This enables knfsd to export pNFS SCSI luns that have been exported from multipath devices. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-11-20dm vdo: fix function doc comment formattingMatthew Sakai
Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm vdo int-map: remove unused parametersMatthew Sakai
Remove __always_unused parameters from static functions. Also fix minor formatting issues. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202407141607.M3E2XQ0Z-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202409101018.B75pIBKR-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202410011107.U2xbVLRA-lkp@intel.com/ Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm-vdo: reset bi_ioprio to the default value when the bio is resetSusan LeGendre-McGhee
Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm-vdo murmurhash: remove u64 alignment requirementSusan LeGendre-McGhee
Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm: Fix typo in error messageSsuhung Yeh
Remove the redundant "i" at the beginning of the error message. This "i" came from commit 1c1318866928 ("dm: prefer '"%s...", __func__'"), the "i" is accidentally left. Signed-off-by: Ssuhung Yeh <ssuhung@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: 1c1318866928 ("dm: prefer '"%s...", __func__'") Cc: stable@vger.kernel.org # v6.3+
2024-11-20dm ioctl: rate limit a couple of ioctl based error messagesColin Ian King
It is possible to spam the kernel log with a misbehaving user process that is passing incorrect dm ioctls to /dev/mapper/control. Use a rate limit on these error messages to reduce the noise. These errors were hit when running the stress-ng's device test. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm vdo: Remove unused uds_compute_index_sizeDr. David Alan Gilbert
uds_compute_index_size() has been unused since it was added in commit b46d79bdb82a ("dm vdo: add deduplication index storage interface") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm vdo: Remove unused functionsDr. David Alan Gilbert
get_data_vio_pool_active_discards() get_data_vio_pool_discard_limit() get_data_vio_pool_maximum_discards() set_data_vio_pool_discard_limit() are all unused since commit a9da0fb6d8c6 ("dm vdo: remove all sysfs interfaces") Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm: zoned: Remove unused functionsDr. David Alan Gilbert
dmz_resume_metadata() is unused since it was added in commit 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target") dmz_zone_nr_blocks_shift is unused since it was added in commit 368205601375 ("dm zoned: move fields from struct dmz_dev to dmz_metadata") Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm: Remove unused dm_table_bio_basedDr. David Alan Gilbert
dm_table_bio_based() is unused since commit 29dec90a0f1d ("dm: fix bio_set allocation") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm: Remove unused dm_set_md_typeDr. David Alan Gilbert
dm_set_md_type() has been unused since commit ba30585936b0 ("dm: move setting md->type into dm_setup_md_queue") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm cache: Remove unused functions in bio-prison-v1Dr. David Alan Gilbert
dm_cache_size() and dm_cache_dump() are unused since commit b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2") Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm cache: Remove unused dm_cache_sizeDr. David Alan Gilbert
dm_cache_size() has been unused since the original commit c6b4fcbad044 ("dm: add cache target") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm cache: Remove unused dm_cache_dumpDr. David Alan Gilbert
dm_cache_dump() has been unused since the original commit c6b4fcbad044 ("dm: add cache target") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20dm cache: Remove unused btracker_nr_writebacks_queuedDr. David Alan Gilbert
btracker_nr_writebacks_queued() has been unused since commit 2e63309507c8 ("dm cache policy smq: don't do any writebacks unless IDLE") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-11-20ovl: Filter invalid inodes with missing lookup functionVasiliy Kovalev
Add a check to the ovl_dentry_weird() function to prevent the processing of directory inodes that lack the lookup function. This is important because such inodes can cause errors in overlayfs when passed to the lowerstack. Reported-by: syzbot+a8c9d476508bd14a90e5@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=a8c9d476508bd14a90e5 Suggested-by: Miklos Szeredi <miklos@szeredi.hu> Link: https://lore.kernel.org/linux-unionfs/CAJfpegvx-oS9XGuwpJx=Xe28_jzWx5eRo1y900_ZzWY+=gGzUg@mail.gmail.com/ Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org> Cc: <stable@vger.kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-11-20selftests/mount_setattr: Fix failures on 64K PAGE_SIZE kernelsMichael Ellerman
Currently the mount_setattr_test fails on machines with a 64K PAGE_SIZE, with errors such as: # RUN mount_setattr_idmapped.invalid_fd_negative ... mkfs.ext4: No space left on device while writing out and closing file system # mount_setattr_test.c:1055:invalid_fd_negative:Expected system("mkfs.ext4 -q /mnt/C/ext4.img") (256) == 0 (0) # invalid_fd_negative: Test terminated by assertion # FAIL mount_setattr_idmapped.invalid_fd_negative not ok 12 mount_setattr_idmapped.invalid_fd_negative The code creates a 100,000 byte tmpfs: ASSERT_EQ(mount("testing", "/mnt", "tmpfs", MS_NOATIME | MS_NODEV, "size=100000,mode=700"), 0); And then a little later creates a 2MB ext4 filesystem in that tmpfs: ASSERT_EQ(ftruncate(img_fd, 1024 * 2048), 0); ASSERT_EQ(system("mkfs.ext4 -q /mnt/C/ext4.img"), 0); At first glance it seems like that should never work, after all 2MB is larger than 100,000 bytes. However the filesystem image doesn't actually occupy 2MB on "disk" (actually RAM, due to tmpfs). On 4K kernels the ext4.img uses ~84KB of actual space (according to du), which just fits. However on 64K PAGE_SIZE kernels the ext4.img takes at least 256KB, which is too large to fit in the tmpfs, hence the errors. It seems fraught to rely on the ext4.img taking less space on disk than the allocated size, so instead create the tmpfs with a size of 2MB. With that all 21 tests pass on 64K PAGE_SIZE kernels. Fixes: 01eadc8dd96d ("tests: add mount_setattr() selftests") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20241115134114.1219555-1-mpe@ellerman.id.au Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-19Input: mpr121 - use devm_regulator_get_enable_read_voltage()David Lechner
We can reduce boilerplate code by using devm_regulator_get_enable_read_voltage(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20241119-input-mpr121-regulator-get-enable-read-voltage-v3-1-1d8ee5c22f6c@baylibre.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2024-11-19Input: sun4i-lradc-keys - don't include 'pm_wakeup.h' directlyWolfram Sang
The header clearly states that it does not want to be included directly, only via 'device.h'. 'platform_device.h' works equally well. Remove the direct inclusion. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20241118072917.3853-7-wsa+renesas@sang-engineering.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2024-11-19Input: spear-keyboard - don't include 'pm_wakeup.h' directlyWolfram Sang
The header clearly states that it does not want to be included directly, only via 'device.h'. 'platform_device.h' works equally well. Remove the direct inclusion. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20241118072917.3853-6-wsa+renesas@sang-engineering.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2024-11-19block: return bool from get_disk_ro and bdev_read_onlyChristoph Hellwig
get_disk_ro and bdev_read_only return boolean conditions, don't masquerade them as int. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241119160932.1327864-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19block: remove a duplicate definition for bdev_read_onlyChristoph Hellwig
bdev_read_only is already defined as an inline function in blkdev.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241119160932.1327864-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19block: return bool from blk_rq_alignedChristoph Hellwig
blk_rq_aligned returns a boolean condition, don't mascquerade it as int. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241119160932.1327864-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19block: return unsigned int from blk_lim_dma_alignment_and_padChristoph Hellwig
The underlying limits are defined as unsigned int, so return that from blk_lim_dma_alignment_and_pad as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241119160932.1327864-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19block: return unsigned int from queue_dma_alignmentChristoph Hellwig
The underlying limit is defined as an unsigned int, so return that from queue_dma_alignment as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241119160932.1327864-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19block: return unsigned int from bdev_io_optChristoph Hellwig
The underlying limit is defined as an unsigned int, so return that from bdev_io_opt as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241119160932.1327864-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19block: req->bio is always set in the merge codeChristoph Hellwig
As smatch, which is a lot smarter than me noticed. So remove the checks for it, and condense these checks a bit including the comments stating the obvious. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241119161157.1328171-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19block: don't bother checking the data direction for mergesChristoph Hellwig
Because it already is encoded in the opcode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241119161157.1328171-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19block: blk-mq: fix uninit-value in blk_rq_prep_clone and refactorSuraj Sonawane
Fix an issue detected by the `smatch` tool: block/blk-mq.c:3314 blk_rq_prep_clone() error: uninitialized symbol 'bio'. This patch refactors `blk_rq_prep_clone()` to improve code readability and ensure safety by addressing potential misuse of the `bio` variable: - Move the bio_put(bio); call to the bio_ctr error handling block, which is the only place where it can be triggered. - Move the bio variable into the __rq_for_each_bio loop scope. This change removes the need to set bio to NULL at the loop's end. discussion on why bio remains uninitialized: https://lore.kernel.org/lkml/20241004141037.43277-1-surajsonawane0215@gmail.com Summary of above discussion: - I pointed out that `bio` can remain uninitialized if the allocation with `bio_alloc_clone` fails. - Keith Busch explained that `bio` is initialized to `NULL` when `bio_alloc_clone()` fails, preventing uninitialized usage. - John Garry questioned whether `rq_src->bio` being `NULL` could leave `bio` uninitialized. Keith clarified that in such cases, `bio` is not referenced, so it does not need initialization. - Christoph Hellwig recommended code improvements: - move the bio_put to the bio_ctr error handling, which is the only case where it can happen - move the bio variable into the __rq_for_each_bio scope, which also removed the need to zero it at the end of the loop These changes enhance code clarity, address static analysis tool warnings, and make the function more maintainable. thread of previous version patch discussion: https://lore.kernel.org/lkml/20241004100842.9052-1-surajsonawane0215@gmail.com Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241119164412.37609-1-surajsonawane0215@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19Revert "block, bfq: merge bfq_release_process_ref() into bfq_put_cooperator()"Zach Wade
This reverts commit bc3b1e9e7c50e1de0f573eea3871db61dd4787de. The bic is associated with sync_bfqq, and bfq_release_process_ref cannot be put into bfq_put_cooperator. kasan report: [ 400.347277] ================================================================== [ 400.347287] BUG: KASAN: slab-use-after-free in bic_set_bfqq+0x200/0x230 [ 400.347420] Read of size 8 at addr ffff88881cab7d60 by task dockerd/5800 [ 400.347430] [ 400.347436] CPU: 24 UID: 0 PID: 5800 Comm: dockerd Kdump: loaded Tainted: G E 6.12.0 #32 [ 400.347450] Tainted: [E]=UNSIGNED_MODULE [ 400.347454] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 07/28/2022 [ 400.347460] Call Trace: [ 400.347464] <TASK> [ 400.347468] dump_stack_lvl+0x5d/0x80 [ 400.347490] print_report+0x174/0x505 [ 400.347521] kasan_report+0xe0/0x160 [ 400.347541] bic_set_bfqq+0x200/0x230 [ 400.347549] bfq_bic_update_cgroup+0x419/0x740 [ 400.347560] bfq_bio_merge+0x133/0x320 [ 400.347584] blk_mq_submit_bio+0x1761/0x1e20 [ 400.347625] __submit_bio+0x28b/0x7b0 [ 400.347664] submit_bio_noacct_nocheck+0x6b2/0xd30 [ 400.347690] iomap_readahead+0x50c/0x680 [ 400.347731] read_pages+0x17f/0x9c0 [ 400.347785] page_cache_ra_unbounded+0x366/0x4a0 [ 400.347795] filemap_fault+0x83d/0x2340 [ 400.347819] __xfs_filemap_fault+0x11a/0x7d0 [xfs] [ 400.349256] __do_fault+0xf1/0x610 [ 400.349270] do_fault+0x977/0x11a0 [ 400.349281] __handle_mm_fault+0x5d1/0x850 [ 400.349314] handle_mm_fault+0x1f8/0x560 [ 400.349324] do_user_addr_fault+0x324/0x970 [ 400.349337] exc_page_fault+0x76/0xf0 [ 400.349350] asm_exc_page_fault+0x26/0x30 [ 400.349360] RIP: 0033:0x55a480d77375 [ 400.349384] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 3b 66 10 0f 86 ae 02 00 00 55 48 89 e5 48 83 ec 58 48 8b 10 <83> 7a 10 00 0f 84 27 02 00 00 44 0f b6 42 28 44 0f b6 4a 29 41 80 [ 400.349392] RSP: 002b:00007f18c37fd8b8 EFLAGS: 00010216 [ 400.349401] RAX: 00007f18c37fd9d0 RBX: 0000000000000000 RCX: 0000000000000000 [ 400.349407] RDX: 000055a484407d38 RSI: 000000c000e8b0c0 RDI: 0000000000000000 [ 400.349412] RBP: 00007f18c37fd910 R08: 000055a484017f60 R09: 000055a484066f80 [ 400.349417] R10: 0000000000194000 R11: 0000000000000005 R12: 0000000000000008 [ 400.349422] R13: 0000000000000000 R14: 000000c000476a80 R15: 0000000000000000 [ 400.349430] </TASK> [ 400.349452] [ 400.349454] Allocated by task 5800: [ 400.349459] kasan_save_stack+0x30/0x50 [ 400.349469] kasan_save_track+0x14/0x30 [ 400.349475] __kasan_slab_alloc+0x89/0x90 [ 400.349482] kmem_cache_alloc_node_noprof+0xdc/0x2a0 [ 400.349492] bfq_get_queue+0x1ef/0x1100 [ 400.349502] __bfq_get_bfqq_handle_split+0x11a/0x510 [ 400.349511] bfq_insert_requests+0xf55/0x9030 [ 400.349519] blk_mq_flush_plug_list+0x446/0x14c0 [ 400.349527] __blk_flush_plug+0x27c/0x4e0 [ 400.349534] blk_finish_plug+0x52/0xa0 [ 400.349540] _xfs_buf_ioapply+0x739/0xc30 [xfs] [ 400.350246] __xfs_buf_submit+0x1b2/0x640 [xfs] [ 400.350967] xfs_buf_read_map+0x306/0xa20 [xfs] [ 400.351672] xfs_trans_read_buf_map+0x285/0x7d0 [xfs] [ 400.352386] xfs_imap_to_bp+0x107/0x270 [xfs] [ 400.353077] xfs_iget+0x70d/0x1eb0 [xfs] [ 400.353786] xfs_lookup+0x2ca/0x3a0 [xfs] [ 400.354506] xfs_vn_lookup+0x14e/0x1a0 [xfs] [ 400.355197] __lookup_slow+0x19c/0x340 [ 400.355204] lookup_one_unlocked+0xfc/0x120 [ 400.355211] ovl_lookup_single+0x1b3/0xcf0 [overlay] [ 400.355255] ovl_lookup_layer+0x316/0x490 [overlay] [ 400.355295] ovl_lookup+0x844/0x1fd0 [overlay] [ 400.355351] lookup_one_qstr_excl+0xef/0x150 [ 400.355357] do_unlinkat+0x22a/0x620 [ 400.355366] __x64_sys_unlinkat+0x109/0x1e0 [ 400.355375] do_syscall_64+0x82/0x160 [ 400.355384] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 400.355393] [ 400.355395] Freed by task 5800: [ 400.355400] kasan_save_stack+0x30/0x50 [ 400.355407] kasan_save_track+0x14/0x30 [ 400.355413] kasan_save_free_info+0x3b/0x70 [ 400.355422] __kasan_slab_free+0x4f/0x70 [ 400.355429] kmem_cache_free+0x176/0x520 [ 400.355438] bfq_put_queue+0x67e/0x980 [ 400.355447] bfq_bic_update_cgroup+0x407/0x740 [ 400.355454] bfq_bio_merge+0x133/0x320 [ 400.355460] blk_mq_submit_bio+0x1761/0x1e20 [ 400.355467] __submit_bio+0x28b/0x7b0 [ 400.355473] submit_bio_noacct_nocheck+0x6b2/0xd30 [ 400.355480] iomap_readahead+0x50c/0x680 [ 400.355490] read_pages+0x17f/0x9c0 [ 400.355498] page_cache_ra_unbounded+0x366/0x4a0 [ 400.355505] filemap_fault+0x83d/0x2340 [ 400.355514] __xfs_filemap_fault+0x11a/0x7d0 [xfs] [ 400.356204] __do_fault+0xf1/0x610 [ 400.356213] do_fault+0x977/0x11a0 [ 400.356221] __handle_mm_fault+0x5d1/0x850 [ 400.356230] handle_mm_fault+0x1f8/0x560 [ 400.356238] do_user_addr_fault+0x324/0x970 [ 400.356248] exc_page_fault+0x76/0xf0 [ 400.356258] asm_exc_page_fault+0x26/0x30 [ 400.356266] [ 400.356269] The buggy address belongs to the object at ffff88881cab7bc0 which belongs to the cache bfq_queue of size 576 [ 400.356276] The buggy address is located 416 bytes inside of freed 576-byte region [ffff88881cab7bc0, ffff88881cab7e00) [ 400.356285] [ 400.356287] The buggy address belongs to the physical page: [ 400.356292] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88881cab0b00 pfn:0x81cab0 [ 400.356300] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 400.356323] flags: 0x50000000000040(head|node=1|zone=2) [ 400.356331] page_type: f5(slab) [ 400.356340] raw: 0050000000000040 ffff88880a00c280 dead000000000122 0000000000000000 [ 400.356347] raw: ffff88881cab0b00 00000000802e0025 00000001f5000000 0000000000000000 [ 400.356354] head: 0050000000000040 ffff88880a00c280 dead000000000122 0000000000000000 [ 400.356359] head: ffff88881cab0b00 00000000802e0025 00000001f5000000 0000000000000000 [ 400.356365] head: 0050000000000003 ffffea002072ac01 ffffffffffffffff 0000000000000000 [ 400.356370] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 [ 400.356378] page dumped because: kasan: bad access detected [ 400.356381] [ 400.356383] Memory state around the buggy address: [ 400.356387] ffff88881cab7c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 400.356392] ffff88881cab7c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 400.356397] >ffff88881cab7d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 400.356400] ^ [ 400.356405] ffff88881cab7d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 400.356409] ffff88881cab7e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 400.356413] ================================================================== Cc: stable@vger.kernel.org Fixes: bc3b1e9e7c50 ("block, bfq: merge bfq_release_process_ref() into bfq_put_cooperator()") Signed-off-by: Zach Wade <zachwade.k@gmail.com> Cc: Ding Hui <dinghui@sangfor.com.cn> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20241119153410.2546-1-zachwade.k@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-19Merge tag 'timers-core-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "A rather large update for timekeeping and timers: - The final step to get rid of auto-rearming posix-timers posix-timers are currently auto-rearmed by the kernel when the signal of the timer is ignored so that the timer signal can be delivered once the corresponding signal is unignored. This requires to throttle the timer to prevent a DoS by small intervals and keeps the system pointlessly out of low power states for no value. This is a long standing non-trivial problem due to the lock order of posix-timer lock and the sighand lock along with life time issues as the timer and the sigqueue have different life time rules. Cure this by: - Embedding the sigqueue into the timer struct to have the same life time rules. Aside of that this also avoids the lookup of the timer in the signal delivery and rearm path as it's just a always valid container_of() now. - Queuing ignored timer signals onto a seperate ignored list. - Moving queued timer signals onto the ignored list when the signal is switched to SIG_IGN before it could be delivered. - Walking the ignored list when SIG_IGN is lifted and requeue the signals to the actual signal lists. This allows the signal delivery code to rearm the timer. This also required to consolidate the signal delivery rules so they are consistent across all situations. With that all self test scenarios finally succeed. - Core infrastructure for VFS multigrain timestamping This is required to allow the kernel to use coarse grained time stamps by default and switch to fine grained time stamps when inode attributes are actively observed via getattr(). These changes have been provided to the VFS tree as well, so that the VFS specific infrastructure could be built on top. - Cleanup and consolidation of the sleep() infrastructure - Move all sleep and timeout functions into one file - Rework udelay() and ndelay() into proper documented inline functions and replace the hardcoded magic numbers by proper defines. - Rework the fsleep() implementation to take the reality of the timer wheel granularity on different HZ values into account. Right now the boundaries are hard coded time ranges which fail to provide the requested accuracy on different HZ settings. - Update documentation for all sleep/timeout related functions and fix up stale documentation links all over the place - Fixup a few usage sites - Rework of timekeeping and adjtimex(2) to prepare for multiple PTP clocks A system can have multiple PTP clocks which are participating in seperate and independent PTP clock domains. So far the kernel only considers the PTP clock which is based on CLOCK TAI relevant as that's the clock which drives the timekeeping adjustments via the various user space daemons through adjtimex(2). The non TAI based clock domains are accessible via the file descriptor based posix clocks, but their usability is very limited. They can't be accessed fast as they always go all the way out to the hardware and they cannot be utilized in the kernel itself. As Time Sensitive Networking (TSN) gains traction it is required to provide fast user and kernel space access to these clocks. The approach taken is to utilize the timekeeping and adjtimex(2) infrastructure to provide this access in a similar way how the kernel provides access to clock MONOTONIC, REALTIME etc. Instead of creating a duplicated infrastructure this rework converts timekeeping and adjtimex(2) into generic functionality which operates on pointers to data structures instead of using static variables. This allows to provide time accessors and adjtimex(2) functionality for the independent PTP clocks in a subsequent step. - Consolidate hrtimer initialization hrtimers are set up by initializing the data structure and then seperately setting the callback function for historical reasons. That's an extra unnecessary step and makes Rust support less straight forward than it should be. Provide a new set of hrtimer_setup*() functions and convert the core code and a few usage sites of the less frequently used interfaces over. The bulk of the htimer_init() to hrtimer_setup() conversion is already prepared and scheduled for the next merge window. - Drivers: - Ensure that the global timekeeping clocksource is utilizing the cluster 0 timer on MIPS multi-cluster systems. Otherwise CPUs on different clusters use their cluster specific clocksource which is not guaranteed to be synchronized with other clusters. - Mostly boring cleanups, fixes, improvements and code movement" * tag 'timers-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (140 commits) posix-timers: Fix spurious warning on double enqueue versus do_exit() clocksource/drivers/arm_arch_timer: Use of_property_present() for non-boolean properties clocksource/drivers/gpx: Remove redundant casts clocksource/drivers/timer-ti-dm: Fix child node refcount handling dt-bindings: timer: actions,owl-timer: convert to YAML clocksource/drivers/ralink: Add Ralink System Tick Counter driver clocksource/drivers/mips-gic-timer: Always use cluster 0 counter as clocksource clocksource/drivers/timer-ti-dm: Don't fail probe if int not found clocksource/drivers:sp804: Make user selectable clocksource/drivers/dw_apb: Remove unused dw_apb_clockevent functions hrtimers: Delete hrtimer_init_on_stack() alarmtimer: Switch to use hrtimer_setup() and hrtimer_setup_on_stack() io_uring: Switch to use hrtimer_setup_on_stack() sched/idle: Switch to use hrtimer_setup_on_stack() hrtimers: Delete hrtimer_init_sleeper_on_stack() wait: Switch to use hrtimer_setup_sleeper_on_stack() timers: Switch to use hrtimer_setup_sleeper_on_stack() net: pktgen: Switch to use hrtimer_setup_sleeper_on_stack() futex: Switch to use hrtimer_setup_sleeper_on_stack() fs/aio: Switch to use hrtimer_setup_sleeper_on_stack() ...
2024-11-19KVM: x86: Break CONFIG_KVM_X86's direct dependency on KVM_INTEL || KVM_AMDSean Christopherson
Rework CONFIG_KVM_X86's dependency to only check if KVM_INTEL or KVM_AMD is selected, i.e. not 'n'. Having KVM_X86 depend directly on the vendor modules results in KVM_X86 being set to 'm' if at least one of KVM_INTEL or KVM_AMD is enabled, but neither is 'y', regardless of the value of KVM itself. The documentation for def_tristate doesn't explicitly state that this is the intended behavior, but it does clearly state that the "if" section is parsed as a dependency, i.e. the behavior is consistent with how tristate dependencies are handled in general. Optionally dependencies for this default value can be added with "if". Fixes: ea4290d77bda ("KVM: x86: leave kvm.ko out of the build if no vendor module is requested") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20241118172002.1633824-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-11-19KVM: x86: add back X86_LOCAL_APIC dependencyArnd Bergmann
Enabling KVM now causes a build failure on x86-32 if X86_LOCAL_APIC is disabled: arch/x86/kvm/svm/svm.c: In function 'svm_emergency_disable_virtualization_cpu': arch/x86/kvm/svm/svm.c:597:9: error: 'kvm_rebooting' undeclared (first use in this function); did you mean 'kvm_irq_routing'? 597 | kvm_rebooting = true; | ^~~~~~~~~~~~~ | kvm_irq_routing arch/x86/kvm/svm/svm.c:597:9: note: each undeclared identifier is reported only once for each function it appears in make[6]: *** [scripts/Makefile.build:221: arch/x86/kvm/svm/svm.o] Error 1 In file included from include/linux/rculist.h:11, from include/linux/hashtable.h:14, from arch/x86/kvm/svm/avic.c:18: arch/x86/kvm/svm/avic.c: In function 'avic_pi_update_irte': arch/x86/kvm/svm/avic.c:909:38: error: 'struct kvm' has no member named 'irq_routing' 909 | irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu); | ^~ include/linux/rcupdate.h:538:17: note: in definition of macro '__rcu_dereference_check' 538 | typeof(*p) *local = (typeof(*p) *__force)READ_ONCE(p); \ Move the dependency to the same place as before. Fixes: ea4290d77bda ("KVM: x86: leave kvm.ko out of the build if no vendor module is requested") Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410060426.e9Xsnkvi-lkp@intel.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sean Christopherson <seanjc@google.com> [sean: add Cc to stable, tweak shortlog scope] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20241118172002.1633824-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-11-19Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of ↵Sean Christopherson
setup_vmcs_config()" Revert back to clearing VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL in KVM's golden VMCS config, as applying the workaround during vCPU creation is pointless and broken. KVM *unconditionally* clears the controls in the values returned by vmx_vmentry_ctrl() and vmx_vmexit_ctrl(), as KVM loads PERF_GLOBAL_CTRL if and only if its necessary to do so. E.g. if KVM wants to run the guest with the same PERF_GLOBAL_CTRL as the host, then there's no need to re-load the MSR on entry and exit. Even worse, the buggy commit failed to apply the erratum where it's actually needed, add_atomic_switch_msr(). As a result, KVM completely ignores the erratum for all intents and purposes, i.e. uses the flawed VMCS controls to load PERF_GLOBAL_CTRL. To top things off, the patch was intended to be dropped, as the premise of an L1 VMM being able to pivot on FMS is flawed, and KVM can (and now does) fully emulate the controls in software. Simply revert the commit, as all upstream supported kernels that have the buggy commit should also have commit f4c93d1a0e71 ("KVM: nVMX: Always emulate PERF_GLOBAL_CTRL VM-Entry/VM-Exit controls"), i.e. the (likely theoretical) live migration concern is a complete non-issue. Opportunistically drop the manual "kvm: " scope from the warning about the erratum, as KVM now uses pr_fmt() to provide the correct scope (v6.1 kernels and earlier don't, but the erratum only applies to CPUs that are 15+ years old; it's not worth a separate patch). This reverts commit 9d78d6fb186bc4aff41b5d6c4726b76649d3cb53. Link: https://lore.kernel.org/all/YtnZmCutdd5tpUmz@google.com Fixes: 9d78d6fb186b ("KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()") Cc: stable@vger.kernel.org Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-ID: <20241119011433.1797921-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-11-19Merge tag 'timers-vdso-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull vdso data page handling updates from Thomas Gleixner: "First steps of consolidating the VDSO data page handling. The VDSO data page handling is architecture specific for historical reasons, but there is no real technical reason to do so. Aside of that VDSO data has become a dump ground for various mechanisms and fail to provide a clear separation of the functionalities. Clean this up by: - consolidating the VDSO page data by getting rid of architecture specific warts especially in x86 and PowerPC. - removing the last includes of header files which are pulling in other headers outside of the VDSO namespace. - seperating timekeeping and other VDSO data accordingly. Further consolidation of the VDSO page handling is done in subsequent changes scheduled for the next merge window. This also lays the ground for expanding the VDSO time getters for independent PTP clocks in a generic way without making every architecture add support seperately" * tag 'timers-vdso-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) x86/vdso: Add missing brackets in switch case vdso: Rename struct arch_vdso_data to arch_vdso_time_data powerpc: Split systemcfg struct definitions out from vdso powerpc: Split systemcfg data out of vdso data page powerpc: Add kconfig option for the systemcfg page powerpc/pseries/lparcfg: Use num_possible_cpus() for potential processors powerpc/pseries/lparcfg: Fix printing of system_active_processors powerpc/procfs: Propagate error of remap_pfn_range() powerpc/vdso: Remove offset comment from 32bit vdso_arch_data x86/vdso: Split virtual clock pages into dedicated mapping x86/vdso: Delete vvar.h x86/vdso: Access vdso data without vvar.h x86/vdso: Move the rng offset to vsyscall.h x86/vdso: Access rng vdso data without vvar.h x86/vdso: Access timens vdso data without vvar.h x86/vdso: Allocate vvar page from C code x86/vdso: Access rng data from kernel without vvar x86/vdso: Place vdso_data at beginning of vvar page x86/vdso: Use __arch_get_vdso_data() to access vdso data x86/mm/mmap: Remove arch_vma_name() ...
2024-11-19Merge tag 'irq-core-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt subsystem updates from Thomas Gleixner: "Tree wide: - Make nr_irqs static to the core code and provide accessor functions to remove existing and prevent future aliasing problems with local variables or function arguments of the same name. Core code: - Prevent freeing an interrupt in the devres code which is not managed by devres in the first place. - Use seq_put_decimal_ull_width() for decimal values output in /proc/interrupts which increases performance significantly as it avoids parsing the format strings over and over. - Optimize raising the timer and hrtimer soft interrupts by using the 'set bit only' variants instead of the combined version which checks whether ksoftirqd should be woken up. The latter is a pointless exercise as both soft interrupts are raised in the context of the timer interrupt and therefore never wake up ksoftirqd. - Delegate timer/hrtimer soft interrupt processing to a dedicated thread on RT. Timer and hrtimer soft interrupts are always processed in ksoftirqd on RT enabled kernels. This can lead to high latencies when other soft interrupts are delegated to ksoftirqd as well. The separate thread allows to run them seperately under a RT scheduling policy to reduce the latency overhead. Drivers: - New drivers or extensions of existing drivers to support Renesas RZ/V2H(P), Aspeed AST27XX, T-HEAD C900 and ATMEL sam9x7 interrupt chips - Support for multi-cluster GICs on MIPS. MIPS CPUs can come with multiple CPU clusters, where each CPU cluster has its own GIC (Generic Interrupt Controller). This requires to access the GIC of a remote cluster through a redirect register block. This is encapsulated into a set of helper functions to keep the complexity out of the actual code paths which handle the GIC details. - Support for encrypted guests in the ARM GICV3 ITS driver The ITS page needs to be shared with the hypervisor and therefore must be decrypted. - Small cleanups and fixes all over the place" * tag 'irq-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) irqchip/riscv-aplic: Prevent crash when MSI domain is missing genirq/proc: Use seq_put_decimal_ull_width() for decimal values softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT. timers: Use __raise_softirq_irqoff() to raise the softirq. hrtimer: Use __raise_softirq_irqoff() to raise the softirq riscv: defconfig: Enable T-HEAD C900 ACLINT SSWI drivers irqchip: Add T-HEAD C900 ACLINT SSWI driver dt-bindings: interrupt-controller: Add T-HEAD C900 ACLINT SSWI device irqchip/stm32mp-exti: Use of_property_present() for non-boolean properties irqchip/mips-gic: Fix selection of GENERIC_IRQ_EFFECTIVE_AFF_MASK irqchip/mips-gic: Prevent indirect access to clusters without CPU cores irqchip/mips-gic: Multi-cluster support irqchip/mips-gic: Setup defaults in each cluster irqchip/mips-gic: Support multi-cluster in for_each_online_cpu_gic() irqchip/mips-gic: Replace open coded online CPU iterations genirq/irqdesc: Use str_enabled_disabled() helper in wakeup_show() genirq/devres: Don't free interrupt which is not managed by devres irqchip/gic-v3-its: Fix over allocation in itt_alloc_pool() irqchip/aspeed-intc: Add AST27XX INTC support dt-bindings: interrupt-controller: Add support for ASPEED AST27XX INTC ...
2024-11-19Merge tag 'core-debugobjects-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull debugobjects updates from Thomas Gleixner: - Prevent destroying the kmem_cache on early failure. Destroying a kmem_cache requires work queues to be set up, but in the early failure case they are not yet initializated. So rather leak the cache instead of triggering a BUG. - Reduce parallel pool fill attempts. Refilling the object pool requires to take the global pool lock, which causes a massive performance issue when a large number of CPUs attempt to refill concurrently. It turns out that it's sufficient to let one CPU handle the refill from the to free list and in case there are not enough objects on it to allocate new objects from the kmem cache. This also splits the free list handling from the actual allocation path as that yields better results on RT where allocation is restricted to preemptible code paths. The refill from free list has no such restrictions. - Consolidate the global and the per CPU pools to use the same data structure, so all helper functions can be shared. - Simplify the object allocation/free logic. The allocation/free logic is an incomprehensible maze, which tries to utilize the to free list and the global pool in the best way. This all can be simplified into a straight forward comprehensible code flow. - Convert the allocation/free mechanism to batch mode. Transferring objects from the global pool to the per CPU pools or vice versa is done by walking the hlist and moving object by object. That not only increases the pool lock held time, it also dirties up to 17 cache lines. This can be avoided by storing the pointer to the first object in a batch of 16 objects in the objects themself and propagate it through the batch when an object is enqueued into a pool or to a temporary hlist head on allocation. This allows to move batches of objects with at max four cache lines dirtied and reduces the pool lock held time and therefore contention significantly. - Improve the object reusage The current implementation is too agressively freeing unused objects, which is counterproductive on bursty workloads like a kernel compile. Address this by: * increasing the per CPU pool size * refilling the per CPU pool from the to be freed pool when the per CPU pool emptied a batch * keeping track of object usage with a exponentially wheighted moving average which prevents the work queue callback to free objects prematuraly. This combined reduces the allocation/free rate for a full kernel compile significantly: kmem_cache_alloc() kmem_cache_free() Baseline: 380k 330k Improved: 170k 117k - A few cleanups and a more cache line friendly layout of debug information on top. * tag 'core-debugobjects-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) debugobjects: Track object usage to avoid premature freeing of objects debugobjects: Refill per CPU pool more agressively debugobjects: Double the per CPU slots debugobjects: Move pool statistics into global_pool struct debugobjects: Implement batch processing debugobjects: Prepare kmem_cache allocations for batching debugobjects: Prepare for batching debugobjects: Use static key for boot pool selection debugobjects: Rework free_object_work() debugobjects: Rework object freeing debugobjects: Rework object allocation debugobjects: Move min/max count into pool struct debugobjects: Rename and tidy up per CPU pools debugobjects: Use separate list head for boot pool debugobjects: Move pools into a datastructure debugobjects: Reduce parallel pool fill attempts debugobjects: Make debug_objects_enabled bool debugobjects: Provide and use free_object_list() debugobjects: Remove pointless debug printk debugobjects: Reuse put_objects() on OOM ...
2024-11-19Merge tag 'x86-mm-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: - Put cpumask_test_cpu() check in switch_mm_irqs_off() under CONFIG_DEBUG_VM, to micro-optimize the context-switching code (Rik van Riel) - Add missing details in virtual memory layout (Kirill A. Shutemov) * tag 'x86-mm-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/tlb: Put cpumask_test_cpu() check in switch_mm_irqs_off() under CONFIG_DEBUG_VM x86/mm/doc: Add missing details in virtual memory layout
2024-11-19Merge tag 'x86-cleanups-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: - x86/boot: Remove unused function atou() (Dr. David Alan Gilbert) - x86/cpu: Use str_yes_no() helper in show_cpuinfo_misc() (Thorsten Blum) - x86/platform: Switch back to struct platform_driver::remove() (Uwe Kleine-König) * tag 'x86-cleanups-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Remove unused function atou() x86/cpu: Use str_yes_no() helper in show_cpuinfo_misc() x86/platform: Switch back to struct platform_driver::remove()
2024-11-19Merge tag 'x86-splitlock-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 splitlock updates from Ingo Molnar: - Move Split and Bus lock code to a dedicated file (Ravi Bangoria) - Add split/bus lock support for AMD (Ravi Bangoria) * tag 'x86-splitlock-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/bus_lock: Add support for AMD x86/split_lock: Move Split and Bus lock code to a dedicated file
2024-11-19kunit: qemu_configs: loongarch: Enable shutdownThomas Weißschuh
QEMU for LoongArch does not yet support shutdown/restart through ACPI. Use the pvpanic driver to enable shutdowns. This requires 9.1.0 for shutdown support in pvpanic, but that is the requirement of kunit on LoongArch anyways. Link: https://lore.kernel.org/r/20241111-kunit-loongarch-v2-3-7676eb5f2da3@linutronix.de Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Bibo Mao <maobibo@loongson.cn> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-11-19kunit: tool: Allow overriding the shutdown mode from qemu configThomas Weißschuh
Not all platforms support machine reboot. If it a proper reboot is not supported the machine will hang. Allow the QEMU configuration to override the necessary shutdown mode for the specific system under test. Link: https://lore.kernel.org/r/20241111-kunit-loongarch-v2-2-7676eb5f2da3@linutronix.de Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-11-19kunit: qemu_configs: Add LoongArch configThomas Weißschuh
Add a basic config to run kunit tests on LoongArch. This requires QEMU 9.1.0 or later for the necessary direct kernel boot support. Link: https://lore.kernel.org/r/20241111-kunit-loongarch-v2-1-7676eb5f2da3@linutronix.de Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Bibo Mao <maobibo@loongson.cn> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>