linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-04-09	dma/contiguous: avoid warning about unused size_bytes	Arnd Bergmann
	When building with W=1, this variable is unused for configs with CONFIG_CMA_SIZE_SEL_PERCENTAGE=y: kernel/dma/contiguous.c:67:26: error: 'size_bytes' defined but not used [-Werror=unused-const-variable=] Change this to a macro to avoid the warning. Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20250409151557.3890443-1-arnd@kernel.org
2025-04-09	smb: client: fix UAF in decryption with multichannel	Paulo Alcantara
	After commit f7025d861694 ("smb: client: allocate crypto only for primary server") and commit b0abcd65ec54 ("smb: client: fix UAF in async decryption"), the channels started reusing AEAD TFM from primary channel to perform synchronous decryption, but that can't done as there could be multiple cifsd threads (one per channel) simultaneously accessing it to perform decryption. This fixes the following KASAN splat when running fstest generic/249 with 'vers=3.1.1,multichannel,max_channels=4,seal' against Windows Server 2022: BUG: KASAN: slab-use-after-free in gf128mul_4k_lle+0xba/0x110 Read of size 8 at addr ffff8881046c18a0 by task cifsd/986 CPU: 3 UID: 0 PID: 986 Comm: cifsd Not tainted 6.15.0-rc1 #1 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5d/0x80 print_report+0x156/0x528 ? gf128mul_4k_lle+0xba/0x110 ? __virt_addr_valid+0x145/0x300 ? __phys_addr+0x46/0x90 ? gf128mul_4k_lle+0xba/0x110 kasan_report+0xdf/0x1a0 ? gf128mul_4k_lle+0xba/0x110 gf128mul_4k_lle+0xba/0x110 ghash_update+0x189/0x210 shash_ahash_update+0x295/0x370 ? __pfx_shash_ahash_update+0x10/0x10 ? __pfx_shash_ahash_update+0x10/0x10 ? __pfx_extract_iter_to_sg+0x10/0x10 ? ___kmalloc_large_node+0x10e/0x180 ? __asan_memset+0x23/0x50 crypto_ahash_update+0x3c/0xc0 gcm_hash_assoc_remain_continue+0x93/0xc0 crypt_message+0xe09/0xec0 [cifs] ? __pfx_crypt_message+0x10/0x10 [cifs] ? _raw_spin_unlock+0x23/0x40 ? __pfx_cifs_readv_from_socket+0x10/0x10 [cifs] decrypt_raw_data+0x229/0x380 [cifs] ? __pfx_decrypt_raw_data+0x10/0x10 [cifs] ? __pfx_cifs_read_iter_from_socket+0x10/0x10 [cifs] smb3_receive_transform+0x837/0xc80 [cifs] ? __pfx_smb3_receive_transform+0x10/0x10 [cifs] ? __pfx___might_resched+0x10/0x10 ? __pfx_smb3_is_transform_hdr+0x10/0x10 [cifs] cifs_demultiplex_thread+0x692/0x1570 [cifs] ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] ? rcu_is_watching+0x20/0x50 ? rcu_lockdep_current_cpu_online+0x62/0xb0 ? find_held_lock+0x32/0x90 ? kvm_sched_clock_read+0x11/0x20 ? local_clock_noinstr+0xd/0xd0 ? trace_irq_enable.constprop.0+0xa8/0xe0 ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] kthread+0x1fe/0x380 ? kthread+0x10f/0x380 ? __pfx_kthread+0x10/0x10 ? local_clock_noinstr+0xd/0xd0 ? ret_from_fork+0x1b/0x60 ? local_clock+0x15/0x30 ? lock_release+0x29b/0x390 ? rcu_is_watching+0x20/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Tested-by: David Howells <dhowells@redhat.com> Reported-by: Steve French <stfrench@microsoft.com> Closes: https://lore.kernel.org/r/CAH2r5mu6Yc0-RJXM3kFyBYUB09XmXBrNodOiCVR4EDrmxq5Szg@mail.gmail.com Fixes: f7025d861694 ("smb: client: allocate crypto only for primary server") Fixes: b0abcd65ec54 ("smb: client: fix UAF in async decryption") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-09	x86/cpu: Avoid running off the end of an AMD erratum table	Dave Hansen
	The NULL array terminator at the end of erratum_1386_microcode was removed during the switch from x86_cpu_desc to x86_cpu_id. This causes readers to run off the end of the array. Replace the NULL. Fixes: f3f325152673 ("x86/cpu: Move AMD erratum 1386 table over to 'x86_cpu_id'") Reported-by: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-04-09	erofs: fix encoded extents handling	Gao Xiang
	- The MSB 32 bits of `z_fragmentoff` are available only in extent records of size >= 8B. - Use round_down() to calculate `lstart` as well as increase `pos` correspondingly for extent records of size == 8B. Fixes: 1d191b4ca51d ("erofs: implement encoded extent metadata") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250408114448.4040220-2-hsiangkao@linux.alibaba.com
2025-04-09	erofs: add __packed annotation to union(__le16..)	Gao Xiang
	I'm unsure why they aren't 2 bytes in size only in arm-linux-gnueabi. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202504051202.DS7QIknJ-lkp@intel.com Fixes: 61ba89b57905 ("erofs: add 48-bit block addressing on-disk support") Fixes: efb2aef569b3 ("erofs: add encoded extent on-disk definition") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250408114448.4040220-1-hsiangkao@linux.alibaba.com
2025-04-09	erofs: set error to bio if file-backed IO fails	Sheng Yong
	If a file-backed IO fails before submitting the bio to the lower filesystem, an error is returned, but the bio->bi_status is not marked as an error. However, the error information should be passed to the end_io handler. Otherwise, the IO request will be treated as successful. Fixes: 283213718f5d ("erofs: support compressed inodes for fileio") Signed-off-by: Sheng Yong <shengyong1@xiaomi.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250408122351.2104507-1-shengyong1@xiaomi.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-04-09	drm/amdgpu/mes12: optimize MES pipe FW version fetching	Alex Deucher
	Don't fetch it again if we already have it. It seems the registers don't reliably have the value at resume in some cases. Fixes: 785f0f9fe742 ("drm/amdgpu: Add mes v12_0 ip block support (v4)") Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9e7b08d239c2f21e8f417854f81e5ff40edbebff) Cc: stable@vger.kernel.org # 6.12.x
2025-04-09	drm/amd/pm/smu11: Prevent division by zero	Denis Arefev
	The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 1e866f1fe528 ("drm/amd/pm: Prevent divide by zero") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit da7dc714a8f8e1c9fc33c57cd63583779a3bef71) Cc: stable@vger.kernel.org
2025-04-09	drm/amdgpu: cancel gfx idle work in device suspend for s0ix	Alex Deucher
	This is normally handled in the gfx IP suspend callbacks, but for S0ix, those are skipped because we don't want to touch gfx. So handle it in device suspend. Fixes: b9467983b774 ("drm/amdgpu: add dynamic workload profile switching for gfx10") Fixes: 963537ca2325 ("drm/amdgpu: add dynamic workload profile switching for gfx11") Fixes: 5f95a1549555 ("drm/amdgpu: add dynamic workload profile switching for gfx12") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 906ad451675155380c1dc1881a244ebde8e8df0a) Cc: stable@vger.kernel.org
2025-04-09	drm/amd/display: pause the workload setting in dm	Kenneth Feng
	Pause the workload setting in dm when doing idle optimization Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b23f81c442ac33af0c808b4bb26333b881669bb7)
2025-04-09	drm/amdgpu/pm/swsmu: implement pause workload profile	Alex Deucher
	Add the callback for implementation for swsmu. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 92e511d1cecc6a8fa7bdfc8657f16ece9ab4d456)
2025-04-09	drm/amdgpu/pm: add workload profile pause helper	Alex Deucher
	To be used for display idle optimizations when we want to pause non-default profiles. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6dafb5d4c7cdfc8f994e789d050e29e0d5ca6efd)
2025-04-09	drm/i915/wm: convert i9xx_wm.c internally to struct intel_display	Jani Nikula
	Going forward, struct intel_display is the main display device data pointer. Convert as much as possible of i9xx_wm.c to struct intel_display. Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/bbee93f837fe7fedfd1627ff6fa295da8881df8d.1744119460.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-04-09	drm/i915/wm: convert i9xx_wm.c to intel_de_*() register interface	Jani Nikula
	The registers handled in i9xx_wm.c are mostly display registers. The MCH_SSKPD and MLTR_ILK registers are not. Convert register access to intel_de_*() interface where applicaple. Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/68367382759570413669d5648895a1da8f6c68f7.1744119460.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-04-09	drm/i915/wm: convert i9xx_wm.h external interfaces to struct intel_display	Jani Nikula
	Going forward, struct intel_display is the main display device data pointer. Convert the i9xx_wm.h interface to struct intel_display. With this, we can make intel_wm.c independent of i915_drv.h. v2: Also remove i915_drv.h, fix commit message Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/3e30634d85c0e0aac9c95f9a2f928131ba400271.1744119460.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-04-09	drm/i915/wm: convert skl_watermarks.c internally to struct intel_display	Jani Nikula
	Going forward, struct intel_display is the main display device data pointer. Convert as much as possible of skl_watermarks.c to struct intel_display. Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/61ae2013c5db962e90e072be7d37d630cb7dfc34.1744119460.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-04-09	drm/i915/wm: convert skl_watermark.h external interfaces to struct intel_display	Jani Nikula
	Going forward, struct intel_display is the main display device data pointer. Convert the skl_watermark.h interface to struct intel_display. Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/cd2b1863dee25b69b4766090dd183a7467c4edea.1744119460.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-04-09	drm/i915/wm: convert intel_wm.c internally to struct intel_display	Jani Nikula
	Going forward, struct intel_display is the main display device data pointer. Convert as much as possible of intel_wm.c to struct intel_display. Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/6106c0313190ee904c7f7737d0b78b61983eed91.1744119460.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-04-09	drm/i915/wm: convert intel_wm.h external interfaces to struct intel_display	Jani Nikula
	Going forward, struct intel_display is the main display device data pointer. Convert the intel_wm.h interface as well as the hooks in struct intel_wm_funcs to struct intel_display. Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/1085900b4e46bbb514e6918c321639ac380331ce.1744119460.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-04-09	ASoC: cs42l43: Reset clamp override on jack removal	Charles Keepax
	Some of the manually selected jack configurations will disable the headphone clamp override. Restore this on jack removal, such that the state is consistent for a new insert. Fixes: fc918cbe874e ("ASoC: cs42l43: Add support for the cs42l43") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250409120717.1294528-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-09	ublk: pass ublksrv_ctrl_cmd * instead of io_uring_cmd *	Caleb Sander Mateos
	The ublk_ctrl_() handlers all take struct io_uring_cmd cmd but only use it to get struct ublksrv_ctrl_cmd *header from the io_uring SQE. Since the caller ublk_ctrl_uring_cmd() has already computed header, pass it instead of cmd. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250409012928.3527198-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-09	dm table: Fix W=1 build warning when mempool_needs_integrity is unused	Andy Shevchenko
	The mempool_needs_integrity is unused. This, in particular, prevents kernel builds with Clang, `make W=1` and CONFIG_WERROR=y: drivers/md/dm-table.c:1052:7: error: variable 'mempool_needs_integrity' set but not used [-Werror,-Wunused-but-set-variable] 1052 \| bool mempool_needs_integrity = t->integrity_supported; \| ^ Fix this by removing the leftover. Fixes: 105ca2a2c2ff ("block: split struct bio_integrity_payload") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-04-09	ublk: don't fail request for recovery & reissue in case of ubq->canceling	Ming Lei
	ubq->canceling is set with request queue quiesced when io_uring context is exiting. USER_RECOVERY or !RECOVERY_FAIL_IO requires request to be re-queued and re-dispatch after device is recovered. However commit d796cea7b9f3 ("ublk: implement ->queue_rqs()") still may fail any request in case of ubq->canceling, this way breaks USER_RECOVERY or !RECOVERY_FAIL_IO. Fix it by calling __ublk_abort_rq() in case of ubq->canceling. Reviewed-by: Uday Shankar <ushankar@purestorage.com> Reported-by: Uday Shankar <ushankar@purestorage.com> Closes: https://lore.kernel.org/linux-block/Z%2FQkkTRHfRxtN%2FmB@dev-ushankar.dev.purestorage.com/ Fixes: d796cea7b9f3 ("ublk: implement ->queue_rqs()") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250409011444.2142010-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-09	ublk: fix handling recovery & reissue in ublk_abort_queue()	Ming Lei
	Commit 8284066946e6 ("ublk: grab request reference when the request is handled by userspace") doesn't grab request reference in case of recovery reissue. Then the request can be requeued & re-dispatch & failed when canceling uring command. If it is one zc request, the request can be freed before io_uring returns the zc buffer back, then cause kernel panic: [ 126.773061] BUG: kernel NULL pointer dereference, address: 00000000000000c8 [ 126.773657] #PF: supervisor read access in kernel mode [ 126.774052] #PF: error_code(0x0000) - not-present page [ 126.774455] PGD 0 P4D 0 [ 126.774698] Oops: Oops: 0000 [#1] SMP NOPTI [ 126.775034] CPU: 13 UID: 0 PID: 1612 Comm: kworker/u64:55 Not tainted 6.14.0_blk+ #182 PREEMPT(full) [ 126.775676] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39 04/01/2014 [ 126.776275] Workqueue: iou_exit io_ring_exit_work [ 126.776651] RIP: 0010:ublk_io_release+0x14/0x130 [ublk_drv] Fixes it by always grabbing request reference for aborting the request. Reported-by: Caleb Sander Mateos <csander@purestorage.com> Closes: https://lore.kernel.org/linux-block/CADUfDZodKfOGUeWrnAxcZiLT+puaZX8jDHoj_sfHZCOZwhzz6A@mail.gmail.com/ Fixes: 8284066946e6 ("ublk: grab request reference when the request is handled by userspace") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250409011444.2142010-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-09	crypto: caam/qi - Fix drv_ctx refcount bug	Herbert Xu
	Ensure refcount is raised before request is enqueued since it could be dequeued before the call returns. Reported-by: Sean Anderson <sean.anderson@linux.dev> Cc: <stable@vger.kernel.org> Fixes: 11144416a755 ("crypto: caam/qi - optimize frame queue cleanup") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> Tested-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-09	crypto: scomp - Fix null-pointer deref when freeing streams	Herbert Xu
	As the scomp streams are freed when an algorithm is unregistered, it is possible that the algorithm has never been used at all (e.g., an algorithm that does not have a self-test). So test whether the streams exist before freeing them. Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com> Fixes: 3d72ad46a23a ("crypto: acomp - Move stream management into scomp layer") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-09	ALSA: hda/realtek - Fixed ASUS platform headset Mic issue	Kailang Yang
	ASUS platform Headset Mic was disable by default. Assigned verb table for Mic pin will enable it. Fixes: 7ab61d0a9a35 ("ALSA: hda/realtek: Add support for ASUS B3405 and B3605 Laptops using CS35L41 HDA") Fixes: c86dd79a7c33 ("ALSA: hda/realtek: Add support for ASUS B5405 and B5605 Laptops using CS35L41 HDA") Signed-off-by: Kailang Yang <kailang@realtek.com> Link: https://lore.kernel.org/0fe3421a6850461fb0b7012cb28ef71d@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-04-09	ALSA: hda/cirrus_scodec_test: Don't select dependencies	Richard Fitzgerald
	Depend on SND_HDA_CIRRUS_SCODEC and GPIOLIB instead of selecting them. KUNIT_ALL_TESTS should only build tests that have satisfied dependencies and test components that are already being built. It must not cause other stuff to be added to the build. Fixes: 2144833e7b41 ("ALSA: hda: cirrus_scodec: Add KUnit test") Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20250409114520.914079-1-rf@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-04-09	Documentation/x86: Zap the subsection letters	Borislav Petkov (AMD)
	The subsections already have numbering - no need for the letters too. Zap the latter. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250409111435.GEZ_ZWmz3_lkP8S9Lb@fat_crate.local
2025-04-09	Documentation/x86: Update the naming of CPU features for /proc/cpuinfo	Naveen N Rao (AMD)
	Commit: 78ce84b9e0a5 ("x86/cpufeatures: Flip the /proc/cpuinfo appearance logic") changed how CPU feature names should be specified. Update document to reflect the same. Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250409111341.GDZ_ZWZS4LckBcirLE@fat_crate.local
2025-04-09	Merge branch 'sch_sfq-derived-limit'	David S. Miller
	Octavian Purdila says: ==================== net_sched: sch_sfq: reject a derived limit of 1 Because sfq parameters can influence each other there can be situations where although the user sets a limit of 2 it can be lowered to 1: $ tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 depth 1 $ tc qdisc show dev dummy0 qdisc sfq 1: dev dummy0 root refcnt 2 limit 1p quantum 1514b depth 1 divisor 1024 $ tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 10 depth 1 divisor 1 $ tc qdisc show dev dummy0 qdisc sfq 2: root refcnt 2 limit 1p quantum 1514b depth 1 divisor 1 As a limit of 1 is invalid, this patch series moves the limit validation to after all configuration changes have been done. To do so, the configuration is done in a temporary work area then applied to the internal state. The patch series also adds new test cases. v3: - remove a couple of unnecessary comments - rearrange local variables to use reverse Christmas tree style declaration order v2: https://lore.kernel.org/all/20250402162750.1671155-1-tavip@google.com/ - remove tmp struct and directly use local variables v1: https://lore.kernel.org/all/20250328201634.3876474-1-tavip@google.com/ =================== Signed-off-by: David S. Miller <davem@davemloft.net>
2025-04-09	selftests/tc-testing: sfq: check that a derived limit of 1 is rejected	Octavian Purdila
	Because the limit is updated indirectly when other parameters are updated, there are cases where even though the user requests a limit of 2 it can actually be set to 1. Add the following test cases to check that the kernel rejects them: - limit 2 depth 1 flows 1 - limit 2 depth 1 divisor 1 Signed-off-by: Octavian Purdila <tavip@google.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-04-09	net_sched: sch_sfq: move the limit validation	Octavian Purdila
	It is not sufficient to directly validate the limit on the data that the user passes as it can be updated based on how the other parameters are changed. Move the check at the end of the configuration update process to also catch scenarios where the limit is indirectly updated, for example with the following configurations: tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 depth 1 tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 divisor 1 This fixes the following syzkaller reported crash: ------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in net/sched/sch_sfq.c:203:6 index 65535 is out of range for type 'struct sfq_head[128]' CPU: 1 UID: 0 PID: 3037 Comm: syz.2.16 Not tainted 6.14.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x201/0x300 lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:231 [inline] __ubsan_handle_out_of_bounds+0xf5/0x120 lib/ubsan.c:429 sfq_link net/sched/sch_sfq.c:203 [inline] sfq_dec+0x53c/0x610 net/sched/sch_sfq.c:231 sfq_dequeue+0x34e/0x8c0 net/sched/sch_sfq.c:493 sfq_reset+0x17/0x60 net/sched/sch_sfq.c:518 qdisc_reset+0x12e/0x600 net/sched/sch_generic.c:1035 tbf_reset+0x41/0x110 net/sched/sch_tbf.c:339 qdisc_reset+0x12e/0x600 net/sched/sch_generic.c:1035 dev_reset_queue+0x100/0x1b0 net/sched/sch_generic.c:1311 netdev_for_each_tx_queue include/linux/netdevice.h:2590 [inline] dev_deactivate_many+0x7e5/0xe70 net/sched/sch_generic.c:1375 Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: 10685681bafc ("net_sched: sch_sfq: don't allow 1 packet limit") Signed-off-by: Octavian Purdila <tavip@google.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-04-09	net_sched: sch_sfq: use a temporary work area for validating configuration	Octavian Purdila
	Many configuration parameters have influence on others (e.g. divisor -> flows -> limit, depth -> limit) and so it is difficult to correctly do all of the validation before applying the configuration. And if a validation error is detected late it is difficult to roll back a partially applied configuration. To avoid these issues use a temporary work area to update and validate the configuration and only then apply the configuration to the internal state. Signed-off-by: Octavian Purdila <tavip@google.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-04-09	RDMA/cma: Fix workqueue crash in cma_netevent_work_handler	Sharath Srinivasan
	struct rdma_cm_id has member "struct work_struct net_work" that is reused for enqueuing cma_netevent_work_handler()s onto cma_wq. Below crash[1] can occur if more than one call to cma_netevent_callback() occurs in quick succession, which further enqueues cma_netevent_work_handler()s for the same rdma_cm_id, overwriting any previously queued work-item(s) that was just scheduled to run i.e. there is no guarantee the queued work item may run between two successive calls to cma_netevent_callback() and the 2nd INIT_WORK would overwrite the 1st work item (for the same rdma_cm_id), despite grabbing id_table_lock during enqueue. Also drgn analysis [2] indicates the work item was likely overwritten. Fix this by moving the INIT_WORK() to __rdma_create_id(), so that it doesn't race with any existing queue_work() or its worker thread. [1] Trimmed crash stack: ============================================= BUG: kernel NULL pointer dereference, address: 0000000000000008 kworker/u256:6 ... 6.12.0-0... Workqueue: cma_netevent_work_handler [rdma_cm] (rdma_cm) RIP: 0010:process_one_work+0xba/0x31a Call Trace: worker_thread+0x266/0x3a0 kthread+0xcf/0x100 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x1a/0x30 ============================================= [2] drgn crash analysis: >>> trace = prog.crashed_thread().stack_trace() >>> trace (0) crash_setup_regs (./arch/x86/include/asm/kexec.h:111:15) (1) __crash_kexec (kernel/crash_core.c:122:4) (2) panic (kernel/panic.c:399:3) (3) oops_end (arch/x86/kernel/dumpstack.c:382:3) ... (8) process_one_work (kernel/workqueue.c:3168:2) (9) process_scheduled_works (kernel/workqueue.c:3310:3) (10) worker_thread (kernel/workqueue.c:3391:4) (11) kthread (kernel/kthread.c:389:9) Line workqueue.c:3168 for this kernel version is in process_one_work(): 3168 strscpy(worker->desc, pwq->wq->name, WORKER_DESC_LEN); >>> trace[8]["work"] (struct work_struct )0xffff92577d0a21d8 = { .data = (atomic_long_t){ .counter = (s64)536870912, <=== Note }, .entry = (struct list_head){ .next = (struct list_head )0xffff924d075924c0, .prev = (struct list_head )0xffff924d075924c0, }, .func = (work_func_t)cma_netevent_work_handler+0x0 = 0xffffffffc2cec280, } Suspicion is that pwq is NULL: >>> trace[8]["pwq"] (struct pool_workqueue )<absent> In process_one_work(), pwq is assigned from: struct pool_workqueue pwq = get_work_pwq(work); and get_work_pwq() is: static struct pool_workqueue get_work_pwq(struct work_struct work) { unsigned long data = atomic_long_read(&work->data); if (data & WORK_STRUCT_PWQ) return work_struct_pwq(data); else return NULL; } WORK_STRUCT_PWQ is 0x4: >>> print(repr(prog['WORK_STRUCT_PWQ'])) Object(prog, 'enum work_flags', value=4) But work->data is 536870912 which is 0x20000000. So, get_work_pwq() returns NULL and we crash in process_one_work(): 3168 strscpy(worker->desc, pwq->wq->name, WORKER_DESC_LEN); ============================================= Fixes: 925d046e7e52 ("RDMA/core: Add a netevent notifier to cma") Cc: stable@vger.kernel.org Co-developed-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Sharath Srinivasan <sharath.srinivasan@oracle.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://patch.msgid.link/bf0082f9-5b25-4593-92c6-d130aa8ba439@oracle.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-04-09	nvmet-fc: put ref when assoc->del_work is already scheduled	Daniel Wagner
	Do not leak the tgtport reference when the work is already scheduled. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-09	nvmet-fc: take tgtport reference only once	Daniel Wagner
	The reference counting code can be simplified. Instead taking a tgtport refrerence at the beginning of nvmet_fc_alloc_hostport and put it back if not a new hostport object is allocated, only take it when a new hostport object is allocated. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-09	nvmet-fc: update tgtport ref per assoc	Daniel Wagner
	We need to take for each unique association a reference. nvmet_fc_alloc_hostport for each newly created association. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-09	nvmet-fc: inline nvmet_fc_free_hostport	Daniel Wagner
	No need for this tiny helper with only one user, let's inline it. And since the hostport ref counter needs to stay in sync, it's not optional anymore to give back the reference. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-09	nvmet-fc: inline nvmet_fc_delete_assoc	Daniel Wagner
	No need for this tiny helper with only one user, just inline it. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-09	nvmet-fcloop: add ref counting to lport	Daniel Wagner
	The fcloop_lport objects live time is controlled by the user interface add_local_port and del_local_port. nport, rport and tport objects are pointing to the lport objects but here is no clear tracking. Let's introduce an explicit ref counter for the lport objects and prepare the stage for restructuring how lports are used. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-09	nvmet-fcloop: replace kref with refcount	Daniel Wagner
	The kref wrapper is not really adding any value ontop of refcount. Thus replace the kref API with the refcount API. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-09	nvmet-fcloop: swap list_add_tail arguments	Daniel Wagner
	The newly element to be added to the list is the first argument of list_add_tail. This fix is missing dcfad4ab4d67 ("nvmet-fcloop: swap the list_add_tail arguments"). Fixes: 437c0b824dbd ("nvme-fcloop: add target to host LS request support") Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-09	x86/bugs: Add RSB mitigation document	Josh Poimboeuf
	Create a document to summarize hard-earned knowledge about RSB-related mitigations, with references, and replace the overly verbose yet incomplete comments with a reference to the document. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/ab73f4659ba697a974759f07befd41ae605e33dd.1744148254.git.jpoimboe@kernel.org
2025-04-09	x86/bugs: Don't fill RSB on context switch with eIBRS	Josh Poimboeuf
	User->user Spectre v2 attacks (including RSB) across context switches are already mitigated by IBPB in cond_mitigation(), if enabled globally or if either the prev or the next task has opted in to protection. RSB filling without IBPB serves no purpose for protecting user space, as indirect branches are still vulnerable. User->kernel RSB attacks are mitigated by eIBRS. In which case the RSB filling on context switch isn't needed, so remove it. Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Amit Shah <amit.shah@amd.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://lore.kernel.org/r/98cdefe42180358efebf78e3b80752850c7a3e1b.1744148254.git.jpoimboe@kernel.org
2025-04-09	x86/bugs: Don't fill RSB on VMEXIT with eIBRS+retpoline	Josh Poimboeuf
	eIBRS protects against guest->host RSB underflow/poisoning attacks. Adding retpoline to the mix doesn't change that. Retpoline has a balanced CALL/RET anyway. So the current full RSB filling on VMEXIT with eIBRS+retpoline is overkill. Disable it or do the VMEXIT_LITE mitigation if needed. Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Amit Shah <amit.shah@amd.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: David Woodhouse <dwmw2@infradead.org> Link: https://lore.kernel.org/r/84a1226e5c9e2698eae1b5ade861f1b8bf3677dc.1744148254.git.jpoimboe@kernel.org
2025-04-09	x86/bugs: Fix RSB clearing in indirect_branch_prediction_barrier()	Josh Poimboeuf
	IBPB is expected to clear the RSB. However, if X86_BUG_IBPB_NO_RET is set, that doesn't happen. Make indirect_branch_prediction_barrier() take that into account by calling write_ibpb() which clears RSB on X86_BUG_IBPB_NO_RET: /* Make sure IBPB clears return stack preductions too. */ FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_BUG_IBPB_NO_RET Note that, as of the previous patch, write_ibpb() also reads 'x86_pred_cmd' in order to use SBPB when applicable: movl _ASM_RIP(x86_pred_cmd), %eax Therefore that existing behavior in indirect_branch_prediction_barrier() is not lost. Fixes: 50e4b3b94090 ("x86/entry: Have entry_ibpb() invalidate return predictions") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://lore.kernel.org/r/bba68888c511743d4cd65564d1fc41438907523f.1744148254.git.jpoimboe@kernel.org
2025-04-09	x86/bugs: Use SBPB in write_ibpb() if applicable	Josh Poimboeuf
	write_ibpb() does IBPB, which (among other things) flushes branch type predictions on AMD. If the CPU has SRSO_NO, or if the SRSO mitigation has been disabled, branch type flushing isn't needed, in which case the lighter-weight SBPB can be used. The 'x86_pred_cmd' variable already keeps track of whether IBPB or SBPB should be used. Use that instead of hardcoding IBPB. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/17c5dcd14b29199b75199d67ff7758de9d9a4928.1744148254.git.jpoimboe@kernel.org
2025-04-09	x86/bugs: Rename entry_ibpb() to write_ibpb()	Josh Poimboeuf
	There's nothing entry-specific about entry_ibpb(). In preparation for calling it from elsewhere, rename it to write_ibpb(). Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/1e54ace131e79b760de3fe828264e26d0896e3ac.1744148254.git.jpoimboe@kernel.org
2025-04-09	x86/early_printk: Use 'mmio32' for consistency, fix comments	Andy Shevchenko
	First of all, using 'mmio' prevents proper implementation of 8-bit accessors. Second, it's simply inconsistent with uart8250 set of options. Rename it to 'mmio32'. While at it, remove rather misleading comment in the documentation. From now on mmio32 is self-explanatory and pciserial supports not only 32-bit MMIO accessors. Also, while at it, fix the comment for the "pciserial" case. The comment seems to be a copy'n'paste error when mentioning "serial" instead of "pciserial" (with double quotes). Fix this. With that, move it upper, so we don't calculate 'buf' twice. Fixes: 3181424aeac2 ("x86/early_printk: Add support for MMIO-based UARTs") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Denis Mukhin <dmukhin@ford.com> Link: https://lore.kernel.org/r/20250407172214.792745-1-andriy.shevchenko@linux.intel.com