linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-05-02	libbpf: Fix bpf_ksym_exists() in GCC	Jose E. Marchesi
	The macro bpf_ksym_exists is defined in bpf_helpers.h as: #define bpf_ksym_exists(sym) ({ \ _Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak"); \ !!sym; \ }) The purpose of the macro is to determine whether a given symbol has been defined, given the address of the object associated with the symbol. It also has a compile-time check to make sure the object whose address is passed to the macro has been declared as weak, which makes the check on `sym' meaningful. As it happens, the check for weak doesn't work in GCC in all cases, because __builtin_constant_p not always folds at parse time when optimizing. This is because optimizations that happen later in the compilation process, like inlining, may make a previously non-constant expression a constant. This results in errors like the following when building the selftests with GCC: bpf_helpers.h:190:24: error: expression in static assertion is not constant 190 \| _Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak"); \ \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fortunately recent versions of GCC support a __builtin_has_attribute that can be used to directly check for the __weak__ attribute. This patch changes bpf_helpers.h to use that builtin when building with a recent enough GCC, and to omit the check if GCC is too old to support the builtin. The macro used for GCC becomes: #define bpf_ksym_exists(sym) ({ \ _Static_assert(__builtin_has_attribute (sym, __weak__), #sym " should be marked as __weak"); \ !!sym; \ }) Note that since bpf_ksym_exists is designed to get the address of the object associated with symbol SYM, we pass sym to __builtin_has_attribute instead of sym. When an expression is passed to __builtin_has_attribute then it is the type of the passed expression that is checked for the specified attribute. The expression itself is not evaluated. This accommodates well with the existing usages of the macro: - For function objects: struct task_struct bpf_task_acquire(struct task_struct p) __ksym __weak; [...] bpf_ksym_exists(bpf_task_acquire) - For variable objects: extern const struct rq runqueues __ksym __weak; /* typed */ [...] bpf_ksym_exists(&runqueues) Note also that BPF support was added in GCC 10 and support for __builtin_has_attribute in GCC 9. Locally tested in bpf-next master branch. No regressions. Signed-of-by: Jose E. Marchesi <jose.marchesi@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240428112559.10518-1-jose.marchesi@oracle.com
2024-05-03	slimbus: qcom-ngd-ctrl: Add timeout for wait operation	Viken Dadhaniya
	In current driver qcom_slim_ngd_up_worker() indefinitely waiting for ctrl->qmi_up completion object. This is resulting in workqueue lockup on Kthread. Added wait_for_completion_interruptible_timeout to allow the thread to wait for specific timeout period and bail out instead waiting infinitely. Fixes: a899d324863a ("slimbus: qcom-ngd-ctrl: add Sub System Restart support") Cc: stable@vger.kernel.org Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Viken Dadhaniya <quic_vdadhani@quicinc.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20240430091238.35209-2-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-03	spi: pxa2xx: Don't provide struct chip_data for others	Andy Shevchenko
	Now the struct chip_data is local to spi-pxa2xx.c, move its definition to the C file. This will slightly speed up a build and also hide badly named data type (too generic). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-10-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: pxa2xx: Remove timeout field from struct chip_data	Andy Shevchenko
	The timeout field is used only once and assigned to a predefined constant. Replace all that by using the constant directly. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-9-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: pxa2xx: Remove DMA parameters from struct chip_data	Andy Shevchenko
	The DMA related fields are set once and never modified. It effectively repeats the content of the same fields in struct pxa2xx_spi_controller. With that, remove DMA parameters from struct chip_data. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-8-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: pxa2xx: Drop struct pxa2xx_spi_chip	Andy Shevchenko
	No more users. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-7-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: pxa2xx: Don't use "proxy" headers	Andy Shevchenko
	Update header inclusions to follow IWYU (Include What You Use) principle. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-6-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: pxa2xx: Remove outdated documentation	Andy Shevchenko
	The documentation is referring to the legacy enumeration of the SPI host controllers and target devices. It has nothing to do with the modern way, which is the only supported in kernel right now. Hence, remove outdated documentation file. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-5-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: pxa2xx: Move contents of linux/spi/pxa2xx_spi.h to a local one	Andy Shevchenko
	There is no user of the linux/spi/pxa2xx_spi.h. Move its contents to the drivers/spi/spi-pxa2xx.h. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-4-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: pxa2xx: Provide num-cs for Sharp PDAs via device properties	Andy Shevchenko
	Since driver can parse num-cs device property, replace platform data with this new approach. This pursues the following objectives: - getting rid of the public header that barely used outside of the SPI subsystem (more specifically the SPI PXA2xx drivers) - making a trampoline for the driver to support non-default number of the chip select pins in case the original code is going to be converted to Device Tree model It's not expected to have more users in board files except this one. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-3-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: pxa2xx: Allow number of chip select pins to be read from property	Andy Shevchenko
	In some cases the number of the chip select pins might come from the device property. Allow driver to use it. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-2-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: dt-bindings: ti,qspi: convert to dtschema	Kousik Sanagavarapu
	Convert txt binding of TI's qspi controller (found on their omap SoCs) to dtschema to allow for validation. The changes, w.r.t. the original txt binding, are: - Introduce "clocks" and "clock-names" which was never mentioned. - Reflect that "ti,hwmods" is deprecated and is not a "required" property anymore. - Introduce "num-cs" which allows for setting the number of chip selects. - Drop "qspi_ctrlmod". Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240501165203.13763-1-five231003@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: bitbang: Add missing MODULE_DESCRIPTION()	Andy Shevchenko
	The modpost script is not happy WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/spi/spi-bitbang.o because there is a missing module description. Add it to the module. While at it, update the terminology in Kconfig section to be in align with added description along with the code comments. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240502171518.2792895-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: bitbang: Use NSEC_PER_*SEC rather than hard coding	Andy Shevchenko
	Use NSEC_PER_*SEC rather than the hard coded value of 1000s. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240502154825.2752464-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: dw: Drop default number of CS setting	Serge Semin
	DW APB/AHB SSI core now supports the procedure automatically detecting the number of native chip-select lines. Thus there is no longer point in defaulting to four CS if the platform doesn't specify the real number especially seeing the default number didn't correspond to any original DW APB/AHB databook. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20240424150657.9678-5-fancer.lancer@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: dw: Convert dw_spi::num_cs to u32	Serge Semin
	Number of native chip-select lines is either retrieved from the "num-cs" DT-property or auto-detected in the generic DW APB/AHB SSI probe method. In the former case the property is supposed to be of the "u32" size. Convert the field type to being u32 then to be able to drop the temporary variable afterwards. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20240424150657.9678-4-fancer.lancer@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: dw: Add a number of native CS auto-detection	Serge Semin
	Aside with the FIFO depth and DFS field size it's possible to auto-detect a number of native chip-select synthesized in the DW APB/AHB SSI IP-core. It can be done just by writing ones to the SER register. The number of writable flags in the register is limited by the SSI_NUM_SLAVES IP-core synthesize parameter. All the upper flags are read-only and wired to zero. Based on that let's add the number of native CS auto-detection procedure so the low-level platform drivers wouldn't need to manually set it up unless it's required to set a constraint due to platform-specific reasons (for instance, due to a hardware bug). Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20240424150657.9678-3-fancer.lancer@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: dw: Convert to using BITS_TO_BYTES() macro	Serge Semin
	Since commit dd3e7cba1627 ("ocfs2/dlm: move BITS_TO_BYTES() to bitops.h for wider use") there is a generic helper available to calculate a number of bytes needed to accommodate the specified number of bits. Let's use it instead of the hard-coded DIV_ROUND_UP() macro function. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20240424150657.9678-2-fancer.lancer@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	hwmon: (da9052) Use devm_regulator_get_enable_read_voltage()	David Lechner
	We can reduce boilerplate code by using devm_regulator_get_enable_read_voltage(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240429-regulator-get-enable-get-votlage-v2-3-b1f11ab766c1@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	hwmon: (adc128d818) Use devm_regulator_get_enable_read_voltage()	David Lechner
	We can reduce boilerplate code and eliminate the driver remove() function by using devm_regulator_get_enable_read_voltage(). A new external_vref flag is added since we no longer have the handle to the regulator to check if it is present. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240429-regulator-get-enable-get-votlage-v2-2-b1f11ab766c1@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	regulator: devres: add API for reference voltage supplies	David Lechner
	A common use case for regulators is to supply a reference voltage to an analog input or output device. This adds a new devres API to get, enable, and get the voltage in a single call. This allows eliminating boilerplate code in drivers that use reference supplies in this way. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20240429-regulator-get-enable-get-votlage-v2-1-b1f11ab766c1@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	ASoC: codecs: Drop explicit initialization of struct ↵	Uwe Kleine-König
	i2c_device_id::driver_data to 0 These drivers don't use the driver_data member of struct i2c_device_id, so don't explicitly initialize this member. This prepares putting driver_data in an anonymous union which requires either no initialization or named designators. But it's also a nice cleanup on its own. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20240502074722.1103986-2-u.kleine-koenig@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	spi: stm32: enable controller before asserting CS	Ben Wolsieffer
	On the STM32F4/7, the MOSI and CLK pins float while the controller is disabled. CS is a regular GPIO, and therefore always driven. Currently, the controller is enabled in the transfer_one() callback, which runs after CS is asserted. Therefore, there is a period where the SPI pins are floating while CS is asserted, making it possible for stray signals to disrupt communications. An analogous problem occurs at the end of the transfer when the controller is disabled before CS is released. This problem can be reliably observed by enabling the pull-up (if CPOL=0) or pull-down (if CPOL=1) on the clock pin. This will cause two extra unintended clock edges per transfer, when the controller is enabled and disabled. Note that this bug is likely not present on the STM32H7, because this driver sets the AFCNTR bit (not supported on F4/F7), which keeps the SPI pins driven even while the controller is disabled. Enabling/disabling the controller as part of runtime PM was suggested as an alternative approach, but this breaks the driver on the STM32MP1 (see [1]). The following quote from the manual may explain this: > To restart the internal state machine properly, SPI is strongly > suggested to be disabled and re-enabled before next transaction starts > despite its setting is not changed. This patch has been tested on an STM32F746 with a MAX14830 UART expander. [1] https://lore.kernel.org/lkml/ZXzRi_h2AMqEhMVw@dell-precision-5540/T/ Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> Link: https://lore.kernel.org/r/20240424135237.1329001-2-ben.wolsieffer@hefring.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03	ASoC: amd: acp: fix for acp platform device creation failure	Vijendar Mukunda
	ACP pin configuration varies based on acp version. ACP PCI driver should read the ACP PIN config value and based on config value, it has to create a platform device in below two conditions. 1) If ACP PDM configuration is selected from BIOS and ACP PDM controller exists. 2) If ACP I2S configuration is selected from BIOS. Other than above scenarios, ACP PCI driver should skip the platform device creation logic, i.e. ACP PCI driver probe sequence should never fail if other acp pin configuration is selected. It should skip platform device creation logic. check_acp_pdm() function was implemented for ACP6.x platforms to check ACP PDM configuration. Previously, this code was safe guarded by FLAG_AMD_LEGACY_ONLY_DMIC flag check. This implementation breaks audio use cases for Huawei Matebooks which are based on ACP3.x varaint uses I2S configuration. In current scenario, check_acp_pdm() function returns -ENODEV value which results in ACP PCI driver probe failure without creating a platform device even in case of valid ACP pin configuration. Implement check_acp_config() as a common function which invokes platform specific acp pin configuration check functions for ACP3.x, ACP6.0 & ACP6.3 & ACP7.0 variants and checks for ACP PDM controller. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218780 Fixes: 4af565de9f8c ("ASoC: amd: acp: fix for acp pdm configuration check") Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Link: https://lore.kernel.org/r/20240502140340.4049021-1-Vijendar.Mukunda@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-02	tcp: Use refcount_inc_not_zero() in tcp_twsk_unique().	Kuniyuki Iwashima
	Anderson Nascimento reported a use-after-free splat in tcp_twsk_unique() with nice analysis. Since commit ec94c2696f0b ("tcp/dccp: avoid one atomic operation for timewait hashdance"), inet_twsk_hashdance() sets TIME-WAIT socket's sk_refcnt after putting it into ehash and releasing the bucket lock. Thus, there is a small race window where other threads could try to reuse the port during connect() and call sock_hold() in tcp_twsk_unique() for the TIME-WAIT socket with zero refcnt. If that happens, the refcnt taken by tcp_twsk_unique() is overwritten and sock_put() will cause underflow, triggering a real use-after-free somewhere else. To avoid the use-after-free, we need to use refcount_inc_not_zero() in tcp_twsk_unique() and give up on reusing the port if it returns false. [0]: refcount_t: addition on 0; use-after-free. WARNING: CPU: 0 PID: 1039313 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110 CPU: 0 PID: 1039313 Comm: trigger Not tainted 6.8.6-200.fc39.x86_64 #1 Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 RIP: 0010:refcount_warn_saturate+0xe5/0x110 Code: 42 8e ff 0f 0b c3 cc cc cc cc 80 3d aa 13 ea 01 00 0f 85 5e ff ff ff 48 c7 c7 f8 8e b7 82 c6 05 96 13 ea 01 01 e8 7b 42 8e ff <0f> 0b c3 cc cc cc cc 48 c7 c7 50 8f b7 82 c6 05 7a 13 ea 01 01 e8 RSP: 0018:ffffc90006b43b60 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff888009bb3ef0 RCX: 0000000000000027 RDX: ffff88807be218c8 RSI: 0000000000000001 RDI: ffff88807be218c0 RBP: 0000000000069d70 R08: 0000000000000000 R09: ffffc90006b439f0 R10: ffffc90006b439e8 R11: 0000000000000003 R12: ffff8880029ede84 R13: 0000000000004e20 R14: ffffffff84356dc0 R15: ffff888009bb3ef0 FS: 00007f62c10926c0(0000) GS:ffff88807be00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020ccb000 CR3: 000000004628c005 CR4: 0000000000f70ef0 PKRU: 55555554 Call Trace: <TASK> ? refcount_warn_saturate+0xe5/0x110 ? __warn+0x81/0x130 ? refcount_warn_saturate+0xe5/0x110 ? report_bug+0x171/0x1a0 ? refcount_warn_saturate+0xe5/0x110 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? refcount_warn_saturate+0xe5/0x110 tcp_twsk_unique+0x186/0x190 __inet_check_established+0x176/0x2d0 __inet_hash_connect+0x74/0x7d0 ? __pfx___inet_check_established+0x10/0x10 tcp_v4_connect+0x278/0x530 __inet_stream_connect+0x10f/0x3d0 inet_stream_connect+0x3a/0x60 __sys_connect+0xa8/0xd0 __x64_sys_connect+0x18/0x20 do_syscall_64+0x83/0x170 entry_SYSCALL_64_after_hwframe+0x78/0x80 RIP: 0033:0x7f62c11a885d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a3 45 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007f62c1091e58 EFLAGS: 00000296 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000020ccb004 RCX: 00007f62c11a885d RDX: 0000000000000010 RSI: 0000000020ccb000 RDI: 0000000000000003 RBP: 00007f62c1091e90 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000296 R12: 00007f62c10926c0 R13: ffffffffffffff88 R14: 0000000000000000 R15: 00007ffe237885b0 </TASK> Fixes: ec94c2696f0b ("tcp/dccp: avoid one atomic operation for timewait hashdance") Reported-by: Anderson Nascimento <anderson@allelesecurity.com> Closes: https://lore.kernel.org/netdev/37a477a6-d39e-486b-9577-3463f655a6b7@allelesecurity.com/ Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240501213145.62261-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets	Eric Dumazet
	TCP_SYN_RECV state is really special, it is only used by cross-syn connections, mostly used by fuzzers. In the following crash [1], syzbot managed to trigger a divide by zero in tcp_rcv_space_adjust() A socket makes the following state transitions, without ever calling tcp_init_transfer(), meaning tcp_init_buffer_space() is also not called. TCP_CLOSE connect() TCP_SYN_SENT TCP_SYN_RECV shutdown() -> tcp_shutdown(sk, SEND_SHUTDOWN) TCP_FIN_WAIT1 To fix this issue, change tcp_shutdown() to not perform a TCP_SYN_RECV -> TCP_FIN_WAIT1 transition, which makes no sense anyway. When tcp_rcv_state_process() later changes socket state from TCP_SYN_RECV to TCP_ESTABLISH, then look at sk->sk_shutdown to finally enter TCP_FIN_WAIT1 state, and send a FIN packet from a sane socket state. This means tcp_send_fin() can now be called from BH context, and must use GFP_ATOMIC allocations. [1] divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 1 PID: 5084 Comm: syz-executor358 Not tainted 6.9.0-rc6-syzkaller-00022-g98369dccd2f8 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 RIP: 0010:tcp_rcv_space_adjust+0x2df/0x890 net/ipv4/tcp_input.c:767 Code: e3 04 4c 01 eb 48 8b 44 24 38 0f b6 04 10 84 c0 49 89 d5 0f 85 a5 03 00 00 41 8b 8e c8 09 00 00 89 e8 29 c8 48 0f af c3 31 d2 <48> f7 f1 48 8d 1c 43 49 8d 96 76 08 00 00 48 89 d0 48 c1 e8 03 48 RSP: 0018:ffffc900031ef3f0 EFLAGS: 00010246 RAX: 0c677a10441f8f42 RBX: 000000004fb95e7e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000027d4b11f R08: ffffffff89e535a4 R09: 1ffffffff25e6ab7 R10: dffffc0000000000 R11: ffffffff8135e920 R12: ffff88802a9f8d30 R13: dffffc0000000000 R14: ffff88802a9f8d00 R15: 1ffff1100553f2da FS: 00005555775c0380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1155bf2304 CR3: 000000002b9f2000 CR4: 0000000000350ef0 Call Trace: <TASK> tcp_recvmsg_locked+0x106d/0x25a0 net/ipv4/tcp.c:2513 tcp_recvmsg+0x25d/0x920 net/ipv4/tcp.c:2578 inet6_recvmsg+0x16a/0x730 net/ipv6/af_inet6.c:680 sock_recvmsg_nosec net/socket.c:1046 [inline] sock_recvmsg+0x109/0x280 net/socket.c:1068 ____sys_recvmsg+0x1db/0x470 net/socket.c:2803 ___sys_recvmsg net/socket.c:2845 [inline] do_recvmmsg+0x474/0xae0 net/socket.c:2939 __sys_recvmmsg net/socket.c:3018 [inline] __do_sys_recvmmsg net/socket.c:3041 [inline] __se_sys_recvmmsg net/socket.c:3034 [inline] __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3034 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7faeb6363db9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffcc1997168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faeb6363db9 RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000001c R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20240501125448.896529-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	net_sched: sch_sfq: annotate data-races around q->perturb_period	Eric Dumazet
	sfq_perturbation() reads q->perturb_period locklessly. Add annotations to fix potential issues. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240430180015.3111398-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	net: dsa: mv88e6xxx: Correct check for empty list	Simon Horman
	Since commit a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses") mv88e6xxx_default_mdio_bus() has checked that the return value of list_first_entry() is non-NULL. This appears to be intended to guard against the list chip->mdios being empty. However, it is not the correct check as the implementation of list_first_entry is not designed to return NULL for empty lists. Instead, use list_first_entry_or_null() which does return NULL if the list is empty. Flagged by Smatch. Compile tested only. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240430-mv88e6xx-list_empty-v3-1-c35c69d88d2e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	selftests/net: skip partial checksum packets in csum test	Willem de Bruijn
	Detect packets with ip_summed CHECKSUM_PARTIAL and skip these. These should not exist, as the test sends individual packets between two hosts. But if (HW) GRO is on, with randomized content sometimes subsequent packets can be coalesced. In this case the GSO packet checksum is converted to a pseudo checksum in anticipation of sending out as TSO/USO. So the field will not match the expected value. Do not count these as test errors. Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240501193156.3627344-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	selftests: net: py: check process exit code in bkg() and background cmd()	Jakub Kicinski
	We're a bit too loose with error checking for background processes. cmd() completely ignores the fail argument passed to the constructor if background is True. Default to checking for errors if process is not terminated explicitly. Caller can override with True / False. For bkg() the processing step is called magically by __exit__ so record the value passed in the constructor. Reported-by: Willem de Bruijn <willemb@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240502025325.1924923-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	IB/hfi1: allocate dummy net_device dynamically	Breno Leitao
	Embedding net_device into structures prohibits the usage of flexible arrays in the net_device structure. For more details, see the discussion at [1]. Un-embed the net_device from struct hfi1_netdev_rx by converting it into a pointer. Then use the leverage alloc_netdev() to allocate the net_device object at hfi1_alloc_rx(). [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/ Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Leon Romanovsky <leon@kernel.org> Link: https://lore.kernel.org/r/20240430162213.746492-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	net/mlx5e: flower: check for unsupported control flags	Asbjørn Sloth Tønnesen
	Use flow_rule_is_supp_control_flags() to reject filters with unsupported control flags. In case any unsupported control flags are masked, flow_rule_is_supp_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Remove FLOW_DIS_FIRST_FRAG specific error message, and treat it as any other unsupported control flag. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240422152728.175677-1-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-03	Merge tag 'drm-misc-fixes-2024-05-02' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: imagination: - fix page-count macro nouveau: - avoid page-table allocation failures - fix firmware memory allocation panel: - ili9341: avoid OF for device properties; respect deferred probe; fix usage of errno codes ttm: - fix status output vmwgfx: - fix legacy display unit - fix read length in fence signalling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240502192117.GA12158@linux.fritz.box
2024-05-03	Merge tag 'drm-xe-fixes-2024-05-02' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Fix UAF on rebind worker - Fix ADL-N display integration Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/6bontwst3mbxozs6u3ad5n3g5zmaucrngbfwv4hkfhpscnwlym@wlwjgjx6pwue
2024-05-03	Merge tag 'drm-xe-next-fixes-2024-05-02' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Driver Changes: - Fix for a backmerge going slightly wrong. - An UAF fix - Avoid a WA error on LNL. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZjOijQA43zhu3SZ4@fedora
2024-05-03	Merge tag 'amd-drm-fixes-6.9-2024-05-01' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.9-2024-05-01: amdgpu: - Fix VRAM memory accounting - DCN 3.1 fixes - DCN 2.0 fix - DCN 3.1.5 fix - DCN 3.5 fix - DCN 3.2.1 fix - DP fixes - Seamless boot fix - Fix call order in amdgpu_ttm_move() - Fix doorbell regression - Disable panel replay temporarily amdkfd: - Flush wq before creating kfd process Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240501135054.1919108-1-alexander.deucher@amd.com
2024-05-02	libbpf: fix ring_buffer__consume_n() return result logic	Andrii Nakryiko
	Add INT_MAX check to ring_buffer__consume_n(). We do the similar check to handle int return result of all these ring buffer APIs in other APIs and ring_buffer__consume_n() is missing one. This patch fixes this omission. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20240430201952.888293-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	libbpf: fix potential overflow in ring__consume_n()	Andrii Nakryiko
	ringbuf_process_ring() return int64_t, while ring__consume_n() assigns it to int. It's highly unlikely, but possible for ringbuf_process_ring() to return value larger than INT_MAX, so use int64_t. ring__consume_n() does check INT_MAX before returning int result to the user. Fixes: 4d22ea94ea33 ("libbpf: Add ring__consume_n / ring_buffer__consume_n") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20240430201952.888293-1-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	Merge branch 'Add new args into tcp_congestion_ops' cong_control'	Martin KaFai Lau
	Miao Xu says: ==================== This patchset attempts to add two new arguments into the hookpoint cong_control in tcp_congestion_ops. The new arguments are inherited from the caller tcp_cong_control and can be used by any bpf cc prog that implements its own logic inside this hookpoint. Please review. Thanks a lot! Changelog ===== v2->v3: - Fixed the broken selftest caused by the new arguments. - Renamed the selftest file name and bpf prog name. v1->v2: - Split the patchset into 3 separate patches. - Added highlights in the selftest prog. - Removed the dependency on bpf_tcp_helpers.h. ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	selftests/bpf: Add test for the use of new args in cong_control	Miao Xu
	This patch adds a selftest to show the usage of the new arguments in cong_control. For simplicity's sake, the testing example reuses cubic's kernel functions. Signed-off-by: Miao Xu <miaxu@meta.com> Link: https://lore.kernel.org/r/20240502042318.801932-4-miaxu@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	bpf: tcp: Allow to write tp->snd_cwnd_stamp in bpf_tcp_ca	Miao Xu
	This patch allows the write of tp->snd_cwnd_stamp in a bpf tcp ca program. An use case of writing this field is to keep track of the time whenever tp->snd_cwnd is raised or reduced inside the `cong_control` callback. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Miao Xu <miaxu@meta.com> Link: https://lore.kernel.org/r/20240502042318.801932-3-miaxu@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	tcp: Add new args for cong_control in tcp_congestion_ops	Miao Xu
	This patch adds two new arguments for cong_control of struct tcp_congestion_ops: - ack - flag These two arguments are inherited from the caller tcp_cong_control in tcp_intput.c. One use case of them is to update cwnd and pacing rate inside cong_control based on the info they provide. For example, the flag can be used to decide if it is the right time to raise or reduce a sender's cwnd. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Miao Xu <miaxu@meta.com> Link: https://lore.kernel.org/r/20240502042318.801932-2-miaxu@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	KVM: selftests: Require KVM_CAP_USER_MEMORY2 for tests that create memslots	Sean Christopherson
	Explicitly require KVM_CAP_USER_MEMORY2 for selftests that create memslots, i.e. skip selftests that need memslots instead of letting them fail on KVM_SET_USER_MEMORY_REGION2. While it's ok to take a dependency on new kernel features, selftests should skip gracefully instead of failing hard when run on older kernels. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/69ae0694-8ca3-402c-b864-99b500b24f5d@moroto.mountain Suggested-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20240430162133.337541-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-05-02	KVM: selftests: Allow skipping the KVM_RUN sanity check in rseq_test	Zide Chen
	The rseq test's migration worker delays 1-10 us, assuming that one KVM_RUN iteration only takes a few microseconds. But if the CPU low power wakeup latency is large enough, for example, hundreds or even thousands of microseconds for deep C-state exit latencies on x86 server CPUs, it may happen that the target CPU is unable to wakeup and run the vCPU before the migration worker starts to migrate the vCPU thread to the _next_ CPU. If the system workload is light, most CPUs could be at a certain low power state, which may result in less successful migrations and fail the migration/KVM_RUN ratio sanity check. But this is not supposed to be deemed a test failure. Add a command line option to skip the sanity check, along with a comment and a verbose assert message to try to help the user resolve the potential source of failures without having to resort to disabling the check. Co-developed-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com> Signed-off-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Link: https://lore.kernel.org/r/20240502213936.27619-1-zide.chen@intel.com [sean: massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-05-02	Merge branch 'selftests/bpf: Add sockaddr tests for kernel networking'	Martin KaFai Lau
	Jordan Rife says: ==================== This patch series adds test coverage for BPF sockaddr hooks and their interactions with kernel socket functions (i.e. kernel_bind(), kernel_connect(), kernel_sendmsg(), sock_sendmsg(), kernel_getpeername(), and kernel_getsockname()) while also rounding out IPv4 and IPv6 sockaddr hook coverage in prog_tests/sock_addr.c. As with v1 of this patch series, we add regression coverage for the issues addressed by these patches, - commit 0bdf399342c5("net: Avoid address overwrite in kernel_connect") - commit 86a7e0b69bd5("net: prevent rewrite of msg_name in sock_sendmsg()") - commit c889a99a21bf("net: prevent address rewrite in kernel_bind()") - commit 01b2885d9415("net: Save and restore msg_namelen in sock_sendmsg") but broaden the focus a bit. In order to extend prog_tests/sock_addr.c to test these kernel functions, we add a set of new kfuncs that wrap individual socket operations to bpf_testmod and invoke them through set of corresponding SYSCALL programs (progs/sock_addr_kern.c). Each test case can be configured to use a different set of "sock_ops" depending on whether it is testing kernel calls (kernel_bind(), kernel_connect(), etc.) or system calls (bind(), connect(), etc.). ======= Patches ======= * Patch 1 fixes the sock_addr bind test program to work for big endian architectures such as s390x. * Patch 2 introduces the new kfuncs to bpf_testmod. * Patch 3 introduces the BPF program which allows us to invoke these kfuncs invividually from the test program. * Patch 4 lays the groundwork for IPv4 and IPv6 sockaddr hook coverage by migrating much of the environment setup logic from bpf/test_sock_addr.sh into prog_tests/sock_addr.c and moves test cases to cover bind4/6, connect4/6, sendmsg4/6 and recvmsg4/6 hooks. * Patch 5 makes the set of socket operations for each test case configurable, laying the groundwork for Patch 6. * Patch 6 introduces two sets of sock_ops that invoke the kernel equivalents of connect(), bind(), etc. and uses these to add coverage for the kernel socket functions. ======= Changes ======= v2->v3 ------ * Renamed bind helpers. Dropped "_ntoh" suffix. * Added guards to kfuncs to make sure addrlen and msglen do not exceed the buffer capacity. * Added KF_SLEEPABLE flag to kfuncs. * Added a mutex (sock_lock) to kfuncs to serialize access to sock. * Added NULL check for sock to each kfunc. * Use the "sock_addr" networking namespace for all network interface setup and testing. * Use "nodad" when calling "ip -6 addr add" during interface setup to avoid delays and remove ping loop. * Removed test cases from test_sock_addr.c to make it clear what remains to be migrated. * Removed unused parameter (expect_change) from sock_addr_op(). Link: https://lore.kernel.org/bpf/20240412165230.2009746-1-jrife@google.com/T/#u v1->v2 ------ * Dropped test_progs/sock_addr_kern.c and the sock_addr_kern test module in favor of simply expanding bpf_testmod and test_progs/sock_addr.c. * Migrated environment setup logic from bpf/test_sock_addr.sh into prog_tests/sock_addr.c rather than invoking the script from the test program. * Added kfuncs to bpf_testmod as well as the sock_addr_kern BPF program to enable us to invoke kernel socket functions from test_progs/sock_addr.c. * Added test coverage for kernel socket functions to test_progs/sock_addr.c. Link: https://lore.kernel.org/bpf/20240329191907.1808635-1-jrife@google.com/T/#u ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	selftests/bpf: Add kernel socket operation tests	Jordan Rife
	This patch creates two sets of sock_ops that call out to the SYSCALL hooks in the sock_addr_kern BPF program and uses them to construct test cases for the range of supported operations (kernel_connect(), kernel_bind(), kernel_sendms(), sock_sendmsg(), kernel_getsockname(), kenel_getpeername()). This ensures that these interact with BPF sockaddr hooks as intended. Beyond this it also ensures that these operations do not modify their address parameter, providing regression coverage for the issues addressed by this set of patches: - commit 0bdf399342c5("net: Avoid address overwrite in kernel_connect") - commit 86a7e0b69bd5("net: prevent rewrite of msg_name in sock_sendmsg()") - commit c889a99a21bf("net: prevent address rewrite in kernel_bind()") - commit 01b2885d9415("net: Save and restore msg_namelen in sock_sendmsg") Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240429214529.2644801-7-jrife@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	selftests/bpf: Make sock configurable for each test case	Jordan Rife
	In order to reuse the same test code for both socket system calls (e.g. connect(), bind(), etc.) and kernel socket functions (e.g. kernel_connect(), kernel_bind(), etc.), this patch introduces the "ops" field to sock_addr_test. This field allows each test cases to configure the set of functions used in the test case to create, manipulate, and tear down a socket. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240429214529.2644801-6-jrife@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	selftests/bpf: Move IPv4 and IPv6 sockaddr test cases	Jordan Rife
	This patch lays the groundwork for testing IPv4 and IPv6 sockaddr hooks and their interaction with both socket syscalls and kernel functions (e.g. kernel_connect, kernel_bind, etc.). It moves some of the test cases from the old-style bpf/test_sock_addr.c self test into the sock_addr prog_test in a step towards fully retiring bpf/test_sock_addr.c. We will expand the test dimensions in the sock_addr prog_test in a later patch series in order to migrate the remaining test cases. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240429214529.2644801-5-jrife@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02	Merge tag 'md-6.10-20240502' of ↵	Jens Axboe
	https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.10/block Pull MD fix from Song: "This fixes an issue observed with dm-raid." * tag 'md-6.10-20240502' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md: fix resync softlockup when bitmap size is less than array size
2024-05-02	md: fix resync softlockup when bitmap size is less than array size	Yu Kuai
	Is is reported that for dm-raid10, lvextend + lvchange --syncaction will trigger following softlockup: kernel:watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [mdX_resync:6976] CPU: 7 PID: 3588 Comm: mdX_resync Kdump: loaded Not tainted 6.9.0-rc4-next-20240419 #1 RIP: 0010:_raw_spin_unlock_irq+0x13/0x30 Call Trace: <TASK> md_bitmap_start_sync+0x6b/0xf0 raid10_sync_request+0x25c/0x1b40 [raid10] md_do_sync+0x64b/0x1020 md_thread+0xa7/0x170 kthread+0xcf/0x100 ret_from_fork+0x30/0x50 ret_from_fork_asm+0x1a/0x30 And the detailed process is as follows: md_do_sync j = mddev->resync_min while (j < max_sectors) sectors = raid10_sync_request(mddev, j, &skipped) if (!md_bitmap_start_sync(..., &sync_blocks)) // md_bitmap_start_sync set sync_blocks to 0 return sync_blocks + sectors_skippe; // sectors = 0; j += sectors; // j never change Root cause is that commit 301867b1c168 ("md/raid10: check slab-out-of-bounds in md_bitmap_get_counter") return early from md_bitmap_get_counter(), without setting returned blocks. Fix this problem by always set returned blocks from md_bitmap_get_counter"(), as it used to be. Noted that this patch just fix the softlockup problem in kernel, the case that bitmap size doesn't match array size still need to be fixed. Fixes: 301867b1c168 ("md/raid10: check slab-out-of-bounds in md_bitmap_get_counter") Reported-and-tested-by: Nigel Croxon <ncroxon@redhat.com> Closes: https://lore.kernel.org/all/71ba5272-ab07-43ba-8232-d2da642acb4e@redhat.com/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240422065824.2516-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>