From d425aef66e62221fa6bb0ccb94296df29e4cc107 Mon Sep 17 00:00:00 2001 From: Anand Moon Date: Mon, 13 Oct 2025 20:50:03 +0530 Subject: arm64: dts: rockchip: Set correct pinctrl for I2S1 8ch TX on odroid-m1 Enable proper pin multiplexing for the I2S1 8-channel transmit interface by adding the default pinctrl configuration which esures correct signal routing and avoids pinmux conflicts during audio playback. Changes fix the error [ 116.856643] [ T782] rockchip-pinctrl pinctrl: pin gpio1-10 already requested by affinity_hint; cannot claim for fe410000.i2s [ 116.857567] [ T782] rockchip-pinctrl pinctrl: error -EINVAL: pin-42 (fe410000.i2s) [ 116.857618] [ T782] rockchip-pinctrl pinctrl: error -EINVAL: could not request pin 42 (gpio1-10) from group i2s1m0-sdi1 on device rockchip-pinctrl [ 116.857659] [ T782] rockchip-i2s-tdm fe410000.i2s: Error applying setting, reverse things back I2S1 on the M1 to the codec in the RK809 only uses the SCLK, LRCK, SDI0 and SDO0 signals, so limit the claimed pins to those. With this change audio output works as expected: $ aplay -l **** List of PLAYBACK Hardware Devices **** card 0: HDMI [HDMI], device 0: fe400000.i2s-i2s-hifi i2s-hifi-0 [fe400000.i2s-i2s-hifi i2s-hifi-0] Subdevices: 1/1 Subdevice #0: subdevice #0 card 1: RK817 [Analog RK817], device 0: fe410000.i2s-rk817-hifi rk817-hifi-0 [fe410000.i2s-rk817-hifi rk817-hifi-0] Subdevices: 1/1 Subdevice #0: subdevice #0 Fixes: 78f858447cb7 ("arm64: dts: rockchip: Add analog audio on ODROID-M1") Cc: Aurelien Jarno Signed-off-by: Anand Moon [adapted the commit message a bit] Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts b/arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts index 0f844806ec54..442a2bc43ba8 100644 --- a/arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts +++ b/arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts @@ -482,6 +482,8 @@ }; &i2s1_8ch { + pinctrl-names = "default"; + pinctrl-0 = <&i2s1m0_sclktx &i2s1m0_lrcktx &i2s1m0_sdi0 &i2s1m0_sdo0>; rockchip,trcm-sync-tx-only; status = "okay"; }; -- cgit From e179de737d13ad99bd19ea0fafab759d4074a425 Mon Sep 17 00:00:00 2001 From: Andrey Leonchikov Date: Sun, 12 Oct 2025 14:33:36 +0200 Subject: arm64: dts: rockchip: Fix PCIe power enable pin for BigTreeTech CB2 and Pi2 Fix typo into regulator GPIO definition. With current definition, PCIe doesn't start up. Valid definition is already used in "pinctrl" section, "pcie_drv" (gpio4, RK_PB1). Fixes: bfbc663d2733a ("arm64: dts: rockchip: Add BigTreeTech CB2 and Pi2") Signed-off-by: Andrey Leonchikov Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi b/arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi index 7f578c50b4ad..f74590af7e33 100644 --- a/arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi @@ -120,7 +120,7 @@ compatible = "regulator-fixed"; regulator-name = "vcc3v3_pcie"; enable-active-high; - gpios = <&gpio0 RK_PB1 GPIO_ACTIVE_HIGH>; + gpios = <&gpio4 RK_PB1 GPIO_ACTIVE_HIGH>; pinctrl-names = "default"; pinctrl-0 = <&pcie_drv>; regulator-always-on; -- cgit From 05b80cd1f37db042e074ecc7ee0d39869fed2f52 Mon Sep 17 00:00:00 2001 From: Alexey Charkov Date: Thu, 9 Oct 2025 16:34:01 +0400 Subject: arm64: dts: rockchip: Remove non-functioning CPU OPPs from RK3576 Drop the top-frequency OPPs from both the LITTLE and big CPU clusters on RK3576, as neither the opensource TF-A [1] nor the recent (after v1.08) binary BL31 images provided by Rockchip expose those. This fixes the problem [2] when the cpufreq governor tries to jump directly to the highest-frequency OPP, which results in a failed SCMI call leaving the system stuck at the previous OPP before the attempted change. [1] https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/rockchip/rk3576/scmi/rk3576_clk.c#L264-L304 [2] https://lore.kernel.org/linux-rockchip/CABjd4Yz4NbqzZH4Qsed3ias56gcga9K6CmYA+BLDBxtbG915Ag@mail.gmail.com/ Fixes: 57b1ce903966 ("arm64: dts: rockchip: Add rk3576 SoC base DT") Cc: stable@vger.kernel.org Signed-off-by: Alexey Charkov Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3576.dtsi | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3576.dtsi b/arch/arm64/boot/dts/rockchip/rk3576.dtsi index fc4e9e07f1cf..f0c3ab00a7f3 100644 --- a/arch/arm64/boot/dts/rockchip/rk3576.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3576.dtsi @@ -276,12 +276,6 @@ opp-microvolt = <900000 900000 950000>; clock-latency-ns = <40000>; }; - - opp-2208000000 { - opp-hz = /bits/ 64 <2208000000>; - opp-microvolt = <950000 950000 950000>; - clock-latency-ns = <40000>; - }; }; cluster1_opp_table: opp-table-cluster1 { @@ -348,12 +342,6 @@ opp-microvolt = <925000 925000 950000>; clock-latency-ns = <40000>; }; - - opp-2304000000 { - opp-hz = /bits/ 64 <2304000000>; - opp-microvolt = <950000 950000 950000>; - clock-latency-ns = <40000>; - }; }; gpu_opp_table: opp-table-gpu { -- cgit From afb5f84b216d14a71e2962ed569edcea30cf9763 Mon Sep 17 00:00:00 2001 From: Diederik de Haas Date: Wed, 8 Oct 2025 10:21:22 +0200 Subject: arm64: dts: rockchip: Drop 'rockchip,grf' prop from tsadc on rk3328 The 'rockchip,grf' property for tsadc in rk3328 wasn't actually used in the driver and is no longer allowed in the DT since commit e881662aa06a ("dt-bindings: thermal: rockchip: Tighten grf requirements") So remove that property which fixes the following DT validation issue tsadc@ff250000 (rockchip,rk3328-tsadc): rockchip,grf: False schema does not allow [[58]] Signed-off-by: Diederik de Haas Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3328.dtsi | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3328.dtsi b/arch/arm64/boot/dts/rockchip/rk3328.dtsi index 283d9cbc4368..03b7c4313750 100644 --- a/arch/arm64/boot/dts/rockchip/rk3328.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3328.dtsi @@ -598,7 +598,6 @@ pinctrl-2 = <&otp_pin>; resets = <&cru SRST_TSADC>; reset-names = "tsadc-apb"; - rockchip,grf = <&grf>; rockchip,hw-tshut-temp = <100000>; #thermal-sensor-cells = <1>; status = "disabled"; -- cgit From b3fd04e23f6e4496f5a2279466a33fbdc83500f0 Mon Sep 17 00:00:00 2001 From: Dragan Simic Date: Sat, 6 Sep 2025 12:01:22 +0200 Subject: arm64: dts: rockchip: Make RK3588 GPU OPP table naming less generic Unify the naming of the existing GPU OPP table nodes found in the RK3588 and RK3588J SoC dtsi files with the other SoC's GPU OPP nodes, following the more "modern" node naming scheme. Fixes: a7b2070505a2 ("arm64: dts: rockchip: Split GPU OPPs of RK3588 and RK3588j") Signed-off-by: Dragan Simic [opp-table also is way too generic on systems with like 4-5 opp-tables] Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3588-opp.dtsi | 2 +- arch/arm64/boot/dts/rockchip/rk3588j.dtsi | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3588-opp.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-opp.dtsi index 0f1a77697351..b5d630d2c879 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-opp.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3588-opp.dtsi @@ -115,7 +115,7 @@ }; }; - gpu_opp_table: opp-table { + gpu_opp_table: opp-table-gpu { compatible = "operating-points-v2"; opp-300000000 { diff --git a/arch/arm64/boot/dts/rockchip/rk3588j.dtsi b/arch/arm64/boot/dts/rockchip/rk3588j.dtsi index 9884a5df47df..e1e0e3fc0ca7 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588j.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3588j.dtsi @@ -66,7 +66,7 @@ }; }; - gpu_opp_table: opp-table { + gpu_opp_table: opp-table-gpu { compatible = "operating-points-v2"; opp-300000000 { -- cgit From ce121914f38aaa59504e20a1a625e5988fc6ead4 Mon Sep 17 00:00:00 2001 From: "Russell King (Oracle)" Date: Wed, 15 Oct 2025 16:52:26 +0100 Subject: arm64: tegra: Mark Jetson Xavier NX's PHY as a wakeup source Mark the RTL8211F PHY as a wakeup source for the Jetson Xavier NX. This allows the reworked RTL8211F driver to know that the PHY is wired to wakeup capable hardware, and thus to expose WoL capabilities. Signed-off-by: Russell King (Oracle) Acked-by: Jon Hunter Signed-off-by: Thierry Reding --- arch/arm64/boot/dts/nvidia/tegra194-p3668.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/nvidia/tegra194-p3668.dtsi b/arch/arm64/boot/dts/nvidia/tegra194-p3668.dtsi index a410fc335fa3..c0f17f8189fa 100644 --- a/arch/arm64/boot/dts/nvidia/tegra194-p3668.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra194-p3668.dtsi @@ -42,6 +42,7 @@ interrupt-parent = <&gpio>; interrupts = ; #phy-cells = <0>; + wakeup-source; }; }; }; -- cgit From 85893094535cced32b33766e283240164a5b11f8 Mon Sep 17 00:00:00 2001 From: Tao Ren Date: Wed, 15 Oct 2025 13:48:37 -0700 Subject: ARM: dts: aspeed: fuji-data64: Enable mac3 controller "mac3" controller was removed from the initial version of fuji-data64 dts because the rgmii setting is incorrect, but dropping mac3 leads to regression in the existing fuji platform, because fuji.dts simply includes fuji-data64.dts. This patch adds mac3 back to fuji-data64.dts to fix the fuji regression[1], and rgmii settings need to be fixed later. Fixes: b0f294fdfc3e ("ARM: dts: aspeed: facebook-fuji: Include facebook-fuji-data64.dts") Link: https://lore.kernel.org/all/79ddc7b9-ef26-4959-9a16-aa4e006eb145@roeck-us.net/ [1] Signed-off-by: Tao Ren Reviewed-by: Andrew Lunn Signed-off-by: Andrew Jeffery --- .../boot/dts/aspeed/aspeed-bmc-facebook-fuji-data64.dts | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-fuji-data64.dts b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-fuji-data64.dts index aa9576d8ab56..48ca25f57ef6 100644 --- a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-fuji-data64.dts +++ b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-fuji-data64.dts @@ -1254,3 +1254,17 @@ max-frequency = <25000000>; bus-width = <4>; }; + +/* + * FIXME: rgmii delay is introduced by MAC (configured in u-boot now) + * instead of PCB on fuji board, so the "phy-mode" should be updated to + * "rgmii-[tx|rx]id" when the aspeed-mac driver can handle the delay + * properly. + */ +&mac3 { + status = "okay"; + phy-mode = "rgmii"; + phy-handle = <ðphy3>; + pinctrl-names = "default"; + pinctrl-0 = <&pinctrl_rgmii4_default>; +}; -- cgit From 62bf7708fe80ec0db14b9179c25eeeda9f81e9d0 Mon Sep 17 00:00:00 2001 From: Dario Binacchi Date: Sat, 13 Sep 2025 11:16:31 +0200 Subject: ARM: dts: imx6ull-engicam-microgea-rmm: fix report-rate-hz value The 'report-rate-hz' property for the edt-ft5x06 driver was added and handled in the Linux kernel by me with patches [1] and [2] for this specific board. The v1 upstream version, which was the one applied to the customer's kernel, used the 'report-rate' property, which was written directly to the controller register. During review, the 'hz' suffix was added, changing its handling so that writing the value directly to the register was no longer possible for the M06 controller. Once the patches were accepted in mainline, I did not reapply them to the customer's kernel, and when upstreaming the DTS for this board, I forgot to correct the 'report-rate-hz' property value. The property must be set to 60 because this board uses the M06 controller, which expects the report rate in units of 10 Hz, meaning the actual value written to the register is 6. [1] 625f829586ea ("dt-bindings: input: touchscreen: edt-ft5x06: add report-rate-hz") [2] 5bcee83a406c ("Input: edt-ft5x06 - set report rate by dts property") Fixes: ffea3cac94ba ("ARM: dts: imx6ul: support Engicam MicroGEA RMM board") Co-developed-by: Michael Trimarchi Signed-off-by: Michael Trimarchi Signed-off-by: Dario Binacchi Signed-off-by: Shawn Guo --- arch/arm/boot/dts/nxp/imx/imx6ull-engicam-microgea-rmm.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/nxp/imx/imx6ull-engicam-microgea-rmm.dts b/arch/arm/boot/dts/nxp/imx/imx6ull-engicam-microgea-rmm.dts index 107b00b9a939..540642e99a41 100644 --- a/arch/arm/boot/dts/nxp/imx/imx6ull-engicam-microgea-rmm.dts +++ b/arch/arm/boot/dts/nxp/imx/imx6ull-engicam-microgea-rmm.dts @@ -136,7 +136,7 @@ interrupt-parent = <&gpio2>; interrupts = <8 IRQ_TYPE_EDGE_FALLING>; reset-gpios = <&gpio2 14 GPIO_ACTIVE_LOW>; - report-rate-hz = <6>; + report-rate-hz = <60>; /* settings valid only for Hycon touchscreen */ touchscreen-size-x = <1280>; touchscreen-size-y = <800>; -- cgit From f31e261712a0d107f09fb1d3dc8f094806149c83 Mon Sep 17 00:00:00 2001 From: Jihed Chaibi Date: Tue, 16 Sep 2025 00:06:55 +0200 Subject: ARM: dts: imx51-zii-rdu1: Fix audmux node names Rename the 'ssi2' and 'aud3' nodes to 'mux-ssi2' and 'mux-aud3' in the audmux configuration of imx51-zii-rdu1.dts to comply with the naming convention in imx-audmux.yaml. This fixes the following dt-schema warning: imx51-zii-rdu1.dtb: audmux@83fd0000 (fsl,imx51-audmux): 'aud3', 'ssi2' do not match any of the regexes: '^mux-[0-9a-z]*$', '^pinctrl-[0-9]+$' Fixes: ceef0396f367f ("ARM: dts: imx: add ZII RDU1 board") Signed-off-by: Jihed Chaibi Signed-off-by: Shawn Guo --- arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts b/arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts index 06545a6052f7..43ff5eafb2bb 100644 --- a/arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts +++ b/arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts @@ -259,7 +259,7 @@ pinctrl-0 = <&pinctrl_audmux>; status = "okay"; - ssi2 { + mux-ssi2 { fsl,audmux-port = <1>; fsl,port-config = < (IMX_AUDMUX_V2_PTCR_SYN | @@ -271,7 +271,7 @@ >; }; - aud3 { + mux-aud3 { fsl,audmux-port = <2>; fsl,port-config = < IMX_AUDMUX_V2_PTCR_SYN -- cgit From 26f0f122f92f2e8c384c08a05956417bfb5f6fbe Mon Sep 17 00:00:00 2001 From: Heiko Stuebner Date: Mon, 20 Oct 2025 11:11:39 +0200 Subject: arm64: dts: rockchip: Fix indentation on rk3399 haikou demo dtso The regulator-cam-dovdd-1v8 uses spaces for indentation, where it should use tabs. Fix this. Fixes: 066a69db9db3 ("arm64: dts: rockchip: add overlay for RK3399 Puma Haikou Video Demo adapter") Signed-off-by: Heiko Stuebner Reviewed-by: Quentin Schulz Link: https://patch.msgid.link/20251020091139.3652738-1-heiko@sntech.de Signed-off-by: Heiko Stuebner --- .../arm64/boot/dts/rockchip/rk3399-puma-haikou-video-demo.dtso | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3399-puma-haikou-video-demo.dtso b/arch/arm64/boot/dts/rockchip/rk3399-puma-haikou-video-demo.dtso index 5e8f729c2cf2..141a921a06e4 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-puma-haikou-video-demo.dtso +++ b/arch/arm64/boot/dts/rockchip/rk3399-puma-haikou-video-demo.dtso @@ -45,11 +45,11 @@ cam_dovdd_1v8: regulator-cam-dovdd-1v8 { compatible = "regulator-fixed"; - gpio = <&pca9670 3 GPIO_ACTIVE_LOW>; - regulator-max-microvolt = <1800000>; - regulator-min-microvolt = <1800000>; - regulator-name = "cam-dovdd-1v8"; - vin-supply = <&vcc1v8_video>; + gpio = <&pca9670 3 GPIO_ACTIVE_LOW>; + regulator-max-microvolt = <1800000>; + regulator-min-microvolt = <1800000>; + regulator-name = "cam-dovdd-1v8"; + vin-supply = <&vcc1v8_video>; }; cam_dvdd_1v2: regulator-cam-dvdd-1v2 { -- cgit From 8d2a2a49c30f67a480fa9ed25e08436a446f057e Mon Sep 17 00:00:00 2001 From: Sabrina Dubroca Date: Thu, 16 Oct 2025 12:39:12 +0200 Subject: xfrm: drop SA reference in xfrm_state_update if dir doesn't match We're not updating x1, but we still need to put() it. Fixes: a4a87fa4e96c ("xfrm: Add Direction to the SA in or out") Signed-off-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_state.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index d213ca3653a8..e4736d1ebb44 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2191,14 +2191,18 @@ int xfrm_state_update(struct xfrm_state *x) } if (x1->km.state == XFRM_STATE_ACQ) { - if (x->dir && x1->dir != x->dir) + if (x->dir && x1->dir != x->dir) { + to_put = x1; goto out; + } __xfrm_state_insert(x); x = NULL; } else { - if (x1->dir != x->dir) + if (x1->dir != x->dir) { + to_put = x1; goto out; + } } err = 0; -- cgit From 10deb69864840ccf96b00ac2ab3a2055c0c04721 Mon Sep 17 00:00:00 2001 From: Sabrina Dubroca Date: Thu, 16 Oct 2025 12:39:13 +0200 Subject: xfrm: also call xfrm_state_delete_tunnel at destroy time for states that were never added In commit b441cf3f8c4b ("xfrm: delete x->tunnel as we delete x"), I missed the case where state creation fails between full initialization (->init_state has been called) and being inserted on the lists. In this situation, ->init_state has been called, so for IPcomp tunnels, the fallback tunnel has been created and added onto the lists, but the user state never gets added, because we fail before that. The user state doesn't go through __xfrm_state_delete, so we don't call xfrm_state_delete_tunnel for those states, and we end up leaking the FB tunnel. There are several codepaths affected by this: the add/update paths, in both net/key and xfrm, and the migrate code (xfrm_migrate, xfrm_state_migrate). A "proper" rollback of the init_state work would probably be doable in the add/update code, but for migrate it gets more complicated as multiple states may be involved. At some point, the new (not-inserted) state will be destroyed, so call xfrm_state_delete_tunnel during xfrm_state_gc_destroy. Most states will have their fallback tunnel cleaned up during __xfrm_state_delete, which solves the issue that b441cf3f8c4b (and other patches before it) aimed at. All states (including FB tunnels) will be removed from the lists once xfrm_state_fini has called flush_work(&xfrm_state_gc_work). Reported-by: syzbot+999eb23467f83f9bf9bf@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=999eb23467f83f9bf9bf Fixes: b441cf3f8c4b ("xfrm: delete x->tunnel as we delete x") Signed-off-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_state.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index e4736d1ebb44..721ef0f409b5 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -592,6 +592,7 @@ void xfrm_state_free(struct xfrm_state *x) } EXPORT_SYMBOL(xfrm_state_free); +static void xfrm_state_delete_tunnel(struct xfrm_state *x); static void xfrm_state_gc_destroy(struct xfrm_state *x) { if (x->mode_cbs && x->mode_cbs->destroy_state) @@ -607,6 +608,7 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x) kfree(x->replay_esn); kfree(x->preplay_esn); xfrm_unset_type_offload(x); + xfrm_state_delete_tunnel(x); if (x->type) { x->type->destructor(x); xfrm_put_type(x->type); @@ -806,7 +808,6 @@ void __xfrm_state_destroy(struct xfrm_state *x) } EXPORT_SYMBOL(__xfrm_state_destroy); -static void xfrm_state_delete_tunnel(struct xfrm_state *x); int __xfrm_state_delete(struct xfrm_state *x) { struct net *net = xs_net(x); -- cgit From 5502bc4746e86bfe91ecbe0ed1ad53cb17673920 Mon Sep 17 00:00:00 2001 From: Sabrina Dubroca Date: Thu, 16 Oct 2025 12:39:14 +0200 Subject: xfrm: make state as DEAD before final put when migrate fails xfrm_state_migrate/xfrm_state_clone_and_setup create a new state, and call xfrm_state_put to destroy it in case of failure. __xfrm_state_destroy expects the state to be in XFRM_STATE_DEAD, but we currently don't do that. Reported-by: syzbot+5cd6299ede4d4f70987b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=5cd6299ede4d4f70987b Fixes: 78347c8c6b2d ("xfrm: Fix xfrm_state_migrate leak") Signed-off-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_state.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 721ef0f409b5..1ab19ca007de 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2074,6 +2074,7 @@ static struct xfrm_state *xfrm_state_clone_and_setup(struct xfrm_state *orig, return x; error: + x->km.state = XFRM_STATE_DEAD; xfrm_state_put(x); out: return NULL; @@ -2163,6 +2164,7 @@ struct xfrm_state *xfrm_state_migrate(struct xfrm_state *x, return xc; error: + xc->km.state = XFRM_STATE_DEAD; xfrm_state_put(xc); return NULL; } -- cgit From 7f02285764790e0ff1a731b4187fa3e389ed02c7 Mon Sep 17 00:00:00 2001 From: Sabrina Dubroca Date: Thu, 16 Oct 2025 12:39:15 +0200 Subject: xfrm: call xfrm_dev_state_delete when xfrm_state_migrate fails to add the state In case xfrm_state_migrate fails after calling xfrm_dev_state_add, we directly release the last reference and destroy the new state, without calling xfrm_dev_state_delete (this only happens in __xfrm_state_delete, which we're not calling on this path, since the state was never added). Call xfrm_dev_state_delete on error when an offload configuration was provided. Fixes: ab244a394c7f ("xfrm: Migrate offload configuration") Signed-off-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_state.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 1ab19ca007de..c3518d1498cd 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2159,10 +2159,13 @@ struct xfrm_state *xfrm_state_migrate(struct xfrm_state *x, xfrm_state_insert(xc); } else { if (xfrm_state_add(xc) < 0) - goto error; + goto error_add; } return xc; +error_add: + if (xuo) + xfrm_dev_state_delete(xc); error: xc->km.state = XFRM_STATE_DEAD; xfrm_state_put(xc); -- cgit From 1dcf617bec5cb85f68ca19969e7537ef6f6931d3 Mon Sep 17 00:00:00 2001 From: Sabrina Dubroca Date: Thu, 16 Oct 2025 12:39:16 +0200 Subject: xfrm: set err and extack on failure to create pcpu SA xfrm_state_construct can fail without setting an error if the requested pcpu_num value is too big. Set err and add an extack message to avoid confusing userspace. Fixes: 1ddf9916ac09 ("xfrm: Add support for per cpu xfrm state handling.") Signed-off-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_user.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 010c9e6638c0..9d98cc9daa37 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -947,8 +947,11 @@ static struct xfrm_state *xfrm_state_construct(struct net *net, if (attrs[XFRMA_SA_PCPU]) { x->pcpu_num = nla_get_u32(attrs[XFRMA_SA_PCPU]); - if (x->pcpu_num >= num_possible_cpus()) + if (x->pcpu_num >= num_possible_cpus()) { + err = -ERANGE; + NL_SET_ERR_MSG(extack, "pCPU number too big"); goto error; + } } err = __xfrm_init_state(x, extack); -- cgit From f2bc8231fd43a02f9d97252b3435869727054d60 Mon Sep 17 00:00:00 2001 From: Sabrina Dubroca Date: Thu, 16 Oct 2025 12:39:17 +0200 Subject: xfrm: check all hash buckets for leftover states during netns deletion The current hlist_empty checks only test the first bucket of each hashtable, ignoring any other bucket. They should be caught by the WARN_ON for state_all, but better to make all the checks accurate. Fixes: 73d189dce486 ("netns xfrm: per-netns xfrm_state_bydst hash") Fixes: d320bbb306f2 ("netns xfrm: per-netns xfrm_state_bysrc hash") Fixes: b754a4fd8f58 ("netns xfrm: per-netns xfrm_state_byspi hash") Fixes: fe9f1d8779cb ("xfrm: add state hashtable keyed by seq") Signed-off-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_state.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index c3518d1498cd..9e14e453b55c 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -3308,6 +3308,7 @@ out_bydst: void xfrm_state_fini(struct net *net) { unsigned int sz; + int i; flush_work(&net->xfrm.state_hash_work); xfrm_state_flush(net, 0, false); @@ -3315,14 +3316,17 @@ void xfrm_state_fini(struct net *net) WARN_ON(!list_empty(&net->xfrm.state_all)); + for (i = 0; i <= net->xfrm.state_hmask; i++) { + WARN_ON(!hlist_empty(net->xfrm.state_byseq + i)); + WARN_ON(!hlist_empty(net->xfrm.state_byspi + i)); + WARN_ON(!hlist_empty(net->xfrm.state_bysrc + i)); + WARN_ON(!hlist_empty(net->xfrm.state_bydst + i)); + } + sz = (net->xfrm.state_hmask + 1) * sizeof(struct hlist_head); - WARN_ON(!hlist_empty(net->xfrm.state_byseq)); xfrm_hash_free(net->xfrm.state_byseq, sz); - WARN_ON(!hlist_empty(net->xfrm.state_byspi)); xfrm_hash_free(net->xfrm.state_byspi, sz); - WARN_ON(!hlist_empty(net->xfrm.state_bysrc)); xfrm_hash_free(net->xfrm.state_bysrc, sz); - WARN_ON(!hlist_empty(net->xfrm.state_bydst)); xfrm_hash_free(net->xfrm.state_bydst, sz); free_percpu(net->xfrm.state_cache_input); } -- cgit From a7b17ece4032dd86bb411297f2169dda395cdc3c Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 10 Oct 2025 15:38:56 +0200 Subject: mmc: wmt-sdmmc: fix compile test default Enabling compile testing should not enable every individual driver (we have "allyesconfig" for that). Fixes: 7cd8db0fb0b2 ("mmc: add COMPILE_TEST to multiple drivers") Cc: Mikko Rapeli Signed-off-by: Johan Hovold Signed-off-by: Ulf Hansson --- drivers/mmc/host/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig index 2c963cb6724b..10d0ef58ef49 100644 --- a/drivers/mmc/host/Kconfig +++ b/drivers/mmc/host/Kconfig @@ -950,7 +950,7 @@ config MMC_USHC config MMC_WMT tristate "Wondermedia SD/MMC Host Controller support" depends on ARCH_VT8500 || COMPILE_TEST - default y + default ARCH_VT8500 help This selects support for the SD/MMC Host Controller on Wondermedia WM8505/WM8650 based SoCs. -- cgit From 0778ac7df5137d5041783fadfc201f8fd55a1d9b Mon Sep 17 00:00:00 2001 From: Zhen Ni Date: Mon, 13 Oct 2025 19:41:51 +0800 Subject: fs: Fix uninitialized 'offp' in statmount_string() In statmount_string(), most flags assign an output offset pointer (offp) which is later updated with the string offset. However, the STATMOUNT_MNT_UIDMAP and STATMOUNT_MNT_GIDMAP cases directly set the struct fields instead of using offp. This leaves offp uninitialized, leading to a possible uninitialized dereference when *offp is updated. Fix it by assigning offp for UIDMAP and GIDMAP as well, keeping the code path consistent. Fixes: 37c4a9590e1e ("statmount: allow to retrieve idmappings") Fixes: e52e97f09fb6 ("statmount: let unset strings be empty") Cc: stable@vger.kernel.org Signed-off-by: Zhen Ni Link: https://patch.msgid.link/20251013114151.664341-1-zhen.ni@easystack.cn Reviewed-by: Jan Kara Signed-off-by: Christian Brauner --- fs/namespace.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index d82910f33dc4..5b5ab2ae238b 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -5454,11 +5454,11 @@ static int statmount_string(struct kstatmount *s, u64 flag) ret = statmount_sb_source(s, seq); break; case STATMOUNT_MNT_UIDMAP: - sm->mnt_uidmap = start; + offp = &sm->mnt_uidmap; ret = statmount_mnt_uidmap(s, seq); break; case STATMOUNT_MNT_GIDMAP: - sm->mnt_gidmap = start; + offp = &sm->mnt_gidmap; ret = statmount_mnt_gidmap(s, seq); break; default: -- cgit From 2c2b67af5f5f77fc68261a137ad65dcfb8e52506 Mon Sep 17 00:00:00 2001 From: Hongbo Li Date: Sat, 11 Oct 2025 09:22:35 +0000 Subject: hostfs: Fix only passing host root in boot stage with new mount In the old mount proceedure, hostfs could only pass root directory during boot. This is because it constructed the root directory using the @root_ino event without any mount options. However, when using it with the new mount API, this step is no longer triggered. As a result, if users mounts without specifying any mount options, the @host_root_path remains uninitialized. To prevent this issue, the @host_root_path should be initialized at the time of allocation. Reported-by: Geoffrey Thorpe Closes: https://lore.kernel.org/all/643333a0-f434-42fb-82ac-d25a0b56f3b7@geoffthorpe.net/ Fixes: cd140ce9f611 ("hostfs: convert hostfs to use the new mount API") Signed-off-by: Hongbo Li Link: https://patch.msgid.link/20251011092235.29880-1-lihongbo22@huawei.com Signed-off-by: Christian Brauner --- fs/hostfs/hostfs_kern.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c index 1e1acf5775ab..86455eebbf6c 100644 --- a/fs/hostfs/hostfs_kern.c +++ b/fs/hostfs/hostfs_kern.c @@ -979,7 +979,7 @@ static int hostfs_parse_param(struct fs_context *fc, struct fs_parameter *param) { struct hostfs_fs_info *fsi = fc->s_fs_info; struct fs_parse_result result; - char *host_root; + char *host_root, *tmp_root; int opt; opt = fs_parse(fc, hostfs_param_specs, param, &result); @@ -990,11 +990,13 @@ static int hostfs_parse_param(struct fs_context *fc, struct fs_parameter *param) case Opt_hostfs: host_root = param->string; if (!*host_root) - host_root = ""; - fsi->host_root_path = - kasprintf(GFP_KERNEL, "%s/%s", root_ino, host_root); - if (fsi->host_root_path == NULL) + break; + tmp_root = kasprintf(GFP_KERNEL, "%s%s", + fsi->host_root_path, host_root); + if (!tmp_root) return -ENOMEM; + kfree(fsi->host_root_path); + fsi->host_root_path = tmp_root; break; } @@ -1004,17 +1006,17 @@ static int hostfs_parse_param(struct fs_context *fc, struct fs_parameter *param) static int hostfs_parse_monolithic(struct fs_context *fc, void *data) { struct hostfs_fs_info *fsi = fc->s_fs_info; - char *host_root = (char *)data; + char *tmp_root, *host_root = (char *)data; /* NULL is printed as '(null)' by printf(): avoid that. */ if (host_root == NULL) - host_root = ""; + return 0; - fsi->host_root_path = - kasprintf(GFP_KERNEL, "%s/%s", root_ino, host_root); - if (fsi->host_root_path == NULL) + tmp_root = kasprintf(GFP_KERNEL, "%s%s", fsi->host_root_path, host_root); + if (!tmp_root) return -ENOMEM; - + kfree(fsi->host_root_path); + fsi->host_root_path = tmp_root; return 0; } @@ -1049,6 +1051,11 @@ static int hostfs_init_fs_context(struct fs_context *fc) if (!fsi) return -ENOMEM; + fsi->host_root_path = kasprintf(GFP_KERNEL, "%s/", root_ino); + if (!fsi->host_root_path) { + kfree(fsi); + return -ENOMEM; + } fc->s_fs_info = fsi; fc->ops = &hostfs_context_ops; return 0; -- cgit From 90c82941adf1986364e0f82c35cf59f2bf5f6a1d Mon Sep 17 00:00:00 2001 From: André Draszik Date: Thu, 16 Oct 2025 16:58:37 +0100 Subject: pmdomain: samsung: plug potential memleak during probe MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit of_genpd_add_provider_simple() could fail, in which case this code leaks the domain name, pd->pd.name. Use devm_kstrdup_const() to plug this leak. As a side-effect, we can simplify existing error handling. Fixes: c09a3e6c97f0 ("soc: samsung: pm_domains: Convert to regular platform driver") Cc: stable@vger.kernel.org Reviewed-by: Peter Griffin Reviewed-by: Krzysztof Kozlowski Signed-off-by: André Draszik Tested-by: Marek Szyprowski Signed-off-by: Ulf Hansson --- drivers/pmdomain/samsung/exynos-pm-domains.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/pmdomain/samsung/exynos-pm-domains.c b/drivers/pmdomain/samsung/exynos-pm-domains.c index 5d478bb37ad6..f53e1bd24798 100644 --- a/drivers/pmdomain/samsung/exynos-pm-domains.c +++ b/drivers/pmdomain/samsung/exynos-pm-domains.c @@ -92,13 +92,14 @@ static const struct of_device_id exynos_pm_domain_of_match[] = { { }, }; -static const char *exynos_get_domain_name(struct device_node *node) +static const char *exynos_get_domain_name(struct device *dev, + struct device_node *node) { const char *name; if (of_property_read_string(node, "label", &name) < 0) name = kbasename(node->full_name); - return kstrdup_const(name, GFP_KERNEL); + return devm_kstrdup_const(dev, name, GFP_KERNEL); } static int exynos_pd_probe(struct platform_device *pdev) @@ -115,15 +116,13 @@ static int exynos_pd_probe(struct platform_device *pdev) if (!pd) return -ENOMEM; - pd->pd.name = exynos_get_domain_name(np); + pd->pd.name = exynos_get_domain_name(dev, np); if (!pd->pd.name) return -ENOMEM; pd->base = of_iomap(np, 0); - if (!pd->base) { - kfree_const(pd->pd.name); + if (!pd->base) return -ENODEV; - } pd->pd.power_off = exynos_pd_power_off; pd->pd.power_on = exynos_pd_power_on; -- cgit From 7c3643f204edf1c5edb12b36b34838683ee5f8dc Mon Sep 17 00:00:00 2001 From: Shuai Xue Date: Sat, 13 Sep 2025 10:32:24 +0800 Subject: acpi,srat: Fix incorrect device handle check for Generic Initiator The Generic Initiator Affinity Structure in SRAT table uses device handle type field to indicate the device type. According to ACPI specification, the device handle type value of 1 represents PCI device, not 0. Fixes: 894c26a1c274 ("ACPI: Support Generic Initiator only domains") Reported-by: Wu Zongyong Signed-off-by: Shuai Xue Reviewed-by: Jonathan Cameron Link: https://patch.msgid.link/20250913023224.39281-1-xueshuai@linux.alibaba.com Signed-off-by: Dave Jiang --- drivers/acpi/numa/srat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c index 53816dfab645..aa87ee1583a4 100644 --- a/drivers/acpi/numa/srat.c +++ b/drivers/acpi/numa/srat.c @@ -237,7 +237,7 @@ acpi_table_print_srat_entry(struct acpi_subtable_header *header) struct acpi_srat_generic_affinity *p = (struct acpi_srat_generic_affinity *)header; - if (p->device_handle_type == 0) { + if (p->device_handle_type == 1) { /* * For pci devices this may be the only place they * are assigned a proximity domain -- cgit From a28352cf2d2f8380e7aca8cb61682396dca7a991 Mon Sep 17 00:00:00 2001 From: Shawn Lin Date: Mon, 20 Oct 2025 09:49:41 +0800 Subject: mmc: sdhci-of-dwcmshc: Change DLL_STRBIN_TAPNUM_DEFAULT to 0x4 strbin signal delay under 0x8 configuration is not stable after massive test. The recommandation of it should be 0x4. Signed-off-by: Shawn Lin Tested-by: Alexey Charkov Tested-by: Hugh Cole-Baker Fixes: 08f3dff799d4 ("mmc: sdhci-of-dwcmshc: add rockchip platform support") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson --- drivers/mmc/host/sdhci-of-dwcmshc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c b/drivers/mmc/host/sdhci-of-dwcmshc.c index eebd45389956..5b61401a7f3d 100644 --- a/drivers/mmc/host/sdhci-of-dwcmshc.c +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c @@ -94,7 +94,7 @@ #define DLL_TXCLK_TAPNUM_DEFAULT 0x10 #define DLL_TXCLK_TAPNUM_90_DEGREES 0xA #define DLL_TXCLK_TAPNUM_FROM_SW BIT(24) -#define DLL_STRBIN_TAPNUM_DEFAULT 0x8 +#define DLL_STRBIN_TAPNUM_DEFAULT 0x4 #define DLL_STRBIN_TAPNUM_FROM_SW BIT(24) #define DLL_STRBIN_DELAY_NUM_SEL BIT(26) #define DLL_STRBIN_DELAY_NUM_OFFSET 16 -- cgit From e4185bed738da755b191aa3f2e16e8b48450e1b8 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Tue, 30 Sep 2025 15:32:34 +0300 Subject: mtdchar: fix integer overflow in read/write ioctls The "req.start" and "req.len" variables are u64 values that come from the user at the start of the function. We mask away the high 32 bits of "req.len" so that's capped at U32_MAX but the "req.start" variable can go up to U64_MAX which means that the addition can still integer overflow. Use check_add_overflow() to fix this bug. Fixes: 095bb6e44eb1 ("mtdchar: add MEMREAD ioctl") Fixes: 6420ac0af95d ("mtdchar: prevent unbounded allocation in MEMWRITE ioctl") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter Signed-off-by: Miquel Raynal --- drivers/mtd/mtdchar.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c index 8dc4f5c493fc..335c702633ff 100644 --- a/drivers/mtd/mtdchar.c +++ b/drivers/mtd/mtdchar.c @@ -599,6 +599,7 @@ mtdchar_write_ioctl(struct mtd_info *mtd, struct mtd_write_req __user *argp) uint8_t *datbuf = NULL, *oobbuf = NULL; size_t datbuf_len, oobbuf_len; int ret = 0; + u64 end; if (copy_from_user(&req, argp, sizeof(req))) return -EFAULT; @@ -618,7 +619,7 @@ mtdchar_write_ioctl(struct mtd_info *mtd, struct mtd_write_req __user *argp) req.len &= 0xffffffff; req.ooblen &= 0xffffffff; - if (req.start + req.len > mtd->size) + if (check_add_overflow(req.start, req.len, &end) || end > mtd->size) return -EINVAL; datbuf_len = min_t(size_t, req.len, mtd->erasesize); @@ -698,6 +699,7 @@ mtdchar_read_ioctl(struct mtd_info *mtd, struct mtd_read_req __user *argp) size_t datbuf_len, oobbuf_len; size_t orig_len, orig_ooblen; int ret = 0; + u64 end; if (copy_from_user(&req, argp, sizeof(req))) return -EFAULT; @@ -724,7 +726,7 @@ mtdchar_read_ioctl(struct mtd_info *mtd, struct mtd_read_req __user *argp) req.len &= 0xffffffff; req.ooblen &= 0xffffffff; - if (req.start + req.len > mtd->size) { + if (check_add_overflow(req.start, req.len, &end) || end > mtd->size) { ret = -EINVAL; goto out; } -- cgit From 9225f02ff201837e1443076f37a3c008140d1835 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Fri, 3 Oct 2025 12:30:10 +0300 Subject: mtd: nand: realtek-ecc: Fix a IS_ERR() vs NULL bug in probe The dma_alloc_noncoherent() function doesn't return error pointers, it returns NULL on error. Fix the error checking to match. Fixes: 3148d0e5b1c5 ("mtd: nand: realtek-ecc: Add Realtek external ECC engine support") Signed-off-by: Dan Carpenter Reviewed-by: Geert Uytterhoeven Signed-off-by: Miquel Raynal --- drivers/mtd/nand/ecc-realtek.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/mtd/nand/ecc-realtek.c b/drivers/mtd/nand/ecc-realtek.c index 7d718934c909..7c275f1eb4a7 100644 --- a/drivers/mtd/nand/ecc-realtek.c +++ b/drivers/mtd/nand/ecc-realtek.c @@ -418,8 +418,8 @@ static int rtl_ecc_probe(struct platform_device *pdev) rtlc->buf = dma_alloc_noncoherent(dev, RTL_ECC_DMA_SIZE, &rtlc->buf_dma, DMA_BIDIRECTIONAL, GFP_KERNEL); - if (IS_ERR(rtlc->buf)) - return PTR_ERR(rtlc->buf); + if (!rtlc->buf) + return -ENOMEM; rtlc->dev = dev; rtlc->engine.dev = dev; -- cgit From 0d9c80aa572182d4b1464826cd77aa8973213216 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Wed, 8 Oct 2025 11:47:15 +0200 Subject: mtd: nand: MTD_NAND_ECC_REALTEK should depend on HAS_DMA If CONFIG_NO_DMA=y: ERROR: modpost: "dma_free_pages" [drivers/mtd/nand/ecc-realtek.ko] undefined! ERROR: modpost: "dma_alloc_pages" [drivers/mtd/nand/ecc-realtek.ko] undefined! The driver cannot function without DMA, hence fix this by adding a dependency on HAS_DMA. Fixes: 3148d0e5b1c5733d ("mtd: nand: realtek-ecc: Add Realtek external ECC engine support") Signed-off-by: Geert Uytterhoeven Signed-off-by: Miquel Raynal --- drivers/mtd/nand/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig index 4a17271076bc..1e57c8de8578 100644 --- a/drivers/mtd/nand/Kconfig +++ b/drivers/mtd/nand/Kconfig @@ -63,7 +63,7 @@ config MTD_NAND_ECC_MEDIATEK config MTD_NAND_ECC_REALTEK tristate "Realtek RTL93xx hardware ECC engine" - depends on HAS_IOMEM + depends on HAS_IOMEM && HAS_DMA depends on MACH_REALTEK_RTL || COMPILE_TEST select MTD_NAND_ECC help -- cgit From 9631350885929819d4e46c6521df35960b472ef3 Mon Sep 17 00:00:00 2001 From: Li Qiang Date: Mon, 20 Oct 2025 20:53:33 +0800 Subject: mtd: rawnand: realtek: Make rtl_ecc_engine_ops const The rtl_ecc_engine_ops structure is only used to provide a set of callback functions and is never modified after initialization. Mark it as const so it can be placed in the read-only section, which improves safety and allows better compiler optimization. Signed-off-by: Li Qiang Signed-off-by: Miquel Raynal --- drivers/mtd/nand/ecc-realtek.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mtd/nand/ecc-realtek.c b/drivers/mtd/nand/ecc-realtek.c index 7c275f1eb4a7..0046da37ea3e 100644 --- a/drivers/mtd/nand/ecc-realtek.c +++ b/drivers/mtd/nand/ecc-realtek.c @@ -380,7 +380,7 @@ static void rtl_ecc_cleanup_ctx(struct nand_device *nand) nand_ecc_cleanup_req_tweaking(&ctx->req_ctx); } -static struct nand_ecc_engine_ops rtl_ecc_engine_ops = { +static const struct nand_ecc_engine_ops rtl_ecc_engine_ops = { .init_ctx = rtl_ecc_init_ctx, .cleanup_ctx = rtl_ecc_cleanup_ctx, .prepare_io_req = rtl_ecc_prepare_io_req, -- cgit From 7458f72cc28f9eb0de811effcb5376d0ec19094a Mon Sep 17 00:00:00 2001 From: Sudeep Holla Date: Fri, 17 Oct 2025 12:03:20 +0100 Subject: pmdomain: arm: scmi: Fix genpd leak on provider registration failure If of_genpd_add_provider_onecell() fails during probe, the previously created generic power domains are not removed, leading to a memory leak and potential kernel crash later in genpd_debug_add(). Add proper error handling to unwind the initialized domains before returning from probe to ensure all resources are correctly released on failure. Example crash trace observed without this fix: | Unable to handle kernel paging request at virtual address fffffffffffffc70 | CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.18.0-rc1 #405 PREEMPT | Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : genpd_debug_add+0x2c/0x160 | lr : genpd_debug_init+0x74/0x98 | Call trace: | genpd_debug_add+0x2c/0x160 (P) | genpd_debug_init+0x74/0x98 | do_one_initcall+0xd0/0x2d8 | do_initcall_level+0xa0/0x140 | do_initcalls+0x60/0xa8 | do_basic_setup+0x28/0x40 | kernel_init_freeable+0xe8/0x170 | kernel_init+0x2c/0x140 | ret_from_fork+0x10/0x20 Fixes: 898216c97ed2 ("firmware: arm_scmi: add device power domain support using genpd") Signed-off-by: Sudeep Holla Reviewed-by: Peng Fan Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson --- drivers/pmdomain/arm/scmi_pm_domain.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/pmdomain/arm/scmi_pm_domain.c b/drivers/pmdomain/arm/scmi_pm_domain.c index 8fe1c0a501c9..b5e2ffd5ea64 100644 --- a/drivers/pmdomain/arm/scmi_pm_domain.c +++ b/drivers/pmdomain/arm/scmi_pm_domain.c @@ -41,7 +41,7 @@ static int scmi_pd_power_off(struct generic_pm_domain *domain) static int scmi_pm_domain_probe(struct scmi_device *sdev) { - int num_domains, i; + int num_domains, i, ret; struct device *dev = &sdev->dev; struct device_node *np = dev->of_node; struct scmi_pm_domain *scmi_pd; @@ -108,9 +108,18 @@ static int scmi_pm_domain_probe(struct scmi_device *sdev) scmi_pd_data->domains = domains; scmi_pd_data->num_domains = num_domains; + ret = of_genpd_add_provider_onecell(np, scmi_pd_data); + if (ret) + goto err_rm_genpds; + dev_set_drvdata(dev, scmi_pd_data); - return of_genpd_add_provider_onecell(np, scmi_pd_data); + return 0; +err_rm_genpds: + for (i = num_domains - 1; i >= 0; i--) + pm_genpd_remove(domains[i]); + + return ret; } static void scmi_pm_domain_remove(struct scmi_device *sdev) -- cgit From 5c56bf214af85ca042bf97f8584aab2151035840 Mon Sep 17 00:00:00 2001 From: Niravkumar L Rabara Date: Thu, 23 Oct 2025 11:32:01 +0800 Subject: mtd: rawnand: cadence: fix DMA device NULL pointer dereference The DMA device pointer `dma_dev` was being dereferenced before ensuring that `cdns_ctrl->dmac` is properly initialized. Move the assignment of `dma_dev` after successfully acquiring the DMA channel to ensure the pointer is valid before use. Fixes: d76d22b5096c ("mtd: rawnand: cadence: use dma_map_resource for sdma address") Cc: stable@vger.kernel.org Signed-off-by: Niravkumar L Rabara Signed-off-by: Miquel Raynal --- drivers/mtd/nand/raw/cadence-nand-controller.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/mtd/nand/raw/cadence-nand-controller.c b/drivers/mtd/nand/raw/cadence-nand-controller.c index 6667eea95597..32ed38b89394 100644 --- a/drivers/mtd/nand/raw/cadence-nand-controller.c +++ b/drivers/mtd/nand/raw/cadence-nand-controller.c @@ -2871,7 +2871,7 @@ cadence_nand_irq_cleanup(int irqnum, struct cdns_nand_ctrl *cdns_ctrl) static int cadence_nand_init(struct cdns_nand_ctrl *cdns_ctrl) { dma_cap_mask_t mask; - struct dma_device *dma_dev = cdns_ctrl->dmac->device; + struct dma_device *dma_dev; int ret; cdns_ctrl->cdma_desc = dma_alloc_coherent(cdns_ctrl->dev, @@ -2915,6 +2915,7 @@ static int cadence_nand_init(struct cdns_nand_ctrl *cdns_ctrl) } } + dma_dev = cdns_ctrl->dmac->device; cdns_ctrl->io.iova_dma = dma_map_resource(dma_dev->dev, cdns_ctrl->io.dma, cdns_ctrl->io.size, DMA_BIDIRECTIONAL, 0); -- cgit From 6f37469a933030692741710db809722076f71973 Mon Sep 17 00:00:00 2001 From: Aaron Kling Date: Tue, 21 Oct 2025 14:47:06 -0500 Subject: memory: tegra210: Fix incorrect client ids The original commit had typos for two of the memory client ids. Fix them to reference the correct bindings. Fixes: 3804cef4c597 ("memory: tegra210: Use bindings for client ids") Signed-off-by: Aaron Kling Link: https://patch.msgid.link/20251021-t210-mem-clientid-fixup-v1-1-5094946faa31@gmail.com Signed-off-by: Krzysztof Kozlowski --- drivers/memory/tegra/tegra210.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/memory/tegra/tegra210.c b/drivers/memory/tegra/tegra210.c index cfa61dd88557..3c2949c16fde 100644 --- a/drivers/memory/tegra/tegra210.c +++ b/drivers/memory/tegra/tegra210.c @@ -1015,7 +1015,7 @@ static const struct tegra_mc_client tegra210_mc_clients[] = { }, }, }, { - .id = TEGRA210_MC_SESRD, + .id = TEGRA210_MC_SESWR, .name = "seswr", .swgroup = TEGRA_SWGROUP_SE, .regs = { @@ -1079,7 +1079,7 @@ static const struct tegra_mc_client tegra210_mc_clients[] = { }, }, }, { - .id = TEGRA210_MC_ETRR, + .id = TEGRA210_MC_ETRW, .name = "etrw", .swgroup = TEGRA_SWGROUP_ETR, .regs = { -- cgit From ff7b5a27438275e4fd8c4809d815638c829fe520 Mon Sep 17 00:00:00 2001 From: Andreas Kemnade Date: Mon, 13 Oct 2025 14:17:09 +0200 Subject: arm: imx_v6_v7_defconfig: enable ext4 directly In former times, ext4 was enabled implicitely by enabling ext3 but with the ext3 fs gone, it does not get enabled, which lets devices fail to mount root on non-initrd based boots with an ext4 root. Fixes: d6ace46c82fd ("ext4: remove obsolete EXT3 config options") Signed-off-by: Andreas Kemnade Reviewed-by: Fabio Estevam Signed-off-by: Shawn Guo --- arch/arm/configs/imx_v6_v7_defconfig | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/arm/configs/imx_v6_v7_defconfig b/arch/arm/configs/imx_v6_v7_defconfig index 9a57763a8d38..0d55056c6f82 100644 --- a/arch/arm/configs/imx_v6_v7_defconfig +++ b/arch/arm/configs/imx_v6_v7_defconfig @@ -436,9 +436,9 @@ CONFIG_EXT2_FS=y CONFIG_EXT2_FS_XATTR=y CONFIG_EXT2_FS_POSIX_ACL=y CONFIG_EXT2_FS_SECURITY=y -CONFIG_EXT3_FS=y -CONFIG_EXT3_FS_POSIX_ACL=y -CONFIG_EXT3_FS_SECURITY=y +CONFIG_EXT4_FS=y +CONFIG_EXT4_FS_POSIX_ACL=y +CONFIG_EXT4_FS_SECURITY=y CONFIG_QUOTA=y CONFIG_QUOTA_NETLINK_INTERFACE=y CONFIG_AUTOFS_FS=y -- cgit From ec4daace64a44b53df76f0629e82684ef09ce869 Mon Sep 17 00:00:00 2001 From: João Paulo Gonçalves Date: Tue, 14 Oct 2025 09:56:43 -0300 Subject: arm64: dts: imx8-ss-img: Avoid gpio0_mipi_csi GPIOs being deferred MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The gpio0_mipi_csi DT nodes are enabled by default, but they are dependent on the irqsteer_csi nodes, which are not enabled. This causes the gpio0_mipi_csi GPIOs to be probe deferred. Since these GPIOs can be used independently of the CSI controller, enable irqsteer_csi by default too to prevent them from being deferred and to ensure they work out of the box. Fixes: 2217f8243714 ("arm64: dts: imx8: add capture controller for i.MX8's img subsystem") Signed-off-by: João Paulo Gonçalves Reviewed-by: Frank Li Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx8-ss-img.dtsi | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/arm64/boot/dts/freescale/imx8-ss-img.dtsi b/arch/arm64/boot/dts/freescale/imx8-ss-img.dtsi index 2cf0f7208350..a72b2f1c4a1b 100644 --- a/arch/arm64/boot/dts/freescale/imx8-ss-img.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8-ss-img.dtsi @@ -67,7 +67,6 @@ img_subsys: bus@58000000 { power-domains = <&pd IMX_SC_R_CSI_0>; fsl,channel = <0>; fsl,num-irqs = <32>; - status = "disabled"; }; gpio0_mipi_csi0: gpio@58222000 { @@ -144,7 +143,6 @@ img_subsys: bus@58000000 { power-domains = <&pd IMX_SC_R_CSI_1>; fsl,channel = <0>; fsl,num-irqs = <32>; - status = "disabled"; }; gpio0_mipi_csi1: gpio@58242000 { -- cgit From 1eb42bacd7cebede5d317569e4b874b54e5c41d6 Mon Sep 17 00:00:00 2001 From: Frank Li Date: Fri, 17 Oct 2025 16:39:46 -0400 Subject: arm64: dts: imx95: Fix MSI mapping for PCIe endpoint nodes The msi-map property was incorrectly applied to pcie0-ep instead of pcie1-ep. Correct the msi-map for both pcie0-ep and pcie1-ep nodes. Fixes: bbe4b2f7d6533 ("arm64: dts: imx95: Add msi-map for pci-ep device") Signed-off-by: Frank Li Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx95.dtsi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi b/arch/arm64/boot/dts/freescale/imx95.dtsi index 1292677cbe4e..6da961eb3fe5 100644 --- a/arch/arm64/boot/dts/freescale/imx95.dtsi +++ b/arch/arm64/boot/dts/freescale/imx95.dtsi @@ -1886,7 +1886,7 @@ assigned-clock-rates = <3600000000>, <100000000>, <10000000>; assigned-clock-parents = <0>, <0>, <&scmi_clk IMX95_CLK_SYSPLL1_PFD1_DIV2>; - msi-map = <0x0 &its 0x98 0x1>; + msi-map = <0x0 &its 0x10 0x1>; power-domains = <&scmi_devpd IMX95_PD_HSIO_TOP>; status = "disabled"; }; @@ -1963,6 +1963,7 @@ assigned-clock-rates = <3600000000>, <100000000>, <10000000>; assigned-clock-parents = <0>, <0>, <&scmi_clk IMX95_CLK_SYSPLL1_PFD1_DIV2>; + msi-map = <0x0 &its 0x98 0x1>; power-domains = <&scmi_devpd IMX95_PD_HSIO_TOP>; status = "disabled"; }; -- cgit From 6504297872c7a5d0d06247970d32940eba26b8b3 Mon Sep 17 00:00:00 2001 From: Frieder Schrempf Date: Mon, 20 Oct 2025 15:21:51 +0200 Subject: arm64: dts: imx8mp-kontron: Fix USB OTG role switching The VBUS supply regulator is currently assigned to the PHY node. This causes the VBUS to be always on, even when the controller needs to be switched to peripheral mode. Fix the OTG role switching by adding a connector node and moving the VBUS supply regulator to that node. This way the VBUS gets correctly switched according to the current role. Fixes: 946ab10e3f40 ("arm64: dts: Add support for Kontron OSM-S i.MX8MP SoM and BL carrier board") Signed-off-by: Frieder Schrempf Signed-off-by: Shawn Guo --- .../boot/dts/freescale/imx8mp-kontron-bl-osm-s.dts | 24 +++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/arch/arm64/boot/dts/freescale/imx8mp-kontron-bl-osm-s.dts b/arch/arm64/boot/dts/freescale/imx8mp-kontron-bl-osm-s.dts index 614b4ce330b1..0924ac50fd2d 100644 --- a/arch/arm64/boot/dts/freescale/imx8mp-kontron-bl-osm-s.dts +++ b/arch/arm64/boot/dts/freescale/imx8mp-kontron-bl-osm-s.dts @@ -16,11 +16,20 @@ ethernet1 = &eqos; }; - extcon_usbc: usbc { - compatible = "linux,extcon-usb-gpio"; + connector { + compatible = "gpio-usb-b-connector", "usb-b-connector"; + id-gpios = <&gpio1 10 GPIO_ACTIVE_HIGH>; + label = "Type-C"; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_usb1_id>; - id-gpios = <&gpio1 10 GPIO_ACTIVE_HIGH>; + type = "micro"; + vbus-supply = <®_usb1_vbus>; + + port { + usb_dr_connector: endpoint { + remote-endpoint = <&usb3_dwc>; + }; + }; }; leds { @@ -244,9 +253,15 @@ hnp-disable; srp-disable; dr_mode = "otg"; - extcon = <&extcon_usbc>; usb-role-switch; + role-switch-default-mode = "peripheral"; status = "okay"; + + port { + usb3_dwc: endpoint { + remote-endpoint = <&usb_dr_connector>; + }; + }; }; &usb_dwc3_1 { @@ -273,7 +288,6 @@ }; &usb3_phy0 { - vbus-supply = <®_usb1_vbus>; status = "okay"; }; -- cgit From 330e2c514823008b22e6afd2055715bc46dd8d55 Mon Sep 17 00:00:00 2001 From: David Howells Date: Wed, 22 Oct 2025 19:48:32 +0100 Subject: afs: Fix dynamic lookup to fail on cell lookup failure When a process tries to access an entry in /afs, normally what happens is that an automount dentry is created by ->lookup() and then triggered, which jumps through the ->d_automount() op. Currently, afs_dynroot_lookup() does not do cell DNS lookup, leaving that to afs_d_automount() to perform - however, it is possible to use access() or stat() on the automount point, which will always return successfully, have briefly created an afs_cell record if one did not already exist. This means that something like: test -d "/afs/.west" && echo Directory exists will print "Directory exists" even though no such cell is configured. This breaks the "west" python module available on PIP as it expects this access to fail. Now, it could be possible to make afs_dynroot_lookup() perform the DNS[*] lookup, but that would make "ls --color /afs" do this for each cell in /afs that is listed but not yet probed. kafs-client, probably wrongly, preloads the entire cell database and all the known cells are then listed in /afs - and doing ls /afs would be very, very slow, especially if any cell supplied addresses but was wholly inaccessible. [*] When I say "DNS", actually read getaddrinfo(), which could use any one of a host of mechanisms. Could also use static configuration. To fix this, make the following changes: (1) Create an enum to specify the origination point of a call to afs_lookup_cell() and pass this value into that function in place of the "excl" parameter (which can be derived from it). There are six points of origination: - Cell preload through /proc/net/afs/cells - Root cell config through /proc/net/afs/rootcell - Lookup in dynamic root - Automount trigger - Direct mount with mount() syscall - Alias check where YFS tells us the cell name is different (2) Add an extra state into the afs_cell state machine to indicate a cell that's been initialised, but not yet looked up. This is separate from one that can be considered active and has been looked up at least once. (3) Make afs_lookup_cell() vary its behaviour more, depending on where it was called from: If called from preload or root cell config, DNS lookup will not happen until we definitely want to use the cell (dynroot mount, automount, direct mount or alias check). The cell will appear in /afs but stat() won't trigger DNS lookup. If the cell already exists, dynroot will not wait for the DNS lookup to complete. If the cell did not already exist, dynroot will wait. If called from automount, direct mount or alias check, it will wait for the DNS lookup to complete. (4) Make afs_lookup_cell() return an error if lookup failed in one way or another. We try to return -ENOENT if the DNS says the cell does not exist and -EDESTADDRREQ if we couldn't access the DNS. Reported-by: Markus Suvanto Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220685 Signed-off-by: David Howells Link: https://patch.msgid.link/1784747.1761158912@warthog.procyon.org.uk Fixes: 1d0b929fc070 ("afs: Change dynroot to create contents on demand") Tested-by: Markus Suvanto cc: Marc Dionne cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner --- fs/afs/cell.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++--------- fs/afs/dynroot.c | 3 ++- fs/afs/internal.h | 12 ++++++++- fs/afs/mntpt.c | 3 ++- fs/afs/proc.c | 3 ++- fs/afs/super.c | 2 +- fs/afs/vl_alias.c | 3 ++- 7 files changed, 86 insertions(+), 18 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index f31359922e98..d9b6fa1088b7 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -229,7 +229,7 @@ error: * @name: The name of the cell. * @namesz: The strlen of the cell name. * @vllist: A colon/comma separated list of numeric IP addresses or NULL. - * @excl: T if an error should be given if the cell name already exists. + * @reason: The reason we're doing the lookup * @trace: The reason to be logged if the lookup is successful. * * Look up a cell record by name and query the DNS for VL server addresses if @@ -239,7 +239,8 @@ error: */ struct afs_cell *afs_lookup_cell(struct afs_net *net, const char *name, unsigned int namesz, - const char *vllist, bool excl, + const char *vllist, + enum afs_lookup_cell_for reason, enum afs_cell_trace trace) { struct afs_cell *cell, *candidate, *cursor; @@ -247,12 +248,18 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, enum afs_cell_state state; int ret, n; - _enter("%s,%s", name, vllist); + _enter("%s,%s,%u", name, vllist, reason); - if (!excl) { + if (reason != AFS_LOOKUP_CELL_PRELOAD) { cell = afs_find_cell(net, name, namesz, trace); - if (!IS_ERR(cell)) + if (!IS_ERR(cell)) { + if (reason == AFS_LOOKUP_CELL_DYNROOT) + goto no_wait; + if (cell->state == AFS_CELL_SETTING_UP || + cell->state == AFS_CELL_UNLOOKED) + goto lookup_cell; goto wait_for_cell; + } } /* Assume we're probably going to create a cell and preallocate and @@ -298,26 +305,69 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, rb_insert_color(&cell->net_node, &net->cells); up_write(&net->cells_lock); - afs_queue_cell(cell, afs_cell_trace_queue_new); +lookup_cell: + if (reason != AFS_LOOKUP_CELL_PRELOAD && + reason != AFS_LOOKUP_CELL_ROOTCELL) { + set_bit(AFS_CELL_FL_DO_LOOKUP, &cell->flags); + afs_queue_cell(cell, afs_cell_trace_queue_new); + } wait_for_cell: - _debug("wait_for_cell"); state = smp_load_acquire(&cell->state); /* vs error */ - if (state != AFS_CELL_ACTIVE && - state != AFS_CELL_DEAD) { + switch (state) { + case AFS_CELL_ACTIVE: + case AFS_CELL_DEAD: + break; + case AFS_CELL_UNLOOKED: + default: + if (reason == AFS_LOOKUP_CELL_PRELOAD || + reason == AFS_LOOKUP_CELL_ROOTCELL) + break; + _debug("wait_for_cell"); afs_see_cell(cell, afs_cell_trace_wait); wait_var_event(&cell->state, ({ state = smp_load_acquire(&cell->state); /* vs error */ state == AFS_CELL_ACTIVE || state == AFS_CELL_DEAD; })); + _debug("waited_for_cell %d %d", cell->state, cell->error); } +no_wait: /* Check the state obtained from the wait check. */ + state = smp_load_acquire(&cell->state); /* vs error */ if (state == AFS_CELL_DEAD) { ret = cell->error; goto error; } + if (state == AFS_CELL_ACTIVE) { + switch (cell->dns_status) { + case DNS_LOOKUP_NOT_DONE: + if (cell->dns_source == DNS_RECORD_FROM_CONFIG) { + ret = 0; + break; + } + fallthrough; + default: + ret = -EIO; + goto error; + case DNS_LOOKUP_GOOD: + case DNS_LOOKUP_GOOD_WITH_BAD: + ret = 0; + break; + case DNS_LOOKUP_GOT_NOT_FOUND: + ret = -ENOENT; + goto error; + case DNS_LOOKUP_BAD: + ret = -EREMOTEIO; + goto error; + case DNS_LOOKUP_GOT_LOCAL_FAILURE: + case DNS_LOOKUP_GOT_TEMP_FAILURE: + case DNS_LOOKUP_GOT_NS_FAILURE: + ret = -EDESTADDRREQ; + goto error; + } + } _leave(" = %p [cell]", cell); return cell; @@ -325,7 +375,7 @@ wait_for_cell: cell_already_exists: _debug("cell exists"); cell = cursor; - if (excl) { + if (reason == AFS_LOOKUP_CELL_PRELOAD) { ret = -EEXIST; } else { afs_use_cell(cursor, trace); @@ -384,7 +434,8 @@ int afs_cell_init(struct afs_net *net, const char *rootcell) return -EINVAL; /* allocate a cell record for the root/workstation cell */ - new_root = afs_lookup_cell(net, rootcell, len, vllist, false, + new_root = afs_lookup_cell(net, rootcell, len, vllist, + AFS_LOOKUP_CELL_ROOTCELL, afs_cell_trace_use_lookup_ws); if (IS_ERR(new_root)) { _leave(" = %ld", PTR_ERR(new_root)); @@ -777,6 +828,7 @@ static bool afs_manage_cell(struct afs_cell *cell) switch (cell->state) { case AFS_CELL_SETTING_UP: goto set_up_cell; + case AFS_CELL_UNLOOKED: case AFS_CELL_ACTIVE: goto cell_is_active; case AFS_CELL_REMOVING: @@ -797,7 +849,7 @@ set_up_cell: goto remove_cell; } - afs_set_cell_state(cell, AFS_CELL_ACTIVE); + afs_set_cell_state(cell, AFS_CELL_UNLOOKED); cell_is_active: if (afs_has_cell_expired(cell, &next_manage)) @@ -807,6 +859,8 @@ cell_is_active: ret = afs_update_cell(cell); if (ret < 0) cell->error = ret; + if (cell->state == AFS_CELL_UNLOOKED) + afs_set_cell_state(cell, AFS_CELL_ACTIVE); } if (next_manage < TIME64_MAX && cell->net->live) { diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index 8c6130789fde..dc9d29e3739e 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -108,7 +108,8 @@ static struct dentry *afs_dynroot_lookup_cell(struct inode *dir, struct dentry * dotted = true; } - cell = afs_lookup_cell(net, name, len, NULL, false, + cell = afs_lookup_cell(net, name, len, NULL, + AFS_LOOKUP_CELL_DYNROOT, afs_cell_trace_use_lookup_dynroot); if (IS_ERR(cell)) { ret = PTR_ERR(cell); diff --git a/fs/afs/internal.h b/fs/afs/internal.h index a45ae5c2ef8a..b92f96f56767 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -343,6 +343,7 @@ extern const char afs_init_sysname[]; enum afs_cell_state { AFS_CELL_SETTING_UP, + AFS_CELL_UNLOOKED, AFS_CELL_ACTIVE, AFS_CELL_REMOVING, AFS_CELL_DEAD, @@ -1049,9 +1050,18 @@ static inline bool afs_cb_is_broken(unsigned int cb_break, extern int afs_cell_init(struct afs_net *, const char *); extern struct afs_cell *afs_find_cell(struct afs_net *, const char *, unsigned, enum afs_cell_trace); +enum afs_lookup_cell_for { + AFS_LOOKUP_CELL_DYNROOT, + AFS_LOOKUP_CELL_MOUNTPOINT, + AFS_LOOKUP_CELL_DIRECT_MOUNT, + AFS_LOOKUP_CELL_PRELOAD, + AFS_LOOKUP_CELL_ROOTCELL, + AFS_LOOKUP_CELL_ALIAS_CHECK, +}; struct afs_cell *afs_lookup_cell(struct afs_net *net, const char *name, unsigned int namesz, - const char *vllist, bool excl, + const char *vllist, + enum afs_lookup_cell_for reason, enum afs_cell_trace trace); extern struct afs_cell *afs_use_cell(struct afs_cell *, enum afs_cell_trace); void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason); diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index 1ad048e6e164..57c204a3c04e 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -107,7 +107,8 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt) if (size > AFS_MAXCELLNAME) return -ENAMETOOLONG; - cell = afs_lookup_cell(ctx->net, p, size, NULL, false, + cell = afs_lookup_cell(ctx->net, p, size, NULL, + AFS_LOOKUP_CELL_MOUNTPOINT, afs_cell_trace_use_lookup_mntpt); if (IS_ERR(cell)) { pr_err("kAFS: unable to lookup cell '%pd'\n", mntpt); diff --git a/fs/afs/proc.c b/fs/afs/proc.c index 40e879c8ca77..44520549b509 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -122,7 +122,8 @@ static int afs_proc_cells_write(struct file *file, char *buf, size_t size) if (strcmp(buf, "add") == 0) { struct afs_cell *cell; - cell = afs_lookup_cell(net, name, strlen(name), args, true, + cell = afs_lookup_cell(net, name, strlen(name), args, + AFS_LOOKUP_CELL_PRELOAD, afs_cell_trace_use_lookup_add); if (IS_ERR(cell)) { ret = PTR_ERR(cell); diff --git a/fs/afs/super.c b/fs/afs/super.c index da407f2d6f0d..d672b7ab57ae 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -290,7 +290,7 @@ static int afs_parse_source(struct fs_context *fc, struct fs_parameter *param) /* lookup the cell record */ if (cellname) { cell = afs_lookup_cell(ctx->net, cellname, cellnamesz, - NULL, false, + NULL, AFS_LOOKUP_CELL_DIRECT_MOUNT, afs_cell_trace_use_lookup_mount); if (IS_ERR(cell)) { pr_err("kAFS: unable to lookup cell '%*.*s'\n", diff --git a/fs/afs/vl_alias.c b/fs/afs/vl_alias.c index 709b4cdb723e..fc9676abd252 100644 --- a/fs/afs/vl_alias.c +++ b/fs/afs/vl_alias.c @@ -269,7 +269,8 @@ static int yfs_check_canonical_cell_name(struct afs_cell *cell, struct key *key) if (!name_len || name_len > AFS_MAXCELLNAME) master = ERR_PTR(-EOPNOTSUPP); else - master = afs_lookup_cell(cell->net, cell_name, name_len, NULL, false, + master = afs_lookup_cell(cell->net, cell_name, name_len, NULL, + AFS_LOOKUP_CELL_ALIAS_CHECK, afs_cell_trace_use_lookup_canonical); kfree(cell_name); if (IS_ERR(master)) -- cgit From 34ab4c75588c07cca12884f2bf6b0347c7a13872 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Thu, 23 Oct 2025 22:25:49 +0900 Subject: bfs: Reconstruct file type when loading from disk syzbot is reporting that S_IFMT bits of inode->i_mode can become bogus when the S_IFMT bits of the 32bits "mode" field loaded from disk are corrupted or when the 32bits "attributes" field loaded from disk are corrupted. A documentation says that BFS uses only lower 9 bits of the "mode" field. But I can't find an explicit explanation that the unused upper 23 bits (especially, the S_IFMT bits) are initialized with 0. Therefore, ignore the S_IFMT bits of the "mode" field loaded from disk. Also, verify that the value of the "attributes" field loaded from disk is either BFS_VREG or BFS_VDIR (because BFS supports only regular files and the root directory). Reported-by: syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d Signed-off-by: Tetsuo Handa Link: https://patch.msgid.link/fabce673-d5b9-4038-8287-0fd65d80203b@I-love.SAKURA.ne.jp Reviewed-by: Tigran Aivazian Signed-off-by: Christian Brauner --- fs/bfs/inode.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c index 1d41ce477df5..984b365df046 100644 --- a/fs/bfs/inode.c +++ b/fs/bfs/inode.c @@ -61,7 +61,19 @@ struct inode *bfs_iget(struct super_block *sb, unsigned long ino) off = (ino - BFS_ROOT_INO) % BFS_INODES_PER_BLOCK; di = (struct bfs_inode *)bh->b_data + off; - inode->i_mode = 0x0000FFFF & le32_to_cpu(di->i_mode); + /* + * https://martin.hinner.info/fs/bfs/bfs-structure.html explains that + * BFS in SCO UnixWare environment used only lower 9 bits of di->i_mode + * value. This means that, although bfs_write_inode() saves whole + * inode->i_mode bits (which include S_IFMT bits and S_IS{UID,GID,VTX} + * bits), middle 7 bits of di->i_mode value can be garbage when these + * bits were not saved by bfs_write_inode(). + * Since we can't tell whether middle 7 bits are garbage, use only + * lower 12 bits (i.e. tolerate S_IS{UID,GID,VTX} bits possibly being + * garbage) and reconstruct S_IFMT bits for Linux environment from + * di->i_vtype value. + */ + inode->i_mode = 0x00000FFF & le32_to_cpu(di->i_mode); if (le32_to_cpu(di->i_vtype) == BFS_VDIR) { inode->i_mode |= S_IFDIR; inode->i_op = &bfs_dir_inops; @@ -71,6 +83,11 @@ struct inode *bfs_iget(struct super_block *sb, unsigned long ino) inode->i_op = &bfs_file_inops; inode->i_fop = &bfs_file_operations; inode->i_mapping->a_ops = &bfs_aops; + } else { + brelse(bh); + printf("Unknown vtype=%u %s:%08lx\n", + le32_to_cpu(di->i_vtype), inode->i_sb->s_id, ino); + goto error; } BFS_I(inode)->i_sblock = le32_to_cpu(di->i_sblock); -- cgit From 9db8d46712d274a27d1d22c38e70211f20d508c2 Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Fri, 24 Oct 2025 15:23:36 +0200 Subject: mnt: Remove dead code which might prevent from building Clang, in particular, is not happy about dead code: fs/namespace.c:135:37: error: unused function 'node_to_mnt_ns' [-Werror,-Wunused-function] 135 | static inline struct mnt_namespace *node_to_mnt_ns(const struct rb_node *node) | ^~~~~~~~~~~~~~ 1 error generated. Remove a leftover from the previous cleanup. Fixes: 7d7d16498958 ("mnt: support ns lookup") Signed-off-by: Andy Shevchenko Link: https://patch.msgid.link/20251024132336.1666382-1-andriy.shevchenko@linux.intel.com Signed-off-by: Christian Brauner --- fs/namespace.c | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 5b5ab2ae238b..cc6e00e72437 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -132,16 +132,6 @@ EXPORT_SYMBOL_GPL(fs_kobj); */ __cacheline_aligned_in_smp DEFINE_SEQLOCK(mount_lock); -static inline struct mnt_namespace *node_to_mnt_ns(const struct rb_node *node) -{ - struct ns_common *ns; - - if (!node) - return NULL; - ns = rb_entry(node, struct ns_common, ns_tree_node); - return container_of(ns, struct mnt_namespace, ns); -} - static void mnt_ns_release(struct mnt_namespace *ns) { /* keep alive for {list,stat}mount() */ -- cgit From f4fa7c25f632cd925352b4d46f245653a23b1d1a Mon Sep 17 00:00:00 2001 From: Andrea Righi Date: Wed, 29 Oct 2025 14:08:43 +0100 Subject: sched_ext: Fix use of uninitialized variable in scx_bpf_cpuperf_set() scx_bpf_cpuperf_set() has a typo where it dereferences the local variable @sch, instead of the global @scx_root pointer. Fix by dereferencing the correct variable. Fixes: 956f2b11a8a4f ("sched_ext: Drop kf_cpu_valid()") Signed-off-by: Andrea Righi Reviewed-by: Christian Loehle Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index ecb251e883ea..1a019a7728fb 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -6401,7 +6401,7 @@ __bpf_kfunc void scx_bpf_cpuperf_set(s32 cpu, u32 perf) guard(rcu)(); - sch = rcu_dereference(sch); + sch = rcu_dereference(scx_root); if (unlikely(!sch)) return; -- cgit From beab067dbcff642243291fd528355d64c41dc3b2 Mon Sep 17 00:00:00 2001 From: Zhang Heng Date: Fri, 12 Sep 2025 20:38:18 +0800 Subject: HID: quirks: work around VID/PID conflict for 0x4c4a/0x4155 Based on available evidence, the USB ID 4c4a:4155 used by multiple devices has been attributed to Jieli. The commit 1a8953f4f774 ("HID: Add IGNORE quirk for SMARTLINKTECHNOLOGY") affected touchscreen functionality. Added checks for manufacturer and serial number to maintain microphone compatibility, enabling both devices to function properly. [jkosina@suse.com: edit shortlog] Fixes: 1a8953f4f774 ("HID: Add IGNORE quirk for SMARTLINKTECHNOLOGY") Cc: stable@vger.kernel.org Tested-by: staffan.melin@oscillator.se Reviewed-by: Terry Junge Signed-off-by: Zhang Heng Signed-off-by: Jiri Kosina --- drivers/hid/hid-ids.h | 4 ++-- drivers/hid/hid-quirks.c | 13 ++++++++++++- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 0723b4b1c9ec..52ae7c29f9e0 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -1543,7 +1543,7 @@ #define USB_VENDOR_ID_SIGNOTEC 0x2133 #define USB_DEVICE_ID_SIGNOTEC_VIEWSONIC_PD1011 0x0018 -#define USB_VENDOR_ID_SMARTLINKTECHNOLOGY 0x4c4a -#define USB_DEVICE_ID_SMARTLINKTECHNOLOGY_4155 0x4155 +#define USB_VENDOR_ID_JIELI_SDK_DEFAULT 0x4c4a +#define USB_DEVICE_ID_JIELI_SDK_4155 0x4155 #endif diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index bcd4bccf1a7c..22760ac50f2d 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -915,7 +915,6 @@ static const struct hid_device_id hid_ignore_list[] = { #endif { HID_USB_DEVICE(USB_VENDOR_ID_YEALINK, USB_DEVICE_ID_YEALINK_P1K_P4K_B2K) }, { HID_USB_DEVICE(USB_VENDOR_ID_QUANTA, USB_DEVICE_ID_QUANTA_HP_5MP_CAMERA_5473) }, - { HID_USB_DEVICE(USB_VENDOR_ID_SMARTLINKTECHNOLOGY, USB_DEVICE_ID_SMARTLINKTECHNOLOGY_4155) }, { } }; @@ -1064,6 +1063,18 @@ bool hid_ignore(struct hid_device *hdev) strlen(elan_acpi_id[i].id))) return true; break; + case USB_VENDOR_ID_JIELI_SDK_DEFAULT: + /* + * Multiple USB devices with identical IDs (mic & touchscreen). + * The touch screen requires hid core processing, but the + * microphone does not. They can be distinguished by manufacturer + * and serial number. + */ + if (hdev->product == USB_DEVICE_ID_JIELI_SDK_4155 && + strncmp(hdev->name, "SmartlinkTechnology", 19) == 0 && + strncmp(hdev->uniq, "20201111000001", 14) == 0) + return true; + break; } if (hdev->type == HID_TYPE_USBMOUSE && -- cgit From a45f15808fb753a14c6041fd1e5bef5d552bd2e3 Mon Sep 17 00:00:00 2001 From: Lauri Tirkkonen Date: Sat, 18 Oct 2025 15:35:15 +0900 Subject: HID: lenovo: fixup Lenovo Yoga Slim 7x Keyboard rdesc The keyboard of this device has the following in its report description for Usage (Keyboard) in Collection (Application): # 0x15, 0x00, // Logical Minimum (0) 52 # 0x25, 0x65, // Logical Maximum (101) 54 # 0x05, 0x07, // Usage Page (Keyboard) 56 # 0x19, 0x00, // Usage Minimum (0) 58 # 0x29, 0xdd, // Usage Maximum (221) 60 # 0x81, 0x00, // Input (Data,Arr,Abs) 62 Since the Usage Min/Max range exceeds the Logical Min/Max range, keypresses outside the Logical range are not recognized. This includes, for example, the Japanese language keyboard variant's keys for |, _ and \. Fixup the report description to make the Logical range match the Usage range, fixing the interpretation of keypresses above 101 on this device. Signed-off-by: Lauri Tirkkonen Signed-off-by: Jiri Kosina --- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-lenovo.c | 17 +++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 52ae7c29f9e0..85db279baa72 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -718,6 +718,7 @@ #define USB_DEVICE_ID_ITE_LENOVO_YOGA2 0x8350 #define I2C_DEVICE_ID_ITE_LENOVO_LEGION_Y720 0x837a #define USB_DEVICE_ID_ITE_LENOVO_YOGA900 0x8396 +#define I2C_DEVICE_ID_ITE_LENOVO_YOGA_SLIM_7X_KEYBOARD 0x8987 #define USB_DEVICE_ID_ITE8595 0x8595 #define USB_DEVICE_ID_ITE_MEDION_E1239T 0xce50 diff --git a/drivers/hid/hid-lenovo.c b/drivers/hid/hid-lenovo.c index 654879814f97..9cc3e029e9f6 100644 --- a/drivers/hid/hid-lenovo.c +++ b/drivers/hid/hid-lenovo.c @@ -148,6 +148,14 @@ static const __u8 lenovo_tpIIbtkbd_need_fixup_collection[] = { 0x81, 0x01, /* Input (Const,Array,Abs,No Wrap,Linear,Preferred State,No Null Position) */ }; +static const __u8 lenovo_yoga7x_kbd_need_fixup_collection[] = { + 0x15, 0x00, // Logical Minimum (0) + 0x25, 0x65, // Logical Maximum (101) + 0x05, 0x07, // Usage Page (Keyboard) + 0x19, 0x00, // Usage Minimum (0) + 0x29, 0xDD, // Usage Maximum (221) +}; + static const __u8 *lenovo_report_fixup(struct hid_device *hdev, __u8 *rdesc, unsigned int *rsize) { @@ -177,6 +185,13 @@ static const __u8 *lenovo_report_fixup(struct hid_device *hdev, __u8 *rdesc, rdesc[260] = 0x01; /* report count (2) = 0x01 */ } break; + case I2C_DEVICE_ID_ITE_LENOVO_YOGA_SLIM_7X_KEYBOARD: + if (*rsize == 176 && + memcmp(&rdesc[52], lenovo_yoga7x_kbd_need_fixup_collection, + sizeof(lenovo_yoga7x_kbd_need_fixup_collection)) == 0) { + rdesc[55] = rdesc[61]; // logical maximum = usage maximum + } + break; } return rdesc; } @@ -1538,6 +1553,8 @@ static const struct hid_device_id lenovo_devices[] = { USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_X12_TAB) }, { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC, USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_X12_TAB2) }, + { HID_DEVICE(BUS_I2C, HID_GROUP_GENERIC, + USB_VENDOR_ID_ITE, I2C_DEVICE_ID_ITE_LENOVO_YOGA_SLIM_7X_KEYBOARD) }, { } }; -- cgit From 082ef944e55da8a9a8df92e3842ca82a626d359a Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Tue, 28 Oct 2025 04:22:47 +0200 Subject: xfrm: Check inner packet family directly from skb_dst In the output path, xfrm_dev_offload_ok and xfrm_get_inner_ipproto need to determine the protocol family of the inner packet (skb) before it gets encapsulated. In xfrm_dev_offload_ok, the code checked x->inner_mode.family. This is unreliable because, for states handling both IPv4 and IPv6, the relevant inner family could be either x->inner_mode.family or x->inner_mode_iaf.family. Checking only the former can lead to a mismatch with the actual packet being processed. In xfrm_get_inner_ipproto, the code checked x->outer_mode.family. This is also incorrect for tunnel mode, as the inner packet's family can be different from the outer header's family. At both of these call sites, the skb variable holds the original inner packet. The most direct and reliable source of truth for its protocol family is its destination entry. This patch fixes the issue by using skb_dst(skb)->ops->family to ensure protocol-specific headers are only accessed for the correct packet type. Fixes: 91d8a53db219 ("xfrm: fix offloading of cross-family tunnels") Fixes: 45a98ef4922d ("net/xfrm: IPsec tunnel mode fix inner_ipproto setting in sec_path") Signed-off-by: Jianbo Liu Reviewed-by: Cosmin Ratiu Reviewed-by: Zhu Yanjun Reviewed-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_device.c | 2 +- net/xfrm/xfrm_output.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c index 44b9de6e4e77..52ae0e034d29 100644 --- a/net/xfrm/xfrm_device.c +++ b/net/xfrm/xfrm_device.c @@ -438,7 +438,7 @@ ok: check_tunnel_size = x->xso.type == XFRM_DEV_OFFLOAD_PACKET && x->props.mode == XFRM_MODE_TUNNEL; - switch (x->inner_mode.family) { + switch (skb_dst(skb)->ops->family) { case AF_INET: /* Check for IPv4 options */ if (ip_hdr(skb)->ihl != 5) diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c index 9077730ff7d0..a98b5bf55ac3 100644 --- a/net/xfrm/xfrm_output.c +++ b/net/xfrm/xfrm_output.c @@ -698,7 +698,7 @@ static void xfrm_get_inner_ipproto(struct sk_buff *skb, struct xfrm_state *x) return; if (x->outer_mode.encap == XFRM_MODE_TUNNEL) { - switch (x->outer_mode.family) { + switch (skb_dst(skb)->ops->family) { case AF_INET: xo->inner_ipproto = ip_hdr(skb)->protocol; break; -- cgit From 61fafbee6cfed283c02a320896089f658fa67e56 Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Tue, 28 Oct 2025 04:22:48 +0200 Subject: xfrm: Determine inner GSO type from packet inner protocol The GSO segmentation functions for ESP tunnel mode (xfrm4_tunnel_gso_segment and xfrm6_tunnel_gso_segment) were determining the inner packet's L2 protocol type by checking the static x->inner_mode.family field from the xfrm state. This is unreliable. In tunnel mode, the state's actual inner family could be defined by x->inner_mode.family or by x->inner_mode_iaf.family. Checking only the former can lead to a mismatch with the actual packet being processed, causing GSO to create segments with the wrong L2 header type. This patch fixes the bug by deriving the inner mode directly from the packet's inner protocol stored in XFRM_MODE_SKB_CB(skb)->protocol. Instead of replicating the code, this patch modifies the xfrm_ip2inner_mode helper function. It now correctly returns &x->inner_mode if the selector family (x->sel.family) is already specified, thereby handling both specific and AF_UNSPEC cases appropriately. With this change, ESP GSO can use xfrm_ip2inner_mode to get the correct inner mode. It doesn't affect existing callers, as the updated logic now mirrors the checks they were already performing externally. Fixes: 26dbd66eab80 ("esp: choose the correct inner protocol for GSO on inter address family tunnels") Signed-off-by: Jianbo Liu Reviewed-by: Cosmin Ratiu Reviewed-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- include/net/xfrm.h | 3 ++- net/ipv4/esp4_offload.c | 6 ++++-- net/ipv6/esp6_offload.c | 6 ++++-- 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/include/net/xfrm.h b/include/net/xfrm.h index f3014e4f54fc..0a14daaa5dd4 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -536,7 +536,8 @@ static inline int xfrm_af2proto(unsigned int family) static inline const struct xfrm_mode *xfrm_ip2inner_mode(struct xfrm_state *x, int ipproto) { - if ((ipproto == IPPROTO_IPIP && x->props.family == AF_INET) || + if ((x->sel.family != AF_UNSPEC) || + (ipproto == IPPROTO_IPIP && x->props.family == AF_INET) || (ipproto == IPPROTO_IPV6 && x->props.family == AF_INET6)) return &x->inner_mode; else diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c index e0d94270da28..05828d4cb6cd 100644 --- a/net/ipv4/esp4_offload.c +++ b/net/ipv4/esp4_offload.c @@ -122,8 +122,10 @@ static struct sk_buff *xfrm4_tunnel_gso_segment(struct xfrm_state *x, struct sk_buff *skb, netdev_features_t features) { - __be16 type = x->inner_mode.family == AF_INET6 ? htons(ETH_P_IPV6) - : htons(ETH_P_IP); + const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x, + XFRM_MODE_SKB_CB(skb)->protocol); + __be16 type = inner_mode->family == AF_INET6 ? htons(ETH_P_IPV6) + : htons(ETH_P_IP); return skb_eth_gso_segment(skb, features, type); } diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c index 7b41fb4f00b5..22410243ebe8 100644 --- a/net/ipv6/esp6_offload.c +++ b/net/ipv6/esp6_offload.c @@ -158,8 +158,10 @@ static struct sk_buff *xfrm6_tunnel_gso_segment(struct xfrm_state *x, struct sk_buff *skb, netdev_features_t features) { - __be16 type = x->inner_mode.family == AF_INET ? htons(ETH_P_IP) - : htons(ETH_P_IPV6); + const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x, + XFRM_MODE_SKB_CB(skb)->protocol); + __be16 type = inner_mode->family == AF_INET ? htons(ETH_P_IP) + : htons(ETH_P_IPV6); return skb_eth_gso_segment(skb, features, type); } -- cgit From 59630e2ccd728703cc826e3a3515d70f8c7a766c Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Wed, 29 Oct 2025 11:50:25 +0200 Subject: xfrm: Prevent locally generated packets from direct output in tunnel mode Add a check to ensure locally generated packets (skb->sk != NULL) do not use direct output in tunnel mode, as these packets require proper L2 header setup that is handled by the normal XFRM processing path. Fixes: 5eddd76ec2fd ("xfrm: fix tunnel mode TX datapath in packet offload mode") Signed-off-by: Jianbo Liu Reviewed-by: Leon Romanovsky Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_output.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c index a98b5bf55ac3..54222fcbd7fd 100644 --- a/net/xfrm/xfrm_output.c +++ b/net/xfrm/xfrm_output.c @@ -772,8 +772,12 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb) /* Exclusive direct xmit for tunnel mode, as * some filtering or matching rules may apply * in transport mode. + * Locally generated packets also require + * the normal XFRM path for L2 header setup, + * as the hardware needs the L2 header to match + * for encryption, so skip direct output as well. */ - if (x->props.mode == XFRM_MODE_TUNNEL) + if (x->props.mode == XFRM_MODE_TUNNEL && !skb->sk) return xfrm_dev_direct_output(sk, x, skb); return xfrm_output_resume(sk, skb, 0); -- cgit From 743c81cdc98fd4fef62a89eb70efff994112c2d9 Mon Sep 17 00:00:00 2001 From: April Grimoire Date: Thu, 23 Oct 2025 00:37:26 +0800 Subject: HID: apple: Add SONiX AK870 PRO to non_apple_keyboards quirk list SONiX AK870 PRO keyboard pretends to be an apple keyboard by VID:PID, rendering function keys not treated properly. Despite being a SONiX USB DEVICE, it uses a different name, so adding it to the list. Signed-off-by: April Grimoire Signed-off-by: Jiri Kosina --- drivers/hid/hid-apple.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hid/hid-apple.c b/drivers/hid/hid-apple.c index 61404d7a43ee..57da4f86a9fa 100644 --- a/drivers/hid/hid-apple.c +++ b/drivers/hid/hid-apple.c @@ -355,6 +355,7 @@ static const struct apple_key_translation swapped_fn_leftctrl_keys[] = { static const struct apple_non_apple_keyboard non_apple_keyboards[] = { { "SONiX USB DEVICE" }, + { "SONiX AK870 PRO" }, { "Keychron" }, { "AONE" }, { "GANSS" }, -- cgit From 4d3a13afa8b64dc49293b3eab3e7beac11072c12 Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Mon, 20 Oct 2025 10:50:42 -0500 Subject: HID: amd_sfh: Stop sensor before starting Titas reports that the accelerometer sensor on their laptop only works after a warm boot or unloading/reloading the amd-sfh kernel module. Presumably the sensor is in a bad state on cold boot and failing to start, so explicitly stop it before starting. Cc: stable@vger.kernel.org Fixes: 93ce5e0231d79 ("HID: amd_sfh: Implement SFH1.1 functionality") Reported-by: Titas Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220670 Tested-by: Titas Signed-off-by: Mario Limonciello (AMD) Signed-off-by: Jiri Kosina --- drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_init.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_init.c b/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_init.c index 0a9b44ce4904..b0bab2a1ddcc 100644 --- a/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_init.c +++ b/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_init.c @@ -194,6 +194,8 @@ static int amd_sfh1_1_hid_client_init(struct amd_mp2_dev *privdata) if (rc) goto cleanup; + mp2_ops->stop(privdata, cl_data->sensor_idx[i]); + amd_sfh_wait_for_response(privdata, cl_data->sensor_idx[i], DISABLE_SENSOR); writel(0, privdata->mmio + amd_get_p2c_val(privdata, 0)); mp2_ops->start(privdata, info); status = amd_sfh_wait_for_response -- cgit From 53f731f5bba0cf03b751ccceb98b82fadc9ccd1e Mon Sep 17 00:00:00 2001 From: Masami Ichikawa Date: Sun, 21 Sep 2025 14:31:02 +0900 Subject: HID: hid-ntrig: Prevent memory leak in ntrig_report_version() Use a scope-based cleanup helper for the buffer allocated with kmalloc() in ntrig_report_version() to simplify the cleanup logic and prevent memory leaks (specifically the !hid_is_usb()-case one). [jkosina@suse.com: elaborate on the actual existing leak] Fixes: 185c926283da ("HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()") Signed-off-by: Masami Ichikawa Signed-off-by: Jiri Kosina --- drivers/hid/hid-ntrig.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/hid/hid-ntrig.c b/drivers/hid/hid-ntrig.c index 0f76e241e0af..a7f10c45f62b 100644 --- a/drivers/hid/hid-ntrig.c +++ b/drivers/hid/hid-ntrig.c @@ -142,13 +142,13 @@ static void ntrig_report_version(struct hid_device *hdev) int ret; char buf[20]; struct usb_device *usb_dev = hid_to_usb_dev(hdev); - unsigned char *data = kmalloc(8, GFP_KERNEL); + unsigned char *data __free(kfree) = kmalloc(8, GFP_KERNEL); if (!hid_is_usb(hdev)) return; if (!data) - goto err_free; + return; ret = usb_control_msg(usb_dev, usb_rcvctrlpipe(usb_dev, 0), USB_REQ_CLEAR_FEATURE, @@ -163,9 +163,6 @@ static void ntrig_report_version(struct hid_device *hdev) hid_info(hdev, "Firmware version: %s (%02x%02x %02x%02x)\n", buf, data[2], data[3], data[4], data[5]); } - -err_free: - kfree(data); } static ssize_t show_phys_width(struct device *dev, -- cgit From 534ca75e8e3b713514b3f2da85dab96831cf5b2a Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Thu, 30 Oct 2025 11:06:25 -0500 Subject: HID: hid-input: Extend Elan ignore battery quirk to USB MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit USB Elan devices have the same problem as the I2C ones with a fake battery device showing up. Reviewed-by: Hans de Goede Reported-by: André Barata Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220722 Signed-off-by: Mario Limonciello (AMD) Signed-off-by: Jiri Kosina --- drivers/hid/hid-input.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c index e56e7de53279..2bbb645c2ff4 100644 --- a/drivers/hid/hid-input.c +++ b/drivers/hid/hid-input.c @@ -399,10 +399,11 @@ static const struct hid_device_id hid_battery_quirks[] = { { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, I2C_DEVICE_ID_CHROMEBOOK_TROGDOR_POMPOM), HID_BATTERY_QUIRK_AVOID_QUERY }, /* - * Elan I2C-HID touchscreens seem to all report a non present battery, - * set HID_BATTERY_QUIRK_IGNORE for all Elan I2C-HID devices. + * Elan HID touchscreens seem to all report a non present battery, + * set HID_BATTERY_QUIRK_IGNORE for all Elan I2C and USB HID devices. */ { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, HID_ANY_ID), HID_BATTERY_QUIRK_IGNORE }, + { HID_USB_DEVICE(USB_VENDOR_ID_ELAN, HID_ANY_ID), HID_BATTERY_QUIRK_IGNORE }, {} }; -- cgit From 08d70143e3033d267507deb98a5fd187df3e6640 Mon Sep 17 00:00:00 2001 From: Quentin Schulz Date: Wed, 29 Oct 2025 14:50:59 +0100 Subject: arm64: dts: rockchip: include rk3399-base instead of rk3399 in rk3399-op1 In commit 296602b8e5f7 ("arm64: dts: rockchip: Move RK3399 OPPs to dtsi files for SoC variants"), everything shared between variants of RK3399 was put into rk3399-base.dtsi and the rest in variant-specific DTSI, such as rk3399-t, rk3399-op1, rk3399, etc. Therefore, the variant-specific DTSI should include rk3399-base.dtsi and not another variant's DTSI. rk3399-op1 wrongly includes rk3399 (a variant) DTSI instead of rk3399-base DTSI, let's fix this oversight by including the intended DTSI. Fortunately, this had no impact on the resulting DTB since all nodes were named the same and all node properties were overridden in rk3399-op1.dtsi. This was checked by doing a checksum of rk3399-op1 DTBs before and after this commit. No intended change in behavior. Fixes: 296602b8e5f7 ("arm64: dts: rockchip: Move RK3399 OPPs to dtsi files for SoC variants") Cc: stable@vger.kernel.org Signed-off-by: Quentin Schulz Reviewed-by: Dragan Simic Link: https://patch.msgid.link/20251029-rk3399-op1-include-v1-1-2472ee60e7f8@cherry.de Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3399-op1.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3399-op1.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-op1.dtsi index c4f4f1ff6117..9da6fd82e46b 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-op1.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-op1.dtsi @@ -3,7 +3,7 @@ * Copyright (c) 2016-2017 Fuzhou Rockchip Electronics Co., Ltd */ -#include "rk3399.dtsi" +#include "rk3399-base.dtsi" / { cluster0_opp: opp-table-0 { -- cgit From 03c7e964a02e388ee168c804add7404eda23908c Mon Sep 17 00:00:00 2001 From: Diederik de Haas Date: Mon, 27 Oct 2025 16:54:28 +0100 Subject: arm64: dts: rockchip: Fix vccio4-supply on rk3566-pinetab2 Page 13 of the PineTab2 v2 schematic dd 20230417 shows VCCIO4's power source is VCCIO_WL. Page 19 shows that VCCIO_WL is connected to VCCA1V8_PMU, so fix the PineTab2 dtsi to reflect that. Fixes: 1b7e19448f8f ("arm64: dts: rockchip: Add devicetree for Pine64 PineTab2") Cc: stable@vger.kernel.org Reviewed-by: Dragan Simic Signed-off-by: Diederik de Haas Link: https://patch.msgid.link/20251027155724.138096-1-diederik@cknow-tech.com Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3566-pinetab2.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3566-pinetab2.dtsi b/arch/arm64/boot/dts/rockchip/rk3566-pinetab2.dtsi index d0e38412d56a..08bf40de17ea 100644 --- a/arch/arm64/boot/dts/rockchip/rk3566-pinetab2.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3566-pinetab2.dtsi @@ -789,7 +789,7 @@ vccio1-supply = <&vccio_acodec>; vccio2-supply = <&vcc_1v8>; vccio3-supply = <&vccio_sd>; - vccio4-supply = <&vcc_1v8>; + vccio4-supply = <&vcca1v8_pmu>; vccio5-supply = <&vcc_1v8>; vccio6-supply = <&vcc1v8_dvp>; vccio7-supply = <&vcc_3v3>; -- cgit From a1d3bc606bf5c3b3ea811cc2019df6285d75b00f Mon Sep 17 00:00:00 2001 From: Mikhail Kshevetskiy Date: Mon, 3 Nov 2025 04:01:48 +0300 Subject: mtd: spinand: fmsh: remove QE bit for FM25S01A flash According to datasheet (http://eng.fmsh.com/nvm/FM25S01A_ds_eng.pdf) there is no QE (Quad Enable) bit for FM25S01A flash, so remove it. Fixes: 5f284dc15ca86 ("mtd: spinand: add support for FudanMicro FM25S01A") Signed-off-by: Mikhail Kshevetskiy Tested-by: Tianling Shen Signed-off-by: Miquel Raynal --- drivers/mtd/nand/spi/fmsh.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mtd/nand/spi/fmsh.c b/drivers/mtd/nand/spi/fmsh.c index 8b2097bfc771..c2b9a8c113cb 100644 --- a/drivers/mtd/nand/spi/fmsh.c +++ b/drivers/mtd/nand/spi/fmsh.c @@ -58,7 +58,7 @@ static const struct spinand_info fmsh_spinand_table[] = { SPINAND_INFO_OP_VARIANTS(&read_cache_variants, &write_cache_variants, &update_cache_variants), - SPINAND_HAS_QE_BIT, + 0, SPINAND_ECCINFO(&fm25s01a_ooblayout, NULL)), }; -- cgit From 97315e7c901a1de60e8ca9b11e0e96d0f9253e18 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Sat, 1 Nov 2025 16:25:48 +0300 Subject: mtd: onenand: Pass correct pointer to IRQ handler This was supposed to pass "onenand" instead of "&onenand" with the ampersand. Passing a random stack address which will be gone when the function ends makes no sense. However the good thing is that the pointer is never used, so this doesn't cause a problem at run time. Fixes: e23abf4b7743 ("mtd: OneNAND: S5PC110: Implement DMA interrupt method") Signed-off-by: Dan Carpenter Signed-off-by: Miquel Raynal --- drivers/mtd/nand/onenand/onenand_samsung.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mtd/nand/onenand/onenand_samsung.c b/drivers/mtd/nand/onenand/onenand_samsung.c index f37a6138e461..6d6aa709a21f 100644 --- a/drivers/mtd/nand/onenand/onenand_samsung.c +++ b/drivers/mtd/nand/onenand/onenand_samsung.c @@ -906,7 +906,7 @@ static int s3c_onenand_probe(struct platform_device *pdev) err = devm_request_irq(&pdev->dev, r->start, s5pc110_onenand_irq, IRQF_SHARED, "onenand", - &onenand); + onenand); if (err) { dev_err(&pdev->dev, "failed to get irq\n"); return err; -- cgit From 8da0efc3da9312b65f5cbf06e57d284f69222b2e Mon Sep 17 00:00:00 2001 From: Richard Fitzgerald Date: Mon, 3 Nov 2025 11:58:09 +0000 Subject: ASoC: doc: cs35l56: Update firmware filename description for B0 silicon Update the text for firmware file naming to show that the l?u? suffix is supported on CS35L56 B0 silicon and ampN was only used on early firmware. The previous version of this text only said that B0 silicon used the ampN suffix. Since kernel 6.16 the driver supports both the old ampN and new l?u? suffix for B0 silicon. New firmwares will use the l?u? suffix. Signed-off-by: Richard Fitzgerald Link: https://patch.msgid.link/20251103115809.33953-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown --- Documentation/sound/codecs/cs35l56.rst | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/Documentation/sound/codecs/cs35l56.rst b/Documentation/sound/codecs/cs35l56.rst index 57d1964453e1..d5363b08f515 100644 --- a/Documentation/sound/codecs/cs35l56.rst +++ b/Documentation/sound/codecs/cs35l56.rst @@ -105,10 +105,10 @@ In this example the SSID is 10280c63. The format of the firmware file names is: -SoundWire (except CS35L56 Rev B0): +SoundWire: cs35lxx-b0-dsp1-misc-SSID[-spkidX]-l?u? -SoundWire CS35L56 Rev B0: +SoundWire CS35L56 Rev B0 firmware released before kernel version 6.16: cs35lxx-b0-dsp1-misc-SSID[-spkidX]-ampN Non-SoundWire (HDA and I2S): @@ -127,9 +127,8 @@ Where: * spkidX is an optional part, used for laptops that have firmware configurations for different makes and models of internal speakers. -The CS35L56 Rev B0 continues to use the old filename scheme because a -large number of firmware files have already been published with these -names. +Early firmware for CS35L56 Rev B0 used the ALSA prefix (ampN) as the +filename qualifier. Support for the l?u? qualifier was added in kernel 6.16. Sound Open Firmware and ALSA topology files ------------------------------------------- -- cgit From 8f05967b022d255412640670915475ac4cdc10e9 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Mon, 3 Nov 2025 08:49:32 -0700 Subject: MAINTAINERS: correct git location for block layer tree As part of a recent move go exclusively listing git.kernel.org trees for the block and io_uring development, the "BLOCK LAYER" entry wasn't updated as it already used git.kernel.org. However, outside of just moving from git.kernel.dk to git.kernel.org, the "block" part of the trees was also dropped, as the tree serves both block and io_uring development trees. Fix up the "BLOCK LAYER" entry so they all use the same tree. Reported-by: John Garry Reviewed-by: John Garry Signed-off-by: Jens Axboe --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 0554bf05b426..b986f4635b7d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4362,7 +4362,7 @@ BLOCK LAYER M: Jens Axboe L: linux-block@vger.kernel.org S: Maintained -T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git F: Documentation/ABI/stable/sysfs-block F: Documentation/block/ F: block/ -- cgit From 56b3c85e153b84f27e6cff39623ba40a1ad299d3 Mon Sep 17 00:00:00 2001 From: Song Liu Date: Mon, 27 Oct 2025 10:50:21 -0700 Subject: ftrace: Fix BPF fexit with livepatch When livepatch is attached to the same function as bpf trampoline with a fexit program, bpf trampoline code calls register_ftrace_direct() twice. The first time will fail with -EAGAIN, and the second time it will succeed. This requires register_ftrace_direct() to unregister the address on the first attempt. Otherwise, the bpf trampoline cannot attach. Here is an easy way to reproduce this issue: insmod samples/livepatch/livepatch-sample.ko bpftrace -e 'fexit:cmdline_proc_show {}' ERROR: Unable to attach probe: fexit:vmlinux:cmdline_proc_show... Fix this by cleaning up the hash when register_ftrace_function_nolock hits errors. Also, move the code that resets ops->func and ops->trampoline to the error path of register_ftrace_direct(); and add a helper function reset_direct() in register_ftrace_direct() and unregister_ftrace_direct(). Fixes: d05cb470663a ("ftrace: Fix modification of direct_function hash while in use") Cc: stable@vger.kernel.org # v6.6+ Reported-by: Andrey Grodzovsky Closes: https://lore.kernel.org/live-patching/c5058315a39d4615b333e485893345be@crowdstrike.com/ Cc: Steven Rostedt (Google) Cc: Masami Hiramatsu (Google) Acked-and-tested-by: Andrey Grodzovsky Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Link: https://lore.kernel.org/r/20251027175023.1521602-2-song@kernel.org Signed-off-by: Alexei Starovoitov Acked-by: Steven Rostedt (Google) --- kernel/bpf/trampoline.c | 5 ----- kernel/trace/ftrace.c | 20 ++++++++++++++------ 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index 5949095e51c3..f2cb0b097093 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -479,11 +479,6 @@ again: * BPF_TRAMP_F_SHARE_IPMODIFY is set, we can generate the * trampoline again, and retry register. */ - /* reset fops->func and fops->trampoline for re-register */ - tr->fops->func = NULL; - tr->fops->trampoline = 0; - - /* free im memory and reallocate later */ bpf_tramp_image_free(im); goto again; } diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 42bd2ba68a82..cbeb7e833131 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -5953,6 +5953,17 @@ static void register_ftrace_direct_cb(struct rcu_head *rhp) free_ftrace_hash(fhp); } +static void reset_direct(struct ftrace_ops *ops, unsigned long addr) +{ + struct ftrace_hash *hash = ops->func_hash->filter_hash; + + remove_direct_functions_hash(hash, addr); + + /* cleanup for possible another register call */ + ops->func = NULL; + ops->trampoline = 0; +} + /** * register_ftrace_direct - Call a custom trampoline directly * for multiple functions registered in @ops @@ -6048,6 +6059,8 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr) ops->direct_call = addr; err = register_ftrace_function_nolock(ops); + if (err) + reset_direct(ops, addr); out_unlock: mutex_unlock(&direct_mutex); @@ -6080,7 +6093,6 @@ EXPORT_SYMBOL_GPL(register_ftrace_direct); int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr, bool free_filters) { - struct ftrace_hash *hash = ops->func_hash->filter_hash; int err; if (check_direct_multi(ops)) @@ -6090,13 +6102,9 @@ int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr, mutex_lock(&direct_mutex); err = unregister_ftrace_function(ops); - remove_direct_functions_hash(hash, addr); + reset_direct(ops, addr); mutex_unlock(&direct_mutex); - /* cleanup for possible another register call */ - ops->func = NULL; - ops->trampoline = 0; - if (free_filters) ftrace_free_filter(ops); return err; -- cgit From 3e9a18e1c3e931abecf501cbb23d28d69f85bb56 Mon Sep 17 00:00:00 2001 From: Song Liu Date: Mon, 27 Oct 2025 10:50:22 -0700 Subject: ftrace: bpf: Fix IPMODIFY + DIRECT in modify_ftrace_direct() ftrace_hash_ipmodify_enable() checks IPMODIFY and DIRECT ftrace_ops on the same kernel function. When needed, ftrace_hash_ipmodify_enable() calls ops->ops_func() to prepare the direct ftrace (BPF trampoline) to share the same function as the IPMODIFY ftrace (livepatch). ftrace_hash_ipmodify_enable() is called in register_ftrace_direct() path, but not called in modify_ftrace_direct() path. As a result, the following operations will break livepatch: 1. Load livepatch to a kernel function; 2. Attach fentry program to the kernel function; 3. Attach fexit program to the kernel function. After 3, the kernel function being used will not be the livepatched version, but the original version. Fix this by adding __ftrace_hash_update_ipmodify() to __modify_ftrace_direct() and adjust some logic around the call. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Link: https://lore.kernel.org/r/20251027175023.1521602-3-song@kernel.org Signed-off-by: Alexei Starovoitov Acked-by: Steven Rostedt (Google) --- kernel/trace/ftrace.c | 40 +++++++++++++++++++++++++++++++--------- 1 file changed, 31 insertions(+), 9 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index cbeb7e833131..59cfacb8a5bb 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1971,7 +1971,8 @@ static void ftrace_hash_rec_enable_modify(struct ftrace_ops *ops) */ static int __ftrace_hash_update_ipmodify(struct ftrace_ops *ops, struct ftrace_hash *old_hash, - struct ftrace_hash *new_hash) + struct ftrace_hash *new_hash, + bool update_target) { struct ftrace_page *pg; struct dyn_ftrace *rec, *end = NULL; @@ -2006,10 +2007,13 @@ static int __ftrace_hash_update_ipmodify(struct ftrace_ops *ops, if (rec->flags & FTRACE_FL_DISABLED) continue; - /* We need to update only differences of filter_hash */ + /* + * Unless we are updating the target of a direct function, + * we only need to update differences of filter_hash + */ in_old = !!ftrace_lookup_ip(old_hash, rec->ip); in_new = !!ftrace_lookup_ip(new_hash, rec->ip); - if (in_old == in_new) + if (!update_target && (in_old == in_new)) continue; if (in_new) { @@ -2020,7 +2024,16 @@ static int __ftrace_hash_update_ipmodify(struct ftrace_ops *ops, if (is_ipmodify) goto rollback; - FTRACE_WARN_ON(rec->flags & FTRACE_FL_DIRECT); + /* + * If this is called by __modify_ftrace_direct() + * then it is only changing where the direct + * pointer is jumping to, and the record already + * points to a direct trampoline. If it isn't, + * then it is a bug to update ipmodify on a direct + * caller. + */ + FTRACE_WARN_ON(!update_target && + (rec->flags & FTRACE_FL_DIRECT)); /* * Another ops with IPMODIFY is already @@ -2076,7 +2089,7 @@ static int ftrace_hash_ipmodify_enable(struct ftrace_ops *ops) if (ftrace_hash_empty(hash)) hash = NULL; - return __ftrace_hash_update_ipmodify(ops, EMPTY_HASH, hash); + return __ftrace_hash_update_ipmodify(ops, EMPTY_HASH, hash, false); } /* Disabling always succeeds */ @@ -2087,7 +2100,7 @@ static void ftrace_hash_ipmodify_disable(struct ftrace_ops *ops) if (ftrace_hash_empty(hash)) hash = NULL; - __ftrace_hash_update_ipmodify(ops, hash, EMPTY_HASH); + __ftrace_hash_update_ipmodify(ops, hash, EMPTY_HASH, false); } static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops, @@ -2101,7 +2114,7 @@ static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops, if (ftrace_hash_empty(new_hash)) new_hash = NULL; - return __ftrace_hash_update_ipmodify(ops, old_hash, new_hash); + return __ftrace_hash_update_ipmodify(ops, old_hash, new_hash, false); } static void print_ip_ins(const char *fmt, const unsigned char *p) @@ -6114,7 +6127,7 @@ EXPORT_SYMBOL_GPL(unregister_ftrace_direct); static int __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr) { - struct ftrace_hash *hash; + struct ftrace_hash *hash = ops->func_hash->filter_hash; struct ftrace_func_entry *entry, *iter; static struct ftrace_ops tmp_ops = { .func = ftrace_stub, @@ -6134,13 +6147,21 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr) if (err) return err; + /* + * Call __ftrace_hash_update_ipmodify() here, so that we can call + * ops->ops_func for the ops. This is needed because the above + * register_ftrace_function_nolock() worked on tmp_ops. + */ + err = __ftrace_hash_update_ipmodify(ops, hash, hash, true); + if (err) + goto out; + /* * Now the ftrace_ops_list_func() is called to do the direct callers. * We can safely change the direct functions attached to each entry. */ mutex_lock(&ftrace_lock); - hash = ops->func_hash->filter_hash; size = 1 << hash->size_bits; for (i = 0; i < size; i++) { hlist_for_each_entry(iter, &hash->buckets[i], hlist) { @@ -6155,6 +6176,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr) mutex_unlock(&ftrace_lock); +out: /* Removing the tmp_ops will add the updated direct callers to the functions */ unregister_ftrace_function(&tmp_ops); -- cgit From 62d2d0a33839c28173909616db2ef16e1a4a5071 Mon Sep 17 00:00:00 2001 From: Song Liu Date: Mon, 27 Oct 2025 10:50:23 -0700 Subject: selftests/bpf: Add tests for livepatch + bpf trampoline Both livepatch and BPF trampoline use ftrace. Special attention is needed when livepatch and fexit program touch the same function at the same time, because livepatch updates a kernel function and the BPF trampoline need to call into the right version of the kernel function. Use samples/livepatch/livepatch-sample.ko for the test. The test covers two cases: 1) When a fentry program is loaded first. This exercises the modify_ftrace_direct code path. 2) When a fentry program is loaded first. This exercises the register_ftrace_direct code path. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Link: https://lore.kernel.org/r/20251027175023.1521602-4-song@kernel.org Signed-off-by: Alexei Starovoitov Acked-by: Steven Rostedt (Google) --- tools/testing/selftests/bpf/config | 3 + .../bpf/prog_tests/livepatch_trampoline.c | 107 +++++++++++++++++++++ .../selftests/bpf/progs/livepatch_trampoline.c | 30 ++++++ 3 files changed, 140 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/livepatch_trampoline.c create mode 100644 tools/testing/selftests/bpf/progs/livepatch_trampoline.c diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index 70b28c1e653e..f2a2fd236ca8 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -50,6 +50,7 @@ CONFIG_IPV6_SIT=y CONFIG_IPV6_TUNNEL=y CONFIG_KEYS=y CONFIG_LIRC=y +CONFIG_LIVEPATCH=y CONFIG_LWTUNNEL=y CONFIG_MODULE_SIG=y CONFIG_MODULE_SRCVERSION_ALL=y @@ -111,6 +112,8 @@ CONFIG_IP6_NF_FILTER=y CONFIG_NF_NAT=y CONFIG_PACKET=y CONFIG_RC_CORE=y +CONFIG_SAMPLES=y +CONFIG_SAMPLE_LIVEPATCH=m CONFIG_SECURITY=y CONFIG_SECURITYFS=y CONFIG_SYN_COOKIES=y diff --git a/tools/testing/selftests/bpf/prog_tests/livepatch_trampoline.c b/tools/testing/selftests/bpf/prog_tests/livepatch_trampoline.c new file mode 100644 index 000000000000..72aa5376c30e --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/livepatch_trampoline.c @@ -0,0 +1,107 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */ + +#include +#include "testing_helpers.h" +#include "livepatch_trampoline.skel.h" + +static int load_livepatch(void) +{ + char path[4096]; + + /* CI will set KBUILD_OUTPUT */ + snprintf(path, sizeof(path), "%s/samples/livepatch/livepatch-sample.ko", + getenv("KBUILD_OUTPUT") ? : "../../../.."); + + return load_module(path, env_verbosity > VERBOSE_NONE); +} + +static void unload_livepatch(void) +{ + /* Disable the livepatch before unloading the module */ + system("echo 0 > /sys/kernel/livepatch/livepatch_sample/enabled"); + + unload_module("livepatch_sample", env_verbosity > VERBOSE_NONE); +} + +static void read_proc_cmdline(void) +{ + char buf[4096]; + int fd, ret; + + fd = open("/proc/cmdline", O_RDONLY); + if (!ASSERT_OK_FD(fd, "open /proc/cmdline")) + return; + + ret = read(fd, buf, sizeof(buf)); + if (!ASSERT_GT(ret, 0, "read /proc/cmdline")) + goto out; + + ASSERT_OK(strncmp(buf, "this has been live patched", 26), "strncmp"); + +out: + close(fd); +} + +static void __test_livepatch_trampoline(bool fexit_first) +{ + struct livepatch_trampoline *skel = NULL; + int err; + + skel = livepatch_trampoline__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) + goto out; + + skel->bss->my_pid = getpid(); + + if (!fexit_first) { + /* fentry program is loaded first by default */ + err = livepatch_trampoline__attach(skel); + if (!ASSERT_OK(err, "skel_attach")) + goto out; + } else { + /* Manually load fexit program first. */ + skel->links.fexit_cmdline = bpf_program__attach(skel->progs.fexit_cmdline); + if (!ASSERT_OK_PTR(skel->links.fexit_cmdline, "attach_fexit")) + goto out; + + skel->links.fentry_cmdline = bpf_program__attach(skel->progs.fentry_cmdline); + if (!ASSERT_OK_PTR(skel->links.fentry_cmdline, "attach_fentry")) + goto out; + } + + read_proc_cmdline(); + + ASSERT_EQ(skel->bss->fentry_hit, 1, "fentry_hit"); + ASSERT_EQ(skel->bss->fexit_hit, 1, "fexit_hit"); +out: + livepatch_trampoline__destroy(skel); +} + +void test_livepatch_trampoline(void) +{ + int retry_cnt = 0; + +retry: + if (load_livepatch()) { + if (retry_cnt) { + ASSERT_OK(1, "load_livepatch"); + goto out; + } + /* + * Something else (previous run of the same test?) loaded + * the KLP module. Unload the KLP module and retry. + */ + unload_livepatch(); + retry_cnt++; + goto retry; + } + + if (test__start_subtest("fentry_first")) + __test_livepatch_trampoline(false); + + if (test__start_subtest("fexit_first")) + __test_livepatch_trampoline(true); +out: + unload_livepatch(); +} diff --git a/tools/testing/selftests/bpf/progs/livepatch_trampoline.c b/tools/testing/selftests/bpf/progs/livepatch_trampoline.c new file mode 100644 index 000000000000..15579d5bcd91 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/livepatch_trampoline.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include + +int fentry_hit; +int fexit_hit; +int my_pid; + +SEC("fentry/cmdline_proc_show") +int BPF_PROG(fentry_cmdline) +{ + if (my_pid != (bpf_get_current_pid_tgid() >> 32)) + return 0; + + fentry_hit = 1; + return 0; +} + +SEC("fexit/cmdline_proc_show") +int BPF_PROG(fexit_cmdline) +{ + if (my_pid != (bpf_get_current_pid_tgid() >> 32)) + return 0; + + fexit_hit = 1; + return 0; +} -- cgit From b98b69c38512c3a8277c83b2d07674fd1ff59625 Mon Sep 17 00:00:00 2001 From: Pauli Virtanen Date: Sun, 2 Nov 2025 21:10:15 +0200 Subject: ALSA: usb-audio: add min_mute quirk for SteelSeries Arctis ID 1038:1294 SteelSeries ApS Arctis Pro Wireless is reported to have muted min playback volume. Apply quirk for that. Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4229#note_3174448 Signed-off-by: Pauli Virtanen Link: https://patch.msgid.link/a83f2694b1f8c37e4667a3cf057ffdc408b0f70d.1762108507.git.pav@iki.fi Signed-off-by: Takashi Iwai --- sound/usb/quirks.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 71638e6dfb20..e5b857129caf 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -2267,6 +2267,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_FIXED_RATE), DEVICE_FLG(0x0fd9, 0x0008, /* Hauppauge HVR-950Q */ QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER), + DEVICE_FLG(0x1038, 0x1294, /* SteelSeries Arctis Pro Wireless */ + QUIRK_FLAG_MIXER_PLAYBACK_MIN_MUTE), DEVICE_FLG(0x1101, 0x0003, /* Audioengine D1 */ QUIRK_FLAG_GET_SAMPLE_RATE), DEVICE_FLG(0x12d1, 0x3a07, /* Huawei Technologies Co., Ltd. */ -- cgit From 249d96b492efb7a773296ab2c62179918301c146 Mon Sep 17 00:00:00 2001 From: Claudiu Beznea Date: Tue, 4 Nov 2025 13:49:14 +0200 Subject: ASoC: da7213: Use component driver suspend/resume Since snd_soc_suspend() is invoked through snd_soc_pm_ops->suspend(), and snd_soc_pm_ops is associated with the soc_driver (defined in sound/soc/soc-core.c), and there is no parent-child relationship between the soc_driver and the DA7213 codec driver, the power management subsystem does not enforce a specific suspend/resume order between the DA7213 driver and the soc_driver. Because of this, the different codec component functionalities, called from snd_soc_resume() to reconfigure various functions, can race with the DA7213 struct dev_pm_ops::resume function, leading to misapplied configuration. This occasionally results in clipped sound. Fix this by dropping the struct dev_pm_ops::{suspend, resume} and use instead struct snd_soc_component_driver::{suspend, resume}. This ensures the proper configuration sequence is handled by the ASoC subsystem. Cc: stable@vger.kernel.org Fixes: 431e040065c8 ("ASoC: da7213: Add suspend to RAM support") Signed-off-by: Claudiu Beznea Link: https://patch.msgid.link/20251104114914.2060603-1-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Mark Brown --- sound/soc/codecs/da7213.c | 69 ++++++++++++++++++++++++++++++----------------- sound/soc/codecs/da7213.h | 1 + 2 files changed, 45 insertions(+), 25 deletions(-) diff --git a/sound/soc/codecs/da7213.c b/sound/soc/codecs/da7213.c index ae89260ca215..3420011da444 100644 --- a/sound/soc/codecs/da7213.c +++ b/sound/soc/codecs/da7213.c @@ -2124,11 +2124,50 @@ static int da7213_probe(struct snd_soc_component *component) return 0; } +static int da7213_runtime_suspend(struct device *dev) +{ + struct da7213_priv *da7213 = dev_get_drvdata(dev); + + regcache_cache_only(da7213->regmap, true); + regcache_mark_dirty(da7213->regmap); + regulator_bulk_disable(DA7213_NUM_SUPPLIES, da7213->supplies); + + return 0; +} + +static int da7213_runtime_resume(struct device *dev) +{ + struct da7213_priv *da7213 = dev_get_drvdata(dev); + int ret; + + ret = regulator_bulk_enable(DA7213_NUM_SUPPLIES, da7213->supplies); + if (ret < 0) + return ret; + regcache_cache_only(da7213->regmap, false); + return regcache_sync(da7213->regmap); +} + +static int da7213_suspend(struct snd_soc_component *component) +{ + struct da7213_priv *da7213 = snd_soc_component_get_drvdata(component); + + return da7213_runtime_suspend(da7213->dev); +} + +static int da7213_resume(struct snd_soc_component *component) +{ + struct da7213_priv *da7213 = snd_soc_component_get_drvdata(component); + + return da7213_runtime_resume(da7213->dev); +} + static const struct snd_soc_component_driver soc_component_dev_da7213 = { .probe = da7213_probe, .set_bias_level = da7213_set_bias_level, .controls = da7213_snd_controls, .num_controls = ARRAY_SIZE(da7213_snd_controls), + .suspend = da7213_suspend, + .resume = da7213_resume, .dapm_widgets = da7213_dapm_widgets, .num_dapm_widgets = ARRAY_SIZE(da7213_dapm_widgets), .dapm_routes = da7213_audio_map, @@ -2175,6 +2214,8 @@ static int da7213_i2c_probe(struct i2c_client *i2c) if (!da7213->fin_min_rate) return -EINVAL; + da7213->dev = &i2c->dev; + i2c_set_clientdata(i2c, da7213); /* Get required supplies */ @@ -2224,31 +2265,9 @@ static void da7213_i2c_remove(struct i2c_client *i2c) pm_runtime_disable(&i2c->dev); } -static int da7213_runtime_suspend(struct device *dev) -{ - struct da7213_priv *da7213 = dev_get_drvdata(dev); - - regcache_cache_only(da7213->regmap, true); - regcache_mark_dirty(da7213->regmap); - regulator_bulk_disable(DA7213_NUM_SUPPLIES, da7213->supplies); - - return 0; -} - -static int da7213_runtime_resume(struct device *dev) -{ - struct da7213_priv *da7213 = dev_get_drvdata(dev); - int ret; - - ret = regulator_bulk_enable(DA7213_NUM_SUPPLIES, da7213->supplies); - if (ret < 0) - return ret; - regcache_cache_only(da7213->regmap, false); - return regcache_sync(da7213->regmap); -} - -static DEFINE_RUNTIME_DEV_PM_OPS(da7213_pm, da7213_runtime_suspend, - da7213_runtime_resume, NULL); +static const struct dev_pm_ops da7213_pm = { + RUNTIME_PM_OPS(da7213_runtime_suspend, da7213_runtime_resume, NULL) +}; static const struct i2c_device_id da7213_i2c_id[] = { { "da7213" }, diff --git a/sound/soc/codecs/da7213.h b/sound/soc/codecs/da7213.h index b9ab791d6b88..29cbf0eb6124 100644 --- a/sound/soc/codecs/da7213.h +++ b/sound/soc/codecs/da7213.h @@ -595,6 +595,7 @@ enum da7213_supplies { /* Codec private data */ struct da7213_priv { struct regmap *regmap; + struct device *dev; struct mutex ctrl_lock; struct regulator_bulk_data supplies[DA7213_NUM_SUPPLIES]; struct clk *mclk; -- cgit From fccac54b0d3d0602f177bb79f203ae6fbea0e32a Mon Sep 17 00:00:00 2001 From: Marek Szyprowski Date: Mon, 27 Oct 2025 13:55:15 +0100 Subject: pmdomain: samsung: Rework legacy splash-screen handover workaround Limit the workaround for the lack of the proper splash-screen handover handling to the legacy ARM 32bit systems and replace forcing a sync_state by explicite power domain shutdown. This approach lets compiler to optimize it out on newer ARM 64bit systems. Suggested-by: Ulf Hansson Fixes: 0745658aebbe ("pmdomain: samsung: Fix splash-screen handover by enforcing a sync_state") Signed-off-by: Marek Szyprowski Acked-by: Krzysztof Kozlowski Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson --- drivers/pmdomain/samsung/exynos-pm-domains.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/pmdomain/samsung/exynos-pm-domains.c b/drivers/pmdomain/samsung/exynos-pm-domains.c index f53e1bd24798..5c3aa8983087 100644 --- a/drivers/pmdomain/samsung/exynos-pm-domains.c +++ b/drivers/pmdomain/samsung/exynos-pm-domains.c @@ -128,6 +128,15 @@ static int exynos_pd_probe(struct platform_device *pdev) pd->pd.power_on = exynos_pd_power_on; pd->local_pwr_cfg = pm_domain_cfg->local_pwr_cfg; + /* + * Some Samsung platforms with bootloaders turning on the splash-screen + * and handing it over to the kernel, requires the power-domains to be + * reset during boot. + */ + if (IS_ENABLED(CONFIG_ARM) && + of_device_is_compatible(np, "samsung,exynos4210-pd")) + exynos_pd_power_off(&pd->pd); + on = readl_relaxed(pd->base + 0x4) & pd->local_pwr_cfg; pm_genpd_init(&pd->pd, NULL, !on); @@ -146,15 +155,6 @@ static int exynos_pd_probe(struct platform_device *pdev) parent.np, child.np); } - /* - * Some Samsung platforms with bootloaders turning on the splash-screen - * and handing it over to the kernel, requires the power-domains to be - * reset during boot. As a temporary hack to manage this, let's enforce - * a sync_state. - */ - if (!ret) - of_genpd_sync_state(np); - pm_runtime_enable(dev); return ret; } -- cgit From bbde14682eba21d86f5f3d6fe2d371b1f97f1e61 Mon Sep 17 00:00:00 2001 From: Miaoqian Lin Date: Tue, 28 Oct 2025 11:16:20 +0800 Subject: pmdomain: imx: Fix reference count leak in imx_gpc_remove of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when not needed anymore. Add the missing of_node_put() to avoid refcount leak. Fixes: 721cabf6c660 ("soc: imx: move PGC handling to a new GPC driver") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin Signed-off-by: Ulf Hansson --- drivers/pmdomain/imx/gpc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/pmdomain/imx/gpc.c b/drivers/pmdomain/imx/gpc.c index 33991f3c6b55..a34b260274f7 100644 --- a/drivers/pmdomain/imx/gpc.c +++ b/drivers/pmdomain/imx/gpc.c @@ -536,6 +536,8 @@ static void imx_gpc_remove(struct platform_device *pdev) return; } } + + of_node_put(pgc_node); } static struct platform_driver imx_gpc_driver = { -- cgit From 9c16e4d216d8103a8178bbe070eb21f779f190c0 Mon Sep 17 00:00:00 2001 From: Stefan Wahren Date: Tue, 4 Nov 2025 18:45:18 +0100 Subject: arm64: defconfig: Fix V3D deferred probe timeout The commit 4adc20ba95d4 ("ARM: dts: broadcom: rpi: Switch to V3D firmware clock") causes a regression in arm64 developer setups, which stores the kernel modules via NFS. Before this change the involved V3D clock provider was builtin, but after this DT change the clk-raspberrypi is responsible for V3D and for arm64/defconfig this driver is build as a kernel module. In case these kernel modules are provided via NFS this takes too long and the PM domain core give up before the clock driver could be loaded: v3d fec00000.gpu: deferred probe timeout, ignoring dependency So resolve this issue by making this critical driver builtin. Reported-by: Mark Brown Closes: https://lore.kernel.org/linux-arm-kernel/9ebda74e-e700-4fbe-bca5-382f92417a9c@sirena.org.uk/ Fixes: 4adc20ba95d4 ("ARM: dts: broadcom: rpi: Switch to V3D firmware clock") Signed-off-by: Stefan Wahren Link: https://lore.kernel.org/r/20251104174518.11783-1-wahrenst@gmx.net Signed-off-by: Florian Fainelli --- arch/arm64/configs/defconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index e3a2d37bd104..1a48faad2473 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -1341,7 +1341,7 @@ CONFIG_COMMON_CLK_RS9_PCIE=y CONFIG_COMMON_CLK_VC3=y CONFIG_COMMON_CLK_VC5=y CONFIG_COMMON_CLK_BD718XX=m -CONFIG_CLK_RASPBERRYPI=m +CONFIG_CLK_RASPBERRYPI=y CONFIG_CLK_IMX8MM=y CONFIG_CLK_IMX8MN=y CONFIG_CLK_IMX8MP=y -- cgit From 3d1c795bdef43363ed1ff71e3f476d86c22e059b Mon Sep 17 00:00:00 2001 From: Rafał Miłecki Date: Thu, 2 Oct 2025 21:48:52 +0200 Subject: ARM: dts: BCM53573: Fix address of Luxul XAP-1440's Ethernet PHY MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Luxul XAP-1440 has BCM54210E PHY at address 25. Fixes: 44ad82078069 ("ARM: dts: BCM53573: Fix Ethernet info for Luxul devices") Signed-off-by: Rafał Miłecki Link: https://lore.kernel.org/r/20251002194852.13929-1-zajec5@gmail.com Signed-off-by: Florian Fainelli --- arch/arm/boot/dts/broadcom/bcm47189-luxul-xap-1440.dts | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm/boot/dts/broadcom/bcm47189-luxul-xap-1440.dts b/arch/arm/boot/dts/broadcom/bcm47189-luxul-xap-1440.dts index ac44c745bdf8..a39a021a3910 100644 --- a/arch/arm/boot/dts/broadcom/bcm47189-luxul-xap-1440.dts +++ b/arch/arm/boot/dts/broadcom/bcm47189-luxul-xap-1440.dts @@ -55,8 +55,8 @@ mdio { /delete-node/ switch@1e; - bcm54210e: ethernet-phy@0 { - reg = <0>; + bcm54210e: ethernet-phy@25 { + reg = <25>; }; }; }; -- cgit From ea0714d61dea6e00b853a0116d0afe2b2fe70ef3 Mon Sep 17 00:00:00 2001 From: Mykyta Yatsenko Date: Tue, 4 Nov 2025 22:54:25 +0000 Subject: bpf:add _impl suffix for bpf_task_work_schedule* kfuncs Rename: bpf_task_work_schedule_resume()->bpf_task_work_schedule_resume_impl() bpf_task_work_schedule_signal()->bpf_task_work_schedule_signal_impl() This aligns task work scheduling kfuncs with the established naming scheme for kfuncs with the bpf_prog_aux argument provided by the verifier implicitly. This convention will be taken advantage of with the upcoming KF_IMPLICIT_ARGS feature to preserve backwards compatibility to BPF programs. Acked-by: Andrii Nakryiko Signed-off-by: Mykyta Yatsenko Link: https://lore.kernel.org/r/20251104-implv2-v3-1-4772b9ae0e06@meta.com Signed-off-by: Alexei Starovoitov Acked-by: Ihor Solodrai --- kernel/bpf/helpers.c | 24 +++++++++++++--------- kernel/bpf/verifier.c | 12 +++++------ tools/testing/selftests/bpf/progs/task_work.c | 6 +++--- tools/testing/selftests/bpf/progs/task_work_fail.c | 8 ++++---- .../testing/selftests/bpf/progs/task_work_stress.c | 4 ++-- 5 files changed, 29 insertions(+), 25 deletions(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index eb25e70e0bdc..33173b027ccf 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -4169,7 +4169,8 @@ release_prog: } /** - * bpf_task_work_schedule_signal - Schedule BPF callback using task_work_add with TWA_SIGNAL mode + * bpf_task_work_schedule_signal_impl - Schedule BPF callback using task_work_add with TWA_SIGNAL + * mode * @task: Task struct for which callback should be scheduled * @tw: Pointer to struct bpf_task_work in BPF map value for internal bookkeeping * @map__map: bpf_map that embeds struct bpf_task_work in the values @@ -4178,15 +4179,17 @@ release_prog: * * Return: 0 if task work has been scheduled successfully, negative error code otherwise */ -__bpf_kfunc int bpf_task_work_schedule_signal(struct task_struct *task, struct bpf_task_work *tw, - void *map__map, bpf_task_work_callback_t callback, - void *aux__prog) +__bpf_kfunc int bpf_task_work_schedule_signal_impl(struct task_struct *task, + struct bpf_task_work *tw, void *map__map, + bpf_task_work_callback_t callback, + void *aux__prog) { return bpf_task_work_schedule(task, tw, map__map, callback, aux__prog, TWA_SIGNAL); } /** - * bpf_task_work_schedule_resume - Schedule BPF callback using task_work_add with TWA_RESUME mode + * bpf_task_work_schedule_resume_impl - Schedule BPF callback using task_work_add with TWA_RESUME + * mode * @task: Task struct for which callback should be scheduled * @tw: Pointer to struct bpf_task_work in BPF map value for internal bookkeeping * @map__map: bpf_map that embeds struct bpf_task_work in the values @@ -4195,9 +4198,10 @@ __bpf_kfunc int bpf_task_work_schedule_signal(struct task_struct *task, struct b * * Return: 0 if task work has been scheduled successfully, negative error code otherwise */ -__bpf_kfunc int bpf_task_work_schedule_resume(struct task_struct *task, struct bpf_task_work *tw, - void *map__map, bpf_task_work_callback_t callback, - void *aux__prog) +__bpf_kfunc int bpf_task_work_schedule_resume_impl(struct task_struct *task, + struct bpf_task_work *tw, void *map__map, + bpf_task_work_callback_t callback, + void *aux__prog) { return bpf_task_work_schedule(task, tw, map__map, callback, aux__prog, TWA_RESUME); } @@ -4377,8 +4381,8 @@ BTF_ID_FLAGS(func, bpf_strnstr); BTF_ID_FLAGS(func, bpf_cgroup_read_xattr, KF_RCU) #endif BTF_ID_FLAGS(func, bpf_stream_vprintk, KF_TRUSTED_ARGS) -BTF_ID_FLAGS(func, bpf_task_work_schedule_signal, KF_TRUSTED_ARGS) -BTF_ID_FLAGS(func, bpf_task_work_schedule_resume, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_task_work_schedule_signal_impl, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_task_work_schedule_resume_impl, KF_TRUSTED_ARGS) BTF_KFUNCS_END(common_btf_ids) static const struct btf_kfunc_id_set common_kfunc_set = { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ff40e5e65c43..8314518c8d93 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -12259,8 +12259,8 @@ enum special_kfunc_type { KF_bpf_res_spin_lock_irqsave, KF_bpf_res_spin_unlock_irqrestore, KF___bpf_trap, - KF_bpf_task_work_schedule_signal, - KF_bpf_task_work_schedule_resume, + KF_bpf_task_work_schedule_signal_impl, + KF_bpf_task_work_schedule_resume_impl, }; BTF_ID_LIST(special_kfunc_list) @@ -12331,13 +12331,13 @@ BTF_ID(func, bpf_res_spin_unlock) BTF_ID(func, bpf_res_spin_lock_irqsave) BTF_ID(func, bpf_res_spin_unlock_irqrestore) BTF_ID(func, __bpf_trap) -BTF_ID(func, bpf_task_work_schedule_signal) -BTF_ID(func, bpf_task_work_schedule_resume) +BTF_ID(func, bpf_task_work_schedule_signal_impl) +BTF_ID(func, bpf_task_work_schedule_resume_impl) static bool is_task_work_add_kfunc(u32 func_id) { - return func_id == special_kfunc_list[KF_bpf_task_work_schedule_signal] || - func_id == special_kfunc_list[KF_bpf_task_work_schedule_resume]; + return func_id == special_kfunc_list[KF_bpf_task_work_schedule_signal_impl] || + func_id == special_kfunc_list[KF_bpf_task_work_schedule_resume_impl]; } static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta) diff --git a/tools/testing/selftests/bpf/progs/task_work.c b/tools/testing/selftests/bpf/progs/task_work.c index 23217f06a3ec..663a80990f8f 100644 --- a/tools/testing/selftests/bpf/progs/task_work.c +++ b/tools/testing/selftests/bpf/progs/task_work.c @@ -66,7 +66,7 @@ int oncpu_hash_map(struct pt_regs *args) if (!work) return 0; - bpf_task_work_schedule_resume(task, &work->tw, &hmap, process_work, NULL); + bpf_task_work_schedule_resume_impl(task, &work->tw, &hmap, process_work, NULL); return 0; } @@ -80,7 +80,7 @@ int oncpu_array_map(struct pt_regs *args) work = bpf_map_lookup_elem(&arrmap, &key); if (!work) return 0; - bpf_task_work_schedule_signal(task, &work->tw, &arrmap, process_work, NULL); + bpf_task_work_schedule_signal_impl(task, &work->tw, &arrmap, process_work, NULL); return 0; } @@ -102,6 +102,6 @@ int oncpu_lru_map(struct pt_regs *args) work = bpf_map_lookup_elem(&lrumap, &key); if (!work || work->data[0]) return 0; - bpf_task_work_schedule_resume(task, &work->tw, &lrumap, process_work, NULL); + bpf_task_work_schedule_resume_impl(task, &work->tw, &lrumap, process_work, NULL); return 0; } diff --git a/tools/testing/selftests/bpf/progs/task_work_fail.c b/tools/testing/selftests/bpf/progs/task_work_fail.c index 77fe8f28facd..1270953fd092 100644 --- a/tools/testing/selftests/bpf/progs/task_work_fail.c +++ b/tools/testing/selftests/bpf/progs/task_work_fail.c @@ -53,7 +53,7 @@ int mismatch_map(struct pt_regs *args) work = bpf_map_lookup_elem(&arrmap, &key); if (!work) return 0; - bpf_task_work_schedule_resume(task, &work->tw, &hmap, process_work, NULL); + bpf_task_work_schedule_resume_impl(task, &work->tw, &hmap, process_work, NULL); return 0; } @@ -65,7 +65,7 @@ int no_map_task_work(struct pt_regs *args) struct bpf_task_work tw; task = bpf_get_current_task_btf(); - bpf_task_work_schedule_resume(task, &tw, &hmap, process_work, NULL); + bpf_task_work_schedule_resume_impl(task, &tw, &hmap, process_work, NULL); return 0; } @@ -76,7 +76,7 @@ int task_work_null(struct pt_regs *args) struct task_struct *task; task = bpf_get_current_task_btf(); - bpf_task_work_schedule_resume(task, NULL, &hmap, process_work, NULL); + bpf_task_work_schedule_resume_impl(task, NULL, &hmap, process_work, NULL); return 0; } @@ -91,6 +91,6 @@ int map_null(struct pt_regs *args) work = bpf_map_lookup_elem(&arrmap, &key); if (!work) return 0; - bpf_task_work_schedule_resume(task, &work->tw, NULL, process_work, NULL); + bpf_task_work_schedule_resume_impl(task, &work->tw, NULL, process_work, NULL); return 0; } diff --git a/tools/testing/selftests/bpf/progs/task_work_stress.c b/tools/testing/selftests/bpf/progs/task_work_stress.c index 90fca06fff56..55e555f7f41b 100644 --- a/tools/testing/selftests/bpf/progs/task_work_stress.c +++ b/tools/testing/selftests/bpf/progs/task_work_stress.c @@ -51,8 +51,8 @@ int schedule_task_work(void *ctx) if (!work) return 0; } - err = bpf_task_work_schedule_signal(bpf_get_current_task_btf(), &work->tw, &hmap, - process_work, NULL); + err = bpf_task_work_schedule_signal_impl(bpf_get_current_task_btf(), &work->tw, &hmap, + process_work, NULL); if (err) __sync_fetch_and_add(&schedule_error, 1); else -- cgit From 137cc92ffe2e71705fce112656a460d924934ebe Mon Sep 17 00:00:00 2001 From: Mykyta Yatsenko Date: Tue, 4 Nov 2025 22:54:26 +0000 Subject: bpf: add _impl suffix for bpf_stream_vprintk() kfunc Rename bpf_stream_vprintk() to bpf_stream_vprintk_impl(). This makes bpf_stream_vprintk() follow the already established "_impl" suffix-based naming convention for kfuncs with the bpf_prog_aux argument provided by the verifier implicitly. This convention will be taken advantage of with the upcoming KF_IMPLICIT_ARGS feature to preserve backwards compatibility to BPF programs. Acked-by: Andrii Nakryiko Signed-off-by: Mykyta Yatsenko Link: https://lore.kernel.org/r/20251104-implv2-v3-2-4772b9ae0e06@meta.com Signed-off-by: Alexei Starovoitov Acked-by: Ihor Solodrai --- kernel/bpf/helpers.c | 2 +- kernel/bpf/stream.c | 3 ++- tools/bpf/bpftool/Documentation/bpftool-prog.rst | 2 +- tools/lib/bpf/bpf_helpers.h | 28 ++++++++++++------------ tools/testing/selftests/bpf/progs/stream_fail.c | 6 ++--- 5 files changed, 21 insertions(+), 20 deletions(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 33173b027ccf..e4007fea4909 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -4380,7 +4380,7 @@ BTF_ID_FLAGS(func, bpf_strnstr); #if defined(CONFIG_BPF_LSM) && defined(CONFIG_CGROUPS) BTF_ID_FLAGS(func, bpf_cgroup_read_xattr, KF_RCU) #endif -BTF_ID_FLAGS(func, bpf_stream_vprintk, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_stream_vprintk_impl, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_task_work_schedule_signal_impl, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_task_work_schedule_resume_impl, KF_TRUSTED_ARGS) BTF_KFUNCS_END(common_btf_ids) diff --git a/kernel/bpf/stream.c b/kernel/bpf/stream.c index eb6c5a21c2ef..ff16c631951b 100644 --- a/kernel/bpf/stream.c +++ b/kernel/bpf/stream.c @@ -355,7 +355,8 @@ __bpf_kfunc_start_defs(); * Avoid using enum bpf_stream_id so that kfunc users don't have to pull in the * enum in headers. */ -__bpf_kfunc int bpf_stream_vprintk(int stream_id, const char *fmt__str, const void *args, u32 len__sz, void *aux__prog) +__bpf_kfunc int bpf_stream_vprintk_impl(int stream_id, const char *fmt__str, const void *args, + u32 len__sz, void *aux__prog) { struct bpf_bprintf_data data = { .get_bin_args = true, diff --git a/tools/bpf/bpftool/Documentation/bpftool-prog.rst b/tools/bpf/bpftool/Documentation/bpftool-prog.rst index 009633294b09..35aeeaf5f711 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-prog.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-prog.rst @@ -182,7 +182,7 @@ bpftool prog tracelog bpftool prog tracelog { stdout | stderr } *PROG* Dump the BPF stream of the program. BPF programs can write to these streams - at runtime with the **bpf_stream_vprintk**\ () kfunc. The kernel may write + at runtime with the **bpf_stream_vprintk_impl**\ () kfunc. The kernel may write error messages to the standard error stream. This facility should be used only for debugging purposes. diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 80c028540656..d4e4e388e625 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -315,20 +315,20 @@ enum libbpf_tristate { ___param, sizeof(___param)); \ }) -extern int bpf_stream_vprintk(int stream_id, const char *fmt__str, const void *args, - __u32 len__sz, void *aux__prog) __weak __ksym; - -#define bpf_stream_printk(stream_id, fmt, args...) \ -({ \ - static const char ___fmt[] = fmt; \ - unsigned long long ___param[___bpf_narg(args)]; \ - \ - _Pragma("GCC diagnostic push") \ - _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \ - ___bpf_fill(___param, args); \ - _Pragma("GCC diagnostic pop") \ - \ - bpf_stream_vprintk(stream_id, ___fmt, ___param, sizeof(___param), NULL);\ +extern int bpf_stream_vprintk_impl(int stream_id, const char *fmt__str, const void *args, + __u32 len__sz, void *aux__prog) __weak __ksym; + +#define bpf_stream_printk(stream_id, fmt, args...) \ +({ \ + static const char ___fmt[] = fmt; \ + unsigned long long ___param[___bpf_narg(args)]; \ + \ + _Pragma("GCC diagnostic push") \ + _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \ + ___bpf_fill(___param, args); \ + _Pragma("GCC diagnostic pop") \ + \ + bpf_stream_vprintk_impl(stream_id, ___fmt, ___param, sizeof(___param), NULL); \ }) /* Use __bpf_printk when bpf_printk call has 3 or fewer fmt args diff --git a/tools/testing/selftests/bpf/progs/stream_fail.c b/tools/testing/selftests/bpf/progs/stream_fail.c index b4a0d0cc8ec8..3662515f0107 100644 --- a/tools/testing/selftests/bpf/progs/stream_fail.c +++ b/tools/testing/selftests/bpf/progs/stream_fail.c @@ -10,7 +10,7 @@ SEC("syscall") __failure __msg("Possibly NULL pointer passed") int stream_vprintk_null_arg(void *ctx) { - bpf_stream_vprintk(BPF_STDOUT, "", NULL, 0, NULL); + bpf_stream_vprintk_impl(BPF_STDOUT, "", NULL, 0, NULL); return 0; } @@ -18,7 +18,7 @@ SEC("syscall") __failure __msg("R3 type=scalar expected=") int stream_vprintk_scalar_arg(void *ctx) { - bpf_stream_vprintk(BPF_STDOUT, "", (void *)46, 0, NULL); + bpf_stream_vprintk_impl(BPF_STDOUT, "", (void *)46, 0, NULL); return 0; } @@ -26,7 +26,7 @@ SEC("syscall") __failure __msg("arg#1 doesn't point to a const string") int stream_vprintk_string_arg(void *ctx) { - bpf_stream_vprintk(BPF_STDOUT, ctx, NULL, 0, NULL); + bpf_stream_vprintk_impl(BPF_STDOUT, ctx, NULL, 0, NULL); return 0; } -- cgit From 636f4618b1cd96f6b5a2b8c7c4f665c8533ecf13 Mon Sep 17 00:00:00 2001 From: Haotian Zhang Date: Wed, 29 Oct 2025 01:28:28 +0800 Subject: regulator: fixed: fix GPIO descriptor leak on register failure In the commit referenced by the Fixes tag, devm_gpiod_get_optional() was replaced by manual GPIO management, relying on the regulator core to release the GPIO descriptor. However, this approach does not account for the error path: when regulator registration fails, the core never takes over the GPIO, resulting in a resource leak. Add gpiod_put() before returning on regulator registration failure. Fixes: 5e6f3ae5c13b ("regulator: fixed: Let core handle GPIO descriptor") Signed-off-by: Haotian Zhang Link: https://patch.msgid.link/20251028172828.625-1-vulab@iscas.ac.cn Signed-off-by: Mark Brown --- drivers/regulator/fixed.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/regulator/fixed.c b/drivers/regulator/fixed.c index 1cb647ed70c6..a2d16e9abfb5 100644 --- a/drivers/regulator/fixed.c +++ b/drivers/regulator/fixed.c @@ -334,6 +334,7 @@ static int reg_fixed_voltage_probe(struct platform_device *pdev) ret = dev_err_probe(&pdev->dev, PTR_ERR(drvdata->dev), "Failed to register regulator: %ld\n", PTR_ERR(drvdata->dev)); + gpiod_put(cfg.ena_gpiod); return ret; } -- cgit From 63b5aa01da0f38cdbd97d021477258e511631497 Mon Sep 17 00:00:00 2001 From: Yongpeng Yang Date: Tue, 4 Nov 2025 20:50:06 +0800 Subject: vfat: fix missing sb_min_blocksize() return value checks When emulating an nvme device on qemu with both logical_block_size and physical_block_size set to 8 KiB, but without format, a kernel panic was triggered during the early boot stage while attempting to mount a vfat filesystem. [95553.682035] EXT4-fs (nvme0n1): unable to set blocksize [95553.684326] EXT4-fs (nvme0n1): unable to set blocksize [95553.686501] EXT4-fs (nvme0n1): unable to set blocksize [95553.696448] ISOFS: unsupported/invalid hardware sector size 8192 [95553.697117] ------------[ cut here ]------------ [95553.697567] kernel BUG at fs/buffer.c:1582! [95553.697984] Oops: invalid opcode: 0000 [#1] SMP NOPTI [95553.698602] CPU: 0 UID: 0 PID: 7212 Comm: mount Kdump: loaded Not tainted 6.18.0-rc2+ #38 PREEMPT(voluntary) [95553.699511] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [95553.700534] RIP: 0010:folio_alloc_buffers+0x1bb/0x1c0 [95553.701018] Code: 48 8b 15 e8 93 18 02 65 48 89 35 e0 93 18 02 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d 31 d2 31 c9 31 f6 31 ff c3 cc cc cc cc <0f> 0b 90 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f [95553.702648] RSP: 0018:ffffd1b0c676f990 EFLAGS: 00010246 [95553.703132] RAX: ffff8cfc4176d820 RBX: 0000000000508c48 RCX: 0000000000000001 [95553.703805] RDX: 0000000000002000 RSI: 0000000000000000 RDI: 0000000000000000 [95553.704481] RBP: ffffd1b0c676f9c8 R08: 0000000000000000 R09: 0000000000000000 [95553.705148] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [95553.705816] R13: 0000000000002000 R14: fffff8bc8257e800 R15: 0000000000000000 [95553.706483] FS: 000072ee77315840(0000) GS:ffff8cfdd2c8d000(0000) knlGS:0000000000000000 [95553.707248] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [95553.707782] CR2: 00007d8f2a9e5a20 CR3: 0000000039d0c006 CR4: 0000000000772ef0 [95553.708439] PKRU: 55555554 [95553.708734] Call Trace: [95553.709015] [95553.709266] __getblk_slow+0xd2/0x230 [95553.709641] ? find_get_block_common+0x8b/0x530 [95553.710084] bdev_getblk+0x77/0xa0 [95553.710449] __bread_gfp+0x22/0x140 [95553.710810] fat_fill_super+0x23a/0xfc0 [95553.711216] ? __pfx_setup+0x10/0x10 [95553.711580] ? __pfx_vfat_fill_super+0x10/0x10 [95553.712014] vfat_fill_super+0x15/0x30 [95553.712401] get_tree_bdev_flags+0x141/0x1e0 [95553.712817] get_tree_bdev+0x10/0x20 [95553.713177] vfat_get_tree+0x15/0x20 [95553.713550] vfs_get_tree+0x2a/0x100 [95553.713910] vfs_cmd_create+0x62/0xf0 [95553.714273] __do_sys_fsconfig+0x4e7/0x660 [95553.714669] __x64_sys_fsconfig+0x20/0x40 [95553.715062] x64_sys_call+0x21ee/0x26a0 [95553.715453] do_syscall_64+0x80/0x670 [95553.715816] ? __fs_parse+0x65/0x1e0 [95553.716172] ? fat_parse_param+0x103/0x4b0 [95553.716587] ? vfs_parse_fs_param_source+0x21/0xa0 [95553.717034] ? __do_sys_fsconfig+0x3d9/0x660 [95553.717548] ? __x64_sys_fsconfig+0x20/0x40 [95553.717957] ? x64_sys_call+0x21ee/0x26a0 [95553.718360] ? do_syscall_64+0xb8/0x670 [95553.718734] ? __x64_sys_fsconfig+0x20/0x40 [95553.719141] ? x64_sys_call+0x21ee/0x26a0 [95553.719545] ? do_syscall_64+0xb8/0x670 [95553.719922] ? x64_sys_call+0x1405/0x26a0 [95553.720317] ? do_syscall_64+0xb8/0x670 [95553.720702] ? __x64_sys_close+0x3e/0x90 [95553.721080] ? x64_sys_call+0x1b5e/0x26a0 [95553.721478] ? do_syscall_64+0xb8/0x670 [95553.721841] ? irqentry_exit+0x43/0x50 [95553.722211] ? exc_page_fault+0x90/0x1b0 [95553.722681] entry_SYSCALL_64_after_hwframe+0x76/0x7e [95553.723166] RIP: 0033:0x72ee774f3afe [95553.723562] Code: 73 01 c3 48 8b 0d 0a 33 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 49 89 ca b8 af 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d da 32 0f 00 f7 d8 64 89 01 48 [95553.725188] RSP: 002b:00007ffe97148978 EFLAGS: 00000246 ORIG_RAX: 00000000000001af [95553.725892] RAX: ffffffffffffffda RBX: 00005dcfe53d0080 RCX: 000072ee774f3afe [95553.726526] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003 [95553.727176] RBP: 00007ffe97148ac0 R08: 0000000000000000 R09: 000072ee775e7ac0 [95553.727818] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [95553.728459] R13: 00005dcfe53d04b0 R14: 000072ee77670b00 R15: 00005dcfe53d1a28 [95553.729086] The panic occurs as follows: 1. logical_block_size is 8KiB, causing {struct super_block *sb}->s_blocksize is initialized to 0. vfat_fill_super - fat_fill_super - sb_min_blocksize - sb_set_blocksize //return 0 when size is 8KiB. 2. __bread_gfp is called with size == 0, causing folio_alloc_buffers() to compute an offset equal to folio_size(folio), which triggers a BUG_ON. fat_fill_super - sb_bread - __bread_gfp // size == {struct super_block *sb}->s_blocksize == 0 - bdev_getblk - __getblk_slow - grow_buffers - grow_dev_folio - folio_alloc_buffers // size == 0 - folio_set_bh //offset == folio_size(folio) and panic To fix this issue, add proper return value checks for sb_min_blocksize(). Cc: stable@vger.kernel.org # v6.15 Fixes: a64e5a596067bd ("bdev: add back PAGE_SIZE block size validation for sb_set_blocksize()") Reviewed-by: Matthew Wilcox Reviewed-by: Darrick J. Wong Reviewed-by: Jan Kara Reviewed-by: OGAWA Hirofumi Reviewed-by: Christoph Hellwig Signed-off-by: Yongpeng Yang Link: https://patch.msgid.link/20251104125009.2111925-2-yangyongpeng.storage@gmail.com Acked-by: OGAWA Hirofumi Signed-off-by: Christian Brauner --- fs/fat/inode.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/fat/inode.c b/fs/fat/inode.c index 9648ed097816..9cfe20a3daaf 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -1595,8 +1595,12 @@ int fat_fill_super(struct super_block *sb, struct fs_context *fc, setup(sb); /* flavour-specific stuff that needs options */ + error = -EINVAL; + if (!sb_min_blocksize(sb, 512)) { + fat_msg(sb, KERN_ERR, "unable to set blocksize"); + goto out_fail; + } error = -EIO; - sb_min_blocksize(sb, 512); bh = sb_bread(sb, 0); if (bh == NULL) { fat_msg(sb, KERN_ERR, "unable to read boot sector"); -- cgit From f2c1f631630e01821fe4c3fdf6077bc7a8284f82 Mon Sep 17 00:00:00 2001 From: Yongpeng Yang Date: Tue, 4 Nov 2025 20:50:07 +0800 Subject: exfat: check return value of sb_min_blocksize in exfat_read_boot_sector sb_min_blocksize() may return 0. Check its return value to avoid accessing the filesystem super block when sb->s_blocksize is 0. Cc: stable@vger.kernel.org # v6.15 Fixes: 719c1e1829166d ("exfat: add super block operations") Reviewed-by: Christoph Hellwig Signed-off-by: Yongpeng Yang Link: https://patch.msgid.link/20251104125009.2111925-3-yangyongpeng.storage@gmail.com Signed-off-by: Christian Brauner --- fs/exfat/super.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/exfat/super.c b/fs/exfat/super.c index 7f9592856bf7..74d451f732c7 100644 --- a/fs/exfat/super.c +++ b/fs/exfat/super.c @@ -433,7 +433,10 @@ static int exfat_read_boot_sector(struct super_block *sb) struct exfat_sb_info *sbi = EXFAT_SB(sb); /* set block size to read super block */ - sb_min_blocksize(sb, 512); + if (!sb_min_blocksize(sb, 512)) { + exfat_err(sb, "unable to set blocksize"); + return -EINVAL; + } /* read boot sector */ sbi->boot_bh = sb_bread(sb, 0); -- cgit From e106e269c5cb38315eb0a0e7e38f71e9b20c8c66 Mon Sep 17 00:00:00 2001 From: Yongpeng Yang Date: Tue, 4 Nov 2025 20:50:08 +0800 Subject: isofs: check the return value of sb_min_blocksize() in isofs_fill_super sb_min_blocksize() may return 0. Check its return value to avoid opt->blocksize and sb->s_blocksize is 0. Cc: stable@vger.kernel.org # v6.15 Fixes: 1b17a46c9243e9 ("isofs: convert isofs to use the new mount API") Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig Signed-off-by: Yongpeng Yang Link: https://patch.msgid.link/20251104125009.2111925-4-yangyongpeng.storage@gmail.com Signed-off-by: Christian Brauner --- fs/isofs/inode.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c index 6f0e6b19383c..ad3143d4066b 100644 --- a/fs/isofs/inode.c +++ b/fs/isofs/inode.c @@ -610,6 +610,11 @@ static int isofs_fill_super(struct super_block *s, struct fs_context *fc) goto out_freesbi; } opt->blocksize = sb_min_blocksize(s, opt->blocksize); + if (!opt->blocksize) { + printk(KERN_ERR + "ISOFS: unable to set blocksize\n"); + goto out_freesbi; + } sbi->s_high_sierra = 0; /* default is iso9660 */ sbi->s_session = opt->session; -- cgit From 124af0868ec6929ba838fb76d25f00c06ba8fc0d Mon Sep 17 00:00:00 2001 From: Yongpeng Yang Date: Tue, 4 Nov 2025 20:50:09 +0800 Subject: xfs: check the return value of sb_min_blocksize() in xfs_fs_fill_super sb_min_blocksize() may return 0. Check its return value to avoid the filesystem super block when sb->s_blocksize is 0. Cc: stable@vger.kernel.org # v6.15 Fixes: a64e5a596067bd ("bdev: add back PAGE_SIZE block size validation for sb_set_blocksize()") Reviewed-by: Christoph Hellwig Signed-off-by: Yongpeng Yang Link: https://patch.msgid.link/20251104125009.2111925-5-yangyongpeng.storage@gmail.com Reviewed-by: Darrick J. Wong Signed-off-by: Christian Brauner --- fs/xfs/xfs_super.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index e85a156dc17d..fbb8009f1c0f 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1662,7 +1662,10 @@ xfs_fs_fill_super( if (error) return error; - sb_min_blocksize(sb, BBSIZE); + if (!sb_min_blocksize(sb, BBSIZE)) { + xfs_err(mp, "unable to set blocksize"); + return -EINVAL; + } sb->s_xattr = xfs_xattr_handlers; sb->s_export_op = &xfs_export_operations; #ifdef CONFIG_XFS_QUOTA -- cgit From c014021253d77cd89b2d8788ce522283d83fbd40 Mon Sep 17 00:00:00 2001 From: Alok Tiwari Date: Mon, 27 Oct 2025 03:46:47 -0700 Subject: virtio-fs: fix incorrect check for fsvq->kobj In virtio_fs_add_queues_sysfs(), the code incorrectly checks fs->mqs_kobj after calling kobject_create_and_add(). Change the check to fsvq->kobj (fs->mqs_kobj -> fsvq->kobj) to ensure the per-queue kobject is successfully created. Fixes: 87cbdc396a31 ("virtio_fs: add sysfs entries for queue information") Signed-off-by: Alok Tiwari Link: https://patch.msgid.link/20251027104658.1668537-1-alok.a.tiwari@oracle.com Reviewed-by: Stefan Hajnoczi Signed-off-by: Christian Brauner --- fs/fuse/virtio_fs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 6bc7c97b017d..b2f6486fe1d5 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -373,7 +373,7 @@ static int virtio_fs_add_queues_sysfs(struct virtio_fs *fs) sprintf(buff, "%d", i); fsvq->kobj = kobject_create_and_add(buff, fs->mqs_kobj); - if (!fs->mqs_kobj) { + if (!fsvq->kobj) { ret = -ENOMEM; goto out_del; } -- cgit From 8637fa89e678422995301ddb20b74190dffcccee Mon Sep 17 00:00:00 2001 From: Yongpeng Yang Date: Tue, 4 Nov 2025 20:50:10 +0800 Subject: block: add __must_check attribute to sb_min_blocksize() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When sb_min_blocksize() returns 0 and the return value is not checked, it may lead to a situation where sb->s_blocksize is 0 when accessing the filesystem super block. After commit a64e5a596067bd ("bdev: add back PAGE_SIZE block size validation for sb_set_blocksize()"), this becomes more likely to happen when the block device’s logical_block_size is larger than PAGE_SIZE and the filesystem is unformatted. Add the __must_check attribute to ensure callers always check the return value. Cc: stable@vger.kernel.org # v6.15 Suggested-by: Matthew Wilcox Reviewed-by: Christoph Hellwig Reviewed-by: Jan Kara Signed-off-by: Yongpeng Yang Link: https://patch.msgid.link/20251104125009.2111925-6-yangyongpeng.storage@gmail.com Signed-off-by: Christian Brauner --- block/bdev.c | 2 +- include/linux/fs.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index 810707cca970..638f0cd458ae 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -231,7 +231,7 @@ int sb_set_blocksize(struct super_block *sb, int size) EXPORT_SYMBOL(sb_set_blocksize); -int sb_min_blocksize(struct super_block *sb, int size) +int __must_check sb_min_blocksize(struct super_block *sb, int size) { int minsize = bdev_logical_block_size(sb->s_bdev); if (size < minsize) diff --git a/include/linux/fs.h b/include/linux/fs.h index c895146c1444..3ea98c6cce81 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3423,8 +3423,8 @@ static inline void remove_inode_hash(struct inode *inode) extern void inode_sb_list_add(struct inode *inode); extern void inode_add_lru(struct inode *inode); -extern int sb_set_blocksize(struct super_block *, int); -extern int sb_min_blocksize(struct super_block *, int); +int sb_set_blocksize(struct super_block *sb, int size); +int __must_check sb_min_blocksize(struct super_block *sb, int size); int generic_file_mmap(struct file *, struct vm_area_struct *); int generic_file_mmap_prepare(struct vm_area_desc *desc); -- cgit From 90f601b497d76f40fa66795c3ecf625b6aced9fd Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Wed, 5 Nov 2025 02:29:23 +0000 Subject: binfmt_misc: restore write access before closing files opened by open_exec() bm_register_write() opens an executable file using open_exec(), which internally calls do_open_execat() and denies write access on the file to avoid modification while it is being executed. However, when an error occurs, bm_register_write() closes the file using filp_close() directly. This does not restore the write permission, which may cause subsequent write operations on the same file to fail. Fix this by calling exe_file_allow_write_access() before filp_close() to restore the write permission properly. Fixes: e7850f4d844e ("binfmt_misc: fix possible deadlock in bm_register_write") Signed-off-by: Zilin Guan Link: https://patch.msgid.link/20251105022923.1813587-1-zilin@seu.edu.cn Signed-off-by: Christian Brauner --- fs/binfmt_misc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c index a839f960cd4a..a8b1d79e4af0 100644 --- a/fs/binfmt_misc.c +++ b/fs/binfmt_misc.c @@ -837,8 +837,10 @@ out: inode_unlock(d_inode(root)); if (err) { - if (f) + if (f) { + exe_file_allow_write_access(f); filp_close(f, NULL); + } kfree(e); return err; } -- cgit From 3cd2018e15b3d66d2187d92867e265f45ad79e6f Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Sun, 2 Nov 2025 20:09:21 +0100 Subject: spi: Try to get ACPI GPIO IRQ earlier Since commit d24cfee7f63d ("spi: Fix acpi deferred irq probe"), the acpi_dev_gpio_irq_get() call gets delayed till spi_probe() is called on the SPI device. If there is no driver for the SPI device then the move to spi_probe() results in acpi_dev_gpio_irq_get() never getting called. This may cause problems by leaving the GPIO pin floating because this call is responsible for setting up the GPIO pin direction and/or bias according to the values from the ACPI tables. Re-add the removed acpi_dev_gpio_irq_get() in acpi_register_spi_device() to ensure the GPIO pin is always correctly setup, while keeping the acpi_dev_gpio_irq_get() call added to spi_probe() to deal with -EPROBE_DEFER returns caused by the GPIO controller not having a driver yet. Link: https://bbs.archlinux.org/viewtopic.php?id=302348 Fixes: d24cfee7f63d ("spi: Fix acpi deferred irq probe") Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede Link: https://patch.msgid.link/20251102190921.30068-1-hansg@kernel.org Signed-off-by: Mark Brown --- drivers/spi/spi.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 2e0647a06890..8588e8562220 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -2851,6 +2851,16 @@ static acpi_status acpi_register_spi_device(struct spi_controller *ctlr, acpi_set_modalias(adev, acpi_device_hid(adev), spi->modalias, sizeof(spi->modalias)); + /* + * This gets re-tried in spi_probe() for -EPROBE_DEFER handling in case + * the GPIO controller does not have a driver yet. This needs to be done + * here too, because this call sets the GPIO direction and/or bias. + * Setting these needs to be done even if there is no driver, in which + * case spi_probe() will never get called. + */ + if (spi->irq < 0) + spi->irq = acpi_dev_gpio_irq_get(adev, 0); + acpi_device_set_enumerated(adev); adev->power.flags.ignore_parent = true; -- cgit From 997c06330fd5c2e220b692f2a358986c6c8fd5a2 Mon Sep 17 00:00:00 2001 From: Laurentiu Mihalcea Date: Tue, 4 Nov 2025 04:02:54 -0800 Subject: reset: imx8mp-audiomix: Fix bad mask values As per the i.MX8MP TRM, section 14.2 "AUDIO_BLK_CTRL", table 14.2.3.1.1 "memory map", the definition of the EARC control register shows that the EARC controller software reset is controlled via bit 0, while the EARC PHY software reset is controlled via bit 1. This means that the current definitions of IMX8MP_AUDIOMIX_EARC_RESET_MASK and IMX8MP_AUDIOMIX_EARC_PHY_RESET_MASK are wrong since their values would imply that the EARC controller software reset is controlled via bit 1 and the EARC PHY software reset is controlled via bit 2. Fix them. Fixes: a83bc87cd30a ("reset: imx8mp-audiomix: Prepare the code for more reset bits") Cc: stable@vger.kernel.org Reviewed-by: Shengjiu Wang Reviewed-by: Frank Li Reviewed-by: Daniel Baluta Signed-off-by: Laurentiu Mihalcea Signed-off-by: Philipp Zabel --- drivers/reset/reset-imx8mp-audiomix.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/reset/reset-imx8mp-audiomix.c b/drivers/reset/reset-imx8mp-audiomix.c index 6b357adfe646..eceb37ff5dc5 100644 --- a/drivers/reset/reset-imx8mp-audiomix.c +++ b/drivers/reset/reset-imx8mp-audiomix.c @@ -14,8 +14,8 @@ #include #define IMX8MP_AUDIOMIX_EARC_RESET_OFFSET 0x200 -#define IMX8MP_AUDIOMIX_EARC_RESET_MASK BIT(1) -#define IMX8MP_AUDIOMIX_EARC_PHY_RESET_MASK BIT(2) +#define IMX8MP_AUDIOMIX_EARC_RESET_MASK BIT(0) +#define IMX8MP_AUDIOMIX_EARC_PHY_RESET_MASK BIT(1) #define IMX8MP_AUDIOMIX_DSP_RUNSTALL_OFFSET 0x108 #define IMX8MP_AUDIOMIX_DSP_RUNSTALL_MASK BIT(5) -- cgit From a7da9c6a2fc08b6ad1a2e9aebbb14bcc59320374 Mon Sep 17 00:00:00 2001 From: Andrea della Porta Date: Tue, 21 Oct 2025 15:55:33 +0200 Subject: arm64: dts: broadcom: Assign clock rates in eth node for RPi5 In Raspberry Pi 5 DTS, the Ethernet clock rates must be assigned as the default clock register values are not valid for the Ethernet interface to function. This can be done either in rp1_clocks node or in rp1_eth node. Define the rates in rp1_eth node, as those clocks are 'leaf' clocks used specifically by the Ethernet device only. Fixes: 43456fdfc014 ("arm64: dts: broadcom: Enable RP1 ethernet for Raspberry Pi 5") Signed-off-by: Andrea della Porta Link: https://lore.kernel.org/r/20251021135533.5517-1-andrea.porta@suse.com Signed-off-by: Florian Fainelli --- arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts b/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts index b8f256545022..09a849dd09b1 100644 --- a/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts +++ b/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts @@ -23,6 +23,10 @@ }; &rp1_eth { + assigned-clocks = <&rp1_clocks RP1_CLK_ETH_TSU>, + <&rp1_clocks RP1_CLK_ETH>; + assigned-clock-rates = <50000000>, + <125000000>; status = "okay"; phy-mode = "rgmii-id"; phy-handle = <&phy1>; -- cgit From 5e44c5a2cc84bed6b92cdfd9c567fcdb9f792604 Mon Sep 17 00:00:00 2001 From: Laurent Pinchart Date: Sun, 2 Nov 2025 13:14:42 +0200 Subject: arm64: dts: broadcom: bcm2712: rpi-5: Add ethernet0 alias The RP1 ethernet controller DT node contains a local-mac-address property to pass the MAC address from the boot loader to the kernel. The boot loader does not fill the MAC address as the ethernet0 alias is missing. Add it. Signed-off-by: Laurent Pinchart Reviewed-by: Andrea della Porta Link: https://lore.kernel.org/r/20251102111443.18206-1-laurent.pinchart@ideasonboard.com Fixes: 43456fdfc014 ("arm64: dts: broadcom: Enable RP1 ethernet for Raspberry Pi 5") Signed-off-by: Florian Fainelli --- arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts b/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts index 09a849dd09b1..3e0319fdb93f 100644 --- a/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts +++ b/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts @@ -18,6 +18,12 @@ #include "bcm2712-rpi-5-b-ovl-rp1.dts" +/ { + aliases { + ethernet0 = &rp1_eth; + }; +}; + &pcie2 { #include "rp1-nexus.dtsi" }; -- cgit From 6d08340d1e354787d6c65a8c3cdd4d41ffb8a5ed Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Tue, 4 Nov 2025 22:54:02 +0100 Subject: Revert "perf/x86: Always store regs->ip in perf_callchain_kernel()" This reverts commit 83f44ae0f8afcc9da659799db8693f74847e66b3. Currently we store initial stacktrace entry twice for non-HW ot_regs, which means callers that fail perf_hw_regs(regs) condition in perf_callchain_kernel. It's easy to reproduce this bpftrace: # bpftrace -e 'tracepoint:sched:sched_process_exec { print(kstack()); }' Attaching 1 probe... bprm_execve+1767 bprm_execve+1767 do_execveat_common.isra.0+425 __x64_sys_execve+56 do_syscall_64+133 entry_SYSCALL_64_after_hwframe+118 When perf_callchain_kernel calls unwind_start with first_frame, AFAICS we do not skip regs->ip, but it's added as part of the unwind process. Hence reverting the extra perf_callchain_store for non-hw regs leg. I was not able to bisect this, so I'm not really sure why this was needed in v5.2 and why it's not working anymore, but I could see double entries as far as v5.10. I did the test for both ORC and framepointer unwind with and without the this fix and except for the initial entry the stacktraces are the same. Acked-by: Song Liu Signed-off-by: Jiri Olsa Link: https://lore.kernel.org/r/20251104215405.168643-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov Acked-by: Steven Rostedt (Google) --- arch/x86/events/core.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 745caa6c15a3..fa6c47b50989 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2789,13 +2789,13 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re return; } - if (perf_callchain_store(entry, regs->ip)) - return; - - if (perf_hw_regs(regs)) + if (perf_hw_regs(regs)) { + if (perf_callchain_store(entry, regs->ip)) + return; unwind_start(&state, current, regs, NULL); - else + } else { unwind_start(&state, current, NULL, (void *)regs->sp); + } for (; !unwind_done(&state); unwind_next_frame(&state)) { addr = unwind_get_return_address(&state); -- cgit From 20a0bc10272fa17a44fc857c31574a8306f60d20 Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Tue, 4 Nov 2025 22:54:03 +0100 Subject: x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probe Currently we don't get stack trace via ORC unwinder on top of fgraph exit handler. We can see that when generating stacktrace from kretprobe_multi bpf program which is based on fprobe/fgraph. The reason is that the ORC unwind code won't get pass the return_to_handler callback installed by fgraph return probe machinery. Solving this by creating stack frame in return_to_handler expected by ftrace_graph_ret_addr function to recover original return address and continue with the unwind. Also updating the pt_regs data with cs/flags/rsp which are needed for successful stack retrieval from ebpf bpf_get_stackid helper. - in get_perf_callchain we check user_mode(regs) so CS has to be set - in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS/FIXED has to be unset Acked-by: Masami Hiramatsu (Google) Signed-off-by: Jiri Olsa Link: https://lore.kernel.org/r/20251104215405.168643-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov Acked-by: Steven Rostedt (Google) --- arch/x86/include/asm/ftrace.h | 5 +++++ arch/x86/kernel/ftrace_64.S | 8 +++++++- include/linux/ftrace.h | 10 +++++++++- 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h index 93156ac4ffe0..b08c95872eed 100644 --- a/arch/x86/include/asm/ftrace.h +++ b/arch/x86/include/asm/ftrace.h @@ -56,6 +56,11 @@ arch_ftrace_get_regs(struct ftrace_regs *fregs) return &arch_ftrace_regs(fregs)->regs; } +#define arch_ftrace_partial_regs(regs) do { \ + regs->flags &= ~X86_EFLAGS_FIXED; \ + regs->cs = __KERNEL_CS; \ +} while (0) + #define arch_ftrace_fill_perf_regs(fregs, _regs) do { \ (_regs)->ip = arch_ftrace_regs(fregs)->regs.ip; \ (_regs)->sp = arch_ftrace_regs(fregs)->regs.sp; \ diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index 367da3638167..823dbdd0eb41 100644 --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -354,12 +354,17 @@ SYM_CODE_START(return_to_handler) UNWIND_HINT_UNDEFINED ANNOTATE_NOENDBR + /* Restore return_to_handler value that got eaten by previous ret instruction. */ + subq $8, %rsp + UNWIND_HINT_FUNC + /* Save ftrace_regs for function exit context */ subq $(FRAME_SIZE), %rsp movq %rax, RAX(%rsp) movq %rdx, RDX(%rsp) movq %rbp, RBP(%rsp) + movq %rsp, RSP(%rsp) movq %rsp, %rdi call ftrace_return_to_handler @@ -368,7 +373,8 @@ SYM_CODE_START(return_to_handler) movq RDX(%rsp), %rdx movq RAX(%rsp), %rax - addq $(FRAME_SIZE), %rsp + addq $(FRAME_SIZE) + 8, %rsp + /* * Jump back to the old return address. This cannot be JMP_NOSPEC rdi * since IBT would demand that contain ENDBR, which simply isn't so for diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 7ded7df6e9b5..07f8c309e432 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -193,6 +193,10 @@ static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs #if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \ defined(CONFIG_HAVE_FTRACE_REGS_HAVING_PT_REGS) +#ifndef arch_ftrace_partial_regs +#define arch_ftrace_partial_regs(regs) do {} while (0) +#endif + static __always_inline struct pt_regs * ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs) { @@ -202,7 +206,11 @@ ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs) * Since arch_ftrace_get_regs() will check some members and may return * NULL, we can not use it. */ - return &arch_ftrace_regs(fregs)->regs; + regs = &arch_ftrace_regs(fregs)->regs; + + /* Allow arch specific updates to regs. */ + arch_ftrace_partial_regs(regs); + return regs; } #endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_FTRACE_REGS_HAVING_PT_REGS */ -- cgit From c9e208fa93cd66f8077ee15df0728e62b105a687 Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Tue, 4 Nov 2025 22:54:04 +0100 Subject: selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi Adding test that attaches kprobe/kretprobe multi and verifies the ORC stacktrace matches expected functions. Adding bpf_testmod_stacktrace_test function to bpf_testmod kernel module which is called through several functions so we get reliable call path for stacktrace. The test is only for ORC unwinder to keep it simple. Signed-off-by: Jiri Olsa Link: https://lore.kernel.org/r/20251104215405.168643-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov Acked-by: Steven Rostedt (Google) --- .../selftests/bpf/prog_tests/stacktrace_ips.c | 104 +++++++++++++++++++++ tools/testing/selftests/bpf/progs/stacktrace_ips.c | 41 ++++++++ .../testing/selftests/bpf/test_kmods/bpf_testmod.c | 26 ++++++ 3 files changed, 171 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c create mode 100644 tools/testing/selftests/bpf/progs/stacktrace_ips.c diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c new file mode 100644 index 000000000000..6fca459ba550 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c @@ -0,0 +1,104 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include "stacktrace_ips.skel.h" + +#ifdef __x86_64__ +static int check_stacktrace_ips(int fd, __u32 key, int cnt, ...) +{ + __u64 ips[PERF_MAX_STACK_DEPTH]; + struct ksyms *ksyms = NULL; + int i, err = 0; + va_list args; + + /* sorted by addr */ + ksyms = load_kallsyms_local(); + if (!ASSERT_OK_PTR(ksyms, "load_kallsyms_local")) + return -1; + + /* unlikely, but... */ + if (!ASSERT_LT(cnt, PERF_MAX_STACK_DEPTH, "check_max")) + return -1; + + err = bpf_map_lookup_elem(fd, &key, ips); + if (err) + goto out; + + /* + * Compare all symbols provided via arguments with stacktrace ips, + * and their related symbol addresses.t + */ + va_start(args, cnt); + + for (i = 0; i < cnt; i++) { + unsigned long val; + struct ksym *ksym; + + val = va_arg(args, unsigned long); + ksym = ksym_search_local(ksyms, ips[i]); + if (!ASSERT_OK_PTR(ksym, "ksym_search_local")) + break; + ASSERT_EQ(ksym->addr, val, "stack_cmp"); + } + + va_end(args); + +out: + free_kallsyms_local(ksyms); + return err; +} + +static void test_stacktrace_ips_kprobe_multi(bool retprobe) +{ + LIBBPF_OPTS(bpf_kprobe_multi_opts, opts, + .retprobe = retprobe + ); + LIBBPF_OPTS(bpf_test_run_opts, topts); + struct stacktrace_ips *skel; + + skel = stacktrace_ips__open_and_load(); + if (!ASSERT_OK_PTR(skel, "stacktrace_ips__open_and_load")) + return; + + if (!skel->kconfig->CONFIG_UNWINDER_ORC) { + test__skip(); + goto cleanup; + } + + skel->links.kprobe_multi_test = bpf_program__attach_kprobe_multi_opts( + skel->progs.kprobe_multi_test, + "bpf_testmod_stacktrace_test", &opts); + if (!ASSERT_OK_PTR(skel->links.kprobe_multi_test, "bpf_program__attach_kprobe_multi_opts")) + goto cleanup; + + trigger_module_test_read(1); + + load_kallsyms(); + + check_stacktrace_ips(bpf_map__fd(skel->maps.stackmap), skel->bss->stack_key, 4, + ksym_get_addr("bpf_testmod_stacktrace_test_3"), + ksym_get_addr("bpf_testmod_stacktrace_test_2"), + ksym_get_addr("bpf_testmod_stacktrace_test_1"), + ksym_get_addr("bpf_testmod_test_read")); + +cleanup: + stacktrace_ips__destroy(skel); +} + +static void __test_stacktrace_ips(void) +{ + if (test__start_subtest("kprobe_multi")) + test_stacktrace_ips_kprobe_multi(false); + if (test__start_subtest("kretprobe_multi")) + test_stacktrace_ips_kprobe_multi(true); +} +#else +static void __test_stacktrace_ips(void) +{ + test__skip(); +} +#endif + +void test_stacktrace_ips(void) +{ + __test_stacktrace_ips(); +} diff --git a/tools/testing/selftests/bpf/progs/stacktrace_ips.c b/tools/testing/selftests/bpf/progs/stacktrace_ips.c new file mode 100644 index 000000000000..e2eb30945c1b --- /dev/null +++ b/tools/testing/selftests/bpf/progs/stacktrace_ips.c @@ -0,0 +1,41 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2018 Facebook + +#include +#include +#include + +#ifndef PERF_MAX_STACK_DEPTH +#define PERF_MAX_STACK_DEPTH 127 +#endif + +typedef __u64 stack_trace_t[PERF_MAX_STACK_DEPTH]; + +struct { + __uint(type, BPF_MAP_TYPE_STACK_TRACE); + __uint(max_entries, 16384); + __type(key, __u32); + __type(value, stack_trace_t); +} stackmap SEC(".maps"); + +extern bool CONFIG_UNWINDER_ORC __kconfig __weak; + +/* + * This function is here to have CONFIG_UNWINDER_ORC + * used and added to object BTF. + */ +int unused(void) +{ + return CONFIG_UNWINDER_ORC ? 0 : 1; +} + +__u32 stack_key; + +SEC("kprobe.multi") +int kprobe_multi_test(struct pt_regs *ctx) +{ + stack_key = bpf_get_stackid(ctx, &stackmap, 0); + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c index 8074bc5f6f20..ed0a4721d8fd 100644 --- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c +++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c @@ -417,6 +417,30 @@ noinline int bpf_testmod_fentry_test11(u64 a, void *b, short c, int d, return a + (long)b + c + d + (long)e + f + g + h + i + j + k; } +noinline void bpf_testmod_stacktrace_test(void) +{ + /* used for stacktrace test as attach function */ + asm volatile (""); +} + +noinline void bpf_testmod_stacktrace_test_3(void) +{ + bpf_testmod_stacktrace_test(); + asm volatile (""); +} + +noinline void bpf_testmod_stacktrace_test_2(void) +{ + bpf_testmod_stacktrace_test_3(); + asm volatile (""); +} + +noinline void bpf_testmod_stacktrace_test_1(void) +{ + bpf_testmod_stacktrace_test_2(); + asm volatile (""); +} + int bpf_testmod_fentry_ok; noinline ssize_t @@ -497,6 +521,8 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj, 21, 22, 23, 24, 25, 26) != 231) goto out; + bpf_testmod_stacktrace_test_1(); + bpf_testmod_fentry_ok = 1; out: return -EIO; /* always fail */ -- cgit From 3490d29964bdd524366d266b655112cb549c7460 Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Tue, 4 Nov 2025 22:54:05 +0100 Subject: selftests/bpf: Add stacktrace ips test for raw_tp Adding test that verifies we get expected initial 2 entries from stacktrace for rawtp probe via ORC unwind. Signed-off-by: Jiri Olsa Link: https://lore.kernel.org/r/20251104215405.168643-5-jolsa@kernel.org Signed-off-by: Alexei Starovoitov Acked-by: Steven Rostedt (Google) --- .../selftests/bpf/prog_tests/stacktrace_ips.c | 46 ++++++++++++++++++++++ tools/testing/selftests/bpf/progs/stacktrace_ips.c | 8 ++++ 2 files changed, 54 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c index 6fca459ba550..c9efdd2a5b18 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c @@ -84,12 +84,58 @@ cleanup: stacktrace_ips__destroy(skel); } +static void test_stacktrace_ips_raw_tp(void) +{ + __u32 info_len = sizeof(struct bpf_prog_info); + LIBBPF_OPTS(bpf_test_run_opts, topts); + struct bpf_prog_info info = {}; + struct stacktrace_ips *skel; + __u64 bpf_prog_ksym = 0; + int err; + + skel = stacktrace_ips__open_and_load(); + if (!ASSERT_OK_PTR(skel, "stacktrace_ips__open_and_load")) + return; + + if (!skel->kconfig->CONFIG_UNWINDER_ORC) { + test__skip(); + goto cleanup; + } + + skel->links.rawtp_test = bpf_program__attach_raw_tracepoint( + skel->progs.rawtp_test, + "bpf_testmod_test_read"); + if (!ASSERT_OK_PTR(skel->links.rawtp_test, "bpf_program__attach_raw_tracepoint")) + goto cleanup; + + /* get bpf program address */ + info.jited_ksyms = ptr_to_u64(&bpf_prog_ksym); + info.nr_jited_ksyms = 1; + err = bpf_prog_get_info_by_fd(bpf_program__fd(skel->progs.rawtp_test), + &info, &info_len); + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd")) + goto cleanup; + + trigger_module_test_read(1); + + load_kallsyms(); + + check_stacktrace_ips(bpf_map__fd(skel->maps.stackmap), skel->bss->stack_key, 2, + bpf_prog_ksym, + ksym_get_addr("bpf_trace_run2")); + +cleanup: + stacktrace_ips__destroy(skel); +} + static void __test_stacktrace_ips(void) { if (test__start_subtest("kprobe_multi")) test_stacktrace_ips_kprobe_multi(false); if (test__start_subtest("kretprobe_multi")) test_stacktrace_ips_kprobe_multi(true); + if (test__start_subtest("raw_tp")) + test_stacktrace_ips_raw_tp(); } #else static void __test_stacktrace_ips(void) diff --git a/tools/testing/selftests/bpf/progs/stacktrace_ips.c b/tools/testing/selftests/bpf/progs/stacktrace_ips.c index e2eb30945c1b..a96c8150d7f5 100644 --- a/tools/testing/selftests/bpf/progs/stacktrace_ips.c +++ b/tools/testing/selftests/bpf/progs/stacktrace_ips.c @@ -38,4 +38,12 @@ int kprobe_multi_test(struct pt_regs *ctx) return 0; } +SEC("raw_tp/bpf_testmod_test_read") +int rawtp_test(void *ctx) +{ + /* Skip ebpf program entry in the stack. */ + stack_key = bpf_get_stackid(ctx, &stackmap, 0); + return 0; +} + char _license[] SEC("license") = "GPL"; -- cgit From 59b0afd01b2ce353ab422ea9c8375b03db313a21 Mon Sep 17 00:00:00 2001 From: Miaoqian Lin Date: Mon, 27 Oct 2025 23:09:34 +0800 Subject: crypto: hisilicon/qm - Fix device reference leak in qm_get_qos_value The qm_get_qos_value() function calls bus_find_device_by_name() which increases the device reference count, but fails to call put_device() to balance the reference count and lead to a device reference leak. Add put_device() calls in both the error path and success path to properly balance the reference count. Found via static analysis. Fixes: 22d7a6c39cab ("crypto: hisilicon/qm - add pci bdf number check") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin Reviewed-by: Longfang Liu Signed-off-by: Herbert Xu --- drivers/crypto/hisilicon/qm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index a5b96adf2d1e..3b391a146635 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -3871,10 +3871,12 @@ static ssize_t qm_get_qos_value(struct hisi_qm *qm, const char *buf, pdev = container_of(dev, struct pci_dev, dev); if (pci_physfn(pdev) != qm->pdev) { pci_err(qm->pdev, "the pdev input does not match the pf!\n"); + put_device(dev); return -EINVAL; } *fun_index = pdev->devfn; + put_device(dev); return 0; } -- cgit From 82420bd4e17bdaba8453fbf9e10c58c9ed0c9727 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 6 Nov 2025 11:46:45 +0100 Subject: ALSA: hda/hdmi: Fix breakage at probing nvhdmi-mcp driver After restructuring and splitting the HDMI codec driver code, each HDMI codec driver contains the own build_controls and build_pcms ops. A copy-n-paste error put the wrong entries for nvhdmi-mcp driver; both build_controls and build_pcms are swapped. Unfortunately both callbacks have the very same form, and the compiler didn't complain it, either. This resulted in a NULL dereference because the PCM instance hasn't been initialized at calling the build_controls callback. Fix it by passing the proper entries. Fixes: ad781b550f9a ("ALSA: hda/hdmi: Rewrite to new probe method") Cc: Link: https://bugzilla.kernel.org/show_bug.cgi?id=220743 Link: https://patch.msgid.link/20251106104647.25805-1-tiwai@suse.de Signed-off-by: Takashi Iwai --- sound/hda/codecs/hdmi/nvhdmi-mcp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/hda/codecs/hdmi/nvhdmi-mcp.c b/sound/hda/codecs/hdmi/nvhdmi-mcp.c index 8fd8d76fa72f..1c5fdfe872f2 100644 --- a/sound/hda/codecs/hdmi/nvhdmi-mcp.c +++ b/sound/hda/codecs/hdmi/nvhdmi-mcp.c @@ -350,8 +350,8 @@ static int nvhdmi_mcp_probe(struct hda_codec *codec, static const struct hda_codec_ops nvhdmi_mcp_codec_ops = { .probe = nvhdmi_mcp_probe, .remove = snd_hda_hdmi_simple_remove, - .build_controls = nvhdmi_mcp_build_pcms, - .build_pcms = nvhdmi_mcp_build_controls, + .build_pcms = nvhdmi_mcp_build_pcms, + .build_controls = nvhdmi_mcp_build_controls, .init = nvhdmi_mcp_init, .unsol_event = snd_hda_hdmi_simple_unsol_event, }; -- cgit From 6b6eddc63ce871897d3a5bc4f8f593e698aef104 Mon Sep 17 00:00:00 2001 From: Haotian Zhang Date: Wed, 5 Nov 2025 14:22:46 +0800 Subject: ASoC: cs4271: Fix regulator leak on probe failure The probe function enables regulators at the beginning but fails to disable them in its error handling path. If any operation after enabling the regulators fails, the probe will exit with an error, leaving the regulators permanently enabled, which could lead to a resource leak. Add a proper error handling path to call regulator_bulk_disable() before returning an error. Fixes: 9a397f473657 ("ASoC: cs4271: add regulator consumer support") Signed-off-by: Haotian Zhang Reviewed-by: Charles Keepax Link: https://patch.msgid.link/20251105062246.1955-1-vulab@iscas.ac.cn Signed-off-by: Mark Brown --- sound/soc/codecs/cs4271.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/sound/soc/codecs/cs4271.c b/sound/soc/codecs/cs4271.c index 6a3cca3d26c7..ead447a5da7f 100644 --- a/sound/soc/codecs/cs4271.c +++ b/sound/soc/codecs/cs4271.c @@ -581,17 +581,17 @@ static int cs4271_component_probe(struct snd_soc_component *component) ret = regcache_sync(cs4271->regmap); if (ret < 0) - return ret; + goto err_disable_regulator; ret = regmap_update_bits(cs4271->regmap, CS4271_MODE2, CS4271_MODE2_PDN | CS4271_MODE2_CPEN, CS4271_MODE2_PDN | CS4271_MODE2_CPEN); if (ret < 0) - return ret; + goto err_disable_regulator; ret = regmap_update_bits(cs4271->regmap, CS4271_MODE2, CS4271_MODE2_PDN, 0); if (ret < 0) - return ret; + goto err_disable_regulator; /* Power-up sequence requires 85 uS */ udelay(85); @@ -601,6 +601,10 @@ static int cs4271_component_probe(struct snd_soc_component *component) CS4271_MODE2_MUTECAEQUB); return 0; + +err_disable_regulator: + regulator_bulk_disable(ARRAY_SIZE(cs4271->supplies), cs4271->supplies); + return ret; } static void cs4271_component_remove(struct snd_soc_component *component) -- cgit From 1a58d865f423f4339edf59053e496089075fa950 Mon Sep 17 00:00:00 2001 From: Miaoqian Lin Date: Wed, 29 Oct 2025 15:17:58 +0800 Subject: ASoC: sdw_utils: fix device reference leak in is_sdca_endpoint_present() The bus_find_device_by_name() function returns a device pointer with an incremented reference count, but the original code was missing put_device() calls in some return paths, leading to reference count leaks. Fix this by ensuring put_device() is called before function exit after bus_find_device_by_name() succeeds This follows the same pattern used elsewhere in the kernel where bus_find_device_by_name() is properly paired with put_device(). Found via static analysis and code review. Fixes: 4f8ef33dd44a ("ASoC: soc_sdw_utils: skip the endpoint that doesn't present") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin Link: https://patch.msgid.link/20251029071804.8425-1-linmq006@gmail.com Signed-off-by: Mark Brown --- sound/soc/sdw_utils/soc_sdw_utils.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/sound/soc/sdw_utils/soc_sdw_utils.c b/sound/soc/sdw_utils/soc_sdw_utils.c index f7c8c16308de..3848c7df1916 100644 --- a/sound/soc/sdw_utils/soc_sdw_utils.c +++ b/sound/soc/sdw_utils/soc_sdw_utils.c @@ -1277,7 +1277,7 @@ static int is_sdca_endpoint_present(struct device *dev, struct sdw_slave *slave; struct device *sdw_dev; const char *sdw_codec_name; - int i; + int ret, i; dlc = kzalloc(sizeof(*dlc), GFP_KERNEL); if (!dlc) @@ -1307,13 +1307,16 @@ static int is_sdca_endpoint_present(struct device *dev, } slave = dev_to_sdw_dev(sdw_dev); - if (!slave) - return -EINVAL; + if (!slave) { + ret = -EINVAL; + goto put_device; + } /* Make sure BIOS provides SDCA properties */ if (!slave->sdca_data.interface_revision) { dev_warn(&slave->dev, "SDCA properties not found in the BIOS\n"); - return 1; + ret = 1; + goto put_device; } for (i = 0; i < slave->sdca_data.num_functions; i++) { @@ -1322,7 +1325,8 @@ static int is_sdca_endpoint_present(struct device *dev, if (dai_type == dai_info->dai_type) { dev_dbg(&slave->dev, "DAI type %d sdca function %s found\n", dai_type, slave->sdca_data.function[i].name); - return 1; + ret = 1; + goto put_device; } } @@ -1330,7 +1334,11 @@ static int is_sdca_endpoint_present(struct device *dev, "SDCA device function for DAI type %d not supported, skip endpoint\n", dai_info->dai_type); - return 0; + ret = 0; + +put_device: + put_device(sdw_dev); + return ret; } int asoc_sdw_parse_sdw_endpoints(struct snd_soc_card *card, -- cgit From 84f5526e4dce0a44d050ceb1b1bf21d43016d91b Mon Sep 17 00:00:00 2001 From: Niranjan H Y Date: Thu, 30 Oct 2025 20:46:37 +0530 Subject: ASoC: tas2783A: Fix issues in firmware parsing During firmware download, if the size of the firmware is too small, it wrongly assumes the firmware download is successful. If there is size mismatch with chunk's header, invalid memory is accessed. Fix these issues by throwing error during these cases. Fixes: 4cc9bd8d7b32 (ASoc: tas2783A: Add soundwire based codec driver) Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202510291226.2R3fbYNh-lkp@intel.com/ Signed-off-by: Niranjan H Y Link: https://patch.msgid.link/20251030151637.566-1-niranjan.hy@ti.com Signed-off-by: Mark Brown --- sound/soc/codecs/tas2783-sdw.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/sound/soc/codecs/tas2783-sdw.c b/sound/soc/codecs/tas2783-sdw.c index 1fb4227b711e..e273b80d033e 100644 --- a/sound/soc/codecs/tas2783-sdw.c +++ b/sound/soc/codecs/tas2783-sdw.c @@ -762,10 +762,17 @@ static void tas2783_fw_ready(const struct firmware *fmw, void *context) goto out; } - mutex_lock(&tas_dev->pde_lock); img_sz = fmw->size; buf = fmw->data; offset += FW_DL_OFFSET; + if (offset >= (img_sz - FW_FL_HDR)) { + dev_err(tas_dev->dev, + "firmware is too small"); + ret = -EINVAL; + goto out; + } + + mutex_lock(&tas_dev->pde_lock); while (offset < (img_sz - FW_FL_HDR)) { memset(&hdr, 0, sizeof(hdr)); offset += read_header(&buf[offset], &hdr); @@ -776,6 +783,14 @@ static void tas2783_fw_ready(const struct firmware *fmw, void *context) /* size also includes the header */ file_blk_size = hdr.length - FW_FL_HDR; + /* make sure that enough data is there */ + if (offset + file_blk_size > img_sz) { + ret = -EINVAL; + dev_err(tas_dev->dev, + "corrupt firmware file"); + break; + } + switch (hdr.file_id) { case 0: ret = sdw_nwrite_no_pm(tas_dev->sdw_peripheral, @@ -808,7 +823,8 @@ static void tas2783_fw_ready(const struct firmware *fmw, void *context) break; } mutex_unlock(&tas_dev->pde_lock); - tas2783_update_calibdata(tas_dev); + if (!ret) + tas2783_update_calibdata(tas_dev); out: if (!ret) -- cgit From 86d57d9c07d54e8cb385ffe800930816ccdba0c1 Mon Sep 17 00:00:00 2001 From: Robin Gong Date: Fri, 24 Oct 2025 13:53:20 +0800 Subject: spi: imx: keep dma request disabled before dma transfer setup Since sdma hardware configure postpone to transfer phase, have to disable dma request before dma transfer setup because there is a hardware limitation on sdma event enable(ENBLn) as below: "It is thus essential for the Arm platform to program them before any DMA request is triggered to the SDMA, otherwise an unpredictable combination of channels may be started." Signed-off-by: Carlos Song Signed-off-by: Robin Gong Link: https://patch.msgid.link/20251024055320.408482-1-carlos.song@nxp.com Signed-off-by: Mark Brown --- drivers/spi/spi-imx.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c index 155ddeb8fcd4..bbf1fd4fe1e9 100644 --- a/drivers/spi/spi-imx.c +++ b/drivers/spi/spi-imx.c @@ -519,9 +519,15 @@ static void mx51_ecspi_trigger(struct spi_imx_data *spi_imx) { u32 reg; - reg = readl(spi_imx->base + MX51_ECSPI_CTRL); - reg |= MX51_ECSPI_CTRL_XCH; - writel(reg, spi_imx->base + MX51_ECSPI_CTRL); + if (spi_imx->usedma) { + reg = readl(spi_imx->base + MX51_ECSPI_DMA); + reg |= MX51_ECSPI_DMA_TEDEN | MX51_ECSPI_DMA_RXDEN; + writel(reg, spi_imx->base + MX51_ECSPI_DMA); + } else { + reg = readl(spi_imx->base + MX51_ECSPI_CTRL); + reg |= MX51_ECSPI_CTRL_XCH; + writel(reg, spi_imx->base + MX51_ECSPI_CTRL); + } } static void mx51_ecspi_disable(struct spi_imx_data *spi_imx) @@ -759,7 +765,6 @@ static void mx51_setup_wml(struct spi_imx_data *spi_imx) writel(MX51_ECSPI_DMA_RX_WML(spi_imx->wml - 1) | MX51_ECSPI_DMA_TX_WML(tx_wml) | MX51_ECSPI_DMA_RXT_WML(spi_imx->wml) | - MX51_ECSPI_DMA_TEDEN | MX51_ECSPI_DMA_RXDEN | MX51_ECSPI_DMA_RXTDEN, spi_imx->base + MX51_ECSPI_DMA); } @@ -1520,6 +1525,8 @@ static int spi_imx_dma_transfer(struct spi_imx_data *spi_imx, reinit_completion(&spi_imx->dma_tx_completion); dma_async_issue_pending(controller->dma_tx); + spi_imx->devtype_data->trigger(spi_imx); + transfer_timeout = spi_imx_calculate_timeout(spi_imx, transfer->len); /* Wait SDMA to finish the data transfer.*/ -- cgit From 3dc8c73365d3ca25c99e7e1a0f493039d7291df5 Mon Sep 17 00:00:00 2001 From: Haotian Zhang Date: Thu, 6 Nov 2025 22:31:14 +0800 Subject: ASoC: codecs: va-macro: fix resource leak in probe error path In the commit referenced by the Fixes tag, clk_hw_get_clk() was added in va_macro_probe() to get the fsgen clock, but forgot to add the corresponding clk_put() in va_macro_remove(). This leads to a clock reference leak when the driver is unloaded. Switch to devm_clk_hw_get_clk() to automatically manage the clock resource. Fixes: 30097967e056 ("ASoC: codecs: va-macro: use fsgen as clock") Suggested-by: Konrad Dybcio Signed-off-by: Haotian Zhang Reviewed-by: Konrad Dybcio Link: https://patch.msgid.link/20251106143114.729-1-vulab@iscas.ac.cn Signed-off-by: Mark Brown --- sound/soc/codecs/lpass-va-macro.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/lpass-va-macro.c b/sound/soc/codecs/lpass-va-macro.c index 2e1b77973a3e..92c177b82a02 100644 --- a/sound/soc/codecs/lpass-va-macro.c +++ b/sound/soc/codecs/lpass-va-macro.c @@ -1638,7 +1638,7 @@ static int va_macro_probe(struct platform_device *pdev) if (ret) goto err_clkout; - va->fsgen = clk_hw_get_clk(&va->hw, "fsgen"); + va->fsgen = devm_clk_hw_get_clk(dev, &va->hw, "fsgen"); if (IS_ERR(va->fsgen)) { ret = PTR_ERR(va->fsgen); goto err_clkout; -- cgit From a59e927ff46a967f84ddf94e89cbb045810e8974 Mon Sep 17 00:00:00 2001 From: Andrey Leonchikov Date: Wed, 5 Nov 2025 22:07:33 +0100 Subject: arm64: dts: rockchip: Fix USB power enable pin for BTT CB2 and Pi2 Fix typo into regulator GPIO definition. With current definition - USB powered off. Valid definition can be found on "pinctrl" section: vcc5v0_usb2t_en: vcc5v0-usb2t-en { rockchip,pins = <3 RK_PD5 RK_FUNC_GPIO &pcfg_pull_none>; }; vcc5v0_usb2b_en: vcc5v0-usb2b-en { rockchip,pins = <4 RK_PC4 RK_FUNC_GPIO &pcfg_pull_none>; }; Fixes: bfbc663d2733a ("arm64: dts: rockchip: Add BigTreeTech CB2 and Pi2") Signed-off-by: Andrey Leonchikov Link: https://patch.msgid.link/20251105210741.850031-1-andreil499@gmail.com Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi b/arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi index f74590af7e33..b6cf03a7ba66 100644 --- a/arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi @@ -187,7 +187,7 @@ vcc5v0_usb2b: regulator-vcc5v0-usb2b { compatible = "regulator-fixed"; enable-active-high; - gpio = <&gpio0 RK_PC4 GPIO_ACTIVE_HIGH>; + gpio = <&gpio4 RK_PC4 GPIO_ACTIVE_HIGH>; pinctrl-names = "default"; pinctrl-0 = <&vcc5v0_usb2b_en>; regulator-name = "vcc5v0_usb2b"; @@ -199,7 +199,7 @@ vcc5v0_usb2t: regulator-vcc5v0-usb2t { compatible = "regulator-fixed"; enable-active-high; - gpios = <&gpio0 RK_PD5 GPIO_ACTIVE_HIGH>; + gpios = <&gpio3 RK_PD5 GPIO_ACTIVE_HIGH>; pinctrl-names = "default"; pinctrl-0 = <&vcc5v0_usb2t_en>; regulator-name = "vcc5v0_usb2t"; -- cgit From 32b415a9dc2c212e809b7ebc2b14bc3fbda2b9af Mon Sep 17 00:00:00 2001 From: Ian Forbes Date: Tue, 21 Oct 2025 14:01:28 -0500 Subject: drm/vmwgfx: Validate command header size against SVGA_CMD_MAX_DATASIZE This data originates from userspace and is used in buffer offset calculations which could potentially overflow causing an out-of-bounds access. Fixes: 8ce75f8ab904 ("drm/vmwgfx: Update device includes for DX device functionality") Reported-by: Rohit Keshri Signed-off-by: Ian Forbes Reviewed-by: Maaz Mombasawala Signed-off-by: Zack Rusin Link: https://patch.msgid.link/20251021190128.13014-1-ian.forbes@broadcom.com --- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c index d539f25b5fbe..3057f8baa7d2 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -3668,6 +3668,11 @@ static int vmw_cmd_check(struct vmw_private *dev_priv, cmd_id = header->id; + if (header->size > SVGA_CMD_MAX_DATASIZE) { + VMW_DEBUG_USER("SVGA3D command: %d is too big.\n", + cmd_id + SVGA_3D_CMD_BASE); + return -E2BIG; + } *size = header->size + sizeof(SVGA3dCmdHeader); cmd_id -= SVGA_3D_CMD_BASE; -- cgit From c1962742ffff7e245f935903a4658eb6f94f6058 Mon Sep 17 00:00:00 2001 From: Ian Forbes Date: Thu, 30 Oct 2025 14:36:40 -0500 Subject: drm/vmwgfx: Use kref in vmw_bo_dirty Rather than using an ad hoc reference count use kref which is atomic and has underflow warnings. Signed-off-by: Ian Forbes Signed-off-by: Zack Rusin Link: https://patch.msgid.link/20251030193640.153697-1-ian.forbes@broadcom.com --- drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c b/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c index 7de20e56082c..fd4e76486f2d 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c @@ -32,22 +32,22 @@ enum vmw_bo_dirty_method { /** * struct vmw_bo_dirty - Dirty information for buffer objects + * @ref_count: Reference count for this structure. Must be first member! * @start: First currently dirty bit * @end: Last currently dirty bit + 1 * @method: The currently used dirty method * @change_count: Number of consecutive method change triggers - * @ref_count: Reference count for this structure * @bitmap_size: The size of the bitmap in bits. Typically equal to the * nuber of pages in the bo. * @bitmap: A bitmap where each bit represents a page. A set bit means a * dirty page. */ struct vmw_bo_dirty { + struct kref ref_count; unsigned long start; unsigned long end; enum vmw_bo_dirty_method method; unsigned int change_count; - unsigned int ref_count; unsigned long bitmap_size; unsigned long bitmap[]; }; @@ -221,7 +221,7 @@ int vmw_bo_dirty_add(struct vmw_bo *vbo) int ret; if (dirty) { - dirty->ref_count++; + kref_get(&dirty->ref_count); return 0; } @@ -235,7 +235,7 @@ int vmw_bo_dirty_add(struct vmw_bo *vbo) dirty->bitmap_size = num_pages; dirty->start = dirty->bitmap_size; dirty->end = 0; - dirty->ref_count = 1; + kref_init(&dirty->ref_count); if (num_pages < PAGE_SIZE / sizeof(pte_t)) { dirty->method = VMW_BO_DIRTY_PAGETABLE; } else { @@ -274,10 +274,8 @@ void vmw_bo_dirty_release(struct vmw_bo *vbo) { struct vmw_bo_dirty *dirty = vbo->dirty; - if (dirty && --dirty->ref_count == 0) { - kvfree(dirty); + if (dirty && kref_put(&dirty->ref_count, (void *)kvfree)) vbo->dirty = NULL; - } } /** -- cgit From eef295a8508202e750e4f103a97447f3c9d5e3d0 Mon Sep 17 00:00:00 2001 From: Ian Forbes Date: Mon, 3 Nov 2025 14:19:20 -0600 Subject: drm/vmwgfx: Restore Guest-Backed only cursor plane support The referenced fixes commit broke the cursor plane for configurations which have Guest-Backed surfaces but no cursor MOB support. Fixes: 965544150d1c ("drm/vmwgfx: Refactor cursor handling") Signed-off-by: Ian Forbes Signed-off-by: Zack Rusin Link: https://patch.msgid.link/20251103201920.381503-1-ian.forbes@broadcom.com --- drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.c | 16 +++++++++++++++- drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.h | 1 + 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.c index 718832b08d96..c46f17ba7236 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.c @@ -100,8 +100,10 @@ vmw_cursor_update_type(struct vmw_private *vmw, struct vmw_plane_state *vps) if (vmw->has_mob) { if ((vmw->capabilities2 & SVGA_CAP2_CURSOR_MOB) != 0) return VMW_CURSOR_UPDATE_MOB; + else + return VMW_CURSOR_UPDATE_GB_ONLY; } - + drm_warn_once(&vmw->drm, "Unknown Cursor Type!\n"); return VMW_CURSOR_UPDATE_NONE; } @@ -139,6 +141,7 @@ static u32 vmw_cursor_mob_size(enum vmw_cursor_update_type update_type, { switch (update_type) { case VMW_CURSOR_UPDATE_LEGACY: + case VMW_CURSOR_UPDATE_GB_ONLY: case VMW_CURSOR_UPDATE_NONE: return 0; case VMW_CURSOR_UPDATE_MOB: @@ -623,6 +626,7 @@ int vmw_cursor_plane_prepare_fb(struct drm_plane *plane, if (!surface || vps->cursor.legacy.id == surface->snooper.id) vps->cursor.update_type = VMW_CURSOR_UPDATE_NONE; break; + case VMW_CURSOR_UPDATE_GB_ONLY: case VMW_CURSOR_UPDATE_MOB: { bo = vmw_user_object_buffer(&vps->uo); if (bo) { @@ -737,6 +741,7 @@ void vmw_cursor_plane_atomic_update(struct drm_plane *plane, struct drm_atomic_state *state) { + struct vmw_bo *bo; struct drm_plane_state *new_state = drm_atomic_get_new_plane_state(state, plane); struct drm_plane_state *old_state = @@ -762,6 +767,15 @@ vmw_cursor_plane_atomic_update(struct drm_plane *plane, case VMW_CURSOR_UPDATE_MOB: vmw_cursor_update_mob(dev_priv, vps); break; + case VMW_CURSOR_UPDATE_GB_ONLY: + bo = vmw_user_object_buffer(&vps->uo); + if (bo) + vmw_send_define_cursor_cmd(dev_priv, bo->map.virtual, + vps->base.crtc_w, + vps->base.crtc_h, + vps->base.hotspot_x, + vps->base.hotspot_y); + break; case VMW_CURSOR_UPDATE_NONE: /* do nothing */ break; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.h b/drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.h index 40694925a70e..0c2cc0699b0d 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.h @@ -33,6 +33,7 @@ static const u32 __maybe_unused vmw_cursor_plane_formats[] = { enum vmw_cursor_update_type { VMW_CURSOR_UPDATE_NONE = 0, VMW_CURSOR_UPDATE_LEGACY, + VMW_CURSOR_UPDATE_GB_ONLY, VMW_CURSOR_UPDATE_MOB, }; -- cgit From 29528c8e643bb0c54da01237a35010c6438423d2 Mon Sep 17 00:00:00 2001 From: Shenghao Ding Date: Fri, 7 Nov 2025 13:49:59 +0800 Subject: ASoC: tas2781: fix getting the wrong device number The return value of device_property_read_u32_array used for getting the property is the status instead of the number of the property. Fixes: ef3bcde75d06 ("ASoC: tas2781: Add tas2781 driver") Signed-off-by: Shenghao Ding Link: https://patch.msgid.link/20251107054959.950-1-shenghao-ding@ti.com Signed-off-by: Mark Brown --- sound/soc/codecs/tas2781-i2c.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/sound/soc/codecs/tas2781-i2c.c b/sound/soc/codecs/tas2781-i2c.c index ba880b5de7e8..8f37aa00e62e 100644 --- a/sound/soc/codecs/tas2781-i2c.c +++ b/sound/soc/codecs/tas2781-i2c.c @@ -1957,7 +1957,8 @@ static void tasdevice_parse_dt(struct tasdevice_priv *tas_priv) { struct i2c_client *client = (struct i2c_client *)tas_priv->client; unsigned int dev_addrs[TASDEVICE_MAX_CHANNELS]; - int i, ndev = 0; + int ndev = 0; + int i, rc; if (tas_priv->isacpi) { ndev = device_property_read_u32_array(&client->dev, @@ -1968,8 +1969,12 @@ static void tasdevice_parse_dt(struct tasdevice_priv *tas_priv) } else { ndev = (ndev < ARRAY_SIZE(dev_addrs)) ? ndev : ARRAY_SIZE(dev_addrs); - ndev = device_property_read_u32_array(&client->dev, + rc = device_property_read_u32_array(&client->dev, "ti,audio-slots", dev_addrs, ndev); + if (rc != 0) { + ndev = 1; + dev_addrs[0] = client->addr; + } } tas_priv->irq = -- cgit From 939edfaa10f1d22e6af6a84bf4bd96dc49c67302 Mon Sep 17 00:00:00 2001 From: Alvaro Gamez Machado Date: Thu, 6 Nov 2025 14:45:35 +0100 Subject: spi: xilinx: increase number of retries before declaring stall SPI devices using a (relative) slow frequency need a larger time. For instance, microblaze running at 83.25MHz and performing a 3 bytes transaction using a 10MHz/16 = 625kHz needed this stall value increased to at least 20. The SPI device is quite slow, but also is the microblaze, so set this value to 32 to give it even more margin. Signed-off-by: Alvaro Gamez Machado Reviewed-by: Ricardo Ribalda Link: https://patch.msgid.link/20251106134545.31942-1-alvaro.gamez@hazent.com Signed-off-by: Mark Brown --- drivers/spi/spi-xilinx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/spi/spi-xilinx.c b/drivers/spi/spi-xilinx.c index d59cc8a18484..c86dc56f38b4 100644 --- a/drivers/spi/spi-xilinx.c +++ b/drivers/spi/spi-xilinx.c @@ -300,7 +300,7 @@ static int xilinx_spi_txrx_bufs(struct spi_device *spi, struct spi_transfer *t) /* Read out all the data from the Rx FIFO */ rx_words = n_words; - stalled = 10; + stalled = 32; while (rx_words) { if (rx_words == n_words && !(stalled--) && !(sr & XSPI_SR_TX_EMPTY_MASK) && -- cgit From 62b9ca1706e1bbb60d945a58de7c7b5826f6b2a2 Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Wed, 5 Nov 2025 22:51:05 -0600 Subject: PM: hibernate: Emit an error when image writing fails If image writing fails, a return code is passed up to the caller, but none of the callers log anything to the log and so the only record of it is the return code that userspace gets. Adjust the logging so that the image size and speed of writing is only emitted on success and if there is an error, it's saved to the logs. Fixes: a06c6f5d3cc9 ("PM: hibernate: Move to crypto APIs for LZO compression") Reported-by: Askar Safin Closes: https://lore.kernel.org/linux-pm/20251105180506.137448-1-safinaskar@gmail.com/ Signed-off-by: Mario Limonciello (AMD) Tested-by: Askar Safin Cc: 6.9+ # 6.9+ [ rjw: Added missing braces after "else", changelog edits ] Link: https://patch.msgid.link/20251106045158.3198061-2-superm1@kernel.org Signed-off-by: Rafael J. Wysocki --- kernel/power/swap.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/kernel/power/swap.c b/kernel/power/swap.c index 0beff7eeaaba..7daa716d2cb1 100644 --- a/kernel/power/swap.c +++ b/kernel/power/swap.c @@ -877,11 +877,14 @@ out_finish: stop = ktime_get(); if (!ret) ret = err2; - if (!ret) + if (!ret) { + swsusp_show_speed(start, stop, nr_to_write, "Wrote"); + pr_info("Image size after compression: %d kbytes\n", + (atomic_read(&compressed_size) / 1024)); pr_info("Image saving done\n"); - swsusp_show_speed(start, stop, nr_to_write, "Wrote"); - pr_info("Image size after compression: %d kbytes\n", - (atomic_read(&compressed_size) / 1024)); + } else { + pr_err("Image saving failed: %d\n", ret); + } out_clean: hib_finish_batch(&hb); -- cgit From 66ededc694f1d06a71ca35a3c8e3689e9b85b3ce Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Wed, 5 Nov 2025 22:51:06 -0600 Subject: PM: hibernate: Use atomic64_t for compressed_size variable `compressed_size` can overflow, showing nonsensical values. Change from `atomic_t` to `atomic64_t` to prevent overflow. Fixes: a06c6f5d3cc9 ("PM: hibernate: Move to crypto APIs for LZO compression") Reported-by: Askar Safin Closes: https://lore.kernel.org/linux-pm/20251105180506.137448-1-safinaskar@gmail.com/ Signed-off-by: Mario Limonciello (AMD) Tested-by: Askar Safin Cc: 6.9+ # 6.9+ Link: https://patch.msgid.link/20251106045158.3198061-3-superm1@kernel.org Signed-off-by: Rafael J. Wysocki --- kernel/power/swap.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/power/swap.c b/kernel/power/swap.c index 7daa716d2cb1..e0441483dbee 100644 --- a/kernel/power/swap.c +++ b/kernel/power/swap.c @@ -635,7 +635,7 @@ struct cmp_data { }; /* Indicates the image size after compression */ -static atomic_t compressed_size = ATOMIC_INIT(0); +static atomic64_t compressed_size = ATOMIC_INIT(0); /* * Compression function that runs in its own thread. @@ -664,7 +664,7 @@ static int compress_threadfn(void *data) d->ret = crypto_acomp_compress(d->cr); d->cmp_len = d->cr->dlen; - atomic_set(&compressed_size, atomic_read(&compressed_size) + d->cmp_len); + atomic64_add(d->cmp_len, &compressed_size); atomic_set_release(&d->stop, 1); wake_up(&d->done); } @@ -696,7 +696,7 @@ static int save_compressed_image(struct swap_map_handle *handle, hib_init_batch(&hb); - atomic_set(&compressed_size, 0); + atomic64_set(&compressed_size, 0); /* * We'll limit the number of threads for compression to limit memory @@ -879,8 +879,8 @@ out_finish: ret = err2; if (!ret) { swsusp_show_speed(start, stop, nr_to_write, "Wrote"); - pr_info("Image size after compression: %d kbytes\n", - (atomic_read(&compressed_size) / 1024)); + pr_info("Image size after compression: %lld kbytes\n", + (atomic64_read(&compressed_size) / 1024)); pr_info("Image saving done\n"); } else { pr_err("Image saving failed: %d\n", ret); -- cgit From 0b6c10cb8479d0d1b7b208277df2e2afe082d4bd Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Wed, 5 Nov 2025 22:51:07 -0600 Subject: PM: hibernate: Fix style issues in save_compressed_image() Address two issues indicated by checkpatch: - Trailing statements should be on next line. - Prefer 'unsigned int' to bare use of 'unsigned'. Signed-off-by: Mario Limonciello (AMD) [ rjw: Changelog edits ] Link: https://patch.msgid.link/20251106045158.3198061-4-superm1@kernel.org Signed-off-by: Rafael J. Wysocki --- kernel/power/swap.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/power/swap.c b/kernel/power/swap.c index e0441483dbee..70ae21f7370d 100644 --- a/kernel/power/swap.c +++ b/kernel/power/swap.c @@ -689,7 +689,7 @@ static int save_compressed_image(struct swap_map_handle *handle, ktime_t start; ktime_t stop; size_t off; - unsigned thr, run_threads, nr_threads; + unsigned int thr, run_threads, nr_threads; unsigned char *page = NULL; struct cmp_data *data = NULL; struct crc_data *crc = NULL; @@ -902,7 +902,8 @@ out_clean: } vfree(data); } - if (page) free_page((unsigned long)page); + if (page) + free_page((unsigned long)page); return ret; } -- cgit From b6cfddd26ec55e865b4715f73e9bbb17a15091ed Mon Sep 17 00:00:00 2001 From: Dave Jiang Date: Fri, 31 Oct 2025 10:32:24 -0700 Subject: cxl: Adjust offset calculation for poison injection The HPA to DPA translation for poison injection assumes that the base address starts from where the CXL region begins. When the extended linear cache is active, the offset can be within the DRAM region. Adjust the offset so that it correctly reflects the offset within the CXL region. [ dj: Add fixes tag from Alison ] Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset") Link: https://patch.msgid.link/20251031173224.3537030-5-dave.jiang@intel.com Reviewed-by: Alison Schofield Signed-off-by: Dave Jiang --- drivers/cxl/core/region.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index b06fee1978ba..41b64d871c5a 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -3702,6 +3702,7 @@ static int cxl_region_debugfs_poison_inject(void *data, u64 offset) if (validate_region_offset(cxlr, offset)) return -EINVAL; + offset -= cxlr->params.cache_size; rc = region_offset_to_dpa_result(cxlr, offset, &result); if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) { dev_dbg(&cxlr->dev, @@ -3734,6 +3735,7 @@ static int cxl_region_debugfs_poison_clear(void *data, u64 offset) if (validate_region_offset(cxlr, offset)) return -EINVAL; + offset -= cxlr->params.cache_size; rc = region_offset_to_dpa_result(cxlr, offset, &result); if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) { dev_dbg(&cxlr->dev, -- cgit From 4fe5934db4a7187d358f1af1b3ef9b6dd59bce58 Mon Sep 17 00:00:00 2001 From: "Gautham R. Shenoy" Date: Fri, 7 Nov 2025 13:11:41 +0530 Subject: ACPI: CPPC: Detect preferred core availability on online CPUs Commit 279f838a61f9 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()") introduced the ability to detect the preferred core on AMD platforms by checking if there at least two distinct highest_perf values. However, it uses for_each_present_cpu() to iterate through all the CPUs in the platform, which is problematic when the kernel is booted with "nosmt=force" commandline option. Hence limit the search to only the online CPUs. Fixes: 279f838a61f9 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()") Reported-by: Christopher Harris Closes: https://lore.kernel.org/lkml/CAM+eXpdDT7KjLV0AxEwOLkSJ2QtrsvGvjA2cCHvt1d0k2_C4Cw@mail.gmail.com/ Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" Tested-by: Chrisopher Harris Signed-off-by: Gautham R. Shenoy Link: https://patch.msgid.link/20251107074145.2340-2-gautham.shenoy@amd.com Signed-off-by: Rafael J. Wysocki --- arch/x86/kernel/acpi/cppc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c index 7047124490f6..d7c8ef1e354d 100644 --- a/arch/x86/kernel/acpi/cppc.c +++ b/arch/x86/kernel/acpi/cppc.c @@ -196,7 +196,7 @@ int amd_detect_prefcore(bool *detected) break; } - for_each_present_cpu(cpu) { + for_each_online_cpu(cpu) { u32 tmp; int ret; -- cgit From 6dd3b8a709a130a4d55c866af9804c81b8486d28 Mon Sep 17 00:00:00 2001 From: "Gautham R. Shenoy" Date: Fri, 7 Nov 2025 13:11:42 +0530 Subject: ACPI: CPPC: Check _CPC validity for only the online CPUs per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() --> acpi_cppc_processor_probe(). However the function acpi_cpc_valid() checks for the validity of the _CPC object for all the present CPUs. This breaks when the kernel is booted with "nosmt=force". Hence check the validity of the _CPC objects of only the online CPUs. Fixes: 2aeca6bd0277 ("ACPI: CPPC: Check present CPUs for determining _CPC is valid") Reported-by: Christopher Harris Closes: https://lore.kernel.org/lkml/CAM+eXpdDT7KjLV0AxEwOLkSJ2QtrsvGvjA2cCHvt1d0k2_C4Cw@mail.gmail.com/ Suggested-by: Mario Limonciello Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" Tested-by: Chrisopher Harris Signed-off-by: Gautham R. Shenoy Link: https://patch.msgid.link/20251107074145.2340-3-gautham.shenoy@amd.com Signed-off-by: Rafael J. Wysocki --- drivers/acpi/cppc_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index 6c684e54fe01..da6b35ac8c87 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -460,7 +460,7 @@ bool acpi_cpc_valid(void) if (acpi_disabled) return false; - for_each_present_cpu(cpu) { + for_each_online_cpu(cpu) { cpc_ptr = per_cpu(cpc_desc_ptr, cpu); if (!cpc_ptr) return false; -- cgit From 8821c8e80a65bc4eb73daf63b34aac6b8ad69461 Mon Sep 17 00:00:00 2001 From: "Gautham R. Shenoy" Date: Fri, 7 Nov 2025 13:11:43 +0530 Subject: ACPI: CPPC: Perform fast check switch only for online CPUs per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() --> acpi_cppc_processor_probe(). However the function cppc_allow_fast_switch() checks for the validity of the _CPC object for all the present CPUs. This breaks when the kernel is booted with "nosmt=force". Check fast_switch capability only on online CPUs Fixes: 15eece6c5b05 ("ACPI: CPPC: Fix NULL pointer dereference when nosmp is used") Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" Signed-off-by: Gautham R. Shenoy Link: https://patch.msgid.link/20251107074145.2340-4-gautham.shenoy@amd.com Signed-off-by: Rafael J. Wysocki --- drivers/acpi/cppc_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index da6b35ac8c87..7492c9922866 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -476,7 +476,7 @@ bool cppc_allow_fast_switch(void) struct cpc_desc *cpc_ptr; int cpu; - for_each_present_cpu(cpu) { + for_each_online_cpu(cpu) { cpc_ptr = per_cpu(cpc_desc_ptr, cpu); desired_reg = &cpc_ptr->cpc_regs[DESIRED_PERF]; if (!CPC_IN_SYSTEM_MEMORY(desired_reg) && -- cgit From 0fce75870666b46b700cfbd3216380b422f975da Mon Sep 17 00:00:00 2001 From: "Gautham R. Shenoy" Date: Fri, 7 Nov 2025 13:11:44 +0530 Subject: ACPI: CPPC: Limit perf ctrs in PCC check only to online CPUs per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online CPU via acpi_soft_cpu_online() --> __acpi_processor_start() --> acpi_cppc_processor_probe(). However the function cppc_perf_ctrs_in_pcc() checks if the CPPC perf-ctrs are in a PCC region for all the present CPUs, which breaks when the kernel is booted with "nosmt=force". Hence, limit the check only to the online CPUs. Fixes: ae2df912d1a5 ("ACPI: CPPC: Disable FIE if registers in PCC regions") Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" Signed-off-by: Gautham R. Shenoy Link: https://patch.msgid.link/20251107074145.2340-5-gautham.shenoy@amd.com Signed-off-by: Rafael J. Wysocki --- drivers/acpi/cppc_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index 7492c9922866..3bdeeee3414e 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -1435,7 +1435,7 @@ bool cppc_perf_ctrs_in_pcc(void) { int cpu; - for_each_present_cpu(cpu) { + for_each_online_cpu(cpu) { struct cpc_register_resource *ref_perf_reg; struct cpc_desc *cpc_desc; -- cgit From 2cf95b9baa52262bfb645cb3c04f902dd50c29e2 Mon Sep 17 00:00:00 2001 From: Shubhrajyoti Datta Date: Thu, 23 Oct 2025 17:01:08 +0530 Subject: EDAC/versalnet: Handle split messages for non-standard errors The current code assumes that only DDR errors have split messages. Ensure proper logging of non-standard event errors that may be split across multiple messages too. [ bp: Massage, move comment too, fix it up. ] Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller") Signed-off-by: Shubhrajyoti Datta Signed-off-by: Borislav Petkov (AMD) Link: https://patch.msgid.link/20251023113108.3467132-1-shubhrajyoti.datta@amd.com --- drivers/edac/versalnet_edac.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c index 1ded4c3f0213..1a1092793092 100644 --- a/drivers/edac/versalnet_edac.c +++ b/drivers/edac/versalnet_edac.c @@ -605,21 +605,23 @@ static int rpmsg_cb(struct rpmsg_device *rpdev, void *data, length = result[MSG_ERR_LENGTH]; offset = result[MSG_ERR_OFFSET]; + /* + * The data can come in two stretches. Construct the regs from two + * messages. The offset indicates the offset from which the data is to + * be taken. + */ + for (i = 0 ; i < length; i++) { + k = offset + i; + j = ERROR_DATA + i; + mc_priv->regs[k] = result[j]; + } + if (result[TOTAL_ERR_LENGTH] > length) { if (!mc_priv->part_len) mc_priv->part_len = length; else mc_priv->part_len += length; - /* - * The data can come in 2 stretches. Construct the regs from 2 - * messages the offset indicates the offset from which the data is to - * be taken - */ - for (i = 0 ; i < length; i++) { - k = offset + i; - j = ERROR_DATA + i; - mc_priv->regs[k] = result[j]; - } + if (mc_priv->part_len < result[TOTAL_ERR_LENGTH]) return 0; mc_priv->part_len = 0; @@ -705,7 +707,7 @@ static int rpmsg_cb(struct rpmsg_device *rpdev, void *data, /* Convert to bytes */ length = result[TOTAL_ERR_LENGTH] * 4; log_non_standard_event(sec_type, &amd_versalnet_guid, mc_priv->message, - sec_sev, (void *)&result[ERROR_DATA], length); + sec_sev, (void *)&mc_priv->regs, length); return 0; } -- cgit From 4b93d211bbffd3dce76664d95f2306d23e7215ce Mon Sep 17 00:00:00 2001 From: Kaushlendra Kumar Date: Thu, 30 Oct 2025 08:02:28 +0530 Subject: ACPI: MRRM: Fix memory leaks and improve error handling Add proper error handling and resource cleanup to prevent memory leaks in add_boot_memory_ranges(). The function now checks for NULL return from kobject_create_and_add(), uses local buffer for range names to avoid dynamic allocation, and implements a cleanup path that removes previously created sysfs groups and kobjects on failure. This prevents resource leaks when kobject creation or sysfs group creation fails during boot memory range initialization. Signed-off-by: Kaushlendra Kumar Reviewed-by: Tony Luck Link: https://patch.msgid.link/20251030023228.3956296-1-kaushlendra.kumar@intel.com Signed-off-by: Rafael J. Wysocki --- drivers/acpi/acpi_mrrm.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-) diff --git a/drivers/acpi/acpi_mrrm.c b/drivers/acpi/acpi_mrrm.c index a6dbf623e557..6d69554c940e 100644 --- a/drivers/acpi/acpi_mrrm.c +++ b/drivers/acpi/acpi_mrrm.c @@ -152,26 +152,49 @@ ATTRIBUTE_GROUPS(memory_range); static __init int add_boot_memory_ranges(void) { - struct kobject *pkobj, *kobj; + struct kobject *pkobj, *kobj, **kobjs; int ret = -EINVAL; - char *name; + char name[16]; + int i; pkobj = kobject_create_and_add("memory_ranges", acpi_kobj); + if (!pkobj) + return -ENOMEM; - for (int i = 0; i < mrrm_mem_entry_num; i++) { - name = kasprintf(GFP_KERNEL, "range%d", i); - if (!name) { - ret = -ENOMEM; - break; - } + kobjs = kcalloc(mrrm_mem_entry_num, sizeof(*kobjs), GFP_KERNEL); + if (!kobjs) { + kobject_put(pkobj); + return -ENOMEM; + } + for (i = 0; i < mrrm_mem_entry_num; i++) { + scnprintf(name, sizeof(name), "range%d", i); kobj = kobject_create_and_add(name, pkobj); + if (!kobj) { + ret = -ENOMEM; + goto cleanup; + } ret = sysfs_create_groups(kobj, memory_range_groups); - if (ret) - return ret; + if (ret) { + kobject_put(kobj); + goto cleanup; + } + kobjs[i] = kobj; } + kfree(kobjs); + return 0; + +cleanup: + for (int j = 0; j < i; j++) { + if (kobjs[j]) { + sysfs_remove_groups(kobjs[j], memory_range_groups); + kobject_put(kobjs[j]); + } + } + kfree(kobjs); + kobject_put(pkobj); return ret; } -- cgit From 7a39c723b7472b8aaa2e0a67d2b6c7cf1c45cafb Mon Sep 17 00:00:00 2001 From: Baojun Xu Date: Sat, 8 Nov 2025 22:23:25 +0800 Subject: ALSA: hda/tas2781: Add new quirk for HP new projects Add new vendor_id and subsystem_id in quirk for HP new projects. Signed-off-by: Baojun Xu Link: https://patch.msgid.link/20251108142325.2563-1-baojun.xu@ti.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index 4aec5067c59d..a9698bf26887 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6694,6 +6694,15 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8e60, "HP Trekker ", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8e61, "HP Trekker ", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8e62, "HP Trekker ", ALC287_FIXUP_CS35L41_I2C_2), + SND_PCI_QUIRK(0x103c, 0x8ed5, "HP Merino13X", ALC245_FIXUP_TAS2781_SPI_2), + SND_PCI_QUIRK(0x103c, 0x8ed6, "HP Merino13", ALC245_FIXUP_TAS2781_SPI_2), + SND_PCI_QUIRK(0x103c, 0x8ed7, "HP Merino14", ALC245_FIXUP_TAS2781_SPI_2), + SND_PCI_QUIRK(0x103c, 0x8ed8, "HP Merino16", ALC245_FIXUP_TAS2781_SPI_2), + SND_PCI_QUIRK(0x103c, 0x8ed9, "HP Merino14W", ALC245_FIXUP_TAS2781_SPI_2), + SND_PCI_QUIRK(0x103c, 0x8eda, "HP Merino16W", ALC245_FIXUP_TAS2781_SPI_2), + SND_PCI_QUIRK(0x103c, 0x8f40, "HP Lampas14", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x103c, 0x8f41, "HP Lampas16", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x103c, 0x8f42, "HP LampasW14", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x1043, 0x1032, "ASUS VivoBook X513EA", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1034, "ASUS GU605C", ALC285_FIXUP_ASUS_GU605_SPI_SPEAKER2_TO_DAC1), SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC), -- cgit From 79280191c2fd7f24899bbd640003b5389d3c109c Mon Sep 17 00:00:00 2001 From: Henrique Carvalho Date: Fri, 7 Nov 2025 18:59:53 -0300 Subject: smb: client: fix cifs_pick_channel when channel needs reconnect cifs_pick_channel iterates candidate channels using cur. The reconnect-state test mistakenly used a different variable. This checked the wrong slot and would cause us to skip a healthy channel and to dispatch on one that needs reconnect, occasionally failing operations when a channel was down. Fix by replacing for the correct variable. Fixes: fc43a8ac396d ("cifs: cifs_pick_channel should try selecting active channels") Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N Signed-off-by: Henrique Carvalho Signed-off-by: Steve French --- fs/smb/client/transport.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/smb/client/transport.c b/fs/smb/client/transport.c index 051cd9dbba13..915cedde5d66 100644 --- a/fs/smb/client/transport.c +++ b/fs/smb/client/transport.c @@ -830,7 +830,7 @@ struct TCP_Server_Info *cifs_pick_channel(struct cifs_ses *ses) if (!server || server->terminate) continue; - if (CIFS_CHAN_NEEDS_RECONNECT(ses, i)) + if (CIFS_CHAN_NEEDS_RECONNECT(ses, cur)) continue; /* -- cgit From e8c73eb7db0a498cd4b22d2819e6ab1a6f506bd6 Mon Sep 17 00:00:00 2001 From: Edward Adam Davis Date: Fri, 7 Nov 2025 22:01:39 +0800 Subject: cifs: client: fix memory leak in smb3_fs_context_parse_param The user calls fsconfig twice, but when the program exits, free() only frees ctx->source for the second fsconfig, not the first. Regarding fc->source, there is no code in the fs context related to its memory reclamation. To fix this memory leak, release the source memory corresponding to ctx or fc before each parsing. syzbot reported: BUG: memory leak unreferenced object 0xffff888128afa360 (size 96): backtrace (crc 79c9c7ba): kstrdup+0x3c/0x80 mm/util.c:84 smb3_fs_context_parse_param+0x229b/0x36c0 fs/smb/client/fs_context.c:1444 BUG: memory leak unreferenced object 0xffff888112c7d900 (size 96): backtrace (crc 79c9c7ba): smb3_fs_context_fullpath+0x70/0x1b0 fs/smb/client/fs_context.c:629 smb3_fs_context_parse_param+0x2266/0x36c0 fs/smb/client/fs_context.c:1438 Reported-by: syzbot+72afd4c236e6bc3f4bac@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=72afd4c236e6bc3f4bac Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (Red Hat) Signed-off-by: Edward Adam Davis Signed-off-by: Steve French --- fs/smb/client/fs_context.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c index e60927b2a7c8..c2d5bb23040c 100644 --- a/fs/smb/client/fs_context.c +++ b/fs/smb/client/fs_context.c @@ -1435,12 +1435,14 @@ static int smb3_fs_context_parse_param(struct fs_context *fc, cifs_errorf(fc, "Unknown error parsing devname\n"); goto cifs_parse_mount_err; } + kfree(ctx->source); ctx->source = smb3_fs_context_fullpath(ctx, '/'); if (IS_ERR(ctx->source)) { ctx->source = NULL; cifs_errorf(fc, "OOM when copying UNC string\n"); goto cifs_parse_mount_err; } + kfree(fc->source); fc->source = kstrdup(ctx->source, GFP_KERNEL); if (fc->source == NULL) { cifs_errorf(fc, "OOM when copying UNC string\n"); -- cgit From aaf46c6a6df6052881c2e75cba65aeb6f1cfa88a Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Thu, 23 Oct 2025 22:21:02 -0700 Subject: tee: : - add ending ':' to some struct members as needed for kernel-doc - change struct name in kernel-doc to match the actual struct name (2x) - add a @params: kernel-doc entry multiple times Warning: tee.h:265 struct member 'ret_origin' not described in 'tee_ioctl_open_session_arg' Warning: tee.h:265 struct member 'num_params' not described in 'tee_ioctl_open_session_arg' Warning: tee.h:265 struct member 'params' not described in 'tee_ioctl_open_session_arg' Warning: tee.h:351 struct member 'num_params' not described in 'tee_iocl_supp_recv_arg' Warning: tee.h:351 struct member 'params' not described in 'tee_iocl_supp_recv_arg' Warning: tee.h:372 struct member 'num_params' not described in 'tee_iocl_supp_send_arg' Warning: tee.h:372 struct member 'params' not described in 'tee_iocl_supp_send_arg' Warning: tee.h:298: expecting prototype for struct tee_ioctl_invoke_func_arg. Prototype was for struct tee_ioctl_invoke_arg instead Warning: tee.h:473: expecting prototype for struct tee_ioctl_invoke_func_arg. Prototype was for struct tee_ioctl_object_invoke_arg instead Signed-off-by: Randy Dunlap Reviewed-by: Sumit Garg Signed-off-by: Jens Wiklander --- include/uapi/linux/tee.h | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/include/uapi/linux/tee.h b/include/uapi/linux/tee.h index 386ad36f1a0a..cab5cadca8ef 100644 --- a/include/uapi/linux/tee.h +++ b/include/uapi/linux/tee.h @@ -249,8 +249,9 @@ struct tee_ioctl_param { * @cancel_id: [in] Cancellation id, a unique value to identify this request * @session: [out] Session id * @ret: [out] return value - * @ret_origin [out] origin of the return value - * @num_params [in] number of parameters following this struct + * @ret_origin: [out] origin of the return value + * @num_params: [in] number of &struct tee_ioctl_param entries in @params + * @params: array of ioctl parameters */ struct tee_ioctl_open_session_arg { __u8 uuid[TEE_IOCTL_UUID_LEN]; @@ -276,14 +277,14 @@ struct tee_ioctl_open_session_arg { struct tee_ioctl_buf_data) /** - * struct tee_ioctl_invoke_func_arg - Invokes a function in a Trusted - * Application + * struct tee_ioctl_invoke_arg - Invokes a function in a Trusted Application * @func: [in] Trusted Application function, specific to the TA * @session: [in] Session id * @cancel_id: [in] Cancellation id, a unique value to identify this request * @ret: [out] return value - * @ret_origin [out] origin of the return value - * @num_params [in] number of parameters following this struct + * @ret_origin: [out] origin of the return value + * @num_params: [in] number of parameters following this struct + * @params: array of ioctl parameters */ struct tee_ioctl_invoke_arg { __u32 func; @@ -338,7 +339,8 @@ struct tee_ioctl_close_session_arg { /** * struct tee_iocl_supp_recv_arg - Receive a request for a supplicant function * @func: [in] supplicant function - * @num_params [in/out] number of parameters following this struct + * @num_params: [in/out] number of &struct tee_ioctl_param entries in @params + * @params: array of ioctl parameters * * @num_params is the number of params that tee-supplicant has room to * receive when input, @num_params is the number of actual params @@ -363,7 +365,8 @@ struct tee_iocl_supp_recv_arg { /** * struct tee_iocl_supp_send_arg - Send a response to a received request * @ret: [out] return value - * @num_params [in] number of parameters following this struct + * @num_params: [in] number of &struct tee_ioctl_param entries in @params + * @params: array of ioctl parameters */ struct tee_iocl_supp_send_arg { __u32 ret; @@ -454,11 +457,13 @@ struct tee_ioctl_shm_register_fd_data { */ /** - * struct tee_ioctl_invoke_func_arg - Invokes an object in a Trusted Application + * struct tee_ioctl_object_invoke_arg - Invokes an object in a + * Trusted Application * @id: [in] Object id * @op: [in] Object operation, specific to the object * @ret: [out] return value * @num_params: [in] number of parameters following this struct + * @params: array of ioctl parameters */ struct tee_ioctl_object_invoke_arg { __u64 id; -- cgit From 05a1fc5efdd8560f34a3af39c9cf1e1526cc3ddf Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Sun, 9 Nov 2025 10:12:07 +0100 Subject: ALSA: usb-audio: Fix potential overflow of PCM transfer buffer The PCM stream data in USB-audio driver is transferred over USB URB packet buffers, and each packet size is determined dynamically. The packet sizes are limited by some factors such as wMaxPacketSize USB descriptor. OTOH, in the current code, the actually used packet sizes are determined only by the rate and the PPS, which may be bigger than the size limit above. This results in a buffer overflow, as reported by syzbot. Basically when the limit is smaller than the calculated packet size, it implies that something is wrong, most likely a weird USB descriptor. So the best option would be just to return an error at the parameter setup time before doing any further operations. This patch introduces such a sanity check, and returns -EINVAL when the packet size is greater than maxpacksize. The comparison with ep->packsize[1] alone should suffice since it's always equal or greater than ep->packsize[0]. Reported-by: syzbot+bfd77469c8966de076f7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=bfd77469c8966de076f7 Link: https://lore.kernel.org/690b6b46.050a0220.3d0d33.0054.GAE@google.com Cc: Lizhi Xu Cc: Link: https://patch.msgid.link/20251109091211.12739-1-tiwai@suse.de Signed-off-by: Takashi Iwai --- sound/usb/endpoint.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index 880f5afcce60..cc15624ecaff 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -1362,6 +1362,11 @@ int snd_usb_endpoint_set_params(struct snd_usb_audio *chip, ep->sample_rem = ep->cur_rate % ep->pps; ep->packsize[0] = ep->cur_rate / ep->pps; ep->packsize[1] = (ep->cur_rate + (ep->pps - 1)) / ep->pps; + if (ep->packsize[1] > ep->maxpacksize) { + usb_audio_dbg(chip, "Too small maxpacksize %u for rate %u / pps %u\n", + ep->maxpacksize, ep->cur_rate, ep->pps); + return -EINVAL; + } /* calculate the frequency in 16.16 format */ ep->freqm = ep->freqn; -- cgit From 66e9feb03e7cf8983b1d0c540e2dad90d5146d48 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Sun, 9 Nov 2025 16:53:39 +0100 Subject: spi: Add TODO comment about ACPI GPIO setup Add a TODO comment that ideally the ACPI/gpiolib core code should take care of setting GPIO direction and/or bias according to ACPI GPIO resources. If this TODO gets implemented then the acpi_dev_gpio_irq_get() call in acpi_register_spi_device() can be dropped. Suggested-by: Andy Shevchenko Signed-off-by: Hans de Goede Reviewed-by: Andy Shevchenko Link: https://patch.msgid.link/20251109155340.26199-1-johannes.goede@oss.qualcomm.com Signed-off-by: Mark Brown --- drivers/spi/spi.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 8588e8562220..e25df9990f82 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -2857,6 +2857,8 @@ static acpi_status acpi_register_spi_device(struct spi_controller *ctlr, * here too, because this call sets the GPIO direction and/or bias. * Setting these needs to be done even if there is no driver, in which * case spi_probe() will never get called. + * TODO: ideally the setup of the GPIO should be handled in a generic + * manner in the ACPI/gpiolib core code. */ if (spi->irq < 0) spi->irq = acpi_dev_gpio_irq_get(adev, 0); -- cgit From 576c930e5e7dcb937648490611a83f1bf0171048 Mon Sep 17 00:00:00 2001 From: Boris Brezillon Date: Fri, 7 Nov 2025 18:12:14 +0100 Subject: drm/panthor: Flush shmem writes before mapping buffers CPU-uncached The shmem layer zeroes out the new pages using cached mappings, and if we don't CPU-flush we might leave dirty cachelines behind, leading to potential data leaks and/or asynchronous buffer corruption when dirty cachelines are evicted. Fixes: 8a1cc07578bf ("drm/panthor: Add GEM logical block") Signed-off-by: Boris Brezillon Reviewed-by: Steven Price Reviewed-by: Liviu Dudau Signed-off-by: Steven Price Link: https://patch.msgid.link/20251107171214.1186299-1-boris.brezillon@collabora.com --- drivers/gpu/drm/panthor/panthor_gem.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c index 156c7a0b62a2..3f43686f0195 100644 --- a/drivers/gpu/drm/panthor/panthor_gem.c +++ b/drivers/gpu/drm/panthor/panthor_gem.c @@ -288,6 +288,23 @@ panthor_gem_create_with_handle(struct drm_file *file, panthor_gem_debugfs_set_usage_flags(bo, 0); + /* If this is a write-combine mapping, we query the sgt to force a CPU + * cache flush (dma_map_sgtable() is called when the sgt is created). + * This ensures the zero-ing is visible to any uncached mapping created + * by vmap/mmap. + * FIXME: Ideally this should be done when pages are allocated, not at + * BO creation time. + */ + if (shmem->map_wc) { + struct sg_table *sgt; + + sgt = drm_gem_shmem_get_pages_sgt(shmem); + if (IS_ERR(sgt)) { + ret = PTR_ERR(sgt); + goto out_put_gem; + } + } + /* * Allocate an id of idr table where the obj is registered * and handle has the id what user can see. @@ -296,6 +313,7 @@ panthor_gem_create_with_handle(struct drm_file *file, if (!ret) *size = bo->base.base.size; +out_put_gem: /* drop reference from allocate - handle holds it now. */ drm_gem_object_put(&shmem->base); -- cgit From 994dec10991b53beac3e16109d876ae363e8a329 Mon Sep 17 00:00:00 2001 From: Jani Nikula Date: Thu, 6 Nov 2025 22:00:00 +0200 Subject: drm/i915/psr: fix pipe to vblank conversion MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit First, we can't assume pipe == crtc index. If a pipe is fused off in between, it no longer holds. intel_crtc_for_pipe() is the only proper way to get from a pipe to the corresponding crtc. Second, drivers aren't supposed to access or index drm->vblank[] directly. There's drm_crtc_vblank_crtc() for this. Use both functions to fix the pipe to vblank conversion. Fixes: f02658c46cf7 ("drm/i915/psr: Add mechanism to notify PSR of pipe enable/disable") Cc: Jouni Högander Cc: stable@vger.kernel.org # v6.16+ Reviewed-by: Jouni Högander Link: https://patch.msgid.link/20251106200000.1455164-1-jani.nikula@intel.com Signed-off-by: Jani Nikula (cherry picked from commit 2750f6765d6974f7e163c5d540a96c8703f6d8dd) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/i915/display/intel_psr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index 10eb93a34cf2..d5e0a1e66944 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -888,7 +888,8 @@ static bool is_dc5_dc6_blocked(struct intel_dp *intel_dp) { struct intel_display *display = to_intel_display(intel_dp); u32 current_dc_state = intel_display_power_get_current_dc_state(display); - struct drm_vblank_crtc *vblank = &display->drm->vblank[intel_dp->psr.pipe]; + struct intel_crtc *crtc = intel_crtc_for_pipe(display, intel_dp->psr.pipe); + struct drm_vblank_crtc *vblank = drm_crtc_vblank_crtc(&crtc->base); return (current_dc_state != DC_STATE_EN_UPTO_DC5 && current_dc_state != DC_STATE_EN_UPTO_DC6) || -- cgit From 7aca00d950e782e66c34fbd045c9605eca343a36 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Sat, 18 Oct 2025 20:10:33 -0400 Subject: pnfs: Fix TLS logic in _nfs4_pnfs_v3_ds_connect() Don't try to add an RDMA transport to a client that is already marked as being a TCP/TLS transport. Fixes: 04a15263662a ("pnfs/flexfiles: connect to NFSv3 DS using TLS if MDS connection uses TLS") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker --- fs/nfs/pnfs_nfs.c | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c index 7b32afb29782..ff48056bf750 100644 --- a/fs/nfs/pnfs_nfs.c +++ b/fs/nfs/pnfs_nfs.c @@ -809,8 +809,11 @@ static int _nfs4_pnfs_v3_ds_connect(struct nfs_server *mds_srv, unsigned int retrans) { struct nfs_client *clp = ERR_PTR(-EIO); + struct nfs_client *mds_clp = mds_srv->nfs_client; + enum xprtsec_policies xprtsec_policy = mds_clp->cl_xprtsec.policy; struct nfs4_pnfs_ds_addr *da; unsigned long connect_timeout = timeo * (retrans + 1) * HZ / 10; + int ds_proto; int status = 0; dprintk("--> %s DS %s\n", __func__, ds->ds_remotestr); @@ -834,27 +837,28 @@ static int _nfs4_pnfs_v3_ds_connect(struct nfs_server *mds_srv, .xprtsec = clp->cl_xprtsec, }; - if (da->da_transport != clp->cl_proto && - clp->cl_proto != XPRT_TRANSPORT_TCP_TLS) - continue; - if (da->da_transport == XPRT_TRANSPORT_TCP && - mds_srv->nfs_client->cl_proto == XPRT_TRANSPORT_TCP_TLS) + if (xprt_args.ident == XPRT_TRANSPORT_TCP && + clp->cl_proto == XPRT_TRANSPORT_TCP_TLS) xprt_args.ident = XPRT_TRANSPORT_TCP_TLS; - if (da->da_addr.ss_family != clp->cl_addr.ss_family) + if (xprt_args.ident != clp->cl_proto) + continue; + if (xprt_args.dstaddr->sa_family != + clp->cl_addr.ss_family) continue; /* Add this address as an alias */ rpc_clnt_add_xprt(clp->cl_rpcclient, &xprt_args, - rpc_clnt_test_and_add_xprt, NULL); + rpc_clnt_test_and_add_xprt, NULL); continue; } - if (da->da_transport == XPRT_TRANSPORT_TCP && - mds_srv->nfs_client->cl_proto == XPRT_TRANSPORT_TCP_TLS) - da->da_transport = XPRT_TRANSPORT_TCP_TLS; - clp = get_v3_ds_connect(mds_srv, - &da->da_addr, - da->da_addrlen, da->da_transport, - timeo, retrans); + + ds_proto = da->da_transport; + if (ds_proto == XPRT_TRANSPORT_TCP && + xprtsec_policy != RPC_XPRTSEC_NONE) + ds_proto = XPRT_TRANSPORT_TCP_TLS; + + clp = get_v3_ds_connect(mds_srv, &da->da_addr, da->da_addrlen, + ds_proto, timeo, retrans); if (IS_ERR(clp)) continue; clp->cl_rpcclient->cl_softerr = 0; -- cgit From 28e19737e1570c7c71890547c2e43c3e0da79df9 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Sat, 18 Oct 2025 20:10:34 -0400 Subject: pnfs: Fix TLS logic in _nfs4_pnfs_v4_ds_connect() Don't try to add an RDMA transport to a client that is already marked as being a TCP/TLS transport. Fixes: a35518cae4b3 ("NFSv4.1/pnfs: fix NFS with TLS in pnfs") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker --- fs/nfs/pnfs_nfs.c | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c index ff48056bf750..9976cc16b689 100644 --- a/fs/nfs/pnfs_nfs.c +++ b/fs/nfs/pnfs_nfs.c @@ -884,7 +884,10 @@ static int _nfs4_pnfs_v4_ds_connect(struct nfs_server *mds_srv, u32 minor_version) { struct nfs_client *clp = ERR_PTR(-EIO); + struct nfs_client *mds_clp = mds_srv->nfs_client; + enum xprtsec_policies xprtsec_policy = mds_clp->cl_xprtsec.policy; struct nfs4_pnfs_ds_addr *da; + int ds_proto; int status = 0; dprintk("--> %s DS %s\n", __func__, ds->ds_remotestr); @@ -912,12 +915,8 @@ static int _nfs4_pnfs_v4_ds_connect(struct nfs_server *mds_srv, .data = &xprtdata, }; - if (da->da_transport != clp->cl_proto && - clp->cl_proto != XPRT_TRANSPORT_TCP_TLS) - continue; - if (da->da_transport == XPRT_TRANSPORT_TCP && - mds_srv->nfs_client->cl_proto == - XPRT_TRANSPORT_TCP_TLS) { + if (xprt_args.ident == XPRT_TRANSPORT_TCP && + clp->cl_proto == XPRT_TRANSPORT_TCP_TLS) { struct sockaddr *addr = (struct sockaddr *)&da->da_addr; struct sockaddr_in *sin = @@ -948,7 +947,10 @@ static int _nfs4_pnfs_v4_ds_connect(struct nfs_server *mds_srv, xprt_args.ident = XPRT_TRANSPORT_TCP_TLS; xprt_args.servername = servername; } - if (da->da_addr.ss_family != clp->cl_addr.ss_family) + if (xprt_args.ident != clp->cl_proto) + continue; + if (xprt_args.dstaddr->sa_family != + clp->cl_addr.ss_family) continue; /** @@ -962,15 +964,14 @@ static int _nfs4_pnfs_v4_ds_connect(struct nfs_server *mds_srv, if (xprtdata.cred) put_cred(xprtdata.cred); } else { - if (da->da_transport == XPRT_TRANSPORT_TCP && - mds_srv->nfs_client->cl_proto == - XPRT_TRANSPORT_TCP_TLS) - da->da_transport = XPRT_TRANSPORT_TCP_TLS; - clp = nfs4_set_ds_client(mds_srv, - &da->da_addr, - da->da_addrlen, - da->da_transport, timeo, - retrans, minor_version); + ds_proto = da->da_transport; + if (ds_proto == XPRT_TRANSPORT_TCP && + xprtsec_policy != RPC_XPRTSEC_NONE) + ds_proto = XPRT_TRANSPORT_TCP_TLS; + + clp = nfs4_set_ds_client(mds_srv, &da->da_addr, + da->da_addrlen, ds_proto, + timeo, retrans, minor_version); if (IS_ERR(clp)) continue; @@ -981,7 +982,6 @@ static int _nfs4_pnfs_v4_ds_connect(struct nfs_server *mds_srv, clp = ERR_PTR(-EIO); continue; } - } } -- cgit From 8ab523ce78d4ca13add6b4ecbacff0f84c274603 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Sat, 18 Oct 2025 20:10:35 -0400 Subject: pnfs: Set transport security policy to RPC_XPRTSEC_NONE unless using TLS The default setting for the transport security policy must be RPC_XPRTSEC_NONE, when using a TCP or RDMA connection without TLS. Conversely, when using TLS, the security policy needs to be set. Fixes: 6c0a8c5fcf71 ("NFS: Have struct nfs_client carry a TLS policy field") Signed-off-by: Trond Myklebust Reviewed-by: Chuck Lever Signed-off-by: Anna Schumaker --- fs/nfs/nfs3client.c | 14 ++++++++++++-- fs/nfs/nfs4client.c | 14 ++++++++++++-- 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/fs/nfs/nfs3client.c b/fs/nfs/nfs3client.c index 0d7310c1ee0c..5d97c1d38bb6 100644 --- a/fs/nfs/nfs3client.c +++ b/fs/nfs/nfs3client.c @@ -2,6 +2,7 @@ #include #include #include +#include #include "internal.h" #include "nfs3_fs.h" #include "netns.h" @@ -98,7 +99,11 @@ struct nfs_client *nfs3_set_ds_client(struct nfs_server *mds_srv, .net = mds_clp->cl_net, .timeparms = &ds_timeout, .cred = mds_srv->cred, - .xprtsec = mds_clp->cl_xprtsec, + .xprtsec = { + .policy = RPC_XPRTSEC_NONE, + .cert_serial = TLS_NO_CERT, + .privkey_serial = TLS_NO_PRIVKEY, + }, .connect_timeout = connect_timeout, .reconnect_timeout = connect_timeout, }; @@ -111,9 +116,14 @@ struct nfs_client *nfs3_set_ds_client(struct nfs_server *mds_srv, cl_init.hostname = buf; switch (ds_proto) { + case XPRT_TRANSPORT_TCP_TLS: + if (mds_clp->cl_xprtsec.policy != RPC_XPRTSEC_NONE) + cl_init.xprtsec = mds_clp->cl_xprtsec; + else + ds_proto = XPRT_TRANSPORT_TCP; + fallthrough; case XPRT_TRANSPORT_RDMA: case XPRT_TRANSPORT_TCP: - case XPRT_TRANSPORT_TCP_TLS: if (mds_clp->cl_nconnect > 1) cl_init.nconnect = mds_clp->cl_nconnect; } diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index 5998d6bd8a4f..3a4baed993c9 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "internal.h" #include "callback.h" #include "delegation.h" @@ -983,7 +984,11 @@ struct nfs_client *nfs4_set_ds_client(struct nfs_server *mds_srv, .net = mds_clp->cl_net, .timeparms = &ds_timeout, .cred = mds_srv->cred, - .xprtsec = mds_srv->nfs_client->cl_xprtsec, + .xprtsec = { + .policy = RPC_XPRTSEC_NONE, + .cert_serial = TLS_NO_CERT, + .privkey_serial = TLS_NO_PRIVKEY, + }, }; char buf[INET6_ADDRSTRLEN + 1]; @@ -992,9 +997,14 @@ struct nfs_client *nfs4_set_ds_client(struct nfs_server *mds_srv, cl_init.hostname = buf; switch (ds_proto) { + case XPRT_TRANSPORT_TCP_TLS: + if (mds_srv->nfs_client->cl_xprtsec.policy != RPC_XPRTSEC_NONE) + cl_init.xprtsec = mds_srv->nfs_client->cl_xprtsec; + else + ds_proto = XPRT_TRANSPORT_TCP; + fallthrough; case XPRT_TRANSPORT_RDMA: case XPRT_TRANSPORT_TCP: - case XPRT_TRANSPORT_TCP_TLS: if (mds_clp->cl_nconnect > 1) { cl_init.nconnect = mds_clp->cl_nconnect; cl_init.max_connect = NFS_MAX_TRANSPORTS; -- cgit From fb2cba0854a7f315c8100a807a6959b99d72479e Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Sat, 18 Oct 2025 20:10:36 -0400 Subject: NFS: Check the TLS certificate fields in nfs_match_client() If the TLS security policy is of type RPC_XPRTSEC_TLS_X509, then the cert_serial and privkey_serial fields need to match as well since they define the client's identity, as presented to the server. Fixes: 90c9550a8d65 ("NFS: support the kernel keyring for TLS") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker --- fs/nfs/client.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 4e3dcc157a83..54699299d5b1 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -338,6 +338,14 @@ again: /* Match the xprt security policy */ if (clp->cl_xprtsec.policy != data->xprtsec.policy) continue; + if (clp->cl_xprtsec.policy == RPC_XPRTSEC_TLS_X509) { + if (clp->cl_xprtsec.cert_serial != + data->xprtsec.cert_serial) + continue; + if (clp->cl_xprtsec.privkey_serial != + data->xprtsec.privkey_serial) + continue; + } refcount_inc(&clp->cl_count); return clp; -- cgit From 51a491f2708de79da76791523d40926921823b7e Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Mon, 27 Oct 2025 09:08:31 -0400 Subject: nfs/localio: remove unecessary ENOTBLK handling in DIO WRITE support Each filesystem is meant to fallback to retrying DIO in terms buffered IO when it might encounter -ENOTBLK when issuing DIO (which can happen if the VFS cannot invalidate the page cache). So NFS doesn't need special handling for -ENOTBLK. Also, explicitly initialize a couple DIO related iocb members rather than simply rely on data structure zeroing. Fixes: c817248fc831 ("nfs/localio: add proper O_DIRECT support for READ and WRITE") Reported-by: Christoph Hellwig Signed-off-by: Mike Snitzer Signed-off-by: Anna Schumaker --- fs/nfs/localio.c | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 2c0455e91571..0383d6eb2f46 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -315,6 +315,7 @@ nfs_local_iocb_alloc(struct nfs_pgio_header *hdr, iocb->hdr = hdr; iocb->kiocb.ki_flags &= ~IOCB_APPEND; + iocb->kiocb.ki_complete = NULL; iocb->aio_complete_work = NULL; iocb->end_iter_index = -1; @@ -484,6 +485,7 @@ nfs_local_iters_init(struct nfs_local_kiocb *iocb, int rw) /* Use buffered IO */ iocb->offset[0] = hdr->args.offset; iov_iter_bvec(&iocb->iters[0], rw, iocb->bvec, v, len); + iocb->iter_is_dio_aligned[0] = false; iocb->n_iters = 1; } @@ -803,7 +805,7 @@ static void nfs_local_call_write(struct work_struct *work) iocb->kiocb.ki_complete = nfs_local_write_aio_complete; iocb->aio_complete_work = nfs_local_write_aio_complete_work; } -retry: + iocb->kiocb.ki_pos = iocb->offset[i]; status = filp->f_op->write_iter(&iocb->kiocb, &iocb->iters[i]); if (status != -EIOCBQUEUED) { @@ -823,15 +825,6 @@ retry: nfs_local_pgio_done(iocb->hdr, status); break; } - } else if (unlikely(status == -ENOTBLK && - (iocb->kiocb.ki_flags & IOCB_DIRECT))) { - /* VFS will return -ENOTBLK if DIO WRITE fails to - * invalidate the page cache. Retry using buffered IO. - */ - iocb->kiocb.ki_flags &= ~IOCB_DIRECT; - iocb->kiocb.ki_complete = NULL; - iocb->aio_complete_work = NULL; - goto retry; } nfs_local_pgio_done(iocb->hdr, status); if (iocb->hdr->task.tk_status) -- cgit From f2060bdc21d70f3d8a4753a9fd3b0b02cb48c0bc Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Mon, 27 Oct 2025 09:08:32 -0400 Subject: nfs/localio: add refcounting for each iocb IO associated with NFS pgio header Improve completion handling of as many as 3 IOs associated with each misaligned DIO by using a atomic_t to track completion of each IO. Update nfs_local_pgio_done() to use precise atomic_t accounting for remaining iov_iter (up to 3) associated with each iocb, so that each NFS LOCALIO pgio header is only released after all IOs have completed. But also allow early return if/when a short read or write occurs. Fixes reported BUG: KASAN: slab-use-after-free in nfs_local_call_read: https://lore.kernel.org/linux-nfs/aPSvi5Yr2lGOh5Jh@dell-per750-06-vm-07.rhts.eng.pek2.redhat.com/ Reported-by: Yongcheng Yang Fixes: c817248fc831 ("nfs/localio: add proper O_DIRECT support for READ and WRITE") Signed-off-by: Mike Snitzer Signed-off-by: Anna Schumaker --- fs/nfs/localio.c | 110 +++++++++++++++++++++++++++++++++---------------------- 1 file changed, 67 insertions(+), 43 deletions(-) diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 0383d6eb2f46..647fa19b0479 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -42,7 +42,7 @@ struct nfs_local_kiocb { /* Begin mostly DIO-specific members */ size_t end_len; short int end_iter_index; - short int n_iters; + atomic_t n_iters; bool iter_is_dio_aligned[NFSLOCAL_MAX_IOS]; loff_t offset[NFSLOCAL_MAX_IOS] ____cacheline_aligned; struct iov_iter iters[NFSLOCAL_MAX_IOS]; @@ -407,6 +407,7 @@ nfs_local_iters_setup_dio(struct nfs_local_kiocb *iocb, int rw, iters[n_iters].count = local_dio->start_len; iocb->offset[n_iters] = iocb->hdr->args.offset; iocb->iter_is_dio_aligned[n_iters] = false; + atomic_inc(&iocb->n_iters); ++n_iters; } @@ -425,6 +426,7 @@ nfs_local_iters_setup_dio(struct nfs_local_kiocb *iocb, int rw, /* Save index and length of end */ iocb->end_iter_index = n_iters; iocb->end_len = local_dio->end_len; + atomic_inc(&iocb->n_iters); ++n_iters; } @@ -448,7 +450,6 @@ nfs_local_iters_setup_dio(struct nfs_local_kiocb *iocb, int rw, } ++n_iters; - iocb->n_iters = n_iters; return n_iters; } @@ -474,6 +475,12 @@ nfs_local_iters_init(struct nfs_local_kiocb *iocb, int rw) } len = hdr->args.count - total; + /* + * For each iocb, iocb->n_iter is always at least 1 and we always + * end io after first nfs_local_pgio_done call unless misaligned DIO. + */ + atomic_set(&iocb->n_iters, 1); + if (test_bit(NFS_IOHDR_ODIRECT, &hdr->flags)) { struct nfs_local_dio local_dio; @@ -486,7 +493,6 @@ nfs_local_iters_init(struct nfs_local_kiocb *iocb, int rw) iocb->offset[0] = hdr->args.offset; iov_iter_bvec(&iocb->iters[0], rw, iocb->bvec, v, len); iocb->iter_is_dio_aligned[0] = false; - iocb->n_iters = 1; } static void @@ -506,9 +512,11 @@ nfs_local_pgio_init(struct nfs_pgio_header *hdr, hdr->task.tk_start = ktime_get(); } -static void -nfs_local_pgio_done(struct nfs_pgio_header *hdr, long status) +static bool +nfs_local_pgio_done(struct nfs_local_kiocb *iocb, long status, bool force) { + struct nfs_pgio_header *hdr = iocb->hdr; + /* Must handle partial completions */ if (status >= 0) { hdr->res.count += status; @@ -519,6 +527,12 @@ nfs_local_pgio_done(struct nfs_pgio_header *hdr, long status) hdr->res.op_status = nfs_localio_errno_to_nfs4_stat(status); hdr->task.tk_status = status; } + + if (force) + return true; + + BUG_ON(atomic_read(&iocb->n_iters) <= 0); + return atomic_dec_and_test(&iocb->n_iters); } static void @@ -549,11 +563,11 @@ static inline void nfs_local_pgio_aio_complete(struct nfs_local_kiocb *iocb) queue_work(nfsiod_workqueue, &iocb->work); } -static void -nfs_local_read_done(struct nfs_local_kiocb *iocb, long status) +static void nfs_local_read_done(struct nfs_local_kiocb *iocb) { struct nfs_pgio_header *hdr = iocb->hdr; struct file *filp = iocb->kiocb.ki_filp; + long status = hdr->task.tk_status; if ((iocb->kiocb.ki_flags & IOCB_DIRECT) && status == -EINVAL) { /* Underlying FS will return -EINVAL if misaligned DIO is attempted. */ @@ -574,12 +588,18 @@ nfs_local_read_done(struct nfs_local_kiocb *iocb, long status) status > 0 ? status : 0, hdr->res.eof); } +static inline void nfs_local_read_iocb_done(struct nfs_local_kiocb *iocb) +{ + nfs_local_read_done(iocb); + nfs_local_pgio_release(iocb); +} + static void nfs_local_read_aio_complete_work(struct work_struct *work) { struct nfs_local_kiocb *iocb = container_of(work, struct nfs_local_kiocb, work); - nfs_local_pgio_release(iocb); + nfs_local_read_iocb_done(iocb); } static void nfs_local_read_aio_complete(struct kiocb *kiocb, long ret) @@ -587,8 +607,10 @@ static void nfs_local_read_aio_complete(struct kiocb *kiocb, long ret) struct nfs_local_kiocb *iocb = container_of(kiocb, struct nfs_local_kiocb, kiocb); - nfs_local_pgio_done(iocb->hdr, ret); - nfs_local_read_done(iocb, ret); + /* AIO completion of DIO read should always be last to complete */ + if (unlikely(!nfs_local_pgio_done(iocb, ret, false))) + return; + nfs_local_pgio_aio_complete(iocb); /* Calls nfs_local_read_aio_complete_work */ } @@ -599,10 +621,13 @@ static void nfs_local_call_read(struct work_struct *work) struct file *filp = iocb->kiocb.ki_filp; const struct cred *save_cred; ssize_t status; + int n_iters; save_cred = override_creds(filp->f_cred); - for (int i = 0; i < iocb->n_iters ; i++) { + n_iters = atomic_read(&iocb->n_iters); + for (int i = 0; i < n_iters ; i++) { + /* DIO-aligned middle is always issued last with AIO completion */ if (iocb->iter_is_dio_aligned[i]) { iocb->kiocb.ki_flags |= IOCB_DIRECT; iocb->kiocb.ki_complete = nfs_local_read_aio_complete; @@ -612,18 +637,14 @@ static void nfs_local_call_read(struct work_struct *work) iocb->kiocb.ki_pos = iocb->offset[i]; status = filp->f_op->read_iter(&iocb->kiocb, &iocb->iters[i]); if (status != -EIOCBQUEUED) { - nfs_local_pgio_done(iocb->hdr, status); - if (iocb->hdr->task.tk_status) + if (nfs_local_pgio_done(iocb, status, false)) { + nfs_local_read_iocb_done(iocb); break; + } } } revert_creds(save_cred); - - if (status != -EIOCBQUEUED) { - nfs_local_read_done(iocb, status); - nfs_local_pgio_release(iocb); - } } static int @@ -738,11 +759,10 @@ static void nfs_local_vfs_getattr(struct nfs_local_kiocb *iocb) fattr->du.nfs3.used = stat.blocks << 9; } -static void -nfs_local_write_done(struct nfs_local_kiocb *iocb, long status) +static void nfs_local_write_done(struct nfs_local_kiocb *iocb) { struct nfs_pgio_header *hdr = iocb->hdr; - struct inode *inode = hdr->inode; + long status = hdr->task.tk_status; dprintk("%s: wrote %ld bytes.\n", __func__, status > 0 ? status : 0); @@ -761,10 +781,17 @@ nfs_local_write_done(struct nfs_local_kiocb *iocb, long status) nfs_set_pgio_error(hdr, -ENOSPC, hdr->args.offset); status = -ENOSPC; /* record -ENOSPC in terms of nfs_local_pgio_done */ - nfs_local_pgio_done(hdr, status); + (void) nfs_local_pgio_done(iocb, status, true); } if (hdr->task.tk_status < 0) - nfs_reset_boot_verifier(inode); + nfs_reset_boot_verifier(hdr->inode); +} + +static inline void nfs_local_write_iocb_done(struct nfs_local_kiocb *iocb) +{ + nfs_local_write_done(iocb); + nfs_local_vfs_getattr(iocb); + nfs_local_pgio_release(iocb); } static void nfs_local_write_aio_complete_work(struct work_struct *work) @@ -772,8 +799,7 @@ static void nfs_local_write_aio_complete_work(struct work_struct *work) struct nfs_local_kiocb *iocb = container_of(work, struct nfs_local_kiocb, work); - nfs_local_vfs_getattr(iocb); - nfs_local_pgio_release(iocb); + nfs_local_write_iocb_done(iocb); } static void nfs_local_write_aio_complete(struct kiocb *kiocb, long ret) @@ -781,8 +807,10 @@ static void nfs_local_write_aio_complete(struct kiocb *kiocb, long ret) struct nfs_local_kiocb *iocb = container_of(kiocb, struct nfs_local_kiocb, kiocb); - nfs_local_pgio_done(iocb->hdr, ret); - nfs_local_write_done(iocb, ret); + /* AIO completion of DIO write should always be last to complete */ + if (unlikely(!nfs_local_pgio_done(iocb, ret, false))) + return; + nfs_local_pgio_aio_complete(iocb); /* Calls nfs_local_write_aio_complete_work */ } @@ -793,13 +821,17 @@ static void nfs_local_call_write(struct work_struct *work) struct file *filp = iocb->kiocb.ki_filp; unsigned long old_flags = current->flags; const struct cred *save_cred; + bool force_done = false; ssize_t status; + int n_iters; current->flags |= PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO; save_cred = override_creds(filp->f_cred); file_start_write(filp); - for (int i = 0; i < iocb->n_iters ; i++) { + n_iters = atomic_read(&iocb->n_iters); + for (int i = 0; i < n_iters ; i++) { + /* DIO-aligned middle is always issued last with AIO completion */ if (iocb->iter_is_dio_aligned[i]) { iocb->kiocb.ki_flags |= IOCB_DIRECT; iocb->kiocb.ki_complete = nfs_local_write_aio_complete; @@ -812,35 +844,27 @@ static void nfs_local_call_write(struct work_struct *work) if (unlikely(status >= 0 && status < iocb->iters[i].count)) { /* partial write */ if (i == iocb->end_iter_index) { - /* Must not account partial end, otherwise, due - * to end being issued before middle: the partial + /* Must not account DIO partial end, otherwise (due + * to end being issued before middle): the partial * write accounting in nfs_local_write_done() * would incorrectly advance hdr->args.offset */ status = 0; } else { - /* Partial write at start or buffered middle, - * exit early. - */ - nfs_local_pgio_done(iocb->hdr, status); - break; + /* Partial write at start or middle, force done */ + force_done = true; } } - nfs_local_pgio_done(iocb->hdr, status); - if (iocb->hdr->task.tk_status) + if (nfs_local_pgio_done(iocb, status, force_done)) { + nfs_local_write_iocb_done(iocb); break; + } } } file_end_write(filp); revert_creds(save_cred); current->flags = old_flags; - - if (status != -EIOCBQUEUED) { - nfs_local_write_done(iocb, status); - nfs_local_vfs_getattr(iocb); - nfs_local_pgio_release(iocb); - } } static int -- cgit From d0497dd27452c79a48414df813a16cd12d274b3b Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Mon, 27 Oct 2025 09:08:33 -0400 Subject: nfs/localio: backfill missing partial read support for misaligned DIO Misaligned DIO read can be split into 3 IOs, must handle potential for short read from each component IO (follows same pattern used for handling partial writes, except upper layer read code handles advancing offset before retry). Fixes: c817248fc831 ("nfs/localio: add proper O_DIRECT support for READ and WRITE") Signed-off-by: Mike Snitzer Signed-off-by: Anna Schumaker --- fs/nfs/localio.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 647fa19b0479..9c205f8b5e59 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -414,7 +414,7 @@ nfs_local_iters_setup_dio(struct nfs_local_kiocb *iocb, int rw, /* Setup misaligned end? * If so, the end is purposely setup to be issued using buffered IO * before the middle (which will use DIO, if DIO-aligned, with AIO). - * This creates problems if/when the end results in a partial write. + * This creates problems if/when the end results in short read or write. * So must save index and length of end to handle this corner case. */ if (local_dio->end_len) { @@ -580,8 +580,9 @@ static void nfs_local_read_done(struct nfs_local_kiocb *iocb) */ hdr->res.replen = 0; - if (hdr->res.count != hdr->args.count || - hdr->args.offset + hdr->res.count >= i_size_read(file_inode(filp))) + /* nfs_readpage_result() handles short read */ + + if (hdr->args.offset + hdr->res.count >= i_size_read(file_inode(filp))) hdr->res.eof = true; dprintk("%s: read %ld bytes eof %d.\n", __func__, @@ -620,6 +621,7 @@ static void nfs_local_call_read(struct work_struct *work) container_of(work, struct nfs_local_kiocb, work); struct file *filp = iocb->kiocb.ki_filp; const struct cred *save_cred; + bool force_done = false; ssize_t status; int n_iters; @@ -637,7 +639,21 @@ static void nfs_local_call_read(struct work_struct *work) iocb->kiocb.ki_pos = iocb->offset[i]; status = filp->f_op->read_iter(&iocb->kiocb, &iocb->iters[i]); if (status != -EIOCBQUEUED) { - if (nfs_local_pgio_done(iocb, status, false)) { + if (unlikely(status >= 0 && status < iocb->iters[i].count)) { + /* partial read */ + if (i == iocb->end_iter_index) { + /* Must not account DIO partial end, otherwise (due + * to end being issued before middle): the partial + * read accounting in nfs_local_read_done() + * would incorrectly advance hdr->args.offset + */ + status = 0; + } else { + /* Partial read at start or middle, force done */ + force_done = true; + } + } + if (nfs_local_pgio_done(iocb, status, force_done)) { nfs_local_read_iocb_done(iocb); break; } -- cgit From d32ddfeb559342e89a4d06b1df4e7e5e96df3762 Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Mon, 27 Oct 2025 13:52:28 -0400 Subject: nfs/localio: Ensure DIO WRITE's IO on stable storage upon completion LOCALIO's misaligned DIO WRITE support requires synchronous IO for any misaligned head and/or tail that are issued using buffered IO. In addition, it is important that the O_DIRECT middle be on stable storage upon its completion via AIO. Otherwise, a misaligned DIO WRITE could mix buffered IO for the head/tail and direct IO for the DIO-aligned middle -- which could lead to problems associated with deferred writes to stable storage (such as out of order partial completions causing incorrect advancement of the file's offset, etc). Fixes: c817248fc831 ("nfs/localio: add proper O_DIRECT support for READ and WRITE") Signed-off-by: Mike Snitzer Signed-off-by: Anna Schumaker --- fs/nfs/localio.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 9c205f8b5e59..839dbda0b370 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -485,8 +485,12 @@ nfs_local_iters_init(struct nfs_local_kiocb *iocb, int rw) struct nfs_local_dio local_dio; if (nfs_is_local_dio_possible(iocb, rw, len, &local_dio) && - nfs_local_iters_setup_dio(iocb, rw, v, len, &local_dio) != 0) + nfs_local_iters_setup_dio(iocb, rw, v, len, &local_dio) != 0) { + /* Ensure DIO WRITE's IO on stable storage upon completion */ + if (rw == ITER_SOURCE) + iocb->kiocb.ki_flags |= IOCB_DSYNC|IOCB_SYNC; return; /* is DIO-aligned */ + } } /* Use buffered IO */ -- cgit From eb2d6774cc0d9d6ab8f924825695a85c14b2e0c2 Mon Sep 17 00:00:00 2001 From: Niranjan H Y Date: Mon, 10 Nov 2025 20:56:46 +0530 Subject: ASoC: SDCA: bug fix while parsing mipi-sdca-control-cn-list "struct sdca_control" declares "values" field as integer array. But the memory allocated to it is of char array. This causes crash for sdca_parse_function API. This patch addresses the issue by allocating correct data size. Signed-off-by: Niranjan H Y Reviewed-by: Charles Keepax Link: https://patch.msgid.link/20251110152646.192-1-niranjan.hy@ti.com Signed-off-by: Mark Brown --- sound/soc/sdca/sdca_functions.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sound/soc/sdca/sdca_functions.c b/sound/soc/sdca/sdca_functions.c index 13f68f7b6dd6..0ccb6775f4de 100644 --- a/sound/soc/sdca/sdca_functions.c +++ b/sound/soc/sdca/sdca_functions.c @@ -894,7 +894,8 @@ static int find_sdca_entity_control(struct device *dev, struct sdca_entity *enti return ret; } - control->values = devm_kzalloc(dev, hweight64(control->cn_list), GFP_KERNEL); + control->values = devm_kcalloc(dev, hweight64(control->cn_list), + sizeof(int), GFP_KERNEL); if (!control->values) return -ENOMEM; -- cgit From 0b2f7be548006b0651e1e8320790f49723265cbc Mon Sep 17 00:00:00 2001 From: Nitin Gote Date: Mon, 27 Oct 2025 14:56:43 +0530 Subject: drm/xe/xe3: Add WA_14024681466 for Xe3_LPG Apply WA_14024681466 to Xe3_LPG graphics IP versions from 30.00 to 30.05. v2: (Matthew Roper) - Remove stepping filter as workaround applies to all steppings. - Add an engine class filter so it only applies to the RENDER engine. Signed-off-by: Nitin Gote Link: https://patch.msgid.link/20251027092643.335904-1-nitin.r.gote@intel.com Reviewed-by: Matt Roper Signed-off-by: Matt Roper (cherry picked from commit 071089a69e199bd810ff31c4c933bd528e502743) Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/regs/xe_gt_regs.h | 1 + drivers/gpu/drm/xe/xe_wa.c | 4 ++++ 2 files changed, 5 insertions(+) diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index 51f2a03847f9..f680c8b8f258 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -168,6 +168,7 @@ #define XEHP_SLICE_COMMON_ECO_CHICKEN1 XE_REG_MCR(0x731c, XE_REG_OPTION_MASKED) #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14) +#define FAST_CLEAR_VALIGN_FIX REG_BIT(13) #define XE2LPM_CCCHKNREG1 XE_REG(0x82a8) diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c index cd03891654a1..c33719e2e0df 100644 --- a/drivers/gpu/drm/xe/xe_wa.c +++ b/drivers/gpu/drm/xe/xe_wa.c @@ -916,6 +916,10 @@ static const struct xe_rtp_entry_sr lrc_was[] = { XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3003), ENGINE_CLASS(RENDER)), XE_RTP_ACTIONS(SET(COMMON_SLICE_CHICKEN4, SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE)) }, + { XE_RTP_NAME("14024681466"), + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3005), ENGINE_CLASS(RENDER)), + XE_RTP_ACTIONS(SET(XEHP_SLICE_COMMON_ECO_CHICKEN1, FAST_CLEAR_VALIGN_FIX)) + }, }; static __maybe_unused const struct xe_rtp_entry oob_was[] = { -- cgit From fa3376319b83ba8b7fd55f2c1a268dcbf9d6eedc Mon Sep 17 00:00:00 2001 From: Tangudu Tilak Tirumalesh Date: Thu, 30 Oct 2025 21:16:26 +0530 Subject: drm/xe/xe3: Extend wa_14023061436 Extend wa_14023061436 to Graphics Versions 30.03, 30.04 and 30.05. Signed-off-by: Tangudu Tilak Tirumalesh Reviewed-by: Matt Roper Link: https://patch.msgid.link/20251030154626.3124565-1-tilak.tirumalesh.tangudu@intel.com Signed-off-by: Matt Roper (cherry picked from commit 0dd656d06f50ae4cedf160634cf13fd9e0944cf7) Cc: stable@vger.kernel.org # v6.17+ Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_wa.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c index c33719e2e0df..917b97317d11 100644 --- a/drivers/gpu/drm/xe/xe_wa.c +++ b/drivers/gpu/drm/xe/xe_wa.c @@ -679,6 +679,8 @@ static const struct xe_rtp_entry_sr engine_was[] = { }, { XE_RTP_NAME("14023061436"), XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3001), + FUNC(xe_rtp_match_first_render_or_compute), OR, + GRAPHICS_VERSION_RANGE(3003, 3005), FUNC(xe_rtp_match_first_render_or_compute)), XE_RTP_ACTIONS(SET(TDL_CHICKEN, QID_WAIT_FOR_THREAD_NOT_RUN_DISABLE)) }, -- cgit From 240372edaf854c9136f5ead45f2d8cd9496a9cb3 Mon Sep 17 00:00:00 2001 From: Nitin Gote Date: Thu, 6 Nov 2025 15:35:17 +0530 Subject: drm/xe/xe3lpg: Extend Wa_15016589081 for xe3lpg Wa_15016589081 applies to Xe3_LPG renderCS Signed-off-by: Nitin Gote Link: https://patch.msgid.link/20251106100516.318863-2-nitin.r.gote@intel.com Signed-off-by: Matt Roper (cherry picked from commit 715974499a2199bd199fb4630501f55545342ea4) Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_wa.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c index 917b97317d11..3cf30718b200 100644 --- a/drivers/gpu/drm/xe/xe_wa.c +++ b/drivers/gpu/drm/xe/xe_wa.c @@ -922,6 +922,11 @@ static const struct xe_rtp_entry_sr lrc_was[] = { XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3005), ENGINE_CLASS(RENDER)), XE_RTP_ACTIONS(SET(XEHP_SLICE_COMMON_ECO_CHICKEN1, FAST_CLEAR_VALIGN_FIX)) }, + { XE_RTP_NAME("15016589081"), + XE_RTP_RULES(GRAPHICS_VERSION(3000), GRAPHICS_STEP(A0, B0), + ENGINE_CLASS(RENDER)), + XE_RTP_ACTIONS(SET(CHICKEN_RASTER_1, DIS_CLIP_NEGATIVE_BOUNDING_BOX)) + }, }; static __maybe_unused const struct xe_rtp_entry oob_was[] = { -- cgit From 6a218b9c3183ed19d5703130025282cf20463d87 Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Wed, 5 Nov 2025 22:03:04 -0500 Subject: nfs/localio: do not issue misaligned DIO out-of-order From https://lore.kernel.org/linux-nfs/aQHASIumLJyOoZGH@infradead.org/ On Wed, Oct 29, 2025 at 12:20:40AM -0700, Christoph Hellwig wrote: > On Mon, Oct 27, 2025 at 12:18:30PM -0400, Mike Snitzer wrote: > > LOCALIO's misaligned DIO will issue head/tail followed by O_DIRECT > > middle (via AIO completion of that aligned middle). So out of order > > relative to file offset. > > That's in general a really bad idea. It will obviously work, but > both on SSDs and out of place write file systems it is a sure way > to increase your garbage collection overhead a lot down the line. Fix this by never issuing misaligned DIO out of order. This fix means the DIO-aligned middle will only use AIO completion if there is no misaligned end segment. Otherwise, all 3 segments of a misaligned DIO will be issued without AIO completion to ensure file offset increases properly for all partial READ or WRITE situations. Factoring out nfs_local_iter_setup() helps standardize repetitive nfs_local_iters_setup_dio() code and is inspired by cleanup work that Chuck Lever did on the NFSD Direct code. Fixes: c817248fc831 ("nfs/localio: add proper O_DIRECT support for READ and WRITE") Reported-by: Christoph Hellwig Signed-off-by: Mike Snitzer Signed-off-by: Anna Schumaker --- fs/nfs/localio.c | 128 ++++++++++++++++++++++--------------------------------- 1 file changed, 52 insertions(+), 76 deletions(-) diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 839dbda0b370..656976b4f42c 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -44,8 +44,7 @@ struct nfs_local_kiocb { short int end_iter_index; atomic_t n_iters; bool iter_is_dio_aligned[NFSLOCAL_MAX_IOS]; - loff_t offset[NFSLOCAL_MAX_IOS] ____cacheline_aligned; - struct iov_iter iters[NFSLOCAL_MAX_IOS]; + struct iov_iter iters[NFSLOCAL_MAX_IOS] ____cacheline_aligned; /* End mostly DIO-specific members */ }; @@ -314,6 +313,7 @@ nfs_local_iocb_alloc(struct nfs_pgio_header *hdr, init_sync_kiocb(&iocb->kiocb, file); iocb->hdr = hdr; + iocb->kiocb.ki_pos = hdr->args.offset; iocb->kiocb.ki_flags &= ~IOCB_APPEND; iocb->kiocb.ki_complete = NULL; iocb->aio_complete_work = NULL; @@ -389,13 +389,24 @@ static bool nfs_iov_iter_aligned_bvec(const struct iov_iter *i, return true; } +static void +nfs_local_iter_setup(struct iov_iter *iter, int rw, struct bio_vec *bvec, + unsigned int nvecs, unsigned long total, + size_t start, size_t len) +{ + iov_iter_bvec(iter, rw, bvec, nvecs, total); + if (start) + iov_iter_advance(iter, start); + iov_iter_truncate(iter, len); +} + /* * Setup as many as 3 iov_iter based on extents described by @local_dio. * Returns the number of iov_iter that were setup. */ static int nfs_local_iters_setup_dio(struct nfs_local_kiocb *iocb, int rw, - unsigned int nvecs, size_t len, + unsigned int nvecs, unsigned long total, struct nfs_local_dio *local_dio) { int n_iters = 0; @@ -403,41 +414,17 @@ nfs_local_iters_setup_dio(struct nfs_local_kiocb *iocb, int rw, /* Setup misaligned start? */ if (local_dio->start_len) { - iov_iter_bvec(&iters[n_iters], rw, iocb->bvec, nvecs, len); - iters[n_iters].count = local_dio->start_len; - iocb->offset[n_iters] = iocb->hdr->args.offset; - iocb->iter_is_dio_aligned[n_iters] = false; - atomic_inc(&iocb->n_iters); + nfs_local_iter_setup(&iters[n_iters], rw, iocb->bvec, + nvecs, total, 0, local_dio->start_len); ++n_iters; } - /* Setup misaligned end? - * If so, the end is purposely setup to be issued using buffered IO - * before the middle (which will use DIO, if DIO-aligned, with AIO). - * This creates problems if/when the end results in short read or write. - * So must save index and length of end to handle this corner case. - */ - if (local_dio->end_len) { - iov_iter_bvec(&iters[n_iters], rw, iocb->bvec, nvecs, len); - iocb->offset[n_iters] = local_dio->end_offset; - iov_iter_advance(&iters[n_iters], - local_dio->start_len + local_dio->middle_len); - iocb->iter_is_dio_aligned[n_iters] = false; - /* Save index and length of end */ - iocb->end_iter_index = n_iters; - iocb->end_len = local_dio->end_len; - atomic_inc(&iocb->n_iters); - ++n_iters; - } - - /* Setup DIO-aligned middle to be issued last, to allow for - * DIO with AIO completion (see nfs_local_call_{read,write}). + /* + * Setup DIO-aligned middle, if there is no misaligned end (below) + * then AIO completion is used, see nfs_local_call_{read,write} */ - iov_iter_bvec(&iters[n_iters], rw, iocb->bvec, nvecs, len); - if (local_dio->start_len) - iov_iter_advance(&iters[n_iters], local_dio->start_len); - iters[n_iters].count -= local_dio->end_len; - iocb->offset[n_iters] = local_dio->middle_offset; + nfs_local_iter_setup(&iters[n_iters], rw, iocb->bvec, nvecs, + total, local_dio->start_len, local_dio->middle_len); iocb->iter_is_dio_aligned[n_iters] = nfs_iov_iter_aligned_bvec(&iters[n_iters], @@ -445,11 +432,22 @@ nfs_local_iters_setup_dio(struct nfs_local_kiocb *iocb, int rw, if (unlikely(!iocb->iter_is_dio_aligned[n_iters])) { trace_nfs_local_dio_misaligned(iocb->hdr->inode, - iocb->hdr->args.offset, len, local_dio); + local_dio->start_len, local_dio->middle_len, local_dio); return 0; /* no DIO-aligned IO possible */ } + iocb->end_iter_index = n_iters; ++n_iters; + /* Setup misaligned end? */ + if (local_dio->end_len) { + nfs_local_iter_setup(&iters[n_iters], rw, iocb->bvec, + nvecs, total, local_dio->start_len + + local_dio->middle_len, local_dio->end_len); + iocb->end_iter_index = n_iters; + ++n_iters; + } + + atomic_set(&iocb->n_iters, n_iters); return n_iters; } @@ -476,7 +474,7 @@ nfs_local_iters_init(struct nfs_local_kiocb *iocb, int rw) len = hdr->args.count - total; /* - * For each iocb, iocb->n_iter is always at least 1 and we always + * For each iocb, iocb->n_iters is always at least 1 and we always * end io after first nfs_local_pgio_done call unless misaligned DIO. */ atomic_set(&iocb->n_iters, 1); @@ -494,9 +492,7 @@ nfs_local_iters_init(struct nfs_local_kiocb *iocb, int rw) } /* Use buffered IO */ - iocb->offset[0] = hdr->args.offset; iov_iter_bvec(&iocb->iters[0], rw, iocb->bvec, v, len); - iocb->iter_is_dio_aligned[0] = false; } static void @@ -633,30 +629,20 @@ static void nfs_local_call_read(struct work_struct *work) n_iters = atomic_read(&iocb->n_iters); for (int i = 0; i < n_iters ; i++) { - /* DIO-aligned middle is always issued last with AIO completion */ if (iocb->iter_is_dio_aligned[i]) { iocb->kiocb.ki_flags |= IOCB_DIRECT; - iocb->kiocb.ki_complete = nfs_local_read_aio_complete; - iocb->aio_complete_work = nfs_local_read_aio_complete_work; - } + /* Only use AIO completion if DIO-aligned segment is last */ + if (i == iocb->end_iter_index) { + iocb->kiocb.ki_complete = nfs_local_read_aio_complete; + iocb->aio_complete_work = nfs_local_read_aio_complete_work; + } + } else + iocb->kiocb.ki_flags &= ~IOCB_DIRECT; - iocb->kiocb.ki_pos = iocb->offset[i]; status = filp->f_op->read_iter(&iocb->kiocb, &iocb->iters[i]); if (status != -EIOCBQUEUED) { - if (unlikely(status >= 0 && status < iocb->iters[i].count)) { - /* partial read */ - if (i == iocb->end_iter_index) { - /* Must not account DIO partial end, otherwise (due - * to end being issued before middle): the partial - * read accounting in nfs_local_read_done() - * would incorrectly advance hdr->args.offset - */ - status = 0; - } else { - /* Partial read at start or middle, force done */ - force_done = true; - } - } + if (unlikely(status >= 0 && status < iocb->iters[i].count)) + force_done = true; /* Partial read */ if (nfs_local_pgio_done(iocb, status, force_done)) { nfs_local_read_iocb_done(iocb); break; @@ -851,30 +837,20 @@ static void nfs_local_call_write(struct work_struct *work) file_start_write(filp); n_iters = atomic_read(&iocb->n_iters); for (int i = 0; i < n_iters ; i++) { - /* DIO-aligned middle is always issued last with AIO completion */ if (iocb->iter_is_dio_aligned[i]) { iocb->kiocb.ki_flags |= IOCB_DIRECT; - iocb->kiocb.ki_complete = nfs_local_write_aio_complete; - iocb->aio_complete_work = nfs_local_write_aio_complete_work; - } + /* Only use AIO completion if DIO-aligned segment is last */ + if (i == iocb->end_iter_index) { + iocb->kiocb.ki_complete = nfs_local_write_aio_complete; + iocb->aio_complete_work = nfs_local_write_aio_complete_work; + } + } else + iocb->kiocb.ki_flags &= ~IOCB_DIRECT; - iocb->kiocb.ki_pos = iocb->offset[i]; status = filp->f_op->write_iter(&iocb->kiocb, &iocb->iters[i]); if (status != -EIOCBQUEUED) { - if (unlikely(status >= 0 && status < iocb->iters[i].count)) { - /* partial write */ - if (i == iocb->end_iter_index) { - /* Must not account DIO partial end, otherwise (due - * to end being issued before middle): the partial - * write accounting in nfs_local_write_done() - * would incorrectly advance hdr->args.offset - */ - status = 0; - } else { - /* Partial write at start or middle, force done */ - force_done = true; - } - } + if (unlikely(status >= 0 && status < iocb->iters[i].count)) + force_done = true; /* Partial write */ if (nfs_local_pgio_done(iocb, status, force_done)) { nfs_local_write_iocb_done(iocb); break; -- cgit From 85d2c2392ac6348e1171d627497034a341a250c1 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 28 Oct 2025 17:27:43 -0400 Subject: NFSv2/v3: Fix error handling in nfs_atomic_open_v23() When nfs_do_create() returns an EEXIST error, it means that a regular file could not be created. That could mean that a symlink needs to be resolved. If that's the case, a lookup needs to be kicked off. Reported-by: Stephen Abbene Link: https://bugzilla.kernel.org/show_bug.cgi?id=220710 Fixes: 7c6c5249f061 ("NFS: add atomic_open for NFSv3 to handle O_TRUNC correctly.") Signed-off-by: Trond Myklebust Reviewed-by: NeilBrown Signed-off-by: Anna Schumaker --- fs/nfs/dir.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 46d9c65d50f8..ea9f6ca8f30f 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -2268,11 +2268,12 @@ int nfs_atomic_open_v23(struct inode *dir, struct dentry *dentry, return -ENAMETOOLONG; if (open_flags & O_CREAT) { - file->f_mode |= FMODE_CREATED; error = nfs_do_create(dir, dentry, mode, open_flags); - if (error) + if (!error) { + file->f_mode |= FMODE_CREATED; + return finish_open(file, dentry, NULL); + } else if (error != -EEXIST || open_flags & O_EXCL) return error; - return finish_open(file, dentry, NULL); } if (d_in_lookup(dentry)) { /* The only flags nfs_lookup considers are -- cgit From 7a7a3456520b309a0bffa1d9d62bd6c9dcab89b3 Mon Sep 17 00:00:00 2001 From: Yang Xiuwei Date: Thu, 30 Oct 2025 11:03:25 +0800 Subject: NFS: sysfs: fix leak when nfs_client kobject add fails If adding the second kobject fails, drop both references to avoid sysfs residue and memory leak. Fixes: e96f9268eea6 ("NFS: Make all of /sys/fs/nfs network-namespace unique") Signed-off-by: Yang Xiuwei Reviewed-by: Benjamin Coddington Signed-off-by: Anna Schumaker --- fs/nfs/sysfs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/nfs/sysfs.c b/fs/nfs/sysfs.c index 545148d42dcc..ea6e6168092b 100644 --- a/fs/nfs/sysfs.c +++ b/fs/nfs/sysfs.c @@ -189,6 +189,7 @@ static struct nfs_netns_client *nfs_netns_client_alloc(struct kobject *parent, return p; kobject_put(&p->kobject); + kobject_put(&p->nfs_net_kobj); } return NULL; } -- cgit From 1f214e9c3aef2d0936be971072e991d78a174d71 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Fri, 31 Oct 2025 10:51:42 -0400 Subject: NFSv4: Fix an incorrect parameter when calling nfs4_call_sync() The Smatch static checker noted that in _nfs4_proc_lookupp(), the flag RPC_TASK_TIMEOUT is being passed as an argument to nfs4_init_sequence(), which is clearly incorrect. Since LOOKUPP is an idempotent operation, nfs4_init_sequence() should not ask the server to cache the result. The RPC_TASK_TIMEOUT flag needs to be passed down to the RPC layer. Reported-by: Dan Carpenter Reported-by: Harshit Mogalapalli Fixes: 76998ebb9158 ("NFSv4: Observe the NFS_MOUNT_SOFTREVAL flag in _nfs4_proc_lookupp") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker --- fs/nfs/nfs4proc.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 411776718494..93c6ce04332b 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -4715,16 +4715,19 @@ static int _nfs4_proc_lookupp(struct inode *inode, }; unsigned short task_flags = 0; - if (NFS_SERVER(inode)->flags & NFS_MOUNT_SOFTREVAL) + if (server->flags & NFS_MOUNT_SOFTREVAL) task_flags |= RPC_TASK_TIMEOUT; + if (server->caps & NFS_CAP_MOVEABLE) + task_flags |= RPC_TASK_MOVEABLE; args.bitmask = nfs4_bitmask(server, fattr->label); nfs_fattr_init(fattr); + nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 0); dprintk("NFS call lookupp ino=0x%lx\n", inode->i_ino); - status = nfs4_call_sync(clnt, server, &msg, &args.seq_args, - &res.seq_res, task_flags); + status = nfs4_do_call_sync(clnt, server, &msg, &args.seq_args, + &res.seq_res, task_flags); dprintk("NFS reply lookupp: %d\n", status); return status; } -- cgit From b623390045a81fc559decb9bfeb79319721d3dfb Mon Sep 17 00:00:00 2001 From: Dai Ngo Date: Sun, 9 Nov 2025 09:05:08 -0800 Subject: NFS: Fix LTP test failures when timestamps are delegated The utimes01 and utime06 tests fail when delegated timestamps are enabled, specifically in subtests that modify the atime and mtime fields using the 'nobody' user ID. The problem can be reproduced as follow: # echo "/media *(rw,no_root_squash,sync)" >> /etc/exports # export -ra # mount -o rw,nfsvers=4.2 127.0.0.1:/media /tmpdir # cd /opt/ltp # ./runltp -d /tmpdir -s utimes01 # ./runltp -d /tmpdir -s utime06 This issue occurs because nfs_setattr does not verify the inode's UID against the caller's fsuid when delegated timestamps are permitted for the inode. This patch adds the UID check and if it does not match then the request is sent to the server for permission checking. Fixes: e12912d94137 ("NFSv4: Add support for delegated atime and mtime attributes") Signed-off-by: Dai Ngo Signed-off-by: Anna Schumaker --- fs/nfs/inode.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 18b57c7c2f97..13ad70fc00d8 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -718,6 +718,8 @@ nfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, struct nfs_fattr *fattr; loff_t oldsize = i_size_read(inode); int error = 0; + kuid_t task_uid = current_fsuid(); + kuid_t owner_uid = inode->i_uid; nfs_inc_stats(inode, NFSIOS_VFSSETATTR); @@ -739,9 +741,11 @@ nfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, if (nfs_have_delegated_mtime(inode) && attr->ia_valid & ATTR_MTIME) { spin_lock(&inode->i_lock); if (attr->ia_valid & ATTR_MTIME_SET) { - nfs_set_timestamps_to_ts(inode, attr); - attr->ia_valid &= ~(ATTR_MTIME|ATTR_MTIME_SET| + if (uid_eq(task_uid, owner_uid)) { + nfs_set_timestamps_to_ts(inode, attr); + attr->ia_valid &= ~(ATTR_MTIME|ATTR_MTIME_SET| ATTR_ATIME|ATTR_ATIME_SET); + } } else { nfs_update_timestamps(inode, attr->ia_valid); attr->ia_valid &= ~(ATTR_MTIME|ATTR_ATIME); @@ -751,10 +755,12 @@ nfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, attr->ia_valid & ATTR_ATIME && !(attr->ia_valid & ATTR_MTIME)) { if (attr->ia_valid & ATTR_ATIME_SET) { - spin_lock(&inode->i_lock); - nfs_set_timestamps_to_ts(inode, attr); - spin_unlock(&inode->i_lock); - attr->ia_valid &= ~(ATTR_ATIME|ATTR_ATIME_SET); + if (uid_eq(task_uid, owner_uid)) { + spin_lock(&inode->i_lock); + nfs_set_timestamps_to_ts(inode, attr); + spin_unlock(&inode->i_lock); + attr->ia_valid &= ~(ATTR_ATIME|ATTR_ATIME_SET); + } } else { nfs_update_delegated_atime(inode); attr->ia_valid &= ~ATTR_ATIME; -- cgit From d3c9c213c0b86ac5dd8fe2c53c24db20f1f510bc Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Mon, 10 Nov 2025 14:30:41 -0700 Subject: io_uring/rw: ensure allocated iovec gets cleared for early failure A previous commit reused the recyling infrastructure for early cleanup, but this is not enough for the case where our internal caches have overflowed. If this happens, then the allocated iovec can get leaked if the request is also aborted early. Reinstate the previous forced free of the iovec for that situation. Cc: stable@vger.kernel.org Reported-by: syzbot+3c93637d7648c24e1fd0@syzkaller.appspotmail.com Tested-by: syzbot+3c93637d7648c24e1fd0@syzkaller.appspotmail.com Fixes: 9ac273ae3dc2 ("io_uring/rw: use io_rw_recycle() from cleanup path") Link: https://lore.kernel.org/io-uring/69122a59.a70a0220.22f260.00fd.GAE@google.com/ Signed-off-by: Jens Axboe --- io_uring/rw.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/io_uring/rw.c b/io_uring/rw.c index 5b2241a5813c..abe68ba9c9dc 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -463,7 +463,10 @@ int io_read_mshot_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) void io_readv_writev_cleanup(struct io_kiocb *req) { + struct io_async_rw *rw = req->async_data; + lockdep_assert_held(&req->ctx->uring_lock); + io_vec_free(&rw->vec); io_rw_recycle(req, 0); } -- cgit From 6a77267d97b5b6cd0e35099ab4eb054e5f965ee6 Mon Sep 17 00:00:00 2001 From: Pavel Begunkov Date: Mon, 10 Nov 2025 13:03:53 +0000 Subject: io_uring/query: return number of available queries It's useful to know which query opcodes are available. Extend the structure and return that. It's a trivial change, and even though it can be painlessly extended later, it'd still require adding a v2 of the structure. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring/query.h | 3 +++ io_uring/query.c | 2 ++ 2 files changed, 5 insertions(+) diff --git a/include/uapi/linux/io_uring/query.h b/include/uapi/linux/io_uring/query.h index 5d754322a27c..3539ccbfd064 100644 --- a/include/uapi/linux/io_uring/query.h +++ b/include/uapi/linux/io_uring/query.h @@ -36,6 +36,9 @@ struct io_uring_query_opcode { __u64 enter_flags; /* Bitmask of all supported IOSQE_* flags */ __u64 sqe_flags; + /* The number of available query opcodes */ + __u32 nr_query_opcodes; + __u32 __pad; }; #endif diff --git a/io_uring/query.c b/io_uring/query.c index 645301bd2c82..cf02893ba911 100644 --- a/io_uring/query.c +++ b/io_uring/query.c @@ -20,6 +20,8 @@ static ssize_t io_query_ops(void *data) e->ring_setup_flags = IORING_SETUP_FLAGS; e->enter_flags = IORING_ENTER_FLAGS; e->sqe_flags = SQE_VALID_FLAGS; + e->nr_query_opcodes = __IO_URING_QUERY_MAX; + e->__pad = 0; return sizeof(*e); } -- cgit From dd4adb986a86727ed8f56c48b6d0695f1e211e65 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Tue, 28 Oct 2025 12:27:24 -0400 Subject: selftests/tracing: Run sample events to clear page cache events The tracing selftest "event-filter-function.tc" was failing because it first runs the "sample_events" function that triggers the kmem_cache_free event and it looks at what function was used during a call to "ls". But the first time it calls this, it could trigger events that are used to pull pages into the page cache. The rest of the test uses the function it finds during that call to see if it will be called in subsequent "sample_events" calls. But if there's no need to pull pages into the page cache, it will not trigger that function and the test will fail. Call the "sample_events" twice to trigger all the page cache work before it calls it to find a function to use in subsequent checks. Cc: stable@vger.kernel.org Fixes: eb50d0f250e96 ("selftests/ftrace: Choose target function for filter test from samples") Signed-off-by: Steven Rostedt (Google) Acked-by: Masami Hiramatsu (Google) Signed-off-by: Shuah Khan --- tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc index c62165fabd0c..cfa16aa1f39a 100644 --- a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc +++ b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc @@ -20,6 +20,10 @@ sample_events() { echo 0 > tracing_on echo 0 > events/enable +# Clear functions caused by page cache; run sample_events twice +sample_events +sample_events + echo "Get the most frequently calling function" echo > trace sample_events -- cgit From 0f559cd91e37b7978e4198ca2fbf7eb95df11361 Mon Sep 17 00:00:00 2001 From: Marc Zyngier Date: Mon, 10 Nov 2025 17:30:10 +0000 Subject: KVM: arm64: Finalize ID registers only once per VM Owing to the ID registers being global to the VM, there is no point in computing them more than once. However, recent changes making use of kvm_set_vm_id_reg() outlined that we repeatedly hammer the ID registers when we shouldn't. Gate the ID reg update on the VM having never run. Fixes: 50e7cce81b9b2 ("KVM: arm64: Limit clearing of ID_{AA64PFR0,PFR1}_EL1.GIC to userspace irqchip") Fixes: 5cb57a1aff755 ("KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest") Closes: https://lore.kernel.org/r/aRHf6x5umkTYhYJ3@finisterre.sirena.org.uk Reported-by: Mark Brown Tested-by: Mark Brown Link: https://patch.msgid.link/20251110173010.1918424-1-maz@kernel.org Signed-off-by: Marc Zyngier --- arch/arm64/kvm/sys_regs.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 8ae2bca81614..ec3fbe0b8d52 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -5609,7 +5609,11 @@ int kvm_finalize_sys_regs(struct kvm_vcpu *vcpu) guard(mutex)(&kvm->arch.config_lock); - if (!irqchip_in_kernel(kvm)) { + /* + * This hacks into the ID registers, so only perform it when the + * first vcpu runs, or the kvm_set_vm_id_reg() helper will scream. + */ + if (!irqchip_in_kernel(kvm) && !kvm_vm_has_ran_once(kvm)) { u64 val; val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64PFR0_EL1) & ~ID_AA64PFR0_EL1_GIC; -- cgit From fd3ecda38fe0cb713d167b5477d25f6b350f0514 Mon Sep 17 00:00:00 2001 From: Niravkumar L Rabara Date: Tue, 11 Nov 2025 16:08:01 +0800 Subject: EDAC/altera: Handle OCRAM ECC enable after warm reset The OCRAM ECC is always enabled either by the BootROM or by the Secure Device Manager (SDM) during a power-on reset on SoCFPGA. However, during a warm reset, the OCRAM content is retained to preserve data, while the control and status registers are reset to their default values. As a result, ECC must be explicitly re-enabled after a warm reset. Fixes: 17e47dc6db4f ("EDAC/altera: Add Stratix10 OCRAM ECC support") Signed-off-by: Niravkumar L Rabara Signed-off-by: Borislav Petkov (AMD) Acked-by: Dinh Nguyen Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251111080801.1279401-1-niravkumarlaxmidas.rabara@altera.com --- drivers/edac/altera_edac.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c index 103b2c2eba2a..a776d61027f2 100644 --- a/drivers/edac/altera_edac.c +++ b/drivers/edac/altera_edac.c @@ -1184,10 +1184,22 @@ altr_check_ocram_deps_init(struct altr_edac_device_dev *device) if (ret) return ret; - /* Verify OCRAM has been initialized */ + /* + * Verify that OCRAM has been initialized. + * During a warm reset, OCRAM contents are retained, but the control + * and status registers are reset to their default values. Therefore, + * ECC must be explicitly re-enabled in the control register. + * Error condition: if INITCOMPLETEA is clear and ECC_EN is already set. + */ if (!ecc_test_bits(ALTR_A10_ECC_INITCOMPLETEA, - (base + ALTR_A10_ECC_INITSTAT_OFST))) - return -ENODEV; + (base + ALTR_A10_ECC_INITSTAT_OFST))) { + if (!ecc_test_bits(ALTR_A10_ECC_EN, + (base + ALTR_A10_ECC_CTRL_OFST))) + ecc_set_bits(ALTR_A10_ECC_EN, + (base + ALTR_A10_ECC_CTRL_OFST)); + else + return -ENODEV; + } /* Enable IRQ on Single Bit Error */ writel(ALTR_A10_ECC_SERRINTEN, (base + ALTR_A10_ECC_ERRINTENS_OFST)); -- cgit From 281326be67252ac5794d1383f67526606b1d6b13 Mon Sep 17 00:00:00 2001 From: Niravkumar L Rabara Date: Tue, 11 Nov 2025 16:13:33 +0800 Subject: EDAC/altera: Use INTTEST register for Ethernet and USB SBE injection The current single-bit error injection mechanism flips bits directly in ECC RAM by performing write and read operations. When the ECC RAM is actively used by the Ethernet or USB controller, this approach sometimes trigger a false double-bit error. Switch both Ethernet and USB EDAC devices to use the INTTEST register (altr_edac_a10_device_inject_fops) for single-bit error injection, similar to the existing double-bit error injection method. Fixes: 064acbd4f4ab ("EDAC, altera: Add Stratix10 peripheral support") Signed-off-by: Niravkumar L Rabara Signed-off-by: Borislav Petkov (AMD) Acked-by: Dinh Nguyen Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251111081333.1279635-1-niravkumarlaxmidas.rabara@altera.com --- drivers/edac/altera_edac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c index a776d61027f2..0c5b94e64ea1 100644 --- a/drivers/edac/altera_edac.c +++ b/drivers/edac/altera_edac.c @@ -1369,7 +1369,7 @@ static const struct edac_device_prv_data a10_enetecc_data = { .ue_set_mask = ALTR_A10_ECC_TDERRA, .set_err_ofst = ALTR_A10_ECC_INTTEST_OFST, .ecc_irq_handler = altr_edac_a10_ecc_irq, - .inject_fops = &altr_edac_a10_device_inject2_fops, + .inject_fops = &altr_edac_a10_device_inject_fops, }; #endif /* CONFIG_EDAC_ALTERA_ETHERNET */ @@ -1459,7 +1459,7 @@ static const struct edac_device_prv_data a10_usbecc_data = { .ue_set_mask = ALTR_A10_ECC_TDERRA, .set_err_ofst = ALTR_A10_ECC_INTTEST_OFST, .ecc_irq_handler = altr_edac_a10_ecc_irq, - .inject_fops = &altr_edac_a10_device_inject2_fops, + .inject_fops = &altr_edac_a10_device_inject_fops, }; #endif /* CONFIG_EDAC_ALTERA_USB */ -- cgit From ed6612165b74f09db00ef0abaf9831895ab28b7f Mon Sep 17 00:00:00 2001 From: Yiqi Sun Date: Tue, 11 Nov 2025 15:05:39 +0800 Subject: smb: fix invalid username check in smb3_fs_context_parse_param() Since the maximum return value of strnlen(..., CIFS_MAX_USERNAME_LEN) is CIFS_MAX_USERNAME_LEN, length check in smb3_fs_context_parse_param() is always FALSE and invalid. Fix the comparison in if statement. Signed-off-by: Yiqi Sun Signed-off-by: Steve French --- fs/smb/client/fs_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c index c2d5bb23040c..0f894d09157b 100644 --- a/fs/smb/client/fs_context.c +++ b/fs/smb/client/fs_context.c @@ -1470,7 +1470,7 @@ static int smb3_fs_context_parse_param(struct fs_context *fc, break; } - if (strnlen(param->string, CIFS_MAX_USERNAME_LEN) > + if (strnlen(param->string, CIFS_MAX_USERNAME_LEN) == CIFS_MAX_USERNAME_LEN) { pr_warn("username too long\n"); goto cifs_parse_mount_err; -- cgit From c42af83c59b65d01c0f7a074e450bbbb43b22f0d Mon Sep 17 00:00:00 2001 From: Akinobu Mita Date: Tue, 11 Nov 2025 10:00:10 +0900 Subject: memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory memblock_estimated_nr_free_pages() returns the difference between the total size of the "memory" memblock type and the "reserved" memblock type. The "soft-reserved" memory regions are added to the "reserved" memblock type, but not to the "memory" memblock type. Therefore, memblock_estimated_nr_free_pages() may return a smaller value than expected, or if it underflows, an extremely large value. /proc/sys/kernel/threads-max is determined by the value of memblock_estimated_nr_free_pages(). This issue was discovered on machines with CXL memory because kernel.threads-max was either smaller than expected or extremely large for the installed DRAM size. This fixes the issue by replacing memblock_reserved_size() with memblock_reserved_kern_size() that tells how much memory was reserved from the actual RAM. Suggested-by: Mike Rapoport Signed-off-by: Akinobu Mita Link: https://patch.msgid.link/20251111010010.7800-1-akinobu.mita@gmail.com Signed-off-by: Mike Rapoport (Microsoft) --- mm/memblock.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/memblock.c b/mm/memblock.c index e23e16618e9b..f0f2dc66e9a2 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1826,7 +1826,8 @@ phys_addr_t __init_memblock memblock_reserved_kern_size(phys_addr_t limit, int n */ unsigned long __init memblock_estimated_nr_free_pages(void) { - return PHYS_PFN(memblock_phys_mem_size() - memblock_reserved_size()); + return PHYS_PFN(memblock_phys_mem_size() - + memblock_reserved_kern_size(MEMBLOCK_ALLOC_ANYWHERE, NUMA_NO_NODE)); } /* lowest address */ -- cgit From 9e805625218b70d865fcee2105dbf835d473c074 Mon Sep 17 00:00:00 2001 From: Rakuram Eswaran Date: Thu, 23 Oct 2025 20:24:32 +0530 Subject: mmc: pxamci: Simplify pxamci_probe() error handling using devm APIs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This patch refactors pxamci_probe() to use devm-managed resource allocation (e.g. devm_dma_request_chan) and dev_err_probe() for improved readability and automatic cleanup on probe failure. It also removes redundant NULL assignments and manual resource release logic from pxamci_probe(), and eliminates the corresponding release calls from pxamci_remove(). Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202510041841.pRlunIfl-lkp@intel.com/ Fixes: 58c40f3faf742c ("mmc: pxamci: Use devm_mmc_alloc_host() helper") Suggested-by: Uwe Kleine-König Signed-off-by: Rakuram Eswaran Reviewed-by: Khalid Aziz Acked-by: Uwe Kleine-König Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson --- drivers/mmc/host/pxamci.c | 56 +++++++++++++++-------------------------------- 1 file changed, 18 insertions(+), 38 deletions(-) diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c index 26d03352af63..b5ea058ed467 100644 --- a/drivers/mmc/host/pxamci.c +++ b/drivers/mmc/host/pxamci.c @@ -652,10 +652,9 @@ static int pxamci_probe(struct platform_device *pdev) host->clkrt = CLKRT_OFF; host->clk = devm_clk_get(dev, NULL); - if (IS_ERR(host->clk)) { - host->clk = NULL; - return PTR_ERR(host->clk); - } + if (IS_ERR(host->clk)) + return dev_err_probe(dev, PTR_ERR(host->clk), + "Failed to acquire clock\n"); host->clkrate = clk_get_rate(host->clk); @@ -703,46 +702,37 @@ static int pxamci_probe(struct platform_device *pdev) platform_set_drvdata(pdev, mmc); - host->dma_chan_rx = dma_request_chan(dev, "rx"); - if (IS_ERR(host->dma_chan_rx)) { - host->dma_chan_rx = NULL; + host->dma_chan_rx = devm_dma_request_chan(dev, "rx"); + if (IS_ERR(host->dma_chan_rx)) return dev_err_probe(dev, PTR_ERR(host->dma_chan_rx), "unable to request rx dma channel\n"); - } - host->dma_chan_tx = dma_request_chan(dev, "tx"); - if (IS_ERR(host->dma_chan_tx)) { - dev_err(dev, "unable to request tx dma channel\n"); - ret = PTR_ERR(host->dma_chan_tx); - host->dma_chan_tx = NULL; - goto out; - } + + host->dma_chan_tx = devm_dma_request_chan(dev, "tx"); + if (IS_ERR(host->dma_chan_tx)) + return dev_err_probe(dev, PTR_ERR(host->dma_chan_tx), + "unable to request tx dma channel\n"); if (host->pdata) { host->detect_delay_ms = host->pdata->detect_delay_ms; host->power = devm_gpiod_get_optional(dev, "power", GPIOD_OUT_LOW); - if (IS_ERR(host->power)) { - ret = PTR_ERR(host->power); - dev_err(dev, "Failed requesting gpio_power\n"); - goto out; - } + if (IS_ERR(host->power)) + return dev_err_probe(dev, PTR_ERR(host->power), + "Failed requesting gpio_power\n"); /* FIXME: should we pass detection delay to debounce? */ ret = mmc_gpiod_request_cd(mmc, "cd", 0, false, 0); - if (ret && ret != -ENOENT) { - dev_err(dev, "Failed requesting gpio_cd\n"); - goto out; - } + if (ret && ret != -ENOENT) + return dev_err_probe(dev, ret, "Failed requesting gpio_cd\n"); if (!host->pdata->gpio_card_ro_invert) mmc->caps2 |= MMC_CAP2_RO_ACTIVE_HIGH; ret = mmc_gpiod_request_ro(mmc, "wp", 0, 0); - if (ret && ret != -ENOENT) { - dev_err(dev, "Failed requesting gpio_ro\n"); - goto out; - } + if (ret && ret != -ENOENT) + return dev_err_probe(dev, ret, "Failed requesting gpio_ro\n"); + if (!ret) host->use_ro_gpio = true; @@ -759,16 +749,8 @@ static int pxamci_probe(struct platform_device *pdev) if (ret) { if (host->pdata && host->pdata->exit) host->pdata->exit(dev, mmc); - goto out; } - return 0; - -out: - if (host->dma_chan_rx) - dma_release_channel(host->dma_chan_rx); - if (host->dma_chan_tx) - dma_release_channel(host->dma_chan_tx); return ret; } @@ -791,8 +773,6 @@ static void pxamci_remove(struct platform_device *pdev) dmaengine_terminate_all(host->dma_chan_rx); dmaengine_terminate_all(host->dma_chan_tx); - dma_release_channel(host->dma_chan_rx); - dma_release_channel(host->dma_chan_tx); } } -- cgit From 739f04f4a46237536aff07ff223c231da53ed8ce Mon Sep 17 00:00:00 2001 From: Shawn Lin Date: Tue, 4 Nov 2025 11:51:23 +0800 Subject: mmc: dw_mmc-rockchip: Fix wrong internal phase calculate ciu clock is 2 times of io clock, but the sample clk used is derived from io clock provided to the card. So we should use io clock to calculate the phase. Fixes: 59903441f5e4 ("mmc: dw_mmc-rockchip: Add internal phase support") Signed-off-by: Shawn Lin Acked-by: Heiko Stuebner Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson --- drivers/mmc/host/dw_mmc-rockchip.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/mmc/host/dw_mmc-rockchip.c b/drivers/mmc/host/dw_mmc-rockchip.c index 82dd906bb002..681354942e97 100644 --- a/drivers/mmc/host/dw_mmc-rockchip.c +++ b/drivers/mmc/host/dw_mmc-rockchip.c @@ -42,7 +42,7 @@ struct dw_mci_rockchip_priv_data { */ static int rockchip_mmc_get_internal_phase(struct dw_mci *host, bool sample) { - unsigned long rate = clk_get_rate(host->ciu_clk); + unsigned long rate = clk_get_rate(host->ciu_clk) / RK3288_CLKGEN_DIV; u32 raw_value; u16 degrees; u32 delay_num = 0; @@ -85,7 +85,7 @@ static int rockchip_mmc_get_phase(struct dw_mci *host, bool sample) static int rockchip_mmc_set_internal_phase(struct dw_mci *host, bool sample, int degrees) { - unsigned long rate = clk_get_rate(host->ciu_clk); + unsigned long rate = clk_get_rate(host->ciu_clk) / RK3288_CLKGEN_DIV; u8 nineties, remainder; u8 delay_num; u32 raw_value; -- cgit From 632108ec072ad64c8c83db6e16a7efee29ebfb74 Mon Sep 17 00:00:00 2001 From: Haein Lee Date: Wed, 12 Nov 2025 00:37:54 +0900 Subject: ALSA: usb-audio: Fix NULL pointer dereference in snd_usb_mixer_controls_badd In snd_usb_create_streams(), for UAC version 3 devices, the Interface Association Descriptor (IAD) is retrieved via usb_ifnum_to_if(). If this call fails, a fallback routine attempts to obtain the IAD from the next interface and sets a BADD profile. However, snd_usb_mixer_controls_badd() assumes that the IAD retrieved from usb_ifnum_to_if() is always valid, without performing a NULL check. This can lead to a NULL pointer dereference when usb_ifnum_to_if() fails to find the interface descriptor. This patch adds a NULL pointer check after calling usb_ifnum_to_if() in snd_usb_mixer_controls_badd() to prevent the dereference. This issue was discovered by syzkaller, which triggered the bug by sending a crafted USB device descriptor. Fixes: 17156f23e93c ("ALSA: usb: add UAC3 BADD profiles support") Signed-off-by: Haein Lee Link: https://patch.msgid.link/vwhzmoba9j2f.vwhzmob9u9e2.g6@dooray.com Signed-off-by: Takashi Iwai --- sound/usb/mixer.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c index 6f00e0d52382..72b900505d2c 100644 --- a/sound/usb/mixer.c +++ b/sound/usb/mixer.c @@ -3086,6 +3086,8 @@ static int snd_usb_mixer_controls_badd(struct usb_mixer_interface *mixer, int i; assoc = usb_ifnum_to_if(dev, ctrlif)->intf_assoc; + if (!assoc) + return -EINVAL; /* Detect BADD capture/playback channels from AS EP descriptors */ for (i = 0; i < assoc->bInterfaceCount; i++) { -- cgit From d93a89684dce949c2ea817b6f07feee9a45241a7 Mon Sep 17 00:00:00 2001 From: Stefan Metzmacher Date: Mon, 10 Nov 2025 16:23:52 +0100 Subject: smb: client: let smbd_disconnect_rdma_connection() turn CREATED into DISCONNECTED When smbd_disconnect_rdma_connection() turns SMBDIRECT_SOCKET_CREATED into SMBDIRECT_SOCKET_ERROR, we'll have the situation that smbd_disconnect_rdma_work() will set SMBDIRECT_SOCKET_DISCONNECTING and call rdma_disconnect(), which likely fails as we never reached the RDMA_CM_EVENT_ESTABLISHED. it means that wait_event(sc->status_wait, sc->status == SMBDIRECT_SOCKET_DISCONNECTED) in smbd_destroy() will hang forever in SMBDIRECT_SOCKET_DISCONNECTING never reaching SMBDIRECT_SOCKET_DISCONNECTED. So we directly go from SMBDIRECT_SOCKET_CREATED to SMBDIRECT_SOCKET_DISCONNECTED. Fixes: ffbfc73e84eb ("smb: client: let smbd_disconnect_rdma_connection() set SMBDIRECT_SOCKET_ERROR...") Cc: Steve French Cc: Tom Talpey Cc: Long Li Cc: Namjae Jeon Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher Signed-off-by: Steve French --- fs/smb/client/smbdirect.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/smb/client/smbdirect.c b/fs/smb/client/smbdirect.c index 85a4c55b61b8..c6c428c2e08d 100644 --- a/fs/smb/client/smbdirect.c +++ b/fs/smb/client/smbdirect.c @@ -290,6 +290,9 @@ static void smbd_disconnect_rdma_connection(struct smbdirect_socket *sc) break; case SMBDIRECT_SOCKET_CREATED: + sc->status = SMBDIRECT_SOCKET_DISCONNECTED; + break; + case SMBDIRECT_SOCKET_CONNECTED: sc->status = SMBDIRECT_SOCKET_ERROR; break; -- cgit From fdf302e6bea1822a9144a0cc2e8e17527e746162 Mon Sep 17 00:00:00 2001 From: Sami Tolvanen Date: Mon, 10 Nov 2025 14:19:13 +0100 Subject: gendwarfksyms: Skip files with no exports Starting with Rust 1.91.0 (released 2025-10-30), in upstream commit ab91a63d403b ("Ignore intrinsic calls in cross-crate-inlining cost model") [1][2], `bindings.o` stops containing DWARF debug information because the `Default` implementations contained `write_bytes()` calls which are now ignored in that cost model (note that `CLIPPY=1` does not reproduce it). This means `gendwarfksyms` complains: RUSTC L rust/bindings.o error: gendwarfksyms: process_module: dwarf_get_units failed: no debugging information? There are several alternatives that would work here: conditionally skipping in the cases needed (but that is subtle and brittle), forcing DWARF generation with e.g. a dummy `static` (ugly and we may need to do it in several crates), skipping the call to the tool in the Kbuild command when there are no exports (fine) or teaching the tool to do so itself (simple and clean). Thus do the last one: don't attempt to process files if we have no symbol versions to calculate. [ I used the commit log of my patch linked below since it explained the root issue and expanded it a bit more to summarize the alternatives. - Miguel ] Cc: stable@vger.kernel.org # Needed in 6.17.y. Reported-by: Haiyue Wang Closes: https://lore.kernel.org/rust-for-linux/b8c1c73d-bf8b-4bf2-beb1-84ffdcd60547@163.com/ Suggested-by: Miguel Ojeda Link: https://lore.kernel.org/rust-for-linux/CANiq72nKC5r24VHAp9oUPR1HVPqT+=0ab9N0w6GqTF-kJOeiSw@mail.gmail.com/ Link: https://github.com/rust-lang/rust/commit/ab91a63d403b0105cacd72809cd292a72984ed99 [1] Link: https://github.com/rust-lang/rust/pull/145910 [2] Signed-off-by: Sami Tolvanen Tested-by: Haiyue Wang Reviewed-by: Alice Ryhl Link: https://patch.msgid.link/20251110131913.1789896-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda --- scripts/gendwarfksyms/gendwarfksyms.c | 3 ++- scripts/gendwarfksyms/gendwarfksyms.h | 2 +- scripts/gendwarfksyms/symbols.c | 4 +++- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/scripts/gendwarfksyms/gendwarfksyms.c b/scripts/gendwarfksyms/gendwarfksyms.c index 08ae61eb327e..f5203d1640ee 100644 --- a/scripts/gendwarfksyms/gendwarfksyms.c +++ b/scripts/gendwarfksyms/gendwarfksyms.c @@ -138,7 +138,8 @@ int main(int argc, char **argv) error("no input files?"); } - symbol_read_exports(stdin); + if (!symbol_read_exports(stdin)) + return 0; if (symtypes_file) { symfile = fopen(symtypes_file, "w"); diff --git a/scripts/gendwarfksyms/gendwarfksyms.h b/scripts/gendwarfksyms/gendwarfksyms.h index d9c06d2cb1df..32cec8f7695a 100644 --- a/scripts/gendwarfksyms/gendwarfksyms.h +++ b/scripts/gendwarfksyms/gendwarfksyms.h @@ -123,7 +123,7 @@ struct symbol { typedef void (*symbol_callback_t)(struct symbol *, void *arg); bool is_symbol_ptr(const char *name); -void symbol_read_exports(FILE *file); +int symbol_read_exports(FILE *file); void symbol_read_symtab(int fd); struct symbol *symbol_get(const char *name); void symbol_set_ptr(struct symbol *sym, Dwarf_Die *ptr); diff --git a/scripts/gendwarfksyms/symbols.c b/scripts/gendwarfksyms/symbols.c index 35ed594f0749..ecddcb5ffcdf 100644 --- a/scripts/gendwarfksyms/symbols.c +++ b/scripts/gendwarfksyms/symbols.c @@ -128,7 +128,7 @@ static bool is_exported(const char *name) return for_each(name, NULL, NULL) > 0; } -void symbol_read_exports(FILE *file) +int symbol_read_exports(FILE *file) { struct symbol *sym; char *line = NULL; @@ -159,6 +159,8 @@ void symbol_read_exports(FILE *file) free(line); debug("%d exported symbols", nsym); + + return nsym; } static void get_symbol(struct symbol *sym, void *arg) -- cgit From 22a36e660d014925114feb09a2680bb3c2d1e279 Mon Sep 17 00:00:00 2001 From: Vitaly Prosyak Date: Thu, 6 Nov 2025 12:35:53 -0500 Subject: drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfaces MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Certain multi-GPU configurations (especially GFX12) may hit data corruption when a DCC-compressed VRAM surface is shared across GPUs using peer-to-peer (P2P) DMA transfers. Such surfaces rely on device-local metadata and cannot be safely accessed through a remote GPU’s page tables. Attempting to import a DCC-enabled surface through P2P leads to incorrect rendering or GPU faults. This change disables P2P for DCC-enabled VRAM buffers that are contiguous and allocated on GFX12+ hardware. In these cases, the importer falls back to the standard system-memory path, avoiding invalid access to compressed surfaces. Future work could consider optional migration (VRAM→System→VRAM) if a performance regression is observed when `attach->peer2peer = false`. Tested on: - Dual RX 9700 XT (Navi4x) setup - GNOME and Wayland compositor scenarios - Confirmed no corruption after disabling P2P under these conditions v2: Remove check TTM_PL_VRAM & TTM_PL_FLAG_CONTIGUOUS. v3: simplify for upsteam and fix ip version check (Alex) Suggested-by: Christian König Signed-off-by: Vitaly Prosyak Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 9dff2bb709e6fbd97e263fd12bf12802d2b5a0cf) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index 8561ad7f6180..ed3bef1edfe4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -82,6 +82,18 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf, struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); + /* + * Disable peer-to-peer access for DCC-enabled VRAM surfaces on GFX12+. + * Such buffers cannot be safely accessed over P2P due to device-local + * compression metadata. Fallback to system-memory path instead. + * Device supports GFX12 (GC 12.x or newer) + * BO was created with the AMDGPU_GEM_CREATE_GFX12_DCC flag + * + */ + if (amdgpu_ip_version(adev, GC_HWIP, 0) >= IP_VERSION(12, 0, 0) && + bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) + attach->peer2peer = false; + if (!amdgpu_dmabuf_is_xgmi_accessible(attach_adev, bo) && pci_p2pdma_distance(adev->pdev, attach->dev, false) < 0) attach->peer2peer = false; -- cgit From 9f8fd538e244b87e4556833da51ddd986f50cc81 Mon Sep 17 00:00:00 2001 From: Pierre-Eric Pelloux-Prayer Date: Tue, 4 Nov 2025 10:42:45 +0100 Subject: drm/amdgpu: jump to the correct label on failure MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit drm_sched_entity_init wasn't called yet, so the only thing to do is to release allocated memory. This doesn't fix any bug since entity is zero allocated and drm_sched_entity_fini does nothing in this case. Signed-off-by: Pierre-Eric Pelloux-Prayer Reviewed-by: Tvrtko Ursulin Acked-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit ec49374ccb8da86b465beaf09c367f3dfd648d8f) --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index f5d5c45ddc0d..afedea02188d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -236,7 +236,7 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip, r = amdgpu_xcp_select_scheds(adev, hw_ip, hw_prio, fpriv, &num_scheds, &scheds); if (r) - goto cleanup_entity; + goto error_free_entity; } /* disable load balance if the hw engine retains context among dependent jobs */ -- cgit From 6623c5f9fd877868fba133b4ae4dab0052e82dad Mon Sep 17 00:00:00 2001 From: "Jesse.Zhang" Date: Fri, 24 Oct 2025 16:09:25 +0800 Subject: drm/amdgpu: fix lock warning in amdgpu_userq_fence_driver_process MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix a potential deadlock caused by inconsistent spinlock usage between interrupt and process contexts in the userq fence driver. The issue occurs when amdgpu_userq_fence_driver_process() is called from both: - Interrupt context: gfx_v11_0_eop_irq() -> amdgpu_userq_fence_driver_process() - Process context: amdgpu_eviction_fence_suspend_worker() -> amdgpu_userq_fence_driver_force_completion() -> amdgpu_userq_fence_driver_process() In interrupt context, the spinlock was acquired without disabling interrupts, leaving it in {IN-HARDIRQ-W} state. When the same lock is acquired in process context, the kernel detects inconsistent locking since the process context acquisition would enable interrupts while holding a lock previously acquired in interrupt context. Kernel log shows: [ 4039.310790] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 4039.310804] kworker/7:2/409 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 4039.310818] ffff9284e1bed000 (&fence_drv->fence_list_lock){?...}-{3:3}, [ 4039.310993] {IN-HARDIRQ-W} state was registered at: [ 4039.311004] lock_acquire+0xc6/0x300 [ 4039.311018] _raw_spin_lock+0x39/0x80 [ 4039.311031] amdgpu_userq_fence_driver_process.part.0+0x30/0x180 [amdgpu] [ 4039.311146] amdgpu_userq_fence_driver_process+0x17/0x30 [amdgpu] [ 4039.311257] gfx_v11_0_eop_irq+0x132/0x170 [amdgpu] Fix by using spin_lock_irqsave()/spin_unlock_irqrestore() to properly manage interrupt state regardless of calling context. Reviewed-by: Christian König Signed-off-by: Jesse Zhang Signed-off-by: Alex Deucher (cherry picked from commit ded3ad780cf97a04927773c4600823b84f7f3cc2) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index 761bad98da3e..4d0096d0baa9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -151,15 +151,16 @@ void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_d { struct amdgpu_userq_fence *userq_fence, *tmp; struct dma_fence *fence; + unsigned long flags; u64 rptr; int i; if (!fence_drv) return; + spin_lock_irqsave(&fence_drv->fence_list_lock, flags); rptr = amdgpu_userq_fence_read(fence_drv); - spin_lock(&fence_drv->fence_list_lock); list_for_each_entry_safe(userq_fence, tmp, &fence_drv->fences, link) { fence = &userq_fence->base; @@ -174,7 +175,7 @@ void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_d list_del(&userq_fence->link); dma_fence_put(fence); } - spin_unlock(&fence_drv->fence_list_lock); + spin_unlock_irqrestore(&fence_drv->fence_list_lock, flags); } void amdgpu_userq_fence_driver_destroy(struct kref *ref) -- cgit From 33c995709121a3a29d4567a08c943bf7a5b24b78 Mon Sep 17 00:00:00 2001 From: Ivan Lipski Date: Thu, 23 Oct 2025 10:03:59 -0400 Subject: drm/amd/display: Allow VRR params change if unsynced with the stream MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [Why] When changing resolution (e.g., 4K → FHD) in mirror/clone mode with certain monitors, the monitor blanks and loses connection due to an early exit in vrr_settings_require_update(). The function only checks if VRR state, fixed refresh target, or min/max refresh rate range has changed. During mode changes, if the calculated min/max refresh values remain the same even though the stream's v_total changed, the function returns early without updating vrr_params.adjust.v_total_min/max, leaving the monitor's VRR timing parameters unsynced with the new mode, causing it to blank out. [How] Explicitly adjust VRR parameters to the stream's nominal v_total when VRR is supported, but inactive. Fixes: 6d31602a9f57 ("drm/amd/display: more liberal vmin/vmax update for freesync") Reviewed-by: Aurabindo Pillai Signed-off-by: Ivan Lipski Signed-off-by: Fangzhi Zuo Tested-by: Dan Wheeler Signed-off-by: Alex Deucher (cherry picked from commit 607df8248a011524211ee34850345305a1913f9e) --- drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c index ce421bcddcb0..1aae46d703ba 100644 --- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c +++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c @@ -1260,6 +1260,17 @@ void mod_freesync_handle_v_update(struct mod_freesync *mod_freesync, update_v_total_for_static_ramp( core_freesync, stream, in_out_vrr); } + + /* + * If VRR is inactive, set vtotal min and max to nominal vtotal + */ + if (in_out_vrr->state == VRR_STATE_INACTIVE) { + in_out_vrr->adjust.v_total_min = + mod_freesync_calc_v_total_from_refresh(stream, + in_out_vrr->max_refresh_in_uhz); + in_out_vrr->adjust.v_total_max = in_out_vrr->adjust.v_total_min; + return; + } } unsigned long long mod_freesync_calc_nominal_field_rate( -- cgit From 7132f7e025f9382157543dd86a62d161335b48b9 Mon Sep 17 00:00:00 2001 From: Sultan Alsawaf Date: Fri, 7 Nov 2025 13:07:13 -0500 Subject: drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO When the BO pointer provided to amdgpu_bo_create_kernel() points to non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address rather than allocate a new BO. This functionality is never desired for allocating ISP buffers. A new BO should always be created when isp_kernel_buffer_alloc() is called, per the description for isp_kernel_buffer_alloc(). Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call. Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers") Reviewed-by: Mario Limonciello (AMD) Reviewed-by: Pratap Nirujogi Signed-off-by: Sultan Alsawaf Signed-off-by: Alex Deucher (cherry picked from commit 73c8c29baac7f0c7e703d92eba009008cbb5228e) --- drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c index 9cddbf50442a..37270c4dab8d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device *dev, u64 size, if (ret) return ret; + /* Ensure *bo is NULL so a new BO will be created */ + *bo = NULL; ret = amdgpu_bo_create_kernel(adev, size, ISP_MC_ADDR_ALIGN, -- cgit From bbe3c115030da431c9ec843c18d5583e59482dd2 Mon Sep 17 00:00:00 2001 From: Sathishkumar S Date: Tue, 7 Oct 2025 13:17:51 +0530 Subject: drm/amdgpu/jpeg: Add parse_cs for JPEG5_0_1 enable parse_cs callback for JPEG5_0_1. Signed-off-by: Sathishkumar S Reviewed-by: Leo Liu Signed-off-by: Alex Deucher (cherry picked from commit 547985579932c1de13f57f8bcf62cd9361b9d3d3) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c index baf097d2e1ac..ab0bf880d3d8 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c @@ -878,6 +878,7 @@ static const struct amdgpu_ring_funcs jpeg_v5_0_1_dec_ring_vm_funcs = { .get_rptr = jpeg_v5_0_1_dec_ring_get_rptr, .get_wptr = jpeg_v5_0_1_dec_ring_get_wptr, .set_wptr = jpeg_v5_0_1_dec_ring_set_wptr, + .parse_cs = amdgpu_jpeg_dec_parse_cs, .emit_frame_size = SOC15_FLUSH_GPU_TLB_NUM_WREG * 6 + SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 8 + -- cgit From d15deafab5d722afb9e2f83c5edcdef9d9d98bd1 Mon Sep 17 00:00:00 2001 From: Jonathan Kim Date: Thu, 6 Nov 2025 10:17:06 -0500 Subject: drm/amdkfd: relax checks for over allocation of save area Over allocation of save area is not fatal, only under allocation is. ROCm has various components that independently claim authority over save area size. Unless KFD decides to claim single authority, relax size checks. Signed-off-by: Jonathan Kim Reviewed-by: Philip Yang Signed-off-by: Alex Deucher (cherry picked from commit 15bd4958fe38e763bc17b607ba55155254a01f55) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_queue.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c index a65c67cf56ff..f1e7583650c4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c @@ -297,16 +297,16 @@ int kfd_queue_acquire_buffers(struct kfd_process_device *pdd, struct queue_prope goto out_err_unreserve; } - if (properties->ctx_save_restore_area_size != topo_dev->node_props.cwsr_size) { - pr_debug("queue cwsr size 0x%x not equal to node cwsr size 0x%x\n", + if (properties->ctx_save_restore_area_size < topo_dev->node_props.cwsr_size) { + pr_debug("queue cwsr size 0x%x not sufficient for node cwsr size 0x%x\n", properties->ctx_save_restore_area_size, topo_dev->node_props.cwsr_size); err = -EINVAL; goto out_err_unreserve; } - total_cwsr_size = (topo_dev->node_props.cwsr_size + topo_dev->node_props.debug_memory_size) - * NUM_XCC(pdd->dev->xcc_mask); + total_cwsr_size = (properties->ctx_save_restore_area_size + + topo_dev->node_props.debug_memory_size) * NUM_XCC(pdd->dev->xcc_mask); total_cwsr_size = ALIGN(total_cwsr_size, PAGE_SIZE); err = kfd_queue_buffer_get(vm, (void *)properties->ctx_save_restore_area_address, @@ -352,8 +352,8 @@ int kfd_queue_release_buffers(struct kfd_process_device *pdd, struct queue_prope topo_dev = kfd_topology_device_by_id(pdd->dev->id); if (!topo_dev) return -EINVAL; - total_cwsr_size = (topo_dev->node_props.cwsr_size + topo_dev->node_props.debug_memory_size) - * NUM_XCC(pdd->dev->xcc_mask); + total_cwsr_size = (properties->ctx_save_restore_area_size + + topo_dev->node_props.debug_memory_size) * NUM_XCC(pdd->dev->xcc_mask); total_cwsr_size = ALIGN(total_cwsr_size, PAGE_SIZE); kfd_queue_buffer_svm_put(pdd, properties->ctx_save_restore_area_address, total_cwsr_size); -- cgit From eac32ff42393efa6657efc821231b8d802c1d485 Mon Sep 17 00:00:00 2001 From: Harish Kasiviswanathan Date: Tue, 28 Oct 2025 14:37:07 -0400 Subject: drm/amdkfd: Fix GPU mappings for APU after prefetch Fix the following corner case:- Consider a 2M huge page SVM allocation, followed by prefetch call for the first 4K page. The whole range is initially mapped with single PTE. After the prefetch, this range gets split to first page + rest of the pages. Currently, the first page mapping is not updated on MI300A (APU) since page hasn't migrated. However, after range split PTE mapping it not valid. Fix this by forcing page table update for the whole range when prefetch is called. Calling prefetch on APU doesn't improve performance. If all it deteriotes. However, functionality has to be supported. v2: Use apu_prefer_gtt as this issue doesn't apply to APUs with carveout VRAM v3: Simplify by setting the flag for all ASICs as it doesn't affect dGPU v4: Remove v2 and v3 changes. Force update_mapping when range is split at a size that is not aligned to prange granularity Suggested-by: Philip Yang Signed-off-by: Harish Kasiviswanathan Reviewed-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher (cherry picked from commit 076470b9f6f8d9c7c8ca73a9f054942a686f9ba7) --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 9d72411c3379..74a1d3e1d52b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -3687,6 +3687,8 @@ svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm, svm_range_apply_attrs(p, prange, nattr, attrs, &update_mapping); /* TODO: unmap ranges from GPU that lost access */ } + update_mapping |= !p->xnack_enabled && !list_empty(&remap_list); + list_for_each_entry_safe(prange, next, &remove_list, update_list) { pr_debug("unlink old 0x%p prange 0x%p [0x%lx 0x%lx]\n", prange->svms, prange, prange->start, -- cgit From a3f8f8662771285511ae26c4c8d3ba1cd22159b9 Mon Sep 17 00:00:00 2001 From: Christian Brauner Date: Wed, 5 Nov 2025 14:39:45 +0100 Subject: power: always freeze efivarfs The efivarfs filesystems must always be frozen and thawed to resync variable state. Make it so. Link: https://patch.msgid.link/20251105-vorbild-zutreffen-fe00d1dd98db@brauner Signed-off-by: Christian Brauner --- fs/efivarfs/super.c | 1 + fs/super.c | 13 ++++++++++--- include/linux/fs.h | 3 ++- kernel/power/hibernate.c | 9 +++------ kernel/power/suspend.c | 3 +-- 5 files changed, 17 insertions(+), 12 deletions(-) diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c index 1f4d8ce56667..6de97565d5f7 100644 --- a/fs/efivarfs/super.c +++ b/fs/efivarfs/super.c @@ -533,6 +533,7 @@ static struct file_system_type efivarfs_type = { .init_fs_context = efivarfs_init_fs_context, .kill_sb = efivarfs_kill_sb, .parameters = efivarfs_parameters, + .fs_flags = FS_POWER_FREEZE, }; static __init int efivarfs_init(void) diff --git a/fs/super.c b/fs/super.c index 5bab94fb7e03..277b84e5c279 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1183,11 +1183,14 @@ static inline bool get_active_super(struct super_block *sb) static const char *filesystems_freeze_ptr = "filesystems_freeze"; -static void filesystems_freeze_callback(struct super_block *sb, void *unused) +static void filesystems_freeze_callback(struct super_block *sb, void *freeze_all_ptr) { if (!sb->s_op->freeze_fs && !sb->s_op->freeze_super) return; + if (freeze_all_ptr && !(sb->s_type->fs_flags & FS_POWER_FREEZE)) + return; + if (!get_active_super(sb)) return; @@ -1201,9 +1204,13 @@ static void filesystems_freeze_callback(struct super_block *sb, void *unused) deactivate_super(sb); } -void filesystems_freeze(void) +void filesystems_freeze(bool freeze_all) { - __iterate_supers(filesystems_freeze_callback, NULL, + void *freeze_all_ptr = NULL; + + if (freeze_all) + freeze_all_ptr = &freeze_all; + __iterate_supers(filesystems_freeze_callback, freeze_all_ptr, SUPER_ITER_UNLOCKED | SUPER_ITER_REVERSE); } diff --git a/include/linux/fs.h b/include/linux/fs.h index 3ea98c6cce81..249a1da8440e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2689,6 +2689,7 @@ struct file_system_type { #define FS_ALLOW_IDMAP 32 /* FS has been updated to handle vfs idmappings. */ #define FS_MGTIME 64 /* FS uses multigrain timestamps */ #define FS_LBS 128 /* FS supports LBS */ +#define FS_POWER_FREEZE 256 /* Always freeze on suspend/hibernate */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ int (*init_fs_context)(struct fs_context *); const struct fs_parameter_spec *parameters; @@ -3606,7 +3607,7 @@ extern void drop_super_exclusive(struct super_block *sb); extern void iterate_supers(void (*f)(struct super_block *, void *), void *arg); extern void iterate_supers_type(struct file_system_type *, void (*)(struct super_block *, void *), void *); -void filesystems_freeze(void); +void filesystems_freeze(bool freeze_all); void filesystems_thaw(void); extern int dcache_dir_open(struct inode *, struct file *); diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 14e85ff23551..1f250ce036a0 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -825,8 +825,7 @@ int hibernate(void) goto Restore; ksys_sync_helper(); - if (filesystem_freeze_enabled) - filesystems_freeze(); + filesystems_freeze(filesystem_freeze_enabled); error = freeze_processes(); if (error) @@ -932,8 +931,7 @@ int hibernate_quiet_exec(int (*func)(void *data), void *data) if (error) goto restore; - if (filesystem_freeze_enabled) - filesystems_freeze(); + filesystems_freeze(filesystem_freeze_enabled); error = freeze_processes(); if (error) @@ -1083,8 +1081,7 @@ static int software_resume(void) if (error) goto Restore; - if (filesystem_freeze_enabled) - filesystems_freeze(); + filesystems_freeze(filesystem_freeze_enabled); pm_pr_dbg("Preparing processes for hibernation restore.\n"); error = freeze_processes(); diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 4bb4686c1c08..c933a63a9718 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -375,8 +375,7 @@ static int suspend_prepare(suspend_state_t state) if (error) goto Restore; - if (filesystem_freeze_enabled) - filesystems_freeze(); + filesystems_freeze(filesystem_freeze_enabled); trace_suspend_resume(TPS("freeze_processes"), 0, true); error = suspend_freeze_processes(); trace_suspend_resume(TPS("freeze_processes"), 0, false); -- cgit From f6fdd77b3e0d519a2535a1e923558cd07d9acda9 Mon Sep 17 00:00:00 2001 From: Baojun Xu Date: Wed, 12 Nov 2025 17:26:09 +0800 Subject: ALSA: hda/tas2781: Correct the wrong project ID The project hardware ID should be ALC287_FIXUP_TXNW2781_I2C, not ALC287_FIXUP_TAS2781_I2C for HP Lampass projects. Fixes: 7a39c723b747 ("ALSA: hda/tas2781: Add new quirk for HP new projects") Signed-off-by: Baojun Xu Link: https://patch.msgid.link/20251112092609.15865-1-baojun.xu@ti.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index a9698bf26887..269b6c1e3b6d 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6700,9 +6700,9 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8ed8, "HP Merino16", ALC245_FIXUP_TAS2781_SPI_2), SND_PCI_QUIRK(0x103c, 0x8ed9, "HP Merino14W", ALC245_FIXUP_TAS2781_SPI_2), SND_PCI_QUIRK(0x103c, 0x8eda, "HP Merino16W", ALC245_FIXUP_TAS2781_SPI_2), - SND_PCI_QUIRK(0x103c, 0x8f40, "HP Lampas14", ALC287_FIXUP_TAS2781_I2C), - SND_PCI_QUIRK(0x103c, 0x8f41, "HP Lampas16", ALC287_FIXUP_TAS2781_I2C), - SND_PCI_QUIRK(0x103c, 0x8f42, "HP LampasW14", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x103c, 0x8f40, "HP Lampas14", ALC287_FIXUP_TXNW2781_I2C), + SND_PCI_QUIRK(0x103c, 0x8f41, "HP Lampas16", ALC287_FIXUP_TXNW2781_I2C), + SND_PCI_QUIRK(0x103c, 0x8f42, "HP LampasW14", ALC287_FIXUP_TXNW2781_I2C), SND_PCI_QUIRK(0x1043, 0x1032, "ASUS VivoBook X513EA", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1034, "ASUS GU605C", ALC285_FIXUP_ASUS_GU605_SPI_SPEAKER2_TO_DAC1), SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC), -- cgit From 78f0e33cd6c939a555aa80dbed2fec6b333a7660 Mon Sep 17 00:00:00 2001 From: Andrei Vagin Date: Tue, 11 Nov 2025 06:28:15 +0000 Subject: fs/namespace: correctly handle errors returned by grab_requested_mnt_ns grab_requested_mnt_ns was changed to return error codes on failure, but its callers were not updated to check for error pointers, still checking only for a NULL return value. This commit updates the callers to use IS_ERR() or IS_ERR_OR_NULL() and PTR_ERR() to correctly check for and propagate errors. This also makes sure that the logic actually works and mount namespace file descriptors can be used to refere to mounts. Christian Brauner says: Rework the patch to be more ergonomic and in line with our overall error handling patterns. Fixes: 7b9d14af8777 ("fs: allow mount namespace fd") Cc: Christian Brauner Signed-off-by: Andrei Vagin Link: https://patch.msgid.link/20251111062815.2546189-1-avagin@google.com Reviewed-by: Jan Kara Signed-off-by: Christian Brauner --- fs/namespace.c | 32 ++++++++++++++++---------------- include/uapi/linux/mount.h | 2 +- 2 files changed, 17 insertions(+), 17 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index cc6e00e72437..2bad25709b2c 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -141,7 +141,8 @@ static void mnt_ns_release(struct mnt_namespace *ns) kfree(ns); } } -DEFINE_FREE(mnt_ns_release, struct mnt_namespace *, if (_T) mnt_ns_release(_T)) +DEFINE_FREE(mnt_ns_release, struct mnt_namespace *, + if (!IS_ERR(_T)) mnt_ns_release(_T)) static void mnt_ns_release_rcu(struct rcu_head *rcu) { @@ -5726,7 +5727,7 @@ static int copy_mnt_id_req(const struct mnt_id_req __user *req, ret = copy_struct_from_user(kreq, sizeof(*kreq), req, usize); if (ret) return ret; - if (kreq->spare != 0) + if (kreq->mnt_ns_fd != 0 && kreq->mnt_ns_id) return -EINVAL; /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */ if (kreq->mnt_id <= MNT_UNIQUE_ID_OFFSET) @@ -5743,16 +5744,12 @@ static struct mnt_namespace *grab_requested_mnt_ns(const struct mnt_id_req *kreq { struct mnt_namespace *mnt_ns; - if (kreq->mnt_ns_id && kreq->spare) - return ERR_PTR(-EINVAL); - - if (kreq->mnt_ns_id) - return lookup_mnt_ns(kreq->mnt_ns_id); - - if (kreq->spare) { + if (kreq->mnt_ns_id) { + mnt_ns = lookup_mnt_ns(kreq->mnt_ns_id); + } else if (kreq->mnt_ns_fd) { struct ns_common *ns; - CLASS(fd, f)(kreq->spare); + CLASS(fd, f)(kreq->mnt_ns_fd); if (fd_empty(f)) return ERR_PTR(-EBADF); @@ -5767,6 +5764,8 @@ static struct mnt_namespace *grab_requested_mnt_ns(const struct mnt_id_req *kreq } else { mnt_ns = current->nsproxy->mnt_ns; } + if (!mnt_ns) + return ERR_PTR(-ENOENT); refcount_inc(&mnt_ns->passive); return mnt_ns; @@ -5791,8 +5790,8 @@ SYSCALL_DEFINE4(statmount, const struct mnt_id_req __user *, req, return ret; ns = grab_requested_mnt_ns(&kreq); - if (!ns) - return -ENOENT; + if (IS_ERR(ns)) + return PTR_ERR(ns); if (kreq.mnt_ns_id && (ns != current->nsproxy->mnt_ns) && !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN)) @@ -5902,8 +5901,8 @@ static void __free_klistmount_free(const struct klistmount *kls) static inline int prepare_klistmount(struct klistmount *kls, struct mnt_id_req *kreq, size_t nr_mnt_ids) { - u64 last_mnt_id = kreq->param; + struct mnt_namespace *ns; /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */ if (last_mnt_id != 0 && last_mnt_id <= MNT_UNIQUE_ID_OFFSET) @@ -5917,9 +5916,10 @@ static inline int prepare_klistmount(struct klistmount *kls, struct mnt_id_req * if (!kls->kmnt_ids) return -ENOMEM; - kls->ns = grab_requested_mnt_ns(kreq); - if (!kls->ns) - return -ENOENT; + ns = grab_requested_mnt_ns(kreq); + if (IS_ERR(ns)) + return PTR_ERR(ns); + kls->ns = ns; kls->mnt_parent_id = kreq->mnt_id; return 0; diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h index 7fa67c2031a5..5d3f8c9e3a62 100644 --- a/include/uapi/linux/mount.h +++ b/include/uapi/linux/mount.h @@ -197,7 +197,7 @@ struct statmount { */ struct mnt_id_req { __u32 size; - __u32 spare; + __u32 mnt_ns_fd; __u64 mnt_id; __u64 param; __u64 mnt_ns_id; -- cgit From 3cd1548a278c7d6a9bdef1f1866e7cf66bfd3518 Mon Sep 17 00:00:00 2001 From: Mike Yuan Date: Sat, 8 Nov 2025 19:09:47 +0000 Subject: shmem: fix tmpfs reconfiguration (remount) when noswap is set In systemd we're trying to switch the internal credentials setup logic to new mount API [1], and I noticed fsconfig(FSCONFIG_CMD_RECONFIGURE) consistently fails on tmpfs with noswap option. This can be trivially reproduced with the following: ``` int fs_fd = fsopen("tmpfs", 0); fsconfig(fs_fd, FSCONFIG_SET_FLAG, "noswap", NULL, 0); fsconfig(fs_fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); fsmount(fs_fd, 0, 0); fsconfig(fs_fd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0); <------ EINVAL ``` After some digging the culprit is shmem_reconfigure() rejecting !(ctx->seen & SHMEM_SEEN_NOSWAP) && sbinfo->noswap, which is bogus as ctx->seen serves as a mask for whether certain options are touched at all. On top of that, noswap option doesn't use fsparam_flag_no, hence it's not really possible to "reenable" swap to begin with. Drop the check and redundant SHMEM_SEEN_NOSWAP flag. [1] https://github.com/systemd/systemd/pull/39637 Fixes: 2c6efe9cf2d7 ("shmem: add support to ignore swap") Signed-off-by: Mike Yuan Link: https://patch.msgid.link/20251108190930.440685-1-me@yhndnzj.com Cc: Luis Chamberlain Cc: Christian Brauner Cc: Hugh Dickins Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner --- mm/shmem.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index b9081b817d28..1b976414d6fa 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -131,8 +131,7 @@ struct shmem_options { #define SHMEM_SEEN_INODES 2 #define SHMEM_SEEN_HUGE 4 #define SHMEM_SEEN_INUMS 8 -#define SHMEM_SEEN_NOSWAP 16 -#define SHMEM_SEEN_QUOTA 32 +#define SHMEM_SEEN_QUOTA 16 }; #ifdef CONFIG_TRANSPARENT_HUGEPAGE @@ -4677,7 +4676,6 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param) "Turning off swap in unprivileged tmpfs mounts unsupported"); } ctx->noswap = true; - ctx->seen |= SHMEM_SEEN_NOSWAP; break; case Opt_quota: if (fc->user_ns != &init_user_ns) @@ -4827,14 +4825,15 @@ static int shmem_reconfigure(struct fs_context *fc) err = "Current inum too high to switch to 32-bit inums"; goto out; } - if ((ctx->seen & SHMEM_SEEN_NOSWAP) && ctx->noswap && !sbinfo->noswap) { + + /* + * "noswap" doesn't use fsparam_flag_no, i.e. there's no "swap" + * counterpart for (re-)enabling swap. + */ + if (ctx->noswap && !sbinfo->noswap) { err = "Cannot disable swap on remount"; goto out; } - if (!(ctx->seen & SHMEM_SEEN_NOSWAP) && !ctx->noswap && sbinfo->noswap) { - err = "Cannot enable swap on remount if it was disabled on first mount"; - goto out; - } if (ctx->seen & SHMEM_SEEN_QUOTA && !sb_any_quota_loaded(fc->root->d_sb)) { -- cgit From 12741624645e098b2234a5ae341045a97473caf1 Mon Sep 17 00:00:00 2001 From: Mateusz Guzik Date: Wed, 5 Nov 2025 22:20:24 +0100 Subject: fs: add iput_not_last() Signed-off-by: Mateusz Guzik Link: https://patch.msgid.link/20251105212025.807549-1-mjguzik@gmail.com Reviewed-by: Jan Kara Signed-off-by: Christian Brauner --- fs/inode.c | 12 ++++++++++++ include/linux/fs.h | 1 + 2 files changed, 13 insertions(+) diff --git a/fs/inode.c b/fs/inode.c index ec9339024ac3..cff1d3af0d57 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1967,6 +1967,18 @@ retry: } EXPORT_SYMBOL(iput); +/** + * iput_not_last - put an inode assuming this is not the last reference + * @inode: inode to put + */ +void iput_not_last(struct inode *inode) +{ + VFS_BUG_ON_INODE(atomic_read(&inode->i_count) < 2, inode); + + WARN_ON(atomic_sub_return(1, &inode->i_count) == 0); +} +EXPORT_SYMBOL(iput_not_last); + #ifdef CONFIG_BLOCK /** * bmap - find a block number in a file diff --git a/include/linux/fs.h b/include/linux/fs.h index 249a1da8440e..dd3b57cfadee 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2824,6 +2824,7 @@ extern int current_umask(void); extern void ihold(struct inode * inode); extern void iput(struct inode *); +void iput_not_last(struct inode *); int inode_update_timestamps(struct inode *inode, int flags); int generic_update_time(struct inode *, int); -- cgit From 56325e8c68c0724d626f665773a5005dcf44e329 Mon Sep 17 00:00:00 2001 From: Mateusz Guzik Date: Wed, 5 Nov 2025 22:20:25 +0100 Subject: landlock: fix splats from iput() after it started calling might_sleep() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit At this point it is guaranteed this is not the last reference. However, a recent addition of might_sleep() at top of iput() started generating false-positives as it was executing for all values. Remedy the problem by using the newly introduced iput_not_last(). Reported-by: syzbot+12479ae15958fc3f54ec@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68d32659.a70a0220.4f78.0012.GAE@google.com/ Fixes: 2ef435a872ab ("fs: add might_sleep() annotation to iput() and more") Signed-off-by: Mateusz Guzik Link: https://patch.msgid.link/20251105212025.807549-2-mjguzik@gmail.com Reviewed-by: Mickaël Salaün Signed-off-by: Christian Brauner --- security/landlock/fs.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/security/landlock/fs.c b/security/landlock/fs.c index 0bade2c5aa1d..d9c12b993fa7 100644 --- a/security/landlock/fs.c +++ b/security/landlock/fs.c @@ -1335,11 +1335,10 @@ static void hook_sb_delete(struct super_block *const sb) * At this point, we own the ihold() reference that was * originally set up by get_inode_object() and the * __iget() reference that we just set in this loop - * walk. Therefore the following call to iput() will - * not sleep nor drop the inode because there is now at - * least two references to it. + * walk. Therefore there are at least two references + * on the inode. */ - iput(inode); + iput_not_last(inode); } else { spin_unlock(&object->lock); rcu_read_unlock(); -- cgit From 85592114ffda568b507bc2b04f5e9afbe7c13b62 Mon Sep 17 00:00:00 2001 From: Alexandru Elisei Date: Wed, 12 Nov 2025 10:28:53 +0000 Subject: KVM: arm64: VHE: Compute fgt traps before activating them On VHE, the Fine Grain Traps registers are written to hardware in kvm_arch_vcpu_load()->..->__activate_traps_hfgxtr(), but the fgt array is computed later, in kvm_vcpu_load_fgt(). This can lead to zero being written to the FGT registers the first time a VCPU is loaded. Also, any changes to the fgt array will be visible only after the VCPU is scheduled out, and then back in, which is not the intended behaviour. Fix it by computing the fgt array just before the fgt traps are written to hardware. Fixes: fb10ddf35c1c ("KVM: arm64: Compute per-vCPU FGTs at vcpu_load()") Signed-off-by: Alexandru Elisei Reviewed-by: Oliver Upton Link: https://patch.msgid.link/20251112102853.47759-1-alexandru.elisei@arm.com Signed-off-by: Marc Zyngier --- arch/arm64/kvm/arm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 870953b4a8a7..052bf0d4d0b0 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -624,6 +624,7 @@ nommu: kvm_timer_vcpu_load(vcpu); kvm_vgic_load(vcpu); kvm_vcpu_load_debug(vcpu); + kvm_vcpu_load_fgt(vcpu); if (has_vhe()) kvm_vcpu_load_vhe(vcpu); kvm_arch_vcpu_load_fp(vcpu); @@ -642,7 +643,6 @@ nommu: vcpu->arch.hcr_el2 |= HCR_TWI; vcpu_set_pauth_traps(vcpu); - kvm_vcpu_load_fgt(vcpu); if (is_protected_kvm_enabled()) { kvm_call_hyp_nvhe(__pkvm_vcpu_load, -- cgit From f2687d3cc9f905505d7b510c50970176115066a2 Mon Sep 17 00:00:00 2001 From: Imre Deak Date: Fri, 7 Nov 2025 14:41:41 +0200 Subject: drm/i915/dp_mst: Disable Panel Replay MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Disable Panel Replay on MST links until it's properly implemented. For instance the required VSC SDP is not programmed on MST and FEC is not enabled if Panel Replay is enabled. Fixes: 3257e55d3ea7 ("drm/i915/panelreplay: enable/disable panel replay") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15174 Cc: Jouni Högander Cc: Animesh Manna Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Jouni Högander Signed-off-by: Imre Deak Link: https://patch.msgid.link/20251107124141.911895-1-imre.deak@intel.com (cherry picked from commit e109f644b871df8440c886a69cdce971ed533088) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/i915/display/intel_psr.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index d5e0a1e66944..4619237f1346 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -585,6 +585,10 @@ static void _panel_replay_init_dpcd(struct intel_dp *intel_dp) struct intel_display *display = to_intel_display(intel_dp); int ret; + /* TODO: Enable Panel Replay on MST once it's properly implemented. */ + if (intel_dp->mst_detect == DRM_DP_MST) + return; + ret = drm_dp_dpcd_read_data(&intel_dp->aux, DP_PANEL_REPLAY_CAP_SUPPORT, &intel_dp->pr_dpcd, sizeof(intel_dp->pr_dpcd)); if (ret < 0) -- cgit From 7c44656ab3ea6f8429027ed14c23b314502e2541 Mon Sep 17 00:00:00 2001 From: Alex Mastro Date: Tue, 11 Nov 2025 10:48:24 -0800 Subject: vfio: selftests: add iova range query helpers VFIO selftests need to map IOVAs from legally accessible ranges, which could vary between hardware. Tests in vfio_dma_mapping_test.c are making excessively strong assumptions about which IOVAs can be mapped. Add vfio_iommu_iova_ranges(), which queries IOVA ranges from the IOMMUFD or VFIO container associated with the device. The queried ranges are normalized to IOMMUFD's iommu_iova_range representation so that handling of IOVA ranges up the stack can be implementation-agnostic. iommu_iova_range and vfio_iova_range are equivalent, so bias to using the new interface's struct. Query IOMMUFD's ranges with IOMMU_IOAS_IOVA_RANGES. Query VFIO container's ranges with VFIO_IOMMU_GET_INFO and VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE. The underlying vfio_iommu_type1_info buffer-related functionality has been kept generic so the same helpers can be used to query other capability chain information, if needed. Reviewed-by: David Matlack Tested-by: David Matlack Signed-off-by: Alex Mastro Link: https://lore.kernel.org/r/20251111-iova-ranges-v3-1-7960244642c5@fb.com Signed-off-by: Alex Williamson --- .../testing/selftests/vfio/lib/include/vfio_util.h | 8 +- tools/testing/selftests/vfio/lib/vfio_pci_device.c | 172 +++++++++++++++++++++ 2 files changed, 179 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/vfio/lib/include/vfio_util.h b/tools/testing/selftests/vfio/lib/include/vfio_util.h index 240409bf5f8a..ef8f06ef0c13 100644 --- a/tools/testing/selftests/vfio/lib/include/vfio_util.h +++ b/tools/testing/selftests/vfio/lib/include/vfio_util.h @@ -4,9 +4,12 @@ #include #include -#include + +#include +#include #include #include +#include #include "../../../kselftest.h" @@ -206,6 +209,9 @@ struct vfio_pci_device *vfio_pci_device_init(const char *bdf, const char *iommu_ void vfio_pci_device_cleanup(struct vfio_pci_device *device); void vfio_pci_device_reset(struct vfio_pci_device *device); +struct iommu_iova_range *vfio_pci_iova_ranges(struct vfio_pci_device *device, + u32 *nranges); + int __vfio_pci_dma_map(struct vfio_pci_device *device, struct vfio_dma_region *region); int __vfio_pci_dma_unmap(struct vfio_pci_device *device, diff --git a/tools/testing/selftests/vfio/lib/vfio_pci_device.c b/tools/testing/selftests/vfio/lib/vfio_pci_device.c index a381fd253aa7..11749348f53f 100644 --- a/tools/testing/selftests/vfio/lib/vfio_pci_device.c +++ b/tools/testing/selftests/vfio/lib/vfio_pci_device.c @@ -29,6 +29,178 @@ VFIO_ASSERT_EQ(__ret, 0, "ioctl(%s, %s, %s) returned %d\n", #_fd, #_op, #_arg, __ret); \ } while (0) +static struct vfio_info_cap_header *next_cap_hdr(void *buf, u32 bufsz, + u32 *cap_offset) +{ + struct vfio_info_cap_header *hdr; + + if (!*cap_offset) + return NULL; + + VFIO_ASSERT_LT(*cap_offset, bufsz); + VFIO_ASSERT_GE(bufsz - *cap_offset, sizeof(*hdr)); + + hdr = (struct vfio_info_cap_header *)((u8 *)buf + *cap_offset); + *cap_offset = hdr->next; + + return hdr; +} + +static struct vfio_info_cap_header *vfio_iommu_info_cap_hdr(struct vfio_iommu_type1_info *info, + u16 cap_id) +{ + struct vfio_info_cap_header *hdr; + u32 cap_offset = info->cap_offset; + u32 max_depth; + u32 depth = 0; + + if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) + return NULL; + + if (cap_offset) + VFIO_ASSERT_GE(cap_offset, sizeof(*info)); + + max_depth = (info->argsz - sizeof(*info)) / sizeof(*hdr); + + while ((hdr = next_cap_hdr(info, info->argsz, &cap_offset))) { + depth++; + VFIO_ASSERT_LE(depth, max_depth, "Capability chain contains a cycle\n"); + + if (hdr->id == cap_id) + return hdr; + } + + return NULL; +} + +/* Return buffer including capability chain, if present. Free with free() */ +static struct vfio_iommu_type1_info *vfio_iommu_get_info(struct vfio_pci_device *device) +{ + struct vfio_iommu_type1_info *info; + + info = malloc(sizeof(*info)); + VFIO_ASSERT_NOT_NULL(info); + + *info = (struct vfio_iommu_type1_info) { + .argsz = sizeof(*info), + }; + + ioctl_assert(device->container_fd, VFIO_IOMMU_GET_INFO, info); + VFIO_ASSERT_GE(info->argsz, sizeof(*info)); + + info = realloc(info, info->argsz); + VFIO_ASSERT_NOT_NULL(info); + + ioctl_assert(device->container_fd, VFIO_IOMMU_GET_INFO, info); + VFIO_ASSERT_GE(info->argsz, sizeof(*info)); + + return info; +} + +/* + * Return iova ranges for the device's container. Normalize vfio_iommu_type1 to + * report iommufd's iommu_iova_range. Free with free(). + */ +static struct iommu_iova_range *vfio_iommu_iova_ranges(struct vfio_pci_device *device, + u32 *nranges) +{ + struct vfio_iommu_type1_info_cap_iova_range *cap_range; + struct vfio_iommu_type1_info *info; + struct vfio_info_cap_header *hdr; + struct iommu_iova_range *ranges = NULL; + + info = vfio_iommu_get_info(device); + hdr = vfio_iommu_info_cap_hdr(info, VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE); + VFIO_ASSERT_NOT_NULL(hdr); + + cap_range = container_of(hdr, struct vfio_iommu_type1_info_cap_iova_range, header); + VFIO_ASSERT_GT(cap_range->nr_iovas, 0); + + ranges = calloc(cap_range->nr_iovas, sizeof(*ranges)); + VFIO_ASSERT_NOT_NULL(ranges); + + for (u32 i = 0; i < cap_range->nr_iovas; i++) { + ranges[i] = (struct iommu_iova_range){ + .start = cap_range->iova_ranges[i].start, + .last = cap_range->iova_ranges[i].end, + }; + } + + *nranges = cap_range->nr_iovas; + + free(info); + return ranges; +} + +/* Return iova ranges of the device's IOAS. Free with free() */ +static struct iommu_iova_range *iommufd_iova_ranges(struct vfio_pci_device *device, + u32 *nranges) +{ + struct iommu_iova_range *ranges; + int ret; + + struct iommu_ioas_iova_ranges query = { + .size = sizeof(query), + .ioas_id = device->ioas_id, + }; + + ret = ioctl(device->iommufd, IOMMU_IOAS_IOVA_RANGES, &query); + VFIO_ASSERT_EQ(ret, -1); + VFIO_ASSERT_EQ(errno, EMSGSIZE); + VFIO_ASSERT_GT(query.num_iovas, 0); + + ranges = calloc(query.num_iovas, sizeof(*ranges)); + VFIO_ASSERT_NOT_NULL(ranges); + + query.allowed_iovas = (uintptr_t)ranges; + + ioctl_assert(device->iommufd, IOMMU_IOAS_IOVA_RANGES, &query); + *nranges = query.num_iovas; + + return ranges; +} + +static int iova_range_comp(const void *a, const void *b) +{ + const struct iommu_iova_range *ra = a, *rb = b; + + if (ra->start < rb->start) + return -1; + + if (ra->start > rb->start) + return 1; + + return 0; +} + +/* Return sorted IOVA ranges of the device. Free with free(). */ +struct iommu_iova_range *vfio_pci_iova_ranges(struct vfio_pci_device *device, + u32 *nranges) +{ + struct iommu_iova_range *ranges; + + if (device->iommufd) + ranges = iommufd_iova_ranges(device, nranges); + else + ranges = vfio_iommu_iova_ranges(device, nranges); + + if (!ranges) + return NULL; + + VFIO_ASSERT_GT(*nranges, 0); + + /* Sort and check that ranges are sane and non-overlapping */ + qsort(ranges, *nranges, sizeof(*ranges), iova_range_comp); + VFIO_ASSERT_LT(ranges[0].start, ranges[0].last); + + for (u32 i = 1; i < *nranges; i++) { + VFIO_ASSERT_LT(ranges[i].start, ranges[i].last); + VFIO_ASSERT_LT(ranges[i - 1].last, ranges[i].start); + } + + return ranges; +} + iova_t __to_iova(struct vfio_pci_device *device, void *vaddr) { struct vfio_dma_region *region; -- cgit From a77fa0b9222d2f23a764061a3be18e6bc738672e Mon Sep 17 00:00:00 2001 From: Alex Mastro Date: Tue, 11 Nov 2025 10:48:25 -0800 Subject: vfio: selftests: fix map limit tests to use last available iova Use the newly available vfio_pci_iova_ranges() to determine the last legal IOVA, and use this as the basis for vfio_dma_map_limit_test tests. Fixes: de8d1f2fd5a5 ("vfio: selftests: add end of address space DMA map/unmap tests") Reviewed-by: David Matlack Tested-by: David Matlack Signed-off-by: Alex Mastro Link: https://lore.kernel.org/r/20251111-iova-ranges-v3-2-7960244642c5@fb.com Signed-off-by: Alex Williamson --- tools/testing/selftests/vfio/vfio_dma_mapping_test.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_test.c index 4f1ea79a200c..e1374aab96bd 100644 --- a/tools/testing/selftests/vfio/vfio_dma_mapping_test.c +++ b/tools/testing/selftests/vfio/vfio_dma_mapping_test.c @@ -3,6 +3,8 @@ #include #include +#include +#include #include #include #include @@ -219,7 +221,10 @@ FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(); FIXTURE_SETUP(vfio_dma_map_limit_test) { struct vfio_dma_region *region = &self->region; + struct iommu_iova_range *ranges; u64 region_size = getpagesize(); + iova_t last_iova; + u32 nranges; /* * Over-allocate mmap by double the size to provide enough backing vaddr @@ -232,8 +237,13 @@ FIXTURE_SETUP(vfio_dma_map_limit_test) MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); ASSERT_NE(region->vaddr, MAP_FAILED); - /* One page prior to the end of address space */ - region->iova = ~(iova_t)0 & ~(region_size - 1); + ranges = vfio_pci_iova_ranges(self->device, &nranges); + VFIO_ASSERT_NOT_NULL(ranges); + last_iova = ranges[nranges - 1].last; + free(ranges); + + /* One page prior to the last iova */ + region->iova = last_iova & ~(region_size - 1); region->size = region_size; } @@ -276,6 +286,7 @@ TEST_F(vfio_dma_map_limit_test, overflow) struct vfio_dma_region *region = &self->region; int rc; + region->iova = ~(iova_t)0 & ~(region->size - 1); region->size = self->mmap_size; rc = __vfio_pci_dma_map(self->device, region); -- cgit From ce0e3c403e00e9e03e80aca6570bf936a44279e2 Mon Sep 17 00:00:00 2001 From: Alex Mastro Date: Tue, 11 Nov 2025 10:48:26 -0800 Subject: vfio: selftests: add iova allocator Add struct iova_allocator, which gives tests a convenient way to generate legally-accessible IOVAs to map. This allocator traverses the sorted available IOVA ranges linearly, requires power-of-two size allocations, and does not support freeing iova allocations. The assumption is that tests are not IOVA space-bounded, and will not need to recycle IOVAs. This is based on Alex Williamson's patch series for adding an IOVA allocator [1]. [1] https://lore.kernel.org/all/20251108212954.26477-1-alex@shazbot.org/ Reviewed-by: David Matlack Tested-by: David Matlack Signed-off-by: Alex Mastro Link: https://lore.kernel.org/r/20251111-iova-ranges-v3-3-7960244642c5@fb.com Signed-off-by: Alex Williamson --- .../testing/selftests/vfio/lib/include/vfio_util.h | 11 ++++ tools/testing/selftests/vfio/lib/vfio_pci_device.c | 74 +++++++++++++++++++++- 2 files changed, 84 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/vfio/lib/include/vfio_util.h b/tools/testing/selftests/vfio/lib/include/vfio_util.h index ef8f06ef0c13..69ec0c856481 100644 --- a/tools/testing/selftests/vfio/lib/include/vfio_util.h +++ b/tools/testing/selftests/vfio/lib/include/vfio_util.h @@ -188,6 +188,13 @@ struct vfio_pci_device { struct vfio_pci_driver driver; }; +struct iova_allocator { + struct iommu_iova_range *ranges; + u32 nranges; + u32 range_idx; + u64 range_offset; +}; + /* * Return the BDF string of the device that the test should use. * @@ -212,6 +219,10 @@ void vfio_pci_device_reset(struct vfio_pci_device *device); struct iommu_iova_range *vfio_pci_iova_ranges(struct vfio_pci_device *device, u32 *nranges); +struct iova_allocator *iova_allocator_init(struct vfio_pci_device *device); +void iova_allocator_cleanup(struct iova_allocator *allocator); +iova_t iova_allocator_alloc(struct iova_allocator *allocator, size_t size); + int __vfio_pci_dma_map(struct vfio_pci_device *device, struct vfio_dma_region *region); int __vfio_pci_dma_unmap(struct vfio_pci_device *device, diff --git a/tools/testing/selftests/vfio/lib/vfio_pci_device.c b/tools/testing/selftests/vfio/lib/vfio_pci_device.c index 11749348f53f..b479a359da12 100644 --- a/tools/testing/selftests/vfio/lib/vfio_pci_device.c +++ b/tools/testing/selftests/vfio/lib/vfio_pci_device.c @@ -12,11 +12,12 @@ #include #include +#include #include #include +#include #include #include -#include #include "../../../kselftest.h" #include @@ -201,6 +202,77 @@ struct iommu_iova_range *vfio_pci_iova_ranges(struct vfio_pci_device *device, return ranges; } +struct iova_allocator *iova_allocator_init(struct vfio_pci_device *device) +{ + struct iova_allocator *allocator; + struct iommu_iova_range *ranges; + u32 nranges; + + ranges = vfio_pci_iova_ranges(device, &nranges); + VFIO_ASSERT_NOT_NULL(ranges); + + allocator = malloc(sizeof(*allocator)); + VFIO_ASSERT_NOT_NULL(allocator); + + *allocator = (struct iova_allocator){ + .ranges = ranges, + .nranges = nranges, + .range_idx = 0, + .range_offset = 0, + }; + + return allocator; +} + +void iova_allocator_cleanup(struct iova_allocator *allocator) +{ + free(allocator->ranges); + free(allocator); +} + +iova_t iova_allocator_alloc(struct iova_allocator *allocator, size_t size) +{ + VFIO_ASSERT_GT(size, 0, "Invalid size arg, zero\n"); + VFIO_ASSERT_EQ(size & (size - 1), 0, "Invalid size arg, non-power-of-2\n"); + + for (;;) { + struct iommu_iova_range *range; + iova_t iova, last; + + VFIO_ASSERT_LT(allocator->range_idx, allocator->nranges, + "IOVA allocator out of space\n"); + + range = &allocator->ranges[allocator->range_idx]; + iova = range->start + allocator->range_offset; + + /* Check for sufficient space at the current offset */ + if (check_add_overflow(iova, size - 1, &last) || + last > range->last) + goto next_range; + + /* Align iova to size */ + iova = last & ~(size - 1); + + /* Check for sufficient space at the aligned iova */ + if (check_add_overflow(iova, size - 1, &last) || + last > range->last) + goto next_range; + + if (last == range->last) { + allocator->range_idx++; + allocator->range_offset = 0; + } else { + allocator->range_offset = last - range->start + 1; + } + + return iova; + +next_range: + allocator->range_idx++; + allocator->range_offset = 0; + } +} + iova_t __to_iova(struct vfio_pci_device *device, void *vaddr) { struct vfio_dma_region *region; -- cgit From d323ad739666761646048fca587734f4ae64f2c8 Mon Sep 17 00:00:00 2001 From: Alex Mastro Date: Tue, 11 Nov 2025 10:48:27 -0800 Subject: vfio: selftests: replace iova=vaddr with allocated iovas vfio_dma_mapping_test and vfio_pci_driver_test currently use iova=vaddr as part of DMA mapping operations. However, not all IOMMUs support the same virtual address width as the processor. For instance, older Intel consumer platforms only support 39-bits of IOMMU address space. On such platforms, using the virtual address as the IOVA fails. Make the tests more robust by using iova_allocator to vend IOVAs, which queries legally accessible IOVAs from the underlying IOMMUFD or VFIO container. Reviewed-by: David Matlack Tested-by: David Matlack Signed-off-by: Alex Mastro Link: https://lore.kernel.org/r/20251111-iova-ranges-v3-4-7960244642c5@fb.com Signed-off-by: Alex Williamson --- tools/testing/selftests/vfio/vfio_dma_mapping_test.c | 5 ++++- tools/testing/selftests/vfio/vfio_pci_driver_test.c | 12 ++++++++---- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_test.c index e1374aab96bd..102603d4407d 100644 --- a/tools/testing/selftests/vfio/vfio_dma_mapping_test.c +++ b/tools/testing/selftests/vfio/vfio_dma_mapping_test.c @@ -95,6 +95,7 @@ static int iommu_mapping_get(const char *bdf, u64 iova, FIXTURE(vfio_dma_mapping_test) { struct vfio_pci_device *device; + struct iova_allocator *iova_allocator; }; FIXTURE_VARIANT(vfio_dma_mapping_test) { @@ -119,10 +120,12 @@ FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(anonymous_hugetlb_1gb, SZ_1G, MAP_HUGETLB | FIXTURE_SETUP(vfio_dma_mapping_test) { self->device = vfio_pci_device_init(device_bdf, variant->iommu_mode); + self->iova_allocator = iova_allocator_init(self->device); } FIXTURE_TEARDOWN(vfio_dma_mapping_test) { + iova_allocator_cleanup(self->iova_allocator); vfio_pci_device_cleanup(self->device); } @@ -144,7 +147,7 @@ TEST_F(vfio_dma_mapping_test, dma_map_unmap) else ASSERT_NE(region.vaddr, MAP_FAILED); - region.iova = (u64)region.vaddr; + region.iova = iova_allocator_alloc(self->iova_allocator, size); region.size = size; vfio_pci_dma_map(self->device, ®ion); diff --git a/tools/testing/selftests/vfio/vfio_pci_driver_test.c b/tools/testing/selftests/vfio/vfio_pci_driver_test.c index 2dbd70b7db62..f69eec8b928d 100644 --- a/tools/testing/selftests/vfio/vfio_pci_driver_test.c +++ b/tools/testing/selftests/vfio/vfio_pci_driver_test.c @@ -19,6 +19,7 @@ static const char *device_bdf; } while (0) static void region_setup(struct vfio_pci_device *device, + struct iova_allocator *iova_allocator, struct vfio_dma_region *region, u64 size) { const int flags = MAP_SHARED | MAP_ANONYMOUS; @@ -29,7 +30,7 @@ static void region_setup(struct vfio_pci_device *device, VFIO_ASSERT_NE(vaddr, MAP_FAILED); region->vaddr = vaddr; - region->iova = (u64)vaddr; + region->iova = iova_allocator_alloc(iova_allocator, size); region->size = size; vfio_pci_dma_map(device, region); @@ -44,6 +45,7 @@ static void region_teardown(struct vfio_pci_device *device, FIXTURE(vfio_pci_driver_test) { struct vfio_pci_device *device; + struct iova_allocator *iova_allocator; struct vfio_dma_region memcpy_region; void *vaddr; int msi_fd; @@ -72,14 +74,15 @@ FIXTURE_SETUP(vfio_pci_driver_test) struct vfio_pci_driver *driver; self->device = vfio_pci_device_init(device_bdf, variant->iommu_mode); + self->iova_allocator = iova_allocator_init(self->device); driver = &self->device->driver; - region_setup(self->device, &self->memcpy_region, SZ_1G); - region_setup(self->device, &driver->region, SZ_2M); + region_setup(self->device, self->iova_allocator, &self->memcpy_region, SZ_1G); + region_setup(self->device, self->iova_allocator, &driver->region, SZ_2M); /* Any IOVA that doesn't overlap memcpy_region and driver->region. */ - self->unmapped_iova = 8UL * SZ_1G; + self->unmapped_iova = iova_allocator_alloc(self->iova_allocator, SZ_1G); vfio_pci_driver_init(self->device); self->msi_fd = self->device->msi_eventfds[driver->msi]; @@ -108,6 +111,7 @@ FIXTURE_TEARDOWN(vfio_pci_driver_test) region_teardown(self->device, &self->memcpy_region); region_teardown(self->device, &driver->region); + iova_allocator_cleanup(self->iova_allocator); vfio_pci_device_cleanup(self->device); } -- cgit From 2d0e88f3fd1dcb37072d499c36162baf5b009d41 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Tue, 11 Nov 2025 12:15:29 -0700 Subject: io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs io_buffer_register_bvec() currently uses blk_rq_nr_phys_segments() as the number of bvecs in the request. However, bvecs may be split into multiple segments depending on the queue limits. Thus, the number of segments may overestimate the number of bvecs. For ublk devices, the only current users of io_buffer_register_bvec(), virt_boundary_mask, seg_boundary_mask, max_segments, and max_segment_size can all be set arbitrarily by the ublk server process. Set imu->nr_bvecs based on the number of bvecs the rq_for_each_bvec() loop actually yields. However, continue using blk_rq_nr_phys_segments() as an upper bound on the number of bvecs when allocating imu to avoid needing to iterate the bvecs a second time. Link: https://lore.kernel.org/io-uring/20251111191530.1268875-1-csander@purestorage.com/ Signed-off-by: Caleb Sander Mateos Fixes: 27cb27b6d5ea ("io_uring: add support for kernel registered bvecs") Reviewed-by: Ming Lei Reviewed-by: Chaitanya Kulkarni Signed-off-by: Jens Axboe --- io_uring/rsrc.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 2602d76d5ff0..0010c4992490 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -943,8 +943,8 @@ int io_buffer_register_bvec(struct io_uring_cmd *cmd, struct request *rq, struct req_iterator rq_iter; struct io_mapped_ubuf *imu; struct io_rsrc_node *node; - struct bio_vec bv, *bvec; - u16 nr_bvecs; + struct bio_vec bv; + unsigned int nr_bvecs = 0; int ret = 0; io_ring_submit_lock(ctx, issue_flags); @@ -965,8 +965,11 @@ int io_buffer_register_bvec(struct io_uring_cmd *cmd, struct request *rq, goto unlock; } - nr_bvecs = blk_rq_nr_phys_segments(rq); - imu = io_alloc_imu(ctx, nr_bvecs); + /* + * blk_rq_nr_phys_segments() may overestimate the number of bvecs + * but avoids needing to iterate over the bvecs + */ + imu = io_alloc_imu(ctx, blk_rq_nr_phys_segments(rq)); if (!imu) { kfree(node); ret = -ENOMEM; @@ -977,16 +980,15 @@ int io_buffer_register_bvec(struct io_uring_cmd *cmd, struct request *rq, imu->len = blk_rq_bytes(rq); imu->acct_pages = 0; imu->folio_shift = PAGE_SHIFT; - imu->nr_bvecs = nr_bvecs; refcount_set(&imu->refs, 1); imu->release = release; imu->priv = rq; imu->is_kbuf = true; imu->dir = 1 << rq_data_dir(rq); - bvec = imu->bvec; rq_for_each_bvec(bv, rq, rq_iter) - *bvec++ = bv; + imu->bvec[nr_bvecs++] = bv; + imu->nr_bvecs = nr_bvecs; node->buf = imu; data->nodes[index] = node; -- cgit From 5f02151c411dda46efcc5dc57b0845efcdcfc26d Mon Sep 17 00:00:00 2001 From: Zqiang Date: Wed, 12 Nov 2025 15:33:28 +0800 Subject: sched_ext: Fix unsafe locking in the scx_dump_state() For built with CONFIG_PREEMPT_RT=y kernels, the dump_lock will be converted sleepable spinlock and not disable-irq, so the following scenarios occur: inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. irq_work/0/27 [HC0[0]:SC0[0]:HE1:SE1] takes: (&rq->__lock){?...}-{2:2}, at: raw_spin_rq_lock_nested+0x2b/0x40 {IN-HARDIRQ-W} state was registered at: lock_acquire+0x1e1/0x510 _raw_spin_lock_nested+0x42/0x80 raw_spin_rq_lock_nested+0x2b/0x40 sched_tick+0xae/0x7b0 update_process_times+0x14c/0x1b0 tick_periodic+0x62/0x1f0 tick_handle_periodic+0x48/0xf0 timer_interrupt+0x55/0x80 __handle_irq_event_percpu+0x20a/0x5c0 handle_irq_event_percpu+0x18/0xc0 handle_irq_event+0xb5/0x150 handle_level_irq+0x220/0x460 __common_interrupt+0xa2/0x1e0 common_interrupt+0xb0/0xd0 asm_common_interrupt+0x2b/0x40 _raw_spin_unlock_irqrestore+0x45/0x80 __setup_irq+0xc34/0x1a30 request_threaded_irq+0x214/0x2f0 hpet_time_init+0x3e/0x60 x86_late_time_init+0x5b/0xb0 start_kernel+0x308/0x410 x86_64_start_reservations+0x1c/0x30 x86_64_start_kernel+0x96/0xa0 common_startup_64+0x13e/0x148 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&rq->__lock); lock(&rq->__lock); *** DEADLOCK *** stack backtrace: CPU: 0 UID: 0 PID: 27 Comm: irq_work/0 Call Trace: dump_stack_lvl+0x8c/0xd0 dump_stack+0x14/0x20 print_usage_bug+0x42e/0x690 mark_lock.part.44+0x867/0xa70 ? __pfx_mark_lock.part.44+0x10/0x10 ? string_nocheck+0x19c/0x310 ? number+0x739/0x9f0 ? __pfx_string_nocheck+0x10/0x10 ? __pfx_check_pointer+0x10/0x10 ? kvm_sched_clock_read+0x15/0x30 ? sched_clock_noinstr+0xd/0x20 ? local_clock_noinstr+0x1c/0xe0 __lock_acquire+0xc4b/0x62b0 ? __pfx_format_decode+0x10/0x10 ? __pfx_string+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 ? __pfx_vsnprintf+0x10/0x10 lock_acquire+0x1e1/0x510 ? raw_spin_rq_lock_nested+0x2b/0x40 ? __pfx_lock_acquire+0x10/0x10 ? dump_line+0x12e/0x270 ? raw_spin_rq_lock_nested+0x20/0x40 _raw_spin_lock_nested+0x42/0x80 ? raw_spin_rq_lock_nested+0x2b/0x40 raw_spin_rq_lock_nested+0x2b/0x40 scx_dump_state+0x3b3/0x1270 ? finish_task_switch+0x27e/0x840 scx_ops_error_irq_workfn+0x67/0x80 irq_work_single+0x113/0x260 irq_work_run_list.part.3+0x44/0x70 run_irq_workd+0x6b/0x90 ? __pfx_run_irq_workd+0x10/0x10 smpboot_thread_fn+0x529/0x870 ? __pfx_smpboot_thread_fn+0x10/0x10 kthread+0x305/0x3f0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x40/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 This commit therefore use rq_lock_irqsave/irqrestore() to replace rq_lock/unlock() in the scx_dump_state(). Fixes: 07814a9439a3 ("sched_ext: Print debug dump after an error exit") Signed-off-by: Zqiang Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 1a019a7728fb..184d562bda5e 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -4276,7 +4276,7 @@ static void scx_dump_state(struct scx_exit_info *ei, size_t dump_len) size_t avail, used; bool idle; - rq_lock(rq, &rf); + rq_lock_irqsave(rq, &rf); idle = list_empty(&rq->scx.runnable_list) && rq->curr->sched_class == &idle_sched_class; @@ -4345,7 +4345,7 @@ static void scx_dump_state(struct scx_exit_info *ei, size_t dump_len) list_for_each_entry(p, &rq->scx.runnable_list, scx.runnable_node) scx_dump_task(&s, &dctx, p, ' '); next: - rq_unlock(rq, &rf); + rq_unlock_irqrestore(rq, &rf); } dump_newline(&s); -- cgit From 4b747cc628d8f500d56cf1338280eacc66362ff3 Mon Sep 17 00:00:00 2001 From: Srinivas Pandruvada Date: Mon, 10 Nov 2025 17:08:40 -0800 Subject: cpufreq: intel_pstate: Check IDA only before MSR_IA32_PERF_CTL writes Commit ac4e04d9e378 ("cpufreq: intel_pstate: Unchecked MSR aceess in legacy mode") introduced a check for feature X86_FEATURE_IDA to verify turbo mode support. Although this is the correct way to check for turbo mode support, it causes issues on some platforms that disable turbo during OS boot, but enable it later [1]. Before adding this feature check, users were able to get turbo mode frequencies by writing 0 to /sys/devices/system/cpu/intel_pstate/no_turbo post-boot. To restore the old behavior on the affected systems while still addressing the unchecked MSR issue on some Skylake-X systems, check X86_FEATURE_IDA only immediately before updates of MSR_IA32_PERF_CTL that may involve setting the Turbo Engage Bit (bit 32). Fixes: ac4e04d9e378 ("cpufreq: intel_pstate: Unchecked MSR aceess in legacy mode") Reported-by: Aaron Rainbolt Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2122531 [1] Tested-by: Aaron Rainbolt Signed-off-by: Srinivas Pandruvada [ rjw: Subject adjustment, changelog edits ] Link: https://patch.msgid.link/20251111010840.141490-1-srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/intel_pstate.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 38897bb14a2c..492a10f1bdbf 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -603,9 +603,6 @@ static bool turbo_is_disabled(void) { u64 misc_en; - if (!cpu_feature_enabled(X86_FEATURE_IDA)) - return true; - rdmsrq(MSR_IA32_MISC_ENABLE, misc_en); return !!(misc_en & MSR_IA32_MISC_ENABLE_TURBO_DISABLE); @@ -2106,7 +2103,8 @@ static u64 atom_get_val(struct cpudata *cpudata, int pstate) u32 vid; val = (u64)pstate << 8; - if (READ_ONCE(global.no_turbo) && !READ_ONCE(global.turbo_disabled)) + if (READ_ONCE(global.no_turbo) && !READ_ONCE(global.turbo_disabled) && + cpu_feature_enabled(X86_FEATURE_IDA)) val |= (u64)1 << 32; vid_fp = cpudata->vid.min + mul_fp( @@ -2271,7 +2269,8 @@ static u64 core_get_val(struct cpudata *cpudata, int pstate) u64 val; val = (u64)pstate << 8; - if (READ_ONCE(global.no_turbo) && !READ_ONCE(global.turbo_disabled)) + if (READ_ONCE(global.no_turbo) && !READ_ONCE(global.turbo_disabled) && + cpu_feature_enabled(X86_FEATURE_IDA)) val |= (u64)1 << 32; return val; -- cgit From c87488a12393a23f8a1b9850b989b386c58cac3f Mon Sep 17 00:00:00 2001 From: Emil Tsalapatis Date: Wed, 12 Nov 2025 08:42:02 -1000 Subject: sched/ext: convert scx_tasks_lock to raw spinlock Update scx_task_locks so that it's safe to lock/unlock in a non-sleepable context in PREEMPT_RT kernels. scx_task_locks is (non-raw) spinlock used to protect the list of tasks under SCX. This list is updated during from finish_task_switch(), which cannot sleep. Regular spinlocks can be locked in such a context in non-RT kernels, but are sleepable under when CONFIG_PREEMPT_RT=y. Convert scx_task_locks into a raw spinlock, which is not sleepable even on RT kernels. Sample backtrace: dump_stack_lvl+0x83/0xa0 __might_resched+0x14a/0x200 rt_spin_lock+0x61/0x1c0 ? sched_ext_dead+0x2d/0xf0 ? lock_release+0xc6/0x280 sched_ext_dead+0x2d/0xf0 ? srso_alias_return_thunk+0x5/0xfbef5 finish_task_switch.isra.0+0x254/0x360 __schedule+0x584/0x11d0 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? tick_nohz_idle_exit+0x7e/0x120 schedule_idle+0x23/0x40 cpu_startup_entry+0x29/0x30 start_secondary+0xf8/0x100 common_startup_64+0x13e/0x148 Signed-off-by: Emil Tsalapatis Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 184d562bda5e..03d05ea20006 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -25,7 +25,7 @@ static struct scx_sched __rcu *scx_root; * guarantee system safety. Maintain a dedicated task list which contains every * task between its fork and eventual free. */ -static DEFINE_SPINLOCK(scx_tasks_lock); +static DEFINE_RAW_SPINLOCK(scx_tasks_lock); static LIST_HEAD(scx_tasks); /* ops enable/disable */ @@ -476,7 +476,7 @@ static void scx_task_iter_start(struct scx_task_iter *iter) BUILD_BUG_ON(__SCX_DSQ_ITER_ALL_FLAGS & ((1U << __SCX_DSQ_LNODE_PRIV_SHIFT) - 1)); - spin_lock_irq(&scx_tasks_lock); + raw_spin_lock_irq(&scx_tasks_lock); iter->cursor = (struct sched_ext_entity){ .flags = SCX_TASK_CURSOR }; list_add(&iter->cursor.tasks_node, &scx_tasks); @@ -507,14 +507,14 @@ static void scx_task_iter_unlock(struct scx_task_iter *iter) __scx_task_iter_rq_unlock(iter); if (iter->list_locked) { iter->list_locked = false; - spin_unlock_irq(&scx_tasks_lock); + raw_spin_unlock_irq(&scx_tasks_lock); } } static void __scx_task_iter_maybe_relock(struct scx_task_iter *iter) { if (!iter->list_locked) { - spin_lock_irq(&scx_tasks_lock); + raw_spin_lock_irq(&scx_tasks_lock); iter->list_locked = true; } } @@ -2940,9 +2940,9 @@ void scx_post_fork(struct task_struct *p) } } - spin_lock_irq(&scx_tasks_lock); + raw_spin_lock_irq(&scx_tasks_lock); list_add_tail(&p->scx.tasks_node, &scx_tasks); - spin_unlock_irq(&scx_tasks_lock); + raw_spin_unlock_irq(&scx_tasks_lock); percpu_up_read(&scx_fork_rwsem); } @@ -2966,9 +2966,9 @@ void sched_ext_free(struct task_struct *p) { unsigned long flags; - spin_lock_irqsave(&scx_tasks_lock, flags); + raw_spin_lock_irqsave(&scx_tasks_lock, flags); list_del_init(&p->scx.tasks_node); - spin_unlock_irqrestore(&scx_tasks_lock, flags); + raw_spin_unlock_irqrestore(&scx_tasks_lock, flags); /* * @p is off scx_tasks and wholly ours. scx_enable()'s READY -> ENABLED -- cgit From 9efb297c520f392ab04bc45544a03770c98c3798 Mon Sep 17 00:00:00 2001 From: Gopi Krishna Menon Date: Sat, 25 Oct 2025 01:50:40 +0530 Subject: hwmon: (gpd-fan) Fix compilation error in non-ACPI builds MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Building gpd-fan driver without CONFIG_ACPI results in the following build errors: drivers/hwmon/gpd-fan.c: In function ‘gpd_ecram_read’: drivers/hwmon/gpd-fan.c:228:9: error: implicit declaration of function ‘outb’ [-Werror=implicit-function-declaration] 228 | outb(0x2E, addr_port); | ^~~~ drivers/hwmon/gpd-fan.c:241:16: error: implicit declaration of function ‘inb’ [-Werror=implicit-function-declaration] 241 | *val = inb(data_port); The definitions for inb() and outb() come from (specifically through ), which is implicitly included via . When CONFIG_ACPI is not set, is not included resulting in to be omitted as well. Since the driver does not depend on ACPI, remove and add directly to fix the compilation errors. Signed-off-by: Gopi Krishna Menon Link: https://lore.kernel.org/r/20251024202042.752160-1-krishnagopi487@gmail.com Signed-off-by: Guenter Roeck --- drivers/hwmon/gpd-fan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/gpd-fan.c b/drivers/hwmon/gpd-fan.c index 321794807e8d..48c84e3e9939 100644 --- a/drivers/hwmon/gpd-fan.c +++ b/drivers/hwmon/gpd-fan.c @@ -12,9 +12,9 @@ * Copyright (c) 2024 Cryolitia PukNgae */ -#include #include #include +#include #include #include #include -- cgit From c55a8e24cd129b6d8fed20e3d63c10c2263e2fc8 Mon Sep 17 00:00:00 2001 From: Cryolitia PukNgae Date: Thu, 30 Oct 2025 22:30:06 +0800 Subject: hwmon: (gpd-fan) initialize EC on driver load for Win 4 The original implement will re-init the EC when it reports a zero value, and it's a workaround for the black box buggy firmware. Now a contributer test and report that, the bug is that, the firmware won't initialize the EC on boot, so the EC ramains in unusable status. And it won't need to re-init it during runtime. The original implement is not perfect, any write command will be ignored until we first read it. Just re-init it unconditionally when the driver load could work. Fixes: 0ab88e239439 ("hwmon: add GPD devices sensor driver") Co-developed-by: kylon <3252255+kylon@users.noreply.github.com> Signed-off-by: kylon <3252255+kylon@users.noreply.github.com> Link: https://github.com/Cryolitia/gpd-fan-driver/pull/20 Signed-off-by: Cryolitia PukNgae Link: https://lore.kernel.org/r/20251030-win4-v1-1-c374dcb86985@uniontech.com Signed-off-by: Guenter Roeck --- drivers/hwmon/gpd-fan.c | 52 ++++++++++++++++++++++++------------------------- 1 file changed, 25 insertions(+), 27 deletions(-) diff --git a/drivers/hwmon/gpd-fan.c b/drivers/hwmon/gpd-fan.c index 48c84e3e9939..f81c3bc422f4 100644 --- a/drivers/hwmon/gpd-fan.c +++ b/drivers/hwmon/gpd-fan.c @@ -276,31 +276,6 @@ static int gpd_generic_read_rpm(void) return (u16)high << 8 | low; } -static void gpd_win4_init_ec(void) -{ - u8 chip_id, chip_ver; - - gpd_ecram_read(0x2000, &chip_id); - - if (chip_id == 0x55) { - gpd_ecram_read(0x1060, &chip_ver); - gpd_ecram_write(0x1060, chip_ver | 0x80); - } -} - -static int gpd_win4_read_rpm(void) -{ - int ret; - - ret = gpd_generic_read_rpm(); - - if (ret == 0) - // Re-init EC when speed is 0 - gpd_win4_init_ec(); - - return ret; -} - static int gpd_wm2_read_rpm(void) { for (u16 pwm_ctr_offset = GPD_PWM_CTR_OFFSET; @@ -320,11 +295,10 @@ static int gpd_wm2_read_rpm(void) static int gpd_read_rpm(void) { switch (gpd_driver_priv.drvdata->board) { + case win4_6800u: case win_mini: case duo: return gpd_generic_read_rpm(); - case win4_6800u: - return gpd_win4_read_rpm(); case win_max_2: return gpd_wm2_read_rpm(); } @@ -607,6 +581,28 @@ static struct hwmon_chip_info gpd_fan_chip_info = { .info = gpd_fan_hwmon_channel_info }; +static void gpd_win4_init_ec(void) +{ + u8 chip_id, chip_ver; + + gpd_ecram_read(0x2000, &chip_id); + + if (chip_id == 0x55) { + gpd_ecram_read(0x1060, &chip_ver); + gpd_ecram_write(0x1060, chip_ver | 0x80); + } +} + +static void gpd_init_ec(void) +{ + // The buggy firmware won't initialize EC properly on boot. + // Before its initialization, reading RPM will always return 0, + // and writing PWM will have no effect. + // Initialize it manually on driver load. + if (gpd_driver_priv.drvdata->board == win4_6800u) + gpd_win4_init_ec(); +} + static int gpd_fan_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; @@ -634,6 +630,8 @@ static int gpd_fan_probe(struct platform_device *pdev) return dev_err_probe(dev, PTR_ERR(hwdev), "Failed to register hwmon device\n"); + gpd_init_ec(); + return 0; } -- cgit From 214291cbaaceeb28debd773336642b1fca393ae0 Mon Sep 17 00:00:00 2001 From: Dave Jiang Date: Wed, 5 Nov 2025 16:51:15 -0700 Subject: acpi/hmat: Fix lockdep warning for hmem_register_resource() The following lockdep splat was observed while kernel auto-online a CXL memory region: ====================================================== WARNING: possible circular locking dependency detected 6.17.0djtest+ #53 Tainted: G W ------------------------------------------------------ systemd-udevd/3334 is trying to acquire lock: ffffffff90346188 (hmem_resource_lock){+.+.}-{4:4}, at: hmem_register_resource+0x31/0x50 but task is already holding lock: ffffffff90338890 ((node_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x2e/0x70 which lock already depends on the new lock. [..] Chain exists of: hmem_resource_lock --> mem_hotplug_lock --> (node_chain).rwsem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock((node_chain).rwsem); lock(mem_hotplug_lock); lock((node_chain).rwsem); lock(hmem_resource_lock); The lock ordering can cause potential deadlock. There are instances where hmem_resource_lock is taken after (node_chain).rwsem, and vice versa. Split out the target update section of hmat_register_target() so that hmat_callback() only envokes that section instead of attempt to register hmem devices that it does not need to. [ dj: Fix up comment to be closer to 80cols. (Jonathan) ] Fixes: cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device") Reviewed-by: Jonathan Cameron Tested-by: Smita Koralahalli Reviewed-by: Smita Koralahalli Reviewed-by: Dan Williams Link: https://patch.msgid.link/20251105235115.85062-3-dave.jiang@intel.com Signed-off-by: Dave Jiang --- drivers/acpi/numa/hmat.c | 46 +++++++++++++++++++++++++--------------------- 1 file changed, 25 insertions(+), 21 deletions(-) diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c index 5a36d57289b4..11e4483685c9 100644 --- a/drivers/acpi/numa/hmat.c +++ b/drivers/acpi/numa/hmat.c @@ -874,10 +874,32 @@ static void hmat_register_target_devices(struct memory_target *target) } } -static void hmat_register_target(struct memory_target *target) +static void hmat_hotplug_target(struct memory_target *target) { int nid = pxm_to_node(target->memory_pxm); + /* + * Skip offline nodes. This can happen when memory marked EFI_MEMORY_SP, + * "specific purpose", is applied to all the memory in a proximity + * domain leading to * the node being marked offline / unplugged, or if + * memory-only "hotplug" node is offline. + */ + if (nid == NUMA_NO_NODE || !node_online(nid)) + return; + + guard(mutex)(&target_lock); + if (target->registered) + return; + + hmat_register_target_initiators(target); + hmat_register_target_cache(target); + hmat_register_target_perf(target, ACCESS_COORDINATE_LOCAL); + hmat_register_target_perf(target, ACCESS_COORDINATE_CPU); + target->registered = true; +} + +static void hmat_register_target(struct memory_target *target) +{ /* * Devices may belong to either an offline or online * node, so unconditionally add them. @@ -895,25 +917,7 @@ static void hmat_register_target(struct memory_target *target) } mutex_unlock(&target_lock); - /* - * Skip offline nodes. This can happen when memory - * marked EFI_MEMORY_SP, "specific purpose", is applied - * to all the memory in a proximity domain leading to - * the node being marked offline / unplugged, or if - * memory-only "hotplug" node is offline. - */ - if (nid == NUMA_NO_NODE || !node_online(nid)) - return; - - mutex_lock(&target_lock); - if (!target->registered) { - hmat_register_target_initiators(target); - hmat_register_target_cache(target); - hmat_register_target_perf(target, ACCESS_COORDINATE_LOCAL); - hmat_register_target_perf(target, ACCESS_COORDINATE_CPU); - target->registered = true; - } - mutex_unlock(&target_lock); + hmat_hotplug_target(target); } static void hmat_register_targets(void) @@ -939,7 +943,7 @@ static int hmat_callback(struct notifier_block *self, if (!target) return NOTIFY_OK; - hmat_register_target(target); + hmat_hotplug_target(target); return NOTIFY_OK; } -- cgit From 360b3730f8eab6c4467c6cca4cb0e30902174a63 Mon Sep 17 00:00:00 2001 From: Haotian Zhang Date: Wed, 12 Nov 2025 14:57:09 +0800 Subject: ASoC: rsnd: fix OF node reference leak in rsnd_ssiu_probe() rsnd_ssiu_probe() leaks an OF node reference obtained by rsnd_ssiu_of_node(). The node reference is acquired but never released across all return paths. Fix it by declaring the device node with the __free(device_node) cleanup construct to ensure automatic release when the variable goes out of scope. Fixes: 4e7788fb8018 ("ASoC: rsnd: add SSIU BUSIF support") Signed-off-by: Haotian Zhang Acked-by: Kuninori Morimoto Link: https://patch.msgid.link/20251112065709.1522-1-vulab@iscas.ac.cn Signed-off-by: Mark Brown --- sound/soc/renesas/rcar/ssiu.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sound/soc/renesas/rcar/ssiu.c b/sound/soc/renesas/rcar/ssiu.c index faf351126d57..244fb833292a 100644 --- a/sound/soc/renesas/rcar/ssiu.c +++ b/sound/soc/renesas/rcar/ssiu.c @@ -509,7 +509,7 @@ void rsnd_parse_connect_ssiu(struct rsnd_dai *rdai, int rsnd_ssiu_probe(struct rsnd_priv *priv) { struct device *dev = rsnd_priv_to_dev(priv); - struct device_node *node; + struct device_node *node __free(device_node) = rsnd_ssiu_of_node(priv); struct rsnd_ssiu *ssiu; struct rsnd_mod_ops *ops; const int *list = NULL; @@ -522,7 +522,6 @@ int rsnd_ssiu_probe(struct rsnd_priv *priv) * see * rsnd_ssiu_bufsif_to_id() */ - node = rsnd_ssiu_of_node(priv); if (node) nr = rsnd_node_count(priv, node, SSIU_NAME); else -- cgit From 4495bffd86ba0fdabfaef0c41d12f68ec2a1e05b Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Mon, 10 Nov 2025 16:22:25 -0600 Subject: PCI/ASPM: Cache L0s/L1 Supported so advertised link states can be overridden Defective devices sometimes advertise support for ASPM L0s or L1 states even if they don't work correctly. Cache the L0s Supported and L1 Supported bits early in enumeration so HEADER quirks can override the ASPM states advertised in Link Capabilities before pcie_aspm_cap_init() enables ASPM. Signed-off-by: Bjorn Helgaas Tested-by: Shawn Lin Reviewed-by: Lukas Wunner Link: https://patch.msgid.link/20251110222929.2140564-2-helgaas@kernel.org --- drivers/pci/pcie/aspm.c | 12 ++++-------- drivers/pci/probe.c | 7 +++++++ include/linux/pci.h | 2 ++ 3 files changed, 13 insertions(+), 8 deletions(-) diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 7cc8281e7011..15d50c089070 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -830,7 +830,6 @@ static void pcie_aspm_override_default_link_state(struct pcie_link_state *link) static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist) { struct pci_dev *child = link->downstream, *parent = link->pdev; - u32 parent_lnkcap, child_lnkcap; u16 parent_lnkctl, child_lnkctl; struct pci_bus *linkbus = parent->subordinate; @@ -845,9 +844,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist) * If ASPM not supported, don't mess with the clocks and link, * bail out now. */ - pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &parent_lnkcap); - pcie_capability_read_dword(child, PCI_EXP_LNKCAP, &child_lnkcap); - if (!(parent_lnkcap & child_lnkcap & PCI_EXP_LNKCAP_ASPMS)) + if (!(parent->aspm_l0s_support && child->aspm_l0s_support) && + !(parent->aspm_l1_support && child->aspm_l1_support)) return; /* Configure common clock before checking latencies */ @@ -859,8 +857,6 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist) * read-only Link Capabilities may change depending on common clock * configuration (PCIe r5.0, sec 7.5.3.6). */ - pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &parent_lnkcap); - pcie_capability_read_dword(child, PCI_EXP_LNKCAP, &child_lnkcap); pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &parent_lnkctl); pcie_capability_read_word(child, PCI_EXP_LNKCTL, &child_lnkctl); @@ -880,7 +876,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist) * given link unless components on both sides of the link each * support L0s. */ - if (parent_lnkcap & child_lnkcap & PCI_EXP_LNKCAP_ASPM_L0S) + if (parent->aspm_l0s_support && child->aspm_l0s_support) link->aspm_support |= PCIE_LINK_STATE_L0S; if (child_lnkctl & PCI_EXP_LNKCTL_ASPM_L0S) @@ -889,7 +885,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist) link->aspm_enabled |= PCIE_LINK_STATE_L0S_DW; /* Setup L1 state */ - if (parent_lnkcap & child_lnkcap & PCI_EXP_LNKCAP_ASPM_L1) + if (parent->aspm_l1_support && child->aspm_l1_support) link->aspm_support |= PCIE_LINK_STATE_L1; if (parent_lnkctl & child_lnkctl & PCI_EXP_LNKCTL_ASPM_L1) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index c83e75a0ec12..de72ceaea285 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1663,6 +1663,13 @@ void set_pcie_port_type(struct pci_dev *pdev) if (reg32 & PCI_EXP_LNKCAP_DLLLARC) pdev->link_active_reporting = 1; +#ifdef CONFIG_PCIEASPM + if (reg32 & PCI_EXP_LNKCAP_ASPM_L0S) + pdev->aspm_l0s_support = 1; + if (reg32 & PCI_EXP_LNKCAP_ASPM_L1) + pdev->aspm_l1_support = 1; +#endif + parent = pci_upstream_bridge(pdev); if (!parent) return; diff --git a/include/linux/pci.h b/include/linux/pci.h index d1fdf81fbe1e..bf97d49c23cf 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -412,6 +412,8 @@ struct pci_dev { u16 l1ss; /* L1SS Capability pointer */ #ifdef CONFIG_PCIEASPM struct pcie_link_state *link_state; /* ASPM link state */ + unsigned int aspm_l0s_support:1; /* ASPM L0s support */ + unsigned int aspm_l1_support:1; /* ASPM L1 support */ unsigned int ltr_path:1; /* Latency Tolerance Reporting supported from root to here */ #endif -- cgit From 575b98e39d817537afdc6c39d35c1de484a64d42 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Mon, 10 Nov 2025 16:22:26 -0600 Subject: PCI/ASPM: Add pcie_aspm_remove_cap() to override advertised link states Add pcie_aspm_remove_cap(). A quirk can use this to prevent use of ASPM L0s or L1 link states, even if the device advertised support for them. Signed-off-by: Bjorn Helgaas Tested-by: Shawn Lin Reviewed-by: Lukas Wunner Link: https://patch.msgid.link/20251110222929.2140564-3-helgaas@kernel.org --- drivers/pci/pci.h | 2 ++ drivers/pci/pcie/aspm.c | 13 +++++++++++++ 2 files changed, 15 insertions(+) diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 4492b809094b..36f8c0985430 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -958,6 +958,7 @@ void pci_save_aspm_l1ss_state(struct pci_dev *dev); void pci_restore_aspm_l1ss_state(struct pci_dev *dev); #ifdef CONFIG_PCIEASPM +void pcie_aspm_remove_cap(struct pci_dev *pdev, u32 lnkcap); void pcie_aspm_init_link_state(struct pci_dev *pdev); void pcie_aspm_exit_link_state(struct pci_dev *pdev); void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked); @@ -965,6 +966,7 @@ void pcie_aspm_powersave_config_link(struct pci_dev *pdev); void pci_configure_ltr(struct pci_dev *pdev); void pci_bridge_reconfigure_ltr(struct pci_dev *pdev); #else +static inline void pcie_aspm_remove_cap(struct pci_dev *pdev, u32 lnkcap) { } static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked) { } diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 15d50c089070..f61d98897503 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -1542,6 +1542,19 @@ int pci_enable_link_state_locked(struct pci_dev *pdev, int state) } EXPORT_SYMBOL(pci_enable_link_state_locked); +void pcie_aspm_remove_cap(struct pci_dev *pdev, u32 lnkcap) +{ + if (lnkcap & PCI_EXP_LNKCAP_ASPM_L0S) + pdev->aspm_l0s_support = 0; + if (lnkcap & PCI_EXP_LNKCAP_ASPM_L1) + pdev->aspm_l1_support = 0; + + pci_info(pdev, "ASPM: Link Capabilities%s%s treated as unsupported to avoid device defect\n", + lnkcap & PCI_EXP_LNKCAP_ASPM_L0S ? " L0s" : "", + lnkcap & PCI_EXP_LNKCAP_ASPM_L1 ? " L1" : ""); + +} + static int pcie_aspm_set_policy(const char *val, const struct kernel_param *kp) { -- cgit From 30579eebba6ae52dc7441479aec9dd8d782256d3 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Mon, 10 Nov 2025 16:22:27 -0600 Subject: PCI/ASPM: Convert quirks to override advertised link states Existing quirks to disable ASPM L0s and L1 use pci_disable_link_state(), which disables ASPM states and prevents their use in the future. But since they are FINAL quirks, they happen after ASPM has already been enabled. Here's a typical call path: pci_host_probe pci_scan_root_bus_bridge pci_scan_child_bus pci_scan_slot pci_scan_single_device pci_device_add pci_fixup_device(pci_fixup_header) # HEADER quirks pcie_aspm_init_link_state pcie_config_aspm_path pcie_config_aspm_link pcie_config_aspm_dev # ASPM may be enabled pci_bus_add_devices pci_bus_add_devices pci_fixup_device(pci_fixup_final) # FINAL quirks quirk_disable_aspm_l0s pci_disable_link_state(dev, PCIE_LINK_STATE_L0S) Sometimes enabling ASPM can make the link non-functional, so if we know ASPM is broken on a device, we shouldn't enable it at all, even temporarily. Convert the existing quirks to use pcie_aspm_remove_cap() instead, which overrides the ASPM Support advertised in PCIe Link Capabilities, and make them HEADER quirks so they run before pcie_aspm_init_link_state() has a chance to enable ASPM. Signed-off-by: Bjorn Helgaas Tested-by: Shawn Lin Reviewed-by: Lukas Wunner Link: https://patch.msgid.link/20251110222929.2140564-4-helgaas@kernel.org --- drivers/pci/quirks.c | 39 +++++++++++++++++++-------------------- 1 file changed, 19 insertions(+), 20 deletions(-) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 214ed060ca1b..922c77c627a1 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -2494,28 +2494,27 @@ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, */ static void quirk_disable_aspm_l0s(struct pci_dev *dev) { - pci_info(dev, "Disabling L0s\n"); - pci_disable_link_state(dev, PCIE_LINK_STATE_L0S); -} -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10a7, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10a9, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10b6, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10c6, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10c7, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10c8, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10d6, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10db, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10dd, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10e1, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10ec, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s); + pcie_aspm_remove_cap(dev, PCI_EXP_LNKCAP_ASPM_L0S); +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10a7, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10a9, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10b6, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10c6, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10c7, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10c8, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10d6, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10db, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10dd, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10e1, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10ec, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s); static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev) { - pci_info(dev, "Disabling ASPM L0s/L1\n"); - pci_disable_link_state(dev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1); + pcie_aspm_remove_cap(dev, + PCI_EXP_LNKCAP_ASPM_L0S | PCI_EXP_LNKCAP_ASPM_L1); } /* @@ -2523,7 +2522,7 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev) * upstream PCIe root port when ASPM is enabled. At least L0s mode is affected; * disable both L0s and L1 for now to be safe. */ -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1); /* * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain -- cgit From 5b40a5080c39933e7ce28bcb63cd4c1818d6c873 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Mon, 10 Nov 2025 16:22:28 -0600 Subject: PCI/ASPM: Avoid L0s and L1 on Freescale [1957:0451] Root Ports Christian reported that f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms") broke booting on the A-EON X5000. Override the L0s and L1 Support advertised in Link Capabilities by the X5000 Root Ports ([1957:0451]) so we don't try to enable those states. Fixes: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms") Fixes: df5192d9bb0e ("PCI/ASPM: Enable only L0s and L1 for devicetree platforms") Reported-by: Christian Zigotzky Link: https://lore.kernel.org/r/db5c95a1-cf3e-46f9-8045-a1b04908051a@xenosoft.de Signed-off-by: Bjorn Helgaas Tested-by: Shawn Lin Reviewed-by: Lukas Wunner Link: https://patch.msgid.link/20251110222929.2140564-5-helgaas@kernel.org --- drivers/pci/quirks.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 922c77c627a1..b94264cd3833 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -2523,6 +2523,7 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev) * disable both L0s and L1 for now to be safe. */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, 0x0451, quirk_disable_aspm_l0s_l1); /* * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain -- cgit From 823576c894d73255d35c0d0dabbb6ffecf1f2667 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Wed, 12 Nov 2025 18:36:24 -0600 Subject: PCI/ASPM: Avoid L0s and L1 on PA Semi [1959:a002] Root Ports Christian reported that f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms") broke booting on the A-EON AmigaOne X1000. Override the L0s and L1 Support advertised in Link Capabilities by the X1000 Root Ports ([1959:a002]) so we don't try to enable those states. Fixes: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms") Fixes: df5192d9bb0e ("PCI/ASPM: Enable only L0s and L1 for devicetree platforms") Reported-by: Christian Zigotzky Link: https://lore.kernel.org/r/a41d2ca1-fcd9-c416-b111-a958e92e94bf@xenosoft.de Signed-off-by: Bjorn Helgaas --- drivers/pci/quirks.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index b94264cd3833..90f6abdb77f4 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -2524,6 +2524,7 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev) */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, 0x0451, quirk_disable_aspm_l0s_l1); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PASEMI, 0xa002, quirk_disable_aspm_l0s_l1); /* * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain -- cgit From 921b3f59b7b00cd7067ab775b0e0ca4eca436c2f Mon Sep 17 00:00:00 2001 From: Shawn Lin Date: Wed, 12 Nov 2025 18:53:18 -0600 Subject: PCI/ASPM: Avoid L0s and L1 on Hi1105 [19e5:1105] Wi-Fi This Wi-Fi advertises the L0s and L1 capabilities but actually it doesn't support them. This is confirmed by HiSilicon team in actual productization. Signed-off-by: Shawn Lin Signed-off-by: Bjorn Helgaas Link: https://patch.msgid.link/1762916319-139532-1-git-send-email-shawn.lin@rock-chips.com --- drivers/pci/quirks.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 90f6abdb77f4..b9c252aa6fe0 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -2525,6 +2525,7 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev) DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, 0x0451, quirk_disable_aspm_l0s_l1); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PASEMI, 0xa002, quirk_disable_aspm_l0s_l1); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HUAWEI, 0x1105, quirk_disable_aspm_l0s_l1); /* * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain -- cgit From 0a4a18e888ae8c8004582f665c5792c84a681668 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Tue, 11 Nov 2025 17:09:20 -0800 Subject: drm/client: fix MODULE_PARM_DESC string for "active" The MODULE_PARM_DESC string for the "active" parameter is missing a space and has an extraneous trailing ']' character. Correct these. Before patch: $ modinfo -p ./drm_client_lib.ko active:Choose which drm client to start, default isfbdev] (string) After patch: $ modinfo -p ./drm_client_lib.ko active:Choose which drm client to start, default is fbdev (string) Fixes: f7b42442c4ac ("drm/log: Introduce a new boot logger to draw the kmsg on the screen") Signed-off-by: Randy Dunlap Reviewed-by: Thomas Zimmermann Reviewed-by: Jocelyn Falempe Signed-off-by: Thomas Zimmermann Link: https://patch.msgid.link/20251112010920.2355712-1-rdunlap@infradead.org --- drivers/gpu/drm/clients/drm_client_setup.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/clients/drm_client_setup.c b/drivers/gpu/drm/clients/drm_client_setup.c index 72480db1f00d..515aceac22b1 100644 --- a/drivers/gpu/drm/clients/drm_client_setup.c +++ b/drivers/gpu/drm/clients/drm_client_setup.c @@ -13,8 +13,8 @@ static char drm_client_default[16] = CONFIG_DRM_CLIENT_DEFAULT; module_param_string(active, drm_client_default, sizeof(drm_client_default), 0444); MODULE_PARM_DESC(active, - "Choose which drm client to start, default is" - CONFIG_DRM_CLIENT_DEFAULT "]"); + "Choose which drm client to start, default is " + CONFIG_DRM_CLIENT_DEFAULT); /** * drm_client_setup() - Setup in-kernel DRM clients -- cgit From ebd4469e7af61019daaf904fdcba07a9ecd18440 Mon Sep 17 00:00:00 2001 From: Andrew Donnellan Date: Wed, 5 Nov 2025 14:40:32 +1100 Subject: entry: Fix ifndef around arch_xfer_to_guest_mode_handle_work() stub The stub implementation of arch_xfer_to_guest_mode_handle_work() is guarded by an #ifndef that incorrectly checks for the name arch_xfer_to_guest_mode_work instead. It seems the function was renamed to add "_handle" as a late change to the original patch, and the #ifndef wasn't updated to go with it. Change the #ifndef to match the name of the function. No users right now, so no need to update any architecture code. Fixes: 935ace2fb5cc4 ("entry: Provide infrastructure for work before transitioning to guest mode") Signed-off-by: Andrew Donnellan Signed-off-by: Thomas Gleixner Link: https://patch.msgid.link/20251105-entry-fix-ifndef-v1-1-d8d28045b627@linux.ibm.com --- include/linux/entry-virt.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/entry-virt.h b/include/linux/entry-virt.h index 42c89e3e5ca7..bfa767702d9a 100644 --- a/include/linux/entry-virt.h +++ b/include/linux/entry-virt.h @@ -32,7 +32,7 @@ */ static inline int arch_xfer_to_guest_mode_handle_work(unsigned long ti_work); -#ifndef arch_xfer_to_guest_mode_work +#ifndef arch_xfer_to_guest_mode_handle_work static inline int arch_xfer_to_guest_mode_handle_work(unsigned long ti_work) { return 0; -- cgit From 0a8fb03fe7b0abab0ff16522e2625163183e7ae4 Mon Sep 17 00:00:00 2001 From: Kiryl Shutsemau Date: Thu, 13 Nov 2025 12:10:06 +0000 Subject: MAINTAINERS: Update name spelling Use transliteration from the Belarusian language instead of Russian. Signed-off-by: Kiryl Shutsemau Signed-off-by: Dave Hansen Link: https://patch.msgid.link/20251113121006.651992-1-kas%40kernel.org --- .mailmap | 2 +- MAINTAINERS | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.mailmap b/.mailmap index 369cfe467932..0f74e16e239c 100644 --- a/.mailmap +++ b/.mailmap @@ -426,7 +426,7 @@ Kenneth W Chen Kenneth Westfield Kiran Gunda Kirill Tkhai -Kirill A. Shutemov +Kiryl Shutsemau Kishon Vijay Abraham I Konrad Dybcio Konrad Dybcio diff --git a/MAINTAINERS b/MAINTAINERS index 46bd8e033042..ffd964b0a9bc 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -27845,7 +27845,7 @@ F: arch/x86/kernel/stacktrace.c F: arch/x86/kernel/unwind_*.c X86 TRUST DOMAIN EXTENSIONS (TDX) -M: Kirill A. Shutemov +M: Kiryl Shutsemau R: Dave Hansen R: Rick Edgecombe L: x86@kernel.org -- cgit From fbade4bd08ba52cbc74a71c4e86e736f059f99f7 Mon Sep 17 00:00:00 2001 From: Jiayuan Chen Date: Tue, 11 Nov 2025 14:02:50 +0800 Subject: mptcp: Disallow MPTCP subflows from sockmap The sockmap feature allows bpf syscall from userspace, or based on bpf sockops, replacing the sk_prot of sockets during protocol stack processing with sockmap's custom read/write interfaces. ''' tcp_rcv_state_process() subflow_syn_recv_sock() tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) bpf_skops_established <== sockops bpf_sock_map_update(sk) <== call bpf helper tcp_bpf_update_proto() <== update sk_prot ''' Consider two scenarios: 1. When the server has MPTCP enabled and the client also requests MPTCP, the sk passed to the BPF program is a subflow sk. Since subflows only handle partial data, replacing their sk_prot is meaningless and will cause traffic disruption. 2. When the server has MPTCP enabled but the client sends a TCP SYN without MPTCP, subflow_syn_recv_sock() performs a fallback on the subflow, replacing the subflow sk's sk_prot with the native sk_prot. ''' subflow_ulp_fallback() subflow_drop_ctx() mptcp_subflow_ops_undo_override() ''' Subsequently, accept::mptcp_stream_accept::mptcp_fallback_tcp_ops() converts the subflow to plain TCP. For the first case, we should prevent it from being combined with sockmap by setting sk_prot->psock_update_sk_prot to NULL, which will be blocked by sockmap's own flow. For the second case, since subflow_syn_recv_sock() has already restored sk_prot to native tcp_prot/tcpv6_prot, no further action is needed. Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections") Signed-off-by: Jiayuan Chen Signed-off-by: Martin KaFai Lau Reviewed-by: Matthieu Baerts (NGI0) Cc: Link: https://patch.msgid.link/20251111060307.194196-2-jiayuan.chen@linux.dev --- net/mptcp/subflow.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index e8325890a322..af707ce0f624 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -2144,6 +2144,10 @@ void __init mptcp_subflow_init(void) tcp_prot_override = tcp_prot; tcp_prot_override.release_cb = tcp_release_cb_override; tcp_prot_override.diag_destroy = tcp_abort_override; +#ifdef CONFIG_BPF_SYSCALL + /* Disable sockmap processing for subflows */ + tcp_prot_override.psock_update_sk_prot = NULL; +#endif #if IS_ENABLED(CONFIG_MPTCP_IPV6) /* In struct mptcp_subflow_request_sock, we assume the TCP request sock @@ -2180,6 +2184,10 @@ void __init mptcp_subflow_init(void) tcpv6_prot_override = tcpv6_prot; tcpv6_prot_override.release_cb = tcp_release_cb_override; tcpv6_prot_override.diag_destroy = tcp_abort_override; +#ifdef CONFIG_BPF_SYSCALL + /* Disable sockmap processing for subflows */ + tcpv6_prot_override.psock_update_sk_prot = NULL; +#endif #endif mptcp_diag_subflow_init(&subflow_ulp_ops); -- cgit From a257e974210320ede524f340ffe16bf4bf0dda1e Mon Sep 17 00:00:00 2001 From: Zqiang Date: Thu, 13 Nov 2025 19:43:55 +0800 Subject: sched_ext: Fix possible deadlock in the deferred_irq_workfn() For PREEMPT_RT=y kernels, the deferred_irq_workfn() is executed in the per-cpu irq_work/* task context and not disable-irq, if the rq returned by container_of() is current CPU's rq, the following scenarios may occur: lock(&rq->__lock); lock(&rq->__lock); This commit use IRQ_WORK_INIT_HARD() to replace init_irq_work() to initialize rq->scx.deferred_irq_work, make the deferred_irq_workfn() is always invoked in hard-irq context. Signed-off-by: Zqiang Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 03d05ea20006..07399210ac2d 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -5321,7 +5321,7 @@ void __init init_sched_ext_class(void) BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_kick_if_idle, GFP_KERNEL, n)); BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_preempt, GFP_KERNEL, n)); BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_wait, GFP_KERNEL, n)); - init_irq_work(&rq->scx.deferred_irq_work, deferred_irq_workfn); + rq->scx.deferred_irq_work = IRQ_WORK_INIT_HARD(deferred_irq_workfn); init_irq_work(&rq->scx.kick_cpus_irq_work, kick_cpus_irq_workfn); if (cpu_online(cpu)) -- cgit From cbcff934fa7deb670d9545a3aad4d07e8f1e4f3c Mon Sep 17 00:00:00 2001 From: Harry Yoo Date: Tue, 11 Nov 2025 21:53:31 +0900 Subject: mm/slub: fix memory leak in free_to_pcs_bulk() The commit 989b09b73978 ("slab: skip percpu sheaves for remote object freeing") introduced the remote_objects array in free_to_pcs_bulk() to skip sheaves when objects from a remote node are freed. However, the array is flushed only when: 1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or 2) slab_free_hook() returns false and size becomes zero. When neither of the conditions is met, objects in the array are leaked. This resulted in a memory leak [1], where 82 GiB of memory was allocated for the maple_node cache. Flush the array after successfully freeing objects to sheaves in the do_free: path. In the meantime, move the snippet if (!size) goto flush_remote; outside the while loop for readability. Let's say all objects in the array are from a remote node: then we acquire s->cpu_sheaves->lock and try to free an object even when size is zero. This doesn't appear to be harmful, but isn't really readable. Reported-by: Tytus Rogalewski Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765 [1] Closes: https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object freeing") Signed-off-by: Harry Yoo Link: https://patch.msgid.link/20251111125331.12246-1-harry.yoo@oracle.com Acked-by: Liam R. Howlett Tested-by: Darrick J. Wong Tested-by: Tytus Rogalewski Signed-off-by: Vlastimil Babka --- mm/slub.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index f1a5373eee7b..a787687a0d59 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -6332,8 +6332,6 @@ next_remote_batch: if (unlikely(!slab_free_hook(s, p[i], init, false))) { p[i] = p[--size]; - if (!size) - goto flush_remote; continue; } @@ -6348,6 +6346,9 @@ next_remote_batch: i++; } + if (!size) + goto flush_remote; + next_batch: if (!local_trylock(&s->cpu_sheaves->lock)) goto fallback; @@ -6402,6 +6403,9 @@ do_free: goto next_batch; } + if (remote_nr) + goto flush_remote; + return; no_empty: -- cgit From 85c894a80ac46aa177df04e0a33bcad409b7d64f Mon Sep 17 00:00:00 2001 From: Thomas Falcon Date: Fri, 7 Nov 2025 11:31:50 -0600 Subject: perf header: Write bpf_prog (infos|btfs)_cnt to data file With commit f0d0f978f3f5830a ("perf header: Don't write empty BPF/BTF info"), the write_bpf_( prog_info() | btf() ) functions exit without writing anything if env->bpf_prog.(infos| btfs)_cnt is zero. process_bpf_( prog_info() | btf() ), however, still expect a "count" value to exist in the data file. If btf information is empty, for example, process_bpf_btf will read garbage or some other data as the number of btf nodes in the data file. As a result, the data file will not be processed correctly. Instead, write the count to the data file and exit if it is zero. Fixes: f0d0f978f3f5830a ("perf header: Don't write empty BPF/BTF info") Reviewed-by: Ian Rogers Signed-off-by: Thomas Falcon Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Mark Rutland Cc: Peter Zijlstra Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/header.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index 4f2a6e10ed5c..4e12be579140 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -1022,12 +1022,9 @@ static int write_bpf_prog_info(struct feat_fd *ff, down_read(&env->bpf_progs.lock); - if (env->bpf_progs.infos_cnt == 0) - goto out; - ret = do_write(ff, &env->bpf_progs.infos_cnt, sizeof(env->bpf_progs.infos_cnt)); - if (ret < 0) + if (ret < 0 || env->bpf_progs.infos_cnt == 0) goto out; root = &env->bpf_progs.infos; @@ -1067,13 +1064,10 @@ static int write_bpf_btf(struct feat_fd *ff, down_read(&env->bpf_progs.lock); - if (env->bpf_progs.btfs_cnt == 0) - goto out; - ret = do_write(ff, &env->bpf_progs.btfs_cnt, sizeof(env->bpf_progs.btfs_cnt)); - if (ret < 0) + if (ret < 0 || env->bpf_progs.btfs_cnt == 0) goto out; root = &env->bpf_progs.btfs; -- cgit From a09e5967ad6819379fd31894634d7aed29c18409 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Wed, 12 Nov 2025 21:57:08 -0300 Subject: perf build: Don't fail fast path feature detection when binutils-devel is not available This is one more remnant of the BUILD_NONDISTRO series to make building with binutils-devel opt-in due to license incompatibility. In this case just the references at link time were still in place, which make building the test-all.bin file fail, which wasn't detected before probably because the last test was done with binutils-devel available, doh. Now: $ rpm -q binutils-devel package binutils-devel is not installed $ file /tmp/build/perf-tools/feature/test-all.bin /tmp/build/perf-tools/feature/test-all.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=4b5388a346b51f1b993f0b0dbd49f4570769b03c, for GNU/Linux 3.2.0, not stripped $ Fixes: 970ae86307718c34 ("perf build: The bfd features are opt-in, stop testing for them by default") Reviewed-by: Ian Rogers Cc: Adrian Hunter Cc: James Clark Cc: Jiri Olsa Cc: Namhyung Kim Signed-off-by: Arnaldo Carvalho de Melo --- tools/build/feature/Makefile | 4 ++-- tools/perf/Makefile.config | 5 ++--- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index 49b0add392b1..95646290cb89 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -107,7 +107,7 @@ all: $(FILES) __BUILD = $(CC) $(CFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.c,$(@F)) $(LDFLAGS) BUILD = $(__BUILD) > $(@:.bin=.make.output) 2>&1 BUILD_BFD = $(BUILD) -DPACKAGE='"perf"' -lbfd -ldl - BUILD_ALL = $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -lslang $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz -llzma -lzstd + BUILD_ALL = $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -lslang $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -ldl -lz -llzma -lzstd __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.cpp,$(@F)) $(LDFLAGS) BUILDXX = $(__BUILDXX) > $(@:.bin=.make.output) 2>&1 @@ -115,7 +115,7 @@ __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.cpp,$( ############################### $(OUTPUT)test-all.bin: - $(BUILD_ALL) || $(BUILD_ALL) -lopcodes -liberty + $(BUILD_ALL) $(OUTPUT)test-hello.bin: $(BUILD) diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 5700516aa84a..2dd5f5a60568 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -354,9 +354,6 @@ FEATURE_CHECK_LDFLAGS-libpython := $(PYTHON_EMBED_LDOPTS) FEATURE_CHECK_LDFLAGS-libaio = -lrt -FEATURE_CHECK_LDFLAGS-disassembler-four-args = -lbfd -lopcodes -ldl -FEATURE_CHECK_LDFLAGS-disassembler-init-styled = -lbfd -lopcodes -ldl - CORE_CFLAGS += -fno-omit-frame-pointer CORE_CFLAGS += -Wall CORE_CFLAGS += -Wextra @@ -930,6 +927,8 @@ ifdef BUILD_NONDISTRO ifeq ($(feature-libbfd), 1) EXTLIBS += -lbfd -lopcodes + FEATURE_CHECK_LDFLAGS-disassembler-four-args = -lbfd -lopcodes -ldl + FEATURE_CHECK_LDFLAGS-disassembler-init-styled = -lbfd -lopcodes -ldl else # we are on a system that requires -liberty and (maybe) -lz # to link against -lbfd; test each case individually here -- cgit From 84003ab3d0ca3717e4b36071c3c5f8b3c70e317c Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 13 Nov 2025 15:03:07 -0300 Subject: tools headers UAPI: Sync KVM's vmx.h with the kernel to pick SEAMCALL exit reason To pick the changes in: 9d7dfb95da2cb5c1 ("KVM: VMX: Inject #UD if guest tries to execute SEAMCALL or TDCALL") The 'perf kvm-stat' tool uses the exit reasons that are included in the VMX_EXIT_REASONS define, this new SEAMCALL isn't included there (TDCALL is), so shouldn't be causing any change in behaviour, this patch ends up being just addressess the following perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h Please see tools/include/uapi/README for further details. Cc: Sean Christopherson Signed-off-by: Arnaldo Carvalho de Melo --- tools/arch/x86/include/uapi/asm/vmx.h | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/arch/x86/include/uapi/asm/vmx.h b/tools/arch/x86/include/uapi/asm/vmx.h index 9792e329343e..1baa86dfe029 100644 --- a/tools/arch/x86/include/uapi/asm/vmx.h +++ b/tools/arch/x86/include/uapi/asm/vmx.h @@ -93,6 +93,7 @@ #define EXIT_REASON_TPAUSE 68 #define EXIT_REASON_BUS_LOCK 74 #define EXIT_REASON_NOTIFY 75 +#define EXIT_REASON_SEAMCALL 76 #define EXIT_REASON_TDCALL 77 #define EXIT_REASON_MSR_READ_IMM 84 #define EXIT_REASON_MSR_WRITE_IMM 85 -- cgit From d0206db94b36c998c11458cfdae2f45ba20bc4fb Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Nov 2025 16:01:23 +0000 Subject: perf lock: Fix segfault due to missing kernel map Kernel maps are encoded in PERF_RECORD_MMAP2 samples but "perf lock report" and "perf lock contention" do not process MMAP2 samples. Because of that, machine->vmlinux_map stays NULL and any later access triggers a segmentation fault. Fix it by adding ->mmap2() callbacks. Fixes: 53b00ff358dc75b1 ("perf record: Make --buildid-mmap the default") Reported-by: Tycho Andersen (AMD) Reviewed-by: Ian Rogers Signed-off-by: Ravi Bangoria Tested-by: Tycho Andersen (AMD) Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ananth Narayan Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Sandipan Das Cc: Santosh Shukla Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-lock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c index 078634461df2..e8962c985d34 100644 --- a/tools/perf/builtin-lock.c +++ b/tools/perf/builtin-lock.c @@ -1867,6 +1867,7 @@ static int __cmd_report(bool display_info) eops.sample = process_sample_event; eops.comm = perf_event__process_comm; eops.mmap = perf_event__process_mmap; + eops.mmap2 = perf_event__process_mmap2; eops.namespaces = perf_event__process_namespaces; eops.tracing_data = perf_event__process_tracing_data; session = perf_session__new(&data, &eops); @@ -2023,6 +2024,7 @@ static int __cmd_contention(int argc, const char **argv) eops.sample = process_sample_event; eops.comm = perf_event__process_comm; eops.mmap = perf_event__process_mmap; + eops.mmap2 = perf_event__process_mmap2; eops.tracing_data = perf_event__process_tracing_data; perf_env__init(&host_env); -- cgit From 3c723f449723db2dc2b75b7efe03c2a76e4c09f0 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Nov 2025 16:01:24 +0000 Subject: perf test: Fix lock contention test Couple of independent fixes: 1. Wire in SIGSEGV handler that terminates the test with a failure code. 2. Use "--lock-cgroup" instead of "-g"; "-g" was proposed but never merged. See commit 4d1792d0a2564caf ("perf lock contention: Add --lock-cgroup option") 3. Call cleanup() on every normal exit so trap_cleanup() doesn't mistake it for an unexpected signal and emit a false-negative "Unexpected signal in main" message. Before patch: # ./perf test -vv "lock contention" 85: kernel lock contention analysis test: --- start --- test child forked, pid 610711 Testing perf lock record and perf lock contention Testing perf lock contention --use-bpf Testing perf lock record and perf lock contention at the same time Testing perf lock contention --threads Testing perf lock contention --lock-addr Testing perf lock contention --lock-cgroup Unexpected signal in test_aggr_cgroup ---- end(0) ---- 85: kernel lock contention analysis test : Ok After patch: # ./perf test -vv "lock contention" 85: kernel lock contention analysis test: --- start --- test child forked, pid 602637 Testing perf lock record and perf lock contention Testing perf lock contention --use-bpf Testing perf lock record and perf lock contention at the same time Testing perf lock contention --threads Testing perf lock contention --lock-addr Testing perf lock contention --lock-cgroup Testing perf lock contention --type-filter (w/ spinlock) Testing perf lock contention --lock-filter (w/ tasklist_lock) Testing perf lock contention --callstack-filter (w/ unix_stream) [Skip] Could not find 'unix_stream' Testing perf lock contention --callstack-filter with task aggregation [Skip] Could not find 'unix_stream' Testing perf lock contention --cgroup-filter Testing perf lock contention CSV output ---- end(0) ---- 85: kernel lock contention analysis test : Ok Reviewed-by: Ian Rogers Signed-off-by: Ravi Bangoria Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ananth Narayan Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Sandipan Das Cc: Santosh Shukla Cc: Tycho Andersen Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/tests/shell/lock_contention.sh | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/tools/perf/tests/shell/lock_contention.sh b/tools/perf/tests/shell/lock_contention.sh index 7248a74ca2a3..6dd90519f45c 100755 --- a/tools/perf/tests/shell/lock_contention.sh +++ b/tools/perf/tests/shell/lock_contention.sh @@ -13,15 +13,18 @@ cleanup() { rm -f ${perfdata} rm -f ${result} rm -f ${errout} - trap - EXIT TERM INT + trap - EXIT TERM INT ERR } trap_cleanup() { + if (( $? == 139 )); then #SIGSEGV + err=1 + fi echo "Unexpected signal in ${FUNCNAME[1]}" cleanup exit ${err} } -trap trap_cleanup EXIT TERM INT +trap trap_cleanup EXIT TERM INT ERR check() { if [ "$(id -u)" != 0 ]; then @@ -145,7 +148,7 @@ test_aggr_cgroup() fi # the perf lock contention output goes to the stderr - perf lock con -a -b -g -E 1 -q -- perf bench sched messaging -p > /dev/null 2> ${result} + perf lock con -a -b --lock-cgroup -E 1 -q -- perf bench sched messaging -p > /dev/null 2> ${result} if [ "$(cat "${result}" | wc -l)" != "1" ]; then echo "[Fail] BPF result count is not 1:" "$(cat "${result}" | wc -l)" err=1 @@ -271,7 +274,7 @@ test_cgroup_filter() return fi - perf lock con -a -b -g -E 1 -F wait_total -q -- perf bench sched messaging -p > /dev/null 2> ${result} + perf lock con -a -b --lock-cgroup -E 1 -F wait_total -q -- perf bench sched messaging -p > /dev/null 2> ${result} if [ "$(cat "${result}" | wc -l)" != "1" ]; then echo "[Fail] BPF result should have a cgroup result:" "$(cat "${result}")" err=1 @@ -279,7 +282,7 @@ test_cgroup_filter() fi cgroup=$(cat "${result}" | awk '{ print $3 }') - perf lock con -a -b -g -E 1 -G "${cgroup}" -q -- perf bench sched messaging -p > /dev/null 2> ${result} + perf lock con -a -b --lock-cgroup -E 1 -G "${cgroup}" -q -- perf bench sched messaging -p > /dev/null 2> ${result} if [ "$(cat "${result}" | wc -l)" != "1" ]; then echo "[Fail] BPF result should have a result with cgroup filter:" "$(cat "${cgroup}")" err=1 @@ -338,4 +341,5 @@ test_aggr_task_stack_filter test_cgroup_filter test_csv_output +cleanup exit ${err} -- cgit From b72b8132d8fd2d6bf5b420a03d4fc553980c3a92 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Tue, 11 Nov 2025 23:43:11 -0800 Subject: perf libbfd: Ensure libbfd is initialized prior to use Multiple threads may be creating and destroying BFD objects in situations like `perf top`. Without appropriate initialization crashes may occur during libbfd's cache management. BFD's locks require recursive mutexes, add support for these. Committer testing: This happens only when building with 'make BUILD_NONDISTRO=1' and having the binutils-devel package (or equivalent) installed, i.e. linking with binutils devel files, an opt-in perf build. Before: root@x1:~# perf top perf: Segmentation fault -------- backtrace -------- root@x1:~# After this patch it works as before. Closes: https://lore.kernel.org/lkml/aQt66zhfxSA80xwt@gentoo.org/ Fixes: 95931d9a594dd0b5 ("perf libbfd: Move libbfd functionality to its own file") Reported-by: Guilherme Amadio Signed-off-by: Ian Rogers Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/libbfd.c | 38 ++++++++++++++++++++++++++++++++++++++ tools/perf/util/mutex.c | 14 ++++++++++---- tools/perf/util/mutex.h | 2 ++ 3 files changed, 50 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/libbfd.c b/tools/perf/util/libbfd.c index 01147fbf73b3..6434c2dccd4a 100644 --- a/tools/perf/util/libbfd.c +++ b/tools/perf/util/libbfd.c @@ -38,6 +38,39 @@ struct a2l_data { asymbol **syms; }; +static bool perf_bfd_lock(void *bfd_mutex) +{ + mutex_lock(bfd_mutex); + return true; +} + +static bool perf_bfd_unlock(void *bfd_mutex) +{ + mutex_unlock(bfd_mutex); + return true; +} + +static void perf_bfd_init(void) +{ + static struct mutex bfd_mutex; + + mutex_init_recursive(&bfd_mutex); + + if (bfd_init() != BFD_INIT_MAGIC) { + pr_err("Error initializing libbfd\n"); + return; + } + if (!bfd_thread_init(perf_bfd_lock, perf_bfd_unlock, &bfd_mutex)) + pr_err("Error initializing libbfd threading\n"); +} + +static void ensure_bfd_init(void) +{ + static pthread_once_t bfd_init_once = PTHREAD_ONCE_INIT; + + pthread_once(&bfd_init_once, perf_bfd_init); +} + static int bfd_error(const char *string) { const char *errmsg; @@ -132,6 +165,7 @@ static struct a2l_data *addr2line_init(const char *path) bfd *abfd; struct a2l_data *a2l = NULL; + ensure_bfd_init(); abfd = bfd_openr(path, NULL); if (abfd == NULL) return NULL; @@ -288,6 +322,7 @@ int dso__load_bfd_symbols(struct dso *dso, const char *debugfile) bfd *abfd; u64 start, len; + ensure_bfd_init(); abfd = bfd_openr(debugfile, NULL); if (!abfd) return -1; @@ -393,6 +428,7 @@ int libbfd__read_build_id(const char *filename, struct build_id *bid, bool block if (fd < 0) return -1; + ensure_bfd_init(); abfd = bfd_fdopenr(filename, /*target=*/NULL, fd); if (!abfd) return -1; @@ -421,6 +457,7 @@ int libbfd_filename__read_debuglink(const char *filename, char *debuglink, asection *section; bfd *abfd; + ensure_bfd_init(); abfd = bfd_openr(filename, NULL); if (!abfd) return -1; @@ -480,6 +517,7 @@ int symbol__disassemble_bpf_libbfd(struct symbol *sym __maybe_unused, memset(tpath, 0, sizeof(tpath)); perf_exe(tpath, sizeof(tpath)); + ensure_bfd_init(); bfdf = bfd_openr(tpath, NULL); if (bfdf == NULL) abort(); diff --git a/tools/perf/util/mutex.c b/tools/perf/util/mutex.c index bca7f0717f35..7aa1f3f55a7d 100644 --- a/tools/perf/util/mutex.c +++ b/tools/perf/util/mutex.c @@ -17,7 +17,7 @@ static void check_err(const char *fn, int err) #define CHECK_ERR(err) check_err(__func__, err) -static void __mutex_init(struct mutex *mtx, bool pshared) +static void __mutex_init(struct mutex *mtx, bool pshared, bool recursive) { pthread_mutexattr_t attr; @@ -27,21 +27,27 @@ static void __mutex_init(struct mutex *mtx, bool pshared) /* In normal builds enable error checking, such as recursive usage. */ CHECK_ERR(pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ERRORCHECK)); #endif + if (recursive) + CHECK_ERR(pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE)); if (pshared) CHECK_ERR(pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED)); - CHECK_ERR(pthread_mutex_init(&mtx->lock, &attr)); CHECK_ERR(pthread_mutexattr_destroy(&attr)); } void mutex_init(struct mutex *mtx) { - __mutex_init(mtx, /*pshared=*/false); + __mutex_init(mtx, /*pshared=*/false, /*recursive=*/false); } void mutex_init_pshared(struct mutex *mtx) { - __mutex_init(mtx, /*pshared=*/true); + __mutex_init(mtx, /*pshared=*/true, /*recursive=*/false); +} + +void mutex_init_recursive(struct mutex *mtx) +{ + __mutex_init(mtx, /*pshared=*/false, /*recursive=*/true); } void mutex_destroy(struct mutex *mtx) diff --git a/tools/perf/util/mutex.h b/tools/perf/util/mutex.h index 38458f00846f..70232d8d094f 100644 --- a/tools/perf/util/mutex.h +++ b/tools/perf/util/mutex.h @@ -104,6 +104,8 @@ void mutex_init(struct mutex *mtx); * process-private attribute. */ void mutex_init_pshared(struct mutex *mtx); +/* Initializes a mutex that may be recursively held on the same thread. */ +void mutex_init_recursive(struct mutex *mtx); void mutex_destroy(struct mutex *mtx); void mutex_lock(struct mutex *mtx) EXCLUSIVE_LOCK_FUNCTION(*mtx); -- cgit From c77b3b79a92e3345aa1ee296180d1af4e7031f8f Mon Sep 17 00:00:00 2001 From: Jiayuan Chen Date: Tue, 11 Nov 2025 14:02:51 +0800 Subject: mptcp: Fix proto fallback detection with BPF The sockmap feature allows bpf syscall from userspace, or based on bpf sockops, replacing the sk_prot of sockets during protocol stack processing with sockmap's custom read/write interfaces. ''' tcp_rcv_state_process() syn_recv_sock()/subflow_syn_recv_sock() tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) bpf_skops_established <== sockops bpf_sock_map_update(sk) <== call bpf helper tcp_bpf_update_proto() <== update sk_prot ''' When the server has MPTCP enabled but the client sends a TCP SYN without MPTCP, subflow_syn_recv_sock() performs a fallback on the subflow, replacing the subflow sk's sk_prot with the native sk_prot. ''' subflow_syn_recv_sock() subflow_ulp_fallback() subflow_drop_ctx() mptcp_subflow_ops_undo_override() ''' Then, this subflow can be normally used by sockmap, which replaces the native sk_prot with sockmap's custom sk_prot. The issue occurs when the user executes accept::mptcp_stream_accept::mptcp_fallback_tcp_ops(). Here, it uses sk->sk_prot to compare with the native sk_prot, but this is incorrect when sockmap is used, as we may incorrectly set sk->sk_socket->ops. This fix uses the more generic sk_family for the comparison instead. Additionally, this also prevents a WARNING from occurring: result from ./scripts/decode_stacktrace.sh: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 337 at net/mptcp/protocol.c:68 mptcp_stream_accept \ (net/mptcp/protocol.c:4005) Modules linked in: ... PKRU: 55555554 Call Trace: do_accept (net/socket.c:1989) __sys_accept4 (net/socket.c:2028 net/socket.c:2057) __x64_sys_accept (net/socket.c:2067) x64_sys_call (arch/x86/entry/syscall_64.c:41) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) RIP: 0033:0x7f87ac92b83d ---[ end trace 0000000000000000 ]--- Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash") Signed-off-by: Jiayuan Chen Signed-off-by: Martin KaFai Lau Reviewed-by: Jakub Sitnicki Reviewed-by: Matthieu Baerts (NGI0) Cc: Link: https://patch.msgid.link/20251111060307.194196-3-jiayuan.chen@linux.dev --- net/mptcp/protocol.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 2d6b8de35c44..90b4aeca2596 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -61,11 +61,13 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk) static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk) { + unsigned short family = READ_ONCE(sk->sk_family); + #if IS_ENABLED(CONFIG_MPTCP_IPV6) - if (sk->sk_prot == &tcpv6_prot) + if (family == AF_INET6) return &inet6_stream_ops; #endif - WARN_ON_ONCE(sk->sk_prot != &tcp_prot); + WARN_ON_ONCE(family != AF_INET); return &inet_stream_ops; } -- cgit From cb730e4ac1b4dca09d364fd83464ebd29547a4ef Mon Sep 17 00:00:00 2001 From: Jiayuan Chen Date: Tue, 11 Nov 2025 14:02:52 +0800 Subject: selftests/bpf: Add mptcp test with sockmap Add test cases to verify that when MPTCP falls back to plain TCP sockets, they can properly work with sockmap. Additionally, add test cases to ensure that sockmap correctly rejects MPTCP sockets as expected. Signed-off-by: Jiayuan Chen Signed-off-by: Martin KaFai Lau Acked-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251111060307.194196-4-jiayuan.chen@linux.dev --- tools/testing/selftests/bpf/prog_tests/mptcp.c | 140 ++++++++++++++++++++++ tools/testing/selftests/bpf/progs/mptcp_sockmap.c | 43 +++++++ 2 files changed, 183 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/mptcp_sockmap.c diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index f8eb7f9d4fd2..8fade8bdc451 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -6,11 +6,13 @@ #include #include #include +#include #include "cgroup_helpers.h" #include "network_helpers.h" #include "mptcp_sock.skel.h" #include "mptcpify.skel.h" #include "mptcp_subflow.skel.h" +#include "mptcp_sockmap.skel.h" #define NS_TEST "mptcp_ns" #define ADDR_1 "10.0.1.1" @@ -436,6 +438,142 @@ close_cgroup: close(cgroup_fd); } +/* Test sockmap on MPTCP server handling non-mp-capable clients. */ +static void test_sockmap_with_mptcp_fallback(struct mptcp_sockmap *skel) +{ + int listen_fd = -1, client_fd1 = -1, client_fd2 = -1; + int server_fd1 = -1, server_fd2 = -1, sent, recvd; + char snd[9] = "123456789"; + char rcv[10]; + + /* start server with MPTCP enabled */ + listen_fd = start_mptcp_server(AF_INET, NULL, 0, 0); + if (!ASSERT_OK_FD(listen_fd, "sockmap-fb:start_mptcp_server")) + return; + + skel->bss->trace_port = ntohs(get_socket_local_port(listen_fd)); + skel->bss->sk_index = 0; + /* create client without MPTCP enabled */ + client_fd1 = connect_to_fd_opts(listen_fd, NULL); + if (!ASSERT_OK_FD(client_fd1, "sockmap-fb:connect_to_fd")) + goto end; + + server_fd1 = accept(listen_fd, NULL, 0); + skel->bss->sk_index = 1; + client_fd2 = connect_to_fd_opts(listen_fd, NULL); + if (!ASSERT_OK_FD(client_fd2, "sockmap-fb:connect_to_fd")) + goto end; + + server_fd2 = accept(listen_fd, NULL, 0); + /* test normal redirect behavior: data sent by client_fd1 can be + * received by client_fd2 + */ + skel->bss->redirect_idx = 1; + sent = send(client_fd1, snd, sizeof(snd), 0); + if (!ASSERT_EQ(sent, sizeof(snd), "sockmap-fb:send(client_fd1)")) + goto end; + + /* try to recv more bytes to avoid truncation check */ + recvd = recv(client_fd2, rcv, sizeof(rcv), 0); + if (!ASSERT_EQ(recvd, sizeof(snd), "sockmap-fb:recv(client_fd2)")) + goto end; + +end: + if (client_fd1 >= 0) + close(client_fd1); + if (client_fd2 >= 0) + close(client_fd2); + if (server_fd1 >= 0) + close(server_fd1); + if (server_fd2 >= 0) + close(server_fd2); + close(listen_fd); +} + +/* Test sockmap rejection of MPTCP sockets - both server and client sides. */ +static void test_sockmap_reject_mptcp(struct mptcp_sockmap *skel) +{ + int listen_fd = -1, server_fd = -1, client_fd1 = -1; + int err, zero = 0; + + /* start server with MPTCP enabled */ + listen_fd = start_mptcp_server(AF_INET, NULL, 0, 0); + if (!ASSERT_OK_FD(listen_fd, "start_mptcp_server")) + return; + + skel->bss->trace_port = ntohs(get_socket_local_port(listen_fd)); + skel->bss->sk_index = 0; + /* create client with MPTCP enabled */ + client_fd1 = connect_to_fd(listen_fd, 0); + if (!ASSERT_OK_FD(client_fd1, "connect_to_fd client_fd1")) + goto end; + + /* bpf_sock_map_update() called from sockops should reject MPTCP sk */ + if (!ASSERT_EQ(skel->bss->helper_ret, -EOPNOTSUPP, "should reject")) + goto end; + + server_fd = accept(listen_fd, NULL, 0); + err = bpf_map_update_elem(bpf_map__fd(skel->maps.sock_map), + &zero, &server_fd, BPF_NOEXIST); + if (!ASSERT_EQ(err, -EOPNOTSUPP, "server should be disallowed")) + goto end; + + /* MPTCP client should also be disallowed */ + err = bpf_map_update_elem(bpf_map__fd(skel->maps.sock_map), + &zero, &client_fd1, BPF_NOEXIST); + if (!ASSERT_EQ(err, -EOPNOTSUPP, "client should be disallowed")) + goto end; +end: + if (client_fd1 >= 0) + close(client_fd1); + if (server_fd >= 0) + close(server_fd); + close(listen_fd); +} + +static void test_mptcp_sockmap(void) +{ + struct mptcp_sockmap *skel; + struct netns_obj *netns; + int cgroup_fd, err; + + cgroup_fd = test__join_cgroup("/mptcp_sockmap"); + if (!ASSERT_OK_FD(cgroup_fd, "join_cgroup: mptcp_sockmap")) + return; + + skel = mptcp_sockmap__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_load: mptcp_sockmap")) + goto close_cgroup; + + skel->links.mptcp_sockmap_inject = + bpf_program__attach_cgroup(skel->progs.mptcp_sockmap_inject, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links.mptcp_sockmap_inject, "attach sockmap")) + goto skel_destroy; + + err = bpf_prog_attach(bpf_program__fd(skel->progs.mptcp_sockmap_redirect), + bpf_map__fd(skel->maps.sock_map), + BPF_SK_SKB_STREAM_VERDICT, 0); + if (!ASSERT_OK(err, "bpf_prog_attach stream verdict")) + goto skel_destroy; + + netns = netns_new(NS_TEST, true); + if (!ASSERT_OK_PTR(netns, "netns_new: mptcp_sockmap")) + goto skel_destroy; + + if (endpoint_init("subflow") < 0) + goto close_netns; + + test_sockmap_with_mptcp_fallback(skel); + test_sockmap_reject_mptcp(skel); + +close_netns: + netns_free(netns); +skel_destroy: + mptcp_sockmap__destroy(skel); +close_cgroup: + close(cgroup_fd); +} + void test_mptcp(void) { if (test__start_subtest("base")) @@ -444,4 +582,6 @@ void test_mptcp(void) test_mptcpify(); if (test__start_subtest("subflow")) test_subflow(); + if (test__start_subtest("sockmap")) + test_mptcp_sockmap(); } diff --git a/tools/testing/selftests/bpf/progs/mptcp_sockmap.c b/tools/testing/selftests/bpf/progs/mptcp_sockmap.c new file mode 100644 index 000000000000..d4eef0cbadb9 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp_sockmap.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "bpf_tracing_net.h" + +char _license[] SEC("license") = "GPL"; + +int sk_index; +int redirect_idx; +int trace_port; +int helper_ret; +struct { + __uint(type, BPF_MAP_TYPE_SOCKMAP); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(__u32)); + __uint(max_entries, 100); +} sock_map SEC(".maps"); + +SEC("sockops") +int mptcp_sockmap_inject(struct bpf_sock_ops *skops) +{ + struct bpf_sock *sk; + + /* only accept specified connection */ + if (skops->local_port != trace_port || + skops->op != BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) + return 1; + + sk = skops->sk; + if (!sk) + return 1; + + /* update sk handler */ + helper_ret = bpf_sock_map_update(skops, &sock_map, &sk_index, BPF_NOEXIST); + + return 1; +} + +SEC("sk_skb/stream_verdict") +int mptcp_sockmap_redirect(struct __sk_buff *skb) +{ + /* redirect skb to the sk under sock_map[redirect_idx] */ + return bpf_sk_redirect_map(skb, &sock_map, redirect_idx, 0); +} -- cgit From 264152a97edf9f1b7ed5372e4033e46108e41422 Mon Sep 17 00:00:00 2001 From: Chukun Pan Date: Sat, 1 Nov 2025 22:01:01 +0800 Subject: arm64: dts: rockchip: drop reset from rk3576 i2c9 node The reset property is not part of the binding, so drop it. It is also not used by the driver, so it was likely copied from some vendor-kernel node. Fixes: 57b1ce903966 ("arm64: dts: rockchip: Add rk3576 SoC base DT") Signed-off-by: Chukun Pan Link: https://patch.msgid.link/20251101140101.302229-1-amadeus@jmu.edu.cn Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3576.dtsi | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3576.dtsi b/arch/arm64/boot/dts/rockchip/rk3576.dtsi index f0c3ab00a7f3..a86fc6b4e8c4 100644 --- a/arch/arm64/boot/dts/rockchip/rk3576.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3576.dtsi @@ -2549,8 +2549,6 @@ interrupts = ; pinctrl-names = "default"; pinctrl-0 = <&i2c9m0_xfer>; - resets = <&cru SRST_I2C9>, <&cru SRST_P_I2C9>; - reset-names = "i2c", "apb"; #address-cells = <1>; #size-cells = <0>; status = "disabled"; -- cgit From baa18d577cd445145039e731d3de0fa49ca57204 Mon Sep 17 00:00:00 2001 From: Quentin Schulz Date: Wed, 12 Nov 2025 16:01:53 +0100 Subject: arm64: dts: rockchip: disable HS400 on RK3588 Tiger We've had reports from the field that some RK3588 Tiger have random issues with eMMC errors. Applying commit a28352cf2d2f ("mmc: sdhci-of-dwcmshc: Change DLL_STRBIN_TAPNUM_DEFAULT to 0x4") didn't help and seemed to have made things worse for our board. Our HW department checked the eMMC lines and reported that they are too long and don't look great so signal integrity is probably not the best. Note that not all Tigers with the same eMMC chip have errors, so the suspicion is that we're really on the edge in terms of signal integrity and only a handful devices are failing. Additionally, we have RK3588 Jaguars with the same eMMC chip but the layout is different and we also haven't received reports about those so far. Lowering the max-frequency to 150MHz from 200MHz instead of simply disabling HS400 was briefly tested and seem to work as well. We've disabled HS400 downstream and haven't received reports since so we'll go with that instead of lowering the max-frequency. Signed-off-by: Quentin Schulz Fixes: 6173ef24b35b ("arm64: dts: rockchip: add RK3588-Q7 (Tiger) SoM") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251112-tiger-hs200-v1-1-b50adac107c0@cherry.de [added Fixes tag and stable-cc from 2nd mail] Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3588-tiger.dtsi | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3588-tiger.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-tiger.dtsi index b44e89e1bb15..365c1d958f2d 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-tiger.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3588-tiger.dtsi @@ -382,14 +382,12 @@ cap-mmc-highspeed; mmc-ddr-1_8v; mmc-hs200-1_8v; - mmc-hs400-1_8v; - mmc-hs400-enhanced-strobe; mmc-pwrseq = <&emmc_pwrseq>; no-sdio; no-sd; non-removable; pinctrl-names = "default"; - pinctrl-0 = <&emmc_bus8 &emmc_cmd &emmc_clk &emmc_data_strobe>; + pinctrl-0 = <&emmc_bus8 &emmc_cmd &emmc_clk>; vmmc-supply = <&vcc_3v3_s3>; vqmmc-supply = <&vcc_1v8_s3>; status = "okay"; -- cgit From b5414520793e68d266fdd97a84989d9831156aad Mon Sep 17 00:00:00 2001 From: Mykola Kvach Date: Mon, 3 Nov 2025 12:27:40 +0200 Subject: arm64: dts: rockchip: fix PCIe 3.3V regulator voltage on orangepi-5 The vcc3v3_pcie20 fixed regulator powers the PCIe device-side 3.3V rail for pcie2x1l2 via vpcie3v3-supply. The DTS mistakenly set its regulator-min/max-microvolt to 1800000 (1.8 V). Correct both to 3300000 (3.3 V) to match the rail name, the PCIe/M.2 power requirement, and the actual hardware wiring on Orange Pi 5. Fixes: b6bc755d806e ("arm64: dts: rockchip: Add Orange Pi 5") Cc: stable@vger.kernel.org Signed-off-by: Mykola Kvach Reviewed-by: Michael Riesch Link: https://patch.msgid.link/cf6e08dfdfbf1c540685d12388baab1326f95d2c.1762165324.git.xakep.amatop@gmail.com Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5.dts | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5.dts b/arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5.dts index ad6d04793b0a..83b9b6645a1e 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5.dts +++ b/arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5.dts @@ -14,8 +14,8 @@ gpios = <&gpio0 RK_PC5 GPIO_ACTIVE_HIGH>; regulator-name = "vcc3v3_pcie20"; regulator-boot-on; - regulator-min-microvolt = <1800000>; - regulator-max-microvolt = <1800000>; + regulator-min-microvolt = <3300000>; + regulator-max-microvolt = <3300000>; startup-delay-us = <50000>; vin-supply = <&vcc5v0_sys>; }; -- cgit From 7410c86fc05b1423466c1a579bcc994f87822566 Mon Sep 17 00:00:00 2001 From: Baruch Siach Date: Wed, 12 Nov 2025 13:17:03 +0200 Subject: MAINTAINERS: Remove eth bridge website MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Ethernet bridge website URL shows "This page isn’t available". Signed-off-by: Baruch Siach Reviewed-by: Ido Schimmel Acked-by: Nikolay Aleksandrov Link: https://patch.msgid.link/0a32aaf7fa4473e7574f7327480e8fbc4fef2741.1762946223.git.baruch@tkos.co.il Signed-off-by: Jakub Kicinski --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 8a4d9c0c7b8a..cffd4effa50a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9266,7 +9266,6 @@ M: Ido Schimmel L: bridge@lists.linux.dev L: netdev@vger.kernel.org S: Maintained -W: http://www.linuxfoundation.org/en/Net:Bridge F: include/linux/if_bridge.h F: include/uapi/linux/if_bridge.h F: include/linux/netfilter_bridge/ -- cgit From f796a8dec9beafcc0f6f0d3478ed685a15c5e062 Mon Sep 17 00:00:00 2001 From: Jiaming Zhang Date: Wed, 12 Nov 2025 01:36:52 +0800 Subject: net: core: prevent NULL deref in generic_hwtstamp_ioctl_lower() The ethtool tsconfig Netlink path can trigger a null pointer dereference. A call chain such as: tsconfig_prepare_data() -> dev_get_hwtstamp_phylib() -> vlan_hwtstamp_get() -> generic_hwtstamp_get_lower() -> generic_hwtstamp_ioctl_lower() results in generic_hwtstamp_ioctl_lower() being called with kernel_cfg->ifr as NULL. The generic_hwtstamp_ioctl_lower() function does not expect a NULL ifr and dereferences it, leading to a system crash. Fix this by adding a NULL check for kernel_cfg->ifr in generic_hwtstamp_ioctl_lower(). If ifr is NULL, return -EINVAL. Fixes: 6e9e2eed4f39 ("net: ethtool: Add support for tsconfig command to get/set hwtstamp config") Closes: https://lore.kernel.org/cd6a7056-fa6d-43f8-b78a-f5e811247ba8@linux.dev Signed-off-by: Jiaming Zhang Reviewed-by: Kory Maincent Link: https://patch.msgid.link/20251111173652.749159-2-r772577952@gmail.com Signed-off-by: Jakub Kicinski --- net/core/dev_ioctl.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c index ad54b12d4b4c..8bb71a10dba0 100644 --- a/net/core/dev_ioctl.c +++ b/net/core/dev_ioctl.c @@ -443,6 +443,9 @@ static int generic_hwtstamp_ioctl_lower(struct net_device *dev, int cmd, struct ifreq ifrr; int err; + if (!kernel_cfg->ifr) + return -EINVAL; + strscpy_pad(ifrr.ifr_name, dev->name, IFNAMSIZ); ifrr.ifr_ifru = kernel_cfg->ifr->ifr_ifru; -- cgit From 407a06507c2358554958e8164dc97176feddcafc Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Wed, 12 Nov 2025 05:21:14 +0000 Subject: mlxsw: spectrum: Fix memory leak in mlxsw_sp_flower_stats() The function mlxsw_sp_flower_stats() calls mlxsw_sp_acl_ruleset_get() to obtain a ruleset reference. If the subsequent call to mlxsw_sp_acl_rule_lookup() fails to find a rule, the function returns an error without releasing the ruleset reference, causing a memory leak. Fix this by using a goto to the existing error handling label, which calls mlxsw_sp_acl_ruleset_put() to properly release the reference. Fixes: 7c1b8eb175b69 ("mlxsw: spectrum: Add support for TC flower offload statistics") Signed-off-by: Zilin Guan Reviewed-by: Ido Schimmel Link: https://patch.msgid.link/20251112052114.1591695-1-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c index 6a4a81c63451..353fd9ca89a6 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c @@ -830,8 +830,10 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp, return -EINVAL; rule = mlxsw_sp_acl_rule_lookup(mlxsw_sp, ruleset, f->cookie); - if (!rule) - return -EINVAL; + if (!rule) { + err = -EINVAL; + goto err_rule_get_stats; + } err = mlxsw_sp_acl_rule_get_stats(mlxsw_sp, rule, &packets, &bytes, &drops, &lastuse, &used_hw_stats); -- cgit From a55ef3bff84f11ee8c84a1ae29b071ffd4ccbbd9 Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Mon, 10 Nov 2025 06:41:25 +0000 Subject: xfrm: fix memory leak in xfrm_add_acquire() The xfrm_add_acquire() function constructs an xfrm policy by calling xfrm_policy_construct(). This allocates the policy structure and potentially associates a security context and a device policy with it. However, at the end of the function, the policy object is freed using only kfree() . This skips the necessary cleanup for the security context and device policy, leading to a memory leak. To fix this, invoke the proper cleanup functions xfrm_dev_policy_delete(), xfrm_dev_policy_free(), and security_xfrm_policy_free() before freeing the policy object. This approach mirrors the error handling path in xfrm_add_policy(), ensuring that all associated resources are correctly released. Fixes: 980ebd25794f ("[IPSEC]: Sync series - acquire insert") Signed-off-by: Zilin Guan Reviewed-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_user.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 9d98cc9daa37..403b5ecac2c5 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -3038,6 +3038,9 @@ static int xfrm_add_acquire(struct sk_buff *skb, struct nlmsghdr *nlh, } xfrm_state_free(x); + xfrm_dev_policy_delete(xp); + xfrm_dev_policy_free(xp); + security_xfrm_policy_free(xp->security); kfree(xp); return 0; -- cgit From f84fd5bec502447df145f31734793714690ce27f Mon Sep 17 00:00:00 2001 From: Luke Wang Date: Fri, 14 Nov 2025 14:53:08 +0800 Subject: pwm: adp5585: Correct mismatched pwm chip info MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The register addresses of ADP5585 and ADP5589 are swapped. Fixes: 75024f97e82e ("pwm: adp5585: add support for adp5589") Signed-off-by: Luke Wang Acked-by: Nuno Sá Tested-by: Liu Ying # ADP5585 PWM Link: https://patch.msgid.link/20251114065308.2074893-1-ziniu.wang_1@nxp.com Signed-off-by: Uwe Kleine-König --- drivers/pwm/pwm-adp5585.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/pwm/pwm-adp5585.c b/drivers/pwm/pwm-adp5585.c index dc2860979e24..806f8d79b0d7 100644 --- a/drivers/pwm/pwm-adp5585.c +++ b/drivers/pwm/pwm-adp5585.c @@ -190,13 +190,13 @@ static int adp5585_pwm_probe(struct platform_device *pdev) return 0; } -static const struct adp5585_pwm_chip adp5589_pwm_chip_info = { +static const struct adp5585_pwm_chip adp5585_pwm_chip_info = { .pwm_cfg = ADP5585_PWM_CFG, .pwm_offt_low = ADP5585_PWM_OFFT_LOW, .pwm_ont_low = ADP5585_PWM_ONT_LOW, }; -static const struct adp5585_pwm_chip adp5585_pwm_chip_info = { +static const struct adp5585_pwm_chip adp5589_pwm_chip_info = { .pwm_cfg = ADP5589_PWM_CFG, .pwm_offt_low = ADP5589_PWM_OFFT_LOW, .pwm_ont_low = ADP5589_PWM_ONT_LOW, -- cgit From e1a97a627cd01d73fac5dd054d8f3de601ef2781 Mon Sep 17 00:00:00 2001 From: Mario Limonciello Date: Thu, 13 Nov 2025 16:35:50 -0600 Subject: x86/CPU/AMD: Add additional fixed RDSEED microcode revisions Microcode that resolves the RDSEED failure (SB-7055 [1]) has been released for additional Zen5 models to linux-firmware [2]. Update the zen5_rdseed_microcode array to cover these new models. Fixes: 607b9fb2ce24 ("x86/CPU/AMD: Add RDSEED fix for Zen5") Signed-off-by: Mario Limonciello Signed-off-by: Borislav Petkov (AMD) Cc: Link: https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7055.html [1] Link: https://gitlab.com/kernel-firmware/linux-firmware/-/commit/6167e5566900cf236f7a69704e8f4c441bc7212a [2] Link: https://patch.msgid.link/20251113223608.1495655-1-mario.limonciello@amd.com --- arch/x86/kernel/cpu/amd.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 2ba9f2d42d8c..5d46709c58d0 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1037,7 +1037,14 @@ static void init_amd_zen4(struct cpuinfo_x86 *c) static const struct x86_cpu_id zen5_rdseed_microcode[] = { ZEN_MODEL_STEP_UCODE(0x1a, 0x02, 0x1, 0x0b00215a), + ZEN_MODEL_STEP_UCODE(0x1a, 0x08, 0x1, 0x0b008121), ZEN_MODEL_STEP_UCODE(0x1a, 0x11, 0x0, 0x0b101054), + ZEN_MODEL_STEP_UCODE(0x1a, 0x24, 0x0, 0x0b204037), + ZEN_MODEL_STEP_UCODE(0x1a, 0x44, 0x0, 0x0b404035), + ZEN_MODEL_STEP_UCODE(0x1a, 0x44, 0x1, 0x0b404108), + ZEN_MODEL_STEP_UCODE(0x1a, 0x60, 0x0, 0x0b600037), + ZEN_MODEL_STEP_UCODE(0x1a, 0x68, 0x0, 0x0b608038), + ZEN_MODEL_STEP_UCODE(0x1a, 0x70, 0x0, 0x0b700037), {}, }; -- cgit From dd14022a7ce96963aa923e35cf4bcc8c32f95840 Mon Sep 17 00:00:00 2001 From: "Borislav Petkov (AMD)" Date: Fri, 14 Nov 2025 14:01:14 +0100 Subject: x86/microcode/AMD: Add Zen5 model 0x44, stepping 0x1 minrev Add the minimum Entrysign revision for that model+stepping to the list of minimum revisions. Fixes: 50cef76d5cb0 ("x86/microcode/AMD: Load only SHA256-checksummed patches") Reported-by: Andrew Cooper Signed-off-by: Borislav Petkov (AMD) Cc: Link: https://lore.kernel.org/r/e94dd76b-4911-482f-8500-5c848a3df026@citrix.com --- arch/x86/kernel/cpu/microcode/amd.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index dc82153009da..a881bf4c2011 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -224,6 +224,7 @@ static bool need_sha_check(u32 cur_rev) case 0xb1010: return cur_rev <= 0xb101046; break; case 0xb2040: return cur_rev <= 0xb204031; break; case 0xb4040: return cur_rev <= 0xb404031; break; + case 0xb4041: return cur_rev <= 0xb404101; break; case 0xb6000: return cur_rev <= 0xb600031; break; case 0xb6080: return cur_rev <= 0xb608031; break; case 0xb7000: return cur_rev <= 0xb700031; break; -- cgit From 21a9ab5b90b3716a631d559e62818029b4e7f5b7 Mon Sep 17 00:00:00 2001 From: Lushih Hsieh Date: Fri, 14 Nov 2025 13:20:53 +0800 Subject: ALSA: usb-audio: Add native DSD quirks for PureAudio DAC series The PureAudio APA DAC and Lotus DAC5 series are USB Audio 2.0 Class devices that support native Direct Stream Digital (DSD) playback via specific vendor protocols. Without these quirks, the devices may only function in standard PCM mode, or fail to correctly report their DSD format capabilities to the ALSA framework, preventing native DSD playback under Linux. This commit adds new quirk entries for the mentioned DAC models based on their respective Vendor/Product IDs (VID:PID), for example: 0x16d0:0x0ab1 (APA DAC), 0x16d0:0xeca1 (DAC5 series), etc. The quirk ensures correct DSD format handling by setting the required SNDRV_PCM_FMTBIT_DSD_U32_BE format bit and defining the DSD-specific Audio Class 2.0 (AC2.0) endpoint configurations. This allows the ALSA DSD API to correctly address the device for high-bitrate DSD streams, bypassing the need for DoP (DSD over PCM). Test on APA DAC and Lotus DAC5 SE under Arch Linux. Tested-by: Lushih Hsieh Signed-off-by: Lushih Hsieh Link: https://patch.msgid.link/20251114052053.54989-1-bruce@mail.kh.edu.tw Signed-off-by: Takashi Iwai --- sound/usb/quirks.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index e5b857129caf..5e30bff69f82 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -2022,6 +2022,8 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, case USB_ID(0x16d0, 0x09d8): /* NuPrime IDA-8 */ case USB_ID(0x16d0, 0x09db): /* NuPrime Audio DAC-9 */ case USB_ID(0x16d0, 0x09dd): /* Encore mDSD */ + case USB_ID(0x16d0, 0x0ab1): /* PureAudio APA DAC */ + case USB_ID(0x16d0, 0xeca1): /* PureAudio Lotus DAC5, DAC5 SE, DAC5 Pro */ case USB_ID(0x1db5, 0x0003): /* Bryston BDA3 */ case USB_ID(0x20a0, 0x4143): /* WaveIO USB Audio 2.0 */ case USB_ID(0x22e1, 0xca01): /* HDTA Serenade DSD */ @@ -2299,6 +2301,10 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_IGNORE_CLOCK_SOURCE), DEVICE_FLG(0x1686, 0x00dd, /* Zoom R16/24 */ QUIRK_FLAG_TX_LENGTH | QUIRK_FLAG_CTL_MSG_DELAY_1M), + DEVICE_FLG(0x16d0, 0x0ab1, /* PureAudio APA DAC */ + QUIRK_FLAG_DSD_RAW), + DEVICE_FLG(0x16d0, 0xeca1, /* PureAudio Lotus DAC5, DAC5 SE and DAC5 Pro */ + QUIRK_FLAG_DSD_RAW), DEVICE_FLG(0x17aa, 0x1046, /* Lenovo ThinkStation P620 Rear Line-in, Line-out and Microphone */ QUIRK_FLAG_DISABLE_AUTOSUSPEND), DEVICE_FLG(0x17aa, 0x104d, /* Lenovo ThinkStation P620 Internal Speaker + Front Headset */ -- cgit From 37339122a7801660dce11abd817af82cc4bef163 Mon Sep 17 00:00:00 2001 From: Ville Syrjälä Date: Fri, 14 Nov 2025 23:18:50 +0900 Subject: firewire: core: Initialize topology_map.lock MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Lockdep barfs on the new uninitialized spinlock. Initialize it. protip: enable lockdep (CONFIG_PROVE_LOCKING=y) when doing locking changes firewire_ohci 0000:02:01.1: added OHCI v1.10 device as card 0, 4 IR + 4 IT contexts, quirks 0x11 INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. CPU: 0 UID: 0 PID: 1042 Comm: irq/17-firewire Not tainted 6.17.0-rc2-cl-bisect2-00026-g7d138cb269db #136 PREEMPT Hardware name: Dell Inc. Latitude E5400 /0D695C, BIOS A19 06/13/2013 Call Trace: dump_stack_lvl+0x6d/0xa0 register_lock_class+0x783/0x790 ? find_held_lock+0x2b/0x80 ? __mod_timer+0x110/0x320 ? __mod_timer+0x110/0x320 __lock_acquire+0x405/0x2600 lock_acquire+0xca/0x2e0 ? fw_core_handle_bus_reset+0x888/0xca0 [firewire_core] ? fw_core_handle_bus_reset+0x878/0xca0 [firewire_core] ? fw_core_handle_bus_reset+0x878/0xca0 [firewire_core] _raw_spin_lock+0x2e/0x40 ? fw_core_handle_bus_reset+0x888/0xca0 [firewire_core] fw_core_handle_bus_reset+0x888/0xca0 [firewire_core] handle_selfid_complete_event+0x35c/0x7a0 [firewire_ohci] ? irq_thread+0x8d/0x280 irq_thread_fn+0x18/0x50 irq_thread+0x15a/0x280 ? irq_check_status_bit+0x100/0x100 ? lockdep_hardirqs_on+0x78/0x100 ? irq_finalize_oneshot.part.0+0xc0/0xc0 ? irq_forced_thread_fn+0x60/0x60 kthread+0x114/0x200 ? kthreads_online_cpu+0x110/0x110 ret_from_fork+0x158/0x1e0 ? kthreads_online_cpu+0x110/0x110 ret_from_fork_asm+0x11/0x20 Reported-by: Erhard Furtner Fixes: 7d138cb269db ("firewire: core: use spin lock specific to topology map") Signed-off-by: Ville Syrjälä Signed-off-by: Takashi Sakamoto --- drivers/firewire/core-card.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/firewire/core-card.c b/drivers/firewire/core-card.c index e5e0174a0335..66e1106db5e7 100644 --- a/drivers/firewire/core-card.c +++ b/drivers/firewire/core-card.c @@ -577,6 +577,8 @@ void fw_card_initialize(struct fw_card *card, INIT_LIST_HEAD(&card->transactions.list); spin_lock_init(&card->transactions.lock); + spin_lock_init(&card->topology_map.lock); + card->split_timeout.hi = DEFAULT_SPLIT_TIMEOUT / 8000; card->split_timeout.lo = (DEFAULT_SPLIT_TIMEOUT % 8000) << 19; card->split_timeout.cycles = DEFAULT_SPLIT_TIMEOUT; -- cgit From 31475b88110c4725b4f9a79c3a0d9bbf97e69e1c Mon Sep 17 00:00:00 2001 From: Heiko Carstens Date: Thu, 13 Nov 2025 13:21:47 +0100 Subject: s390/mm: Fix __ptep_rdp() inline assembly When a zero ASCE is passed to the __ptep_rdp() inline assembly, the generated instruction should have the R3 field of the instruction set to zero. However the inline assembly is written incorrectly: for such cases a zero is loaded into a register allocated by the compiler and this register is then used by the instruction. This means that selected TLB entries may not be flushed since the specified ASCE does not match the one which was used when the selected TLB entries were created. Fix this by removing the asce and opt parameters of __ptep_rdp(), since all callers always pass zero, and use a hard-coded register zero for the R3 field. Fixes: 0807b856521f ("s390/mm: add support for RDP (Reset DAT-Protection)") Cc: stable@vger.kernel.org Reviewed-by: Gerald Schaefer Signed-off-by: Heiko Carstens --- arch/s390/include/asm/pgtable.h | 12 +++++------- arch/s390/mm/pgtable.c | 4 ++-- 2 files changed, 7 insertions(+), 9 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index b7100c6a4054..6663f1619abb 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1154,17 +1154,15 @@ static inline pte_t pte_mkhuge(pte_t pte) #define IPTE_NODAT 0x400 #define IPTE_GUEST_ASCE 0x800 -static __always_inline void __ptep_rdp(unsigned long addr, pte_t *ptep, - unsigned long opt, unsigned long asce, - int local) +static __always_inline void __ptep_rdp(unsigned long addr, pte_t *ptep, int local) { unsigned long pto; pto = __pa(ptep) & ~(PTRS_PER_PTE * sizeof(pte_t) - 1); - asm volatile(".insn rrf,0xb98b0000,%[r1],%[r2],%[asce],%[m4]" + asm volatile(".insn rrf,0xb98b0000,%[r1],%[r2],%%r0,%[m4]" : "+m" (*ptep) - : [r1] "a" (pto), [r2] "a" ((addr & PAGE_MASK) | opt), - [asce] "a" (asce), [m4] "i" (local)); + : [r1] "a" (pto), [r2] "a" (addr & PAGE_MASK), + [m4] "i" (local)); } static __always_inline void __ptep_ipte(unsigned long address, pte_t *ptep, @@ -1348,7 +1346,7 @@ static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma, * A local RDP can be used to do the flush. */ if (cpu_has_rdp() && !(pte_val(*ptep) & _PAGE_PROTECT)) - __ptep_rdp(address, ptep, 0, 0, 1); + __ptep_rdp(address, ptep, 1); } #define flush_tlb_fix_spurious_fault flush_tlb_fix_spurious_fault diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 0fde20bbc50b..05974304d622 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -274,9 +274,9 @@ void ptep_reset_dat_prot(struct mm_struct *mm, unsigned long addr, pte_t *ptep, preempt_disable(); atomic_inc(&mm->context.flush_count); if (cpumask_equal(mm_cpumask(mm), cpumask_of(smp_processor_id()))) - __ptep_rdp(addr, ptep, 0, 0, 1); + __ptep_rdp(addr, ptep, 1); else - __ptep_rdp(addr, ptep, 0, 0, 0); + __ptep_rdp(addr, ptep, 0); /* * PTE is not invalidated by RDP, only _PAGE_PROTECT is cleared. That * means it is still valid and active, and must not be changed according -- cgit From 14473a1f88596fd729e892782efc267c0097dd1d Mon Sep 17 00:00:00 2001 From: Nick Hu Date: Fri, 14 Nov 2025 15:28:44 +0800 Subject: irqchip/riscv-intc: Add missing free() callback in riscv_intc_domain_ops The irq_domain_free_irqs() helper requires that the irq_domain_ops->free callback is implemented. Otherwise, the kernel reports the warning message "NULL pointer, cannot free irq" when irq_dispose_mapping() is invoked to release the per-HART local interrupts. Set irq_domain_ops->free to irq_domain_free_irqs_top() to cure that. Fixes: 832f15f42646 ("RISC-V: Treat IPIs as normal Linux IRQs") Signed-off-by: Nick Hu Signed-off-by: Thomas Gleixner Link: https://patch.msgid.link/20251114-rv-intc-fix-v1-1-a3edd1c1a868@sifive.com --- drivers/irqchip/irq-riscv-intc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c index e5805885394e..70290b35b317 100644 --- a/drivers/irqchip/irq-riscv-intc.c +++ b/drivers/irqchip/irq-riscv-intc.c @@ -166,7 +166,8 @@ static int riscv_intc_domain_alloc(struct irq_domain *domain, static const struct irq_domain_ops riscv_intc_domain_ops = { .map = riscv_intc_domain_map, .xlate = irq_domain_xlate_onecell, - .alloc = riscv_intc_domain_alloc + .alloc = riscv_intc_domain_alloc, + .free = irq_domain_free_irqs_top, }; static struct fwnode_handle *riscv_intc_hwnode(void) -- cgit From e0fd4d42e27f761e9cc82801b3f183e658dc749d Mon Sep 17 00:00:00 2001 From: Eslam Khafagy Date: Fri, 14 Nov 2025 14:27:39 +0200 Subject: posix-timers: Plug potential memory leak in do_timer_create() When posix timer creation is set to allocate a given timer ID and the access to the user space value faults, the function terminates without freeing the already allocated posix timer structure. Move the allocation after the user space access to cure that. [ tglx: Massaged change log ] Fixes: ec2d0c04624b3 ("posix-timers: Provide a mechanism to allocate a given timer ID") Reported-by: syzbot+9c47ad18f978d4394986@syzkaller.appspotmail.com Suggested-by: Cyrill Gorcunov Signed-off-by: Eslam Khafagy Signed-off-by: Thomas Gleixner Reviewed-by: Frederic Weisbecker Link: https://patch.msgid.link/20251114122739.994326-1-eslam.medhat1993@gmail.com Closes: https://lore.kernel.org/all/69155df4.a70a0220.3124cb.0017.GAE@google.com/T/ --- kernel/time/posix-timers.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c index aa3120104a51..56e17b625c72 100644 --- a/kernel/time/posix-timers.c +++ b/kernel/time/posix-timers.c @@ -475,12 +475,6 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event, if (!kc->timer_create) return -EOPNOTSUPP; - new_timer = alloc_posix_timer(); - if (unlikely(!new_timer)) - return -EAGAIN; - - spin_lock_init(&new_timer->it_lock); - /* Special case for CRIU to restore timers with a given timer ID. */ if (unlikely(current->signal->timer_create_restore_ids)) { if (copy_from_user(&req_id, created_timer_id, sizeof(req_id))) @@ -490,6 +484,12 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event, return -EINVAL; } + new_timer = alloc_posix_timer(); + if (unlikely(!new_timer)) + return -EAGAIN; + + spin_lock_init(&new_timer->it_lock); + /* * Add the timer to the hash table. The timer is not yet valid * after insertion, but has a unique ID allocated. -- cgit From 4ef92743625818932b9c320152b58274c05e5053 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Wed, 12 Nov 2025 12:55:16 +0000 Subject: bpf: Add bpf_prog_run_data_pointers() syzbot found that cls_bpf_classify() is able to change tc_skb_cb(skb)->drop_reason triggering a warning in sk_skb_reason_drop(). WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 __sk_skb_reason_drop net/core/skbuff.c:1189 [inline] WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 sk_skb_reason_drop+0x76/0x170 net/core/skbuff.c:1214 struct tc_skb_cb has been added in commit ec624fe740b4 ("net/sched: Extend qdisc control block with tc control block"), which added a wrong interaction with db58ba459202 ("bpf: wire in data and data_end for cls_act_bpf"). drop_reason was added later. Add bpf_prog_run_data_pointers() helper to save/restore the net_sched storage colliding with BPF data_meta/data_end. Fixes: ec624fe740b4 ("net/sched: Extend qdisc control block with tc control block") Reported-by: syzbot Closes: https://lore.kernel.org/netdev/6913437c.a70a0220.22f260.013b.GAE@google.com/ Signed-off-by: Eric Dumazet Signed-off-by: Martin KaFai Lau Reviewed-by: Victor Nogueira Acked-by: Jamal Hadi Salim Link: https://patch.msgid.link/20251112125516.1563021-1-edumazet@google.com --- include/linux/filter.h | 20 ++++++++++++++++++++ net/sched/act_bpf.c | 6 ++---- net/sched/cls_bpf.c | 6 ++---- 3 files changed, 24 insertions(+), 8 deletions(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index f5c859b8131a..973233b82dc1 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -901,6 +901,26 @@ static inline void bpf_compute_data_pointers(struct sk_buff *skb) cb->data_end = skb->data + skb_headlen(skb); } +static inline int bpf_prog_run_data_pointers( + const struct bpf_prog *prog, + struct sk_buff *skb) +{ + struct bpf_skb_data_end *cb = (struct bpf_skb_data_end *)skb->cb; + void *save_data_meta, *save_data_end; + int res; + + save_data_meta = cb->data_meta; + save_data_end = cb->data_end; + + bpf_compute_data_pointers(skb); + res = bpf_prog_run(prog, skb); + + cb->data_meta = save_data_meta; + cb->data_end = save_data_end; + + return res; +} + /* Similar to bpf_compute_data_pointers(), except that save orginal * data in cb->data and cb->meta_data for restore. */ diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c index 396b576390d0..c2b5bc19e091 100644 --- a/net/sched/act_bpf.c +++ b/net/sched/act_bpf.c @@ -47,12 +47,10 @@ TC_INDIRECT_SCOPE int tcf_bpf_act(struct sk_buff *skb, filter = rcu_dereference(prog->filter); if (at_ingress) { __skb_push(skb, skb->mac_len); - bpf_compute_data_pointers(skb); - filter_res = bpf_prog_run(filter, skb); + filter_res = bpf_prog_run_data_pointers(filter, skb); __skb_pull(skb, skb->mac_len); } else { - bpf_compute_data_pointers(skb); - filter_res = bpf_prog_run(filter, skb); + filter_res = bpf_prog_run_data_pointers(filter, skb); } if (unlikely(!skb->tstamp && skb->tstamp_type)) skb->tstamp_type = SKB_CLOCK_REALTIME; diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c index 7fbe42f0e5c2..a32754a2658b 100644 --- a/net/sched/cls_bpf.c +++ b/net/sched/cls_bpf.c @@ -97,12 +97,10 @@ TC_INDIRECT_SCOPE int cls_bpf_classify(struct sk_buff *skb, } else if (at_ingress) { /* It is safe to push/pull even if skb_shared() */ __skb_push(skb, skb->mac_len); - bpf_compute_data_pointers(skb); - filter_res = bpf_prog_run(prog->filter, skb); + filter_res = bpf_prog_run_data_pointers(prog->filter, skb); __skb_pull(skb, skb->mac_len); } else { - bpf_compute_data_pointers(skb); - filter_res = bpf_prog_run(prog->filter, skb); + filter_res = bpf_prog_run_data_pointers(prog->filter, skb); } if (unlikely(!skb->tstamp && skb->tstamp_type)) skb->tstamp_type = SKB_CLOCK_REALTIME; -- cgit From b0c8e6d3d866b6a7f73877f71968dbffd27b7785 Mon Sep 17 00:00:00 2001 From: Eduard Zingerman Date: Thu, 13 Nov 2025 18:57:29 -0800 Subject: bpf: account for current allocated stack depth in widen_imprecise_scalars() The usage pattern for widen_imprecise_scalars() looks as follows: prev_st = find_prev_entry(env, ...); queued_st = push_stack(...); widen_imprecise_scalars(env, prev_st, queued_st); Where prev_st is an ancestor of the queued_st in the explored states tree. This ancestor is not guaranteed to have same allocated stack depth as queued_st. E.g. in the following case: def main(): for i in 1..2: foo(i) // same callsite, differnt param def foo(i): if i == 1: use 128 bytes of stack iterator based loop Here, for a second 'foo' call prev_st->allocated_stack is 128, while queued_st->allocated_stack is much smaller. widen_imprecise_scalars() needs to take this into account and avoid accessing bpf_verifier_state->frame[*]->stack out of bounds. Fixes: 2793a8b015f7 ("bpf: exact states comparison for iterator convergence checks") Reported-by: Emil Tsalapatis Signed-off-by: Eduard Zingerman Link: https://lore.kernel.org/r/20251114025730.772723-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov --- kernel/bpf/verifier.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8314518c8d93..fbe4bb91c564 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8866,7 +8866,7 @@ static int widen_imprecise_scalars(struct bpf_verifier_env *env, struct bpf_verifier_state *cur) { struct bpf_func_state *fold, *fcur; - int i, fr; + int i, fr, num_slots; reset_idmap_scratch(env); for (fr = old->curframe; fr >= 0; fr--) { @@ -8879,7 +8879,9 @@ static int widen_imprecise_scalars(struct bpf_verifier_env *env, &fcur->regs[i], &env->idmap_scratch); - for (i = 0; i < fold->allocated_stack / BPF_REG_SIZE; i++) { + num_slots = min(fold->allocated_stack / BPF_REG_SIZE, + fcur->allocated_stack / BPF_REG_SIZE); + for (i = 0; i < num_slots; i++) { if (!is_spilled_reg(&fold->stack[i]) || !is_spilled_reg(&fcur->stack[i])) continue; -- cgit From 6c762611fed7365790000925f3d14f20037d0061 Mon Sep 17 00:00:00 2001 From: Eduard Zingerman Date: Thu, 13 Nov 2025 18:57:30 -0800 Subject: selftests/bpf: Test widen_imprecise_scalars() with different stack depth A test case for a situation when widen_imprecise_scalars() is called with old->allocated_stack > cur->allocated_stack. Test structure: def widening_stack_size_bug(): r1 = 0 for r6 in 0..1: iterator_with_diff_stack_depth(r1) r1 = 42 def iterator_with_diff_stack_depth(r1): if r1 != 42: use 128 bytes of stack iterator based loop iterator_with_diff_stack_depth() is verified with r1 == 0 first and r1 == 42 next. Causing stack usage of 128 bytes on a first visit and 8 bytes on a second. Such arrangement triggered a KASAN error in widen_imprecise_scalars(). Signed-off-by: Eduard Zingerman Link: https://lore.kernel.org/r/20251114025730.772723-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/progs/iters_looping.c | 53 +++++++++++++++++++++++ 1 file changed, 53 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/iters_looping.c b/tools/testing/selftests/bpf/progs/iters_looping.c index 05fa5ce7fc59..d00fd570255a 100644 --- a/tools/testing/selftests/bpf/progs/iters_looping.c +++ b/tools/testing/selftests/bpf/progs/iters_looping.c @@ -161,3 +161,56 @@ int simplest_loop(void *ctx) return 0; } + +__used +static void iterator_with_diff_stack_depth(int x) +{ + struct bpf_iter_num iter; + + asm volatile ( + "if r1 == 42 goto 0f;" + "*(u64 *)(r10 - 128) = 0;" + "0:" + /* create iterator */ + "r1 = %[iter];" + "r2 = 0;" + "r3 = 10;" + "call %[bpf_iter_num_new];" + "1:" + /* consume next item */ + "r1 = %[iter];" + "call %[bpf_iter_num_next];" + "if r0 == 0 goto 2f;" + "goto 1b;" + "2:" + /* destroy iterator */ + "r1 = %[iter];" + "call %[bpf_iter_num_destroy];" + : + : __imm_ptr(iter), ITER_HELPERS + : __clobber_common, "r6" + ); +} + +SEC("socket") +__success +__naked int widening_stack_size_bug(void *ctx) +{ + /* + * Depending on iterator_with_diff_stack_depth() parameter value, + * subprogram stack depth is either 8 or 128 bytes. Arrange values so + * that it is 128 on a first call and 8 on a second. This triggered a + * bug in verifier's widen_imprecise_scalars() logic. + */ + asm volatile ( + "r6 = 0;" + "r1 = 0;" + "1:" + "call iterator_with_diff_stack_depth;" + "r1 = 42;" + "r6 += 1;" + "if r6 < 2 goto 1b;" + "r0 = 0;" + "exit;" + ::: __clobber_all); +} -- cgit From ec0ca4be116ad7efb08cd23acc1ff29b04d9cf52 Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Fri, 31 Oct 2025 14:50:39 +0100 Subject: MAINTAINERS: Update Krzysztof Kozlowski's email Update Krzysztof Kozlowski's email address in mailmap to stay reachable. Link: https://patch.msgid.link/20251021095426.86549-2-krzysztof.kozlowski@linaro.org Signed-off-by: Krzysztof Kozlowski Link: https://lore.kernel.org/r/20251031135041.78789-2-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann --- .mailmap | 1 + MAINTAINERS | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/.mailmap b/.mailmap index 369cfe467932..ccec2aeb28f9 100644 --- a/.mailmap +++ b/.mailmap @@ -437,6 +437,7 @@ Krishna Manikandan Krzysztof Kozlowski Krzysztof Kozlowski Krzysztof Kozlowski +Krzysztof Kozlowski Krzysztof Wilczyński Krzysztof Wilczyński Kshitiz Godara diff --git a/MAINTAINERS b/MAINTAINERS index ddecf1ef3bed..29f86c9aa27b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16204,7 +16204,7 @@ MEMORY CONTROLLER DRIVERS M: Krzysztof Kozlowski L: linux-kernel@vger.kernel.org S: Maintained -B: mailto:krzysztof.kozlowski@linaro.org +B: mailto:krzk@kernel.org T: git git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl.git F: Documentation/devicetree/bindings/memory-controllers/ F: drivers/memory/ @@ -21177,7 +21177,7 @@ F: Documentation/devicetree/bindings/i2c/qcom,i2c-cci.yaml F: drivers/i2c/busses/i2c-qcom-cci.c QUALCOMM INTERCONNECT BWMON DRIVER -M: Krzysztof Kozlowski +M: Krzysztof Kozlowski L: linux-arm-msm@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/interconnect/qcom,msm8998-bwmon.yaml -- cgit From e6751b0b19a6baab219a62e1e302b8aa6b5a55b2 Mon Sep 17 00:00:00 2001 From: Pavel Zhigulin Date: Thu, 13 Nov 2025 16:57:44 +0300 Subject: net: dsa: hellcreek: fix missing error handling in LED registration The LED setup routine registered both led_sync_good and led_is_gm devices without checking the return values of led_classdev_register(). If either registration failed, the function continued silently, leaving the driver in a partially-initialized state and leaking a registered LED classdev. Add proper error handling Fixes: 7d9ee2e8ff15 ("net: dsa: hellcreek: Add PTP status LEDs") Signed-off-by: Pavel Zhigulin Reviewed-by: Andrew Lunn Acked-by: Kurt Kanzenbach Link: https://patch.msgid.link/20251113135745.92375-1-Pavel.Zhigulin@kaspersky.com Signed-off-by: Jakub Kicinski --- drivers/net/dsa/hirschmann/hellcreek_ptp.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/net/dsa/hirschmann/hellcreek_ptp.c b/drivers/net/dsa/hirschmann/hellcreek_ptp.c index bfe21f9f7dcd..cb23bea9c21b 100644 --- a/drivers/net/dsa/hirschmann/hellcreek_ptp.c +++ b/drivers/net/dsa/hirschmann/hellcreek_ptp.c @@ -376,8 +376,18 @@ static int hellcreek_led_setup(struct hellcreek *hellcreek) hellcreek_set_brightness(hellcreek, STATUS_OUT_IS_GM, 1); /* Register both leds */ - led_classdev_register(hellcreek->dev, &hellcreek->led_sync_good); - led_classdev_register(hellcreek->dev, &hellcreek->led_is_gm); + ret = led_classdev_register(hellcreek->dev, &hellcreek->led_sync_good); + if (ret) { + dev_err(hellcreek->dev, "Failed to register sync_good LED\n"); + goto out; + } + + ret = led_classdev_register(hellcreek->dev, &hellcreek->led_is_gm); + if (ret) { + dev_err(hellcreek->dev, "Failed to register is_gm LED\n"); + led_classdev_unregister(&hellcreek->led_sync_good); + goto out; + } ret = 0; -- cgit From b0c959fec18f4595a6a6317ffc30615cfa37bf69 Mon Sep 17 00:00:00 2001 From: Pavel Zhigulin Date: Thu, 13 Nov 2025 19:19:21 +0300 Subject: net: mlxsw: linecards: fix missing error check in mlxsw_linecard_devlink_info_get() The call to devlink_info_version_fixed_put() in mlxsw_linecard_devlink_info_get() did not check for errors, although it is checked everywhere in the code. Add missed 'err' check to the mlxsw_linecard_devlink_info_get() Fixes: 3fc0c51905fb ("mlxsw: core_linecards: Expose device PSID over device info") Signed-off-by: Pavel Zhigulin Reviewed-by: Ido Schimmel Link: https://patch.msgid.link/20251113161922.813828-1-Pavel.Zhigulin@kaspersky.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlxsw/core_linecards.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_linecards.c b/drivers/net/ethernet/mellanox/mlxsw/core_linecards.c index b032d5a4b3b8..10f5bc4892fc 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/core_linecards.c +++ b/drivers/net/ethernet/mellanox/mlxsw/core_linecards.c @@ -601,6 +601,8 @@ int mlxsw_linecard_devlink_info_get(struct mlxsw_linecard *linecard, err = devlink_info_version_fixed_put(req, DEVLINK_INFO_VERSION_GENERIC_FW_PSID, info->psid); + if (err) + goto unlock; sprintf(buf, "%u.%u.%u", info->fw_major, info->fw_minor, info->fw_sub_minor); -- cgit From 035bca3f017ee9dea3a5a756e77a6f7138cc6eea Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 13 Nov 2025 10:39:24 +0000 Subject: mptcp: fix race condition in mptcp_schedule_work() syzbot reported use-after-free in mptcp_schedule_work() [1] Issue here is that mptcp_schedule_work() schedules a work, then gets a refcount on sk->sk_refcnt if the work was scheduled. This refcount will be released by mptcp_worker(). [A] if (schedule_work(...)) { [B] sock_hold(sk); return true; } Problem is that mptcp_worker() can run immediately and complete before [B] We need instead : sock_hold(sk); if (schedule_work(...)) return true; sock_put(sk); [1] refcount_t: addition on 0; use-after-free. WARNING: CPU: 1 PID: 29 at lib/refcount.c:25 refcount_warn_saturate+0xfa/0x1d0 lib/refcount.c:25 Call Trace: __refcount_add include/linux/refcount.h:-1 [inline] __refcount_inc include/linux/refcount.h:366 [inline] refcount_inc include/linux/refcount.h:383 [inline] sock_hold include/net/sock.h:816 [inline] mptcp_schedule_work+0x164/0x1a0 net/mptcp/protocol.c:943 mptcp_tout_timer+0x21/0xa0 net/mptcp/protocol.c:2316 call_timer_fn+0x17e/0x5f0 kernel/time/timer.c:1747 expire_timers kernel/time/timer.c:1798 [inline] __run_timers kernel/time/timer.c:2372 [inline] __run_timer_base+0x648/0x970 kernel/time/timer.c:2384 run_timer_base kernel/time/timer.c:2393 [inline] run_timer_softirq+0xb7/0x180 kernel/time/timer.c:2403 handle_softirqs+0x22f/0x710 kernel/softirq.c:622 __do_softirq kernel/softirq.c:656 [inline] run_ktimerd+0xcf/0x190 kernel/softirq.c:1138 smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 Cc: stable@vger.kernel.org Fixes: 3b1d6210a957 ("mptcp: implement and use MPTCP-level retransmission") Reported-by: syzbot+355158e7e301548a1424@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6915b46f.050a0220.3565dc.0028.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Reviewed-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251113103924.3737425-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/mptcp/protocol.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 2d6b8de35c44..e27e0fe2460f 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -935,14 +935,19 @@ static void mptcp_reset_rtx_timer(struct sock *sk) bool mptcp_schedule_work(struct sock *sk) { - if (inet_sk_state_load(sk) != TCP_CLOSE && - schedule_work(&mptcp_sk(sk)->work)) { - /* each subflow already holds a reference to the sk, and the - * workqueue is invoked by a subflow, so sk can't go away here. - */ - sock_hold(sk); + if (inet_sk_state_load(sk) == TCP_CLOSE) + return false; + + /* Get a reference on this socket, mptcp_worker() will release it. + * As mptcp_worker() might complete before us, we can not avoid + * a sock_hold()/sock_put() if schedule_work() returns false. + */ + sock_hold(sk); + + if (schedule_work(&mptcp_sk(sk)->work)) return true; - } + + sock_put(sk); return false; } -- cgit From dfe28c4167a9259fc0c372d9f9473e1ac95cff67 Mon Sep 17 00:00:00 2001 From: Ilya Maximets Date: Wed, 12 Nov 2025 12:14:03 +0100 Subject: net: openvswitch: remove never-working support for setting nsh fields The validation of the set(nsh(...)) action is completely wrong. It runs through the nsh_key_put_from_nlattr() function that is the same function that validates NSH keys for the flow match and the push_nsh() action. However, the set(nsh(...)) has a very different memory layout. Nested attributes in there are doubled in size in case of the masked set(). That makes proper validation impossible. There is also confusion in the code between the 'masked' flag, that says that the nested attributes are doubled in size containing both the value and the mask, and the 'is_mask' that says that the value we're parsing is the mask. This is causing kernel crash on trying to write into mask part of the match with SW_FLOW_KEY_PUT() during validation, while validate_nsh() doesn't allocate any memory for it: BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1c2383067 P4D 1c2383067 PUD 20b703067 PMD 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 8 UID: 0 Kdump: loaded Not tainted 6.17.0-rc4+ #107 PREEMPT(voluntary) RIP: 0010:nsh_key_put_from_nlattr+0x19d/0x610 [openvswitch] Call Trace: validate_nsh+0x60/0x90 [openvswitch] validate_set.constprop.0+0x270/0x3c0 [openvswitch] __ovs_nla_copy_actions+0x477/0x860 [openvswitch] ovs_nla_copy_actions+0x8d/0x100 [openvswitch] ovs_packet_cmd_execute+0x1cc/0x310 [openvswitch] genl_family_rcv_msg_doit+0xdb/0x130 genl_family_rcv_msg+0x14b/0x220 genl_rcv_msg+0x47/0xa0 netlink_rcv_skb+0x53/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x280/0x3b0 netlink_sendmsg+0x1f7/0x430 ____sys_sendmsg+0x36b/0x3a0 ___sys_sendmsg+0x87/0xd0 __sys_sendmsg+0x6d/0xd0 do_syscall_64+0x7b/0x2c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e The third issue with this process is that while trying to convert the non-masked set into masked one, validate_set() copies and doubles the size of the OVS_KEY_ATTR_NSH as if it didn't have any nested attributes. It should be copying each nested attribute and doubling them in size independently. And the process must be properly reversed during the conversion back from masked to a non-masked variant during the flow dump. In the end, the only two outcomes of trying to use this action are either validation failure or a kernel crash. And if somehow someone manages to install a flow with such an action, it will most definitely not do what it is supposed to, since all the keys and the masks are mixed up. Fixing all the issues is a complex task as it requires re-writing most of the validation code. Given that and the fact that this functionality never worked since introduction, let's just remove it altogether. It's better to re-introduce it later with a proper implementation instead of trying to fix it in stable releases. Fixes: b2d0f5d5dc53 ("openvswitch: enable NSH support") Reported-by: Junvy Yang Signed-off-by: Ilya Maximets Acked-by: Eelco Chaudron Reviewed-by: Aaron Conole Link: https://patch.msgid.link/20251112112246.95064-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski --- net/openvswitch/actions.c | 68 +----------------------------------------- net/openvswitch/flow_netlink.c | 64 +++++---------------------------------- net/openvswitch/flow_netlink.h | 2 -- 3 files changed, 9 insertions(+), 125 deletions(-) diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index 2832e0794197..792ca44a461d 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -572,69 +572,6 @@ static int set_ipv6(struct sk_buff *skb, struct sw_flow_key *flow_key, return 0; } -static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key, - const struct nlattr *a) -{ - struct nshhdr *nh; - size_t length; - int err; - u8 flags; - u8 ttl; - int i; - - struct ovs_key_nsh key; - struct ovs_key_nsh mask; - - err = nsh_key_from_nlattr(a, &key, &mask); - if (err) - return err; - - /* Make sure the NSH base header is there */ - if (!pskb_may_pull(skb, skb_network_offset(skb) + NSH_BASE_HDR_LEN)) - return -ENOMEM; - - nh = nsh_hdr(skb); - length = nsh_hdr_len(nh); - - /* Make sure the whole NSH header is there */ - err = skb_ensure_writable(skb, skb_network_offset(skb) + - length); - if (unlikely(err)) - return err; - - nh = nsh_hdr(skb); - skb_postpull_rcsum(skb, nh, length); - flags = nsh_get_flags(nh); - flags = OVS_MASKED(flags, key.base.flags, mask.base.flags); - flow_key->nsh.base.flags = flags; - ttl = nsh_get_ttl(nh); - ttl = OVS_MASKED(ttl, key.base.ttl, mask.base.ttl); - flow_key->nsh.base.ttl = ttl; - nsh_set_flags_and_ttl(nh, flags, ttl); - nh->path_hdr = OVS_MASKED(nh->path_hdr, key.base.path_hdr, - mask.base.path_hdr); - flow_key->nsh.base.path_hdr = nh->path_hdr; - switch (nh->mdtype) { - case NSH_M_TYPE1: - for (i = 0; i < NSH_MD1_CONTEXT_SIZE; i++) { - nh->md1.context[i] = - OVS_MASKED(nh->md1.context[i], key.context[i], - mask.context[i]); - } - memcpy(flow_key->nsh.context, nh->md1.context, - sizeof(nh->md1.context)); - break; - case NSH_M_TYPE2: - memset(flow_key->nsh.context, 0, - sizeof(flow_key->nsh.context)); - break; - default: - return -EINVAL; - } - skb_postpush_rcsum(skb, nh, length); - return 0; -} - /* Must follow skb_ensure_writable() since that can move the skb data. */ static void set_tp_port(struct sk_buff *skb, __be16 *port, __be16 new_port, __sum16 *check) @@ -1130,10 +1067,6 @@ static int execute_masked_set_action(struct sk_buff *skb, get_mask(a, struct ovs_key_ethernet *)); break; - case OVS_KEY_ATTR_NSH: - err = set_nsh(skb, flow_key, a); - break; - case OVS_KEY_ATTR_IPV4: err = set_ipv4(skb, flow_key, nla_data(a), get_mask(a, struct ovs_key_ipv4 *)); @@ -1170,6 +1103,7 @@ static int execute_masked_set_action(struct sk_buff *skb, case OVS_KEY_ATTR_CT_LABELS: case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4: case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6: + case OVS_KEY_ATTR_NSH: err = -EINVAL; break; } diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c index ad64bb9ab5e2..1cb4f97335d8 100644 --- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -1305,6 +1305,11 @@ static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match, return 0; } +/* + * Constructs NSH header 'nh' from attributes of OVS_ACTION_ATTR_PUSH_NSH, + * where 'nh' points to a memory block of 'size' bytes. It's assumed that + * attributes were previously validated with validate_push_nsh(). + */ int nsh_hdr_from_nlattr(const struct nlattr *attr, struct nshhdr *nh, size_t size) { @@ -1314,8 +1319,6 @@ int nsh_hdr_from_nlattr(const struct nlattr *attr, u8 ttl = 0; int mdlen = 0; - /* validate_nsh has check this, so we needn't do duplicate check here - */ if (size < NSH_BASE_HDR_LEN) return -ENOBUFS; @@ -1359,46 +1362,6 @@ int nsh_hdr_from_nlattr(const struct nlattr *attr, return 0; } -int nsh_key_from_nlattr(const struct nlattr *attr, - struct ovs_key_nsh *nsh, struct ovs_key_nsh *nsh_mask) -{ - struct nlattr *a; - int rem; - - /* validate_nsh has check this, so we needn't do duplicate check here - */ - nla_for_each_nested(a, attr, rem) { - int type = nla_type(a); - - switch (type) { - case OVS_NSH_KEY_ATTR_BASE: { - const struct ovs_nsh_key_base *base = nla_data(a); - const struct ovs_nsh_key_base *base_mask = base + 1; - - nsh->base = *base; - nsh_mask->base = *base_mask; - break; - } - case OVS_NSH_KEY_ATTR_MD1: { - const struct ovs_nsh_key_md1 *md1 = nla_data(a); - const struct ovs_nsh_key_md1 *md1_mask = md1 + 1; - - memcpy(nsh->context, md1->context, sizeof(*md1)); - memcpy(nsh_mask->context, md1_mask->context, - sizeof(*md1_mask)); - break; - } - case OVS_NSH_KEY_ATTR_MD2: - /* Not supported yet */ - return -ENOTSUPP; - default: - return -EINVAL; - } - } - - return 0; -} - static int nsh_key_put_from_nlattr(const struct nlattr *attr, struct sw_flow_match *match, bool is_mask, bool is_push_nsh, bool log) @@ -2839,17 +2802,13 @@ static int validate_and_copy_set_tun(const struct nlattr *attr, return err; } -static bool validate_nsh(const struct nlattr *attr, bool is_mask, - bool is_push_nsh, bool log) +static bool validate_push_nsh(const struct nlattr *attr, bool log) { struct sw_flow_match match; struct sw_flow_key key; - int ret = 0; ovs_match_init(&match, &key, true, NULL); - ret = nsh_key_put_from_nlattr(attr, &match, is_mask, - is_push_nsh, log); - return !ret; + return !nsh_key_put_from_nlattr(attr, &match, false, true, log); } /* Return false if there are any non-masked bits set. @@ -2997,13 +2956,6 @@ static int validate_set(const struct nlattr *a, break; - case OVS_KEY_ATTR_NSH: - if (eth_type != htons(ETH_P_NSH)) - return -EINVAL; - if (!validate_nsh(nla_data(a), masked, false, log)) - return -EINVAL; - break; - default: return -EINVAL; } @@ -3437,7 +3389,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr, return -EINVAL; } mac_proto = MAC_PROTO_NONE; - if (!validate_nsh(nla_data(a), false, true, true)) + if (!validate_push_nsh(nla_data(a), log)) return -EINVAL; break; diff --git a/net/openvswitch/flow_netlink.h b/net/openvswitch/flow_netlink.h index fe7f77fc5f18..ff8cdecbe346 100644 --- a/net/openvswitch/flow_netlink.h +++ b/net/openvswitch/flow_netlink.h @@ -65,8 +65,6 @@ int ovs_nla_put_actions(const struct nlattr *attr, void ovs_nla_free_flow_actions(struct sw_flow_actions *); void ovs_nla_free_flow_actions_rcu(struct sw_flow_actions *); -int nsh_key_from_nlattr(const struct nlattr *attr, struct ovs_key_nsh *nsh, - struct ovs_key_nsh *nsh_mask); int nsh_hdr_from_nlattr(const struct nlattr *attr, struct nshhdr *nh, size_t size); -- cgit From 5442a9da69789741bfda39f34ee7f69552bf0c56 Mon Sep 17 00:00:00 2001 From: Jesper Dangaard Brouer Date: Wed, 12 Nov 2025 14:13:52 +0100 Subject: veth: more robust handing of race to avoid txq getting stuck Commit dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") introduced a race condition that can lead to a permanently stalled TXQ. This was observed in production on ARM64 systems (Ampere Altra Max). The race occurs in veth_xmit(). The producer observes a full ptr_ring and stops the queue (netif_tx_stop_queue()). The subsequent conditional logic, intended to re-wake the queue if the consumer had just emptied it (if (__ptr_ring_empty(...)) netif_tx_wake_queue()), can fail. This leads to a "lost wakeup" where the TXQ remains stopped (QUEUE_STATE_DRV_XOFF) and traffic halts. This failure is caused by an incorrect use of the __ptr_ring_empty() API from the producer side. As noted in kernel comments, this check is not guaranteed to be correct if a consumer is operating on another CPU. The empty test is based on ptr_ring->consumer_head, making it reliable only for the consumer. Using this check from the producer side is fundamentally racy. This patch fixes the race by adopting the more robust logic from an earlier version V4 of the patchset, which always flushed the peer: (1) In veth_xmit(), the racy conditional wake-up logic and its memory barrier are removed. Instead, after stopping the queue, we unconditionally call __veth_xdp_flush(rq). This guarantees that the NAPI consumer is scheduled, making it solely responsible for re-waking the TXQ. This handles the race where veth_poll() consumes all packets and completes NAPI *before* veth_xmit() on the producer side has called netif_tx_stop_queue. The __veth_xdp_flush(rq) will observe rx_notify_masked is false and schedule NAPI. (2) On the consumer side, the logic for waking the peer TXQ is moved out of veth_xdp_rcv() and placed at the end of the veth_poll() function. This placement is part of fixing the race, as the netif_tx_queue_stopped() check must occur after rx_notify_masked is potentially set to false during NAPI completion. This handles the race where veth_poll() consumes all packets, but haven't finished (rx_notify_masked is still true). The producer veth_xmit() stops the TXQ and __veth_xdp_flush(rq) will observe rx_notify_masked is true, meaning not starting NAPI. Then veth_poll() change rx_notify_masked to false and stops NAPI. Before exiting veth_poll() will observe TXQ is stopped and wake it up. Fixes: dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") Reviewed-by: Toshiaki Makita Signed-off-by: Jesper Dangaard Brouer Link: https://patch.msgid.link/176295323282.307447.14790015927673763094.stgit@firesoul Signed-off-by: Jakub Kicinski --- drivers/net/veth.c | 38 ++++++++++++++++++++------------------ 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index a3046142cb8e..35dd89aff4a9 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -392,14 +392,12 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev) } /* Restore Eth hdr pulled by dev_forward_skb/eth_type_trans */ __skb_push(skb, ETH_HLEN); - /* Depend on prior success packets started NAPI consumer via - * __veth_xdp_flush(). Cancel TXQ stop if consumer stopped, - * paired with empty check in veth_poll(). - */ netif_tx_stop_queue(txq); - smp_mb__after_atomic(); - if (unlikely(__ptr_ring_empty(&rq->xdp_ring))) - netif_tx_wake_queue(txq); + /* Makes sure NAPI peer consumer runs. Consumer is responsible + * for starting txq again, until then ndo_start_xmit (this + * function) will not be invoked by the netstack again. + */ + __veth_xdp_flush(rq); break; case NET_RX_DROP: /* same as NET_XMIT_DROP */ drop: @@ -900,17 +898,9 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget, struct veth_xdp_tx_bq *bq, struct veth_stats *stats) { - struct veth_priv *priv = netdev_priv(rq->dev); - int queue_idx = rq->xdp_rxq.queue_index; - struct netdev_queue *peer_txq; - struct net_device *peer_dev; int i, done = 0, n_xdpf = 0; void *xdpf[VETH_XDP_BATCH]; - /* NAPI functions as RCU section */ - peer_dev = rcu_dereference_check(priv->peer, rcu_read_lock_bh_held()); - peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL; - for (i = 0; i < budget; i++) { void *ptr = __ptr_ring_consume(&rq->xdp_ring); @@ -959,9 +949,6 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget, rq->stats.vs.xdp_packets += done; u64_stats_update_end(&rq->stats.syncp); - if (peer_txq && unlikely(netif_tx_queue_stopped(peer_txq))) - netif_tx_wake_queue(peer_txq); - return done; } @@ -969,12 +956,20 @@ static int veth_poll(struct napi_struct *napi, int budget) { struct veth_rq *rq = container_of(napi, struct veth_rq, xdp_napi); + struct veth_priv *priv = netdev_priv(rq->dev); + int queue_idx = rq->xdp_rxq.queue_index; + struct netdev_queue *peer_txq; struct veth_stats stats = {}; + struct net_device *peer_dev; struct veth_xdp_tx_bq bq; int done; bq.count = 0; + /* NAPI functions as RCU section */ + peer_dev = rcu_dereference_check(priv->peer, rcu_read_lock_bh_held()); + peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL; + xdp_set_return_frame_no_direct(); done = veth_xdp_rcv(rq, budget, &bq, &stats); @@ -996,6 +991,13 @@ static int veth_poll(struct napi_struct *napi, int budget) veth_xdp_flush(rq, &bq); xdp_clear_return_frame_no_direct(); + /* Release backpressure per NAPI poll */ + smp_rmb(); /* Paired with netif_tx_stop_queue set_bit */ + if (peer_txq && netif_tx_queue_stopped(peer_txq)) { + txq_trans_cond_update(peer_txq); + netif_tx_wake_queue(peer_txq); + } + return done; } -- cgit From 39231e8d6ba7f794b566fd91ebd88c0834a23b98 Mon Sep 17 00:00:00 2001 From: "David Hildenbrand (Red Hat)" Date: Fri, 14 Nov 2025 22:49:20 +0100 Subject: mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support runtime allocation of gigantic hugetlb folios. In the meantime it evolved into a generic way for the architecture to state that it supports gigantic hugetlb folios. In commit fae7d834c43c ("mm: add __dump_folio()") we started using CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could have folios larger than what the buddy can handle. In the context of that commit, we started using MAX_FOLIO_ORDER to detect page corruptions when dumping tail pages of folios. Before that commit, we assumed that we cannot have folios larger than the highest buddy order, which was obviously wrong. In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes when registering hstate"), we used MAX_FOLIO_ORDER to detect inconsistencies, and in fact, we found some now. Powerpc allows for configs that can allocate gigantic folio during boot (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can exceed PUD_ORDER. To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE with hugetlb on powerpc, and increase the maximum folio size with hugetlb to 16 GiB on 64bit (possible on arm64 and powerpc) and 1 GiB on 32 bit (powerpc). Note that on some powerpc configurations, whether we actually have gigantic pages depends on the setting of CONFIG_ARCH_FORCE_MAX_ORDER, but there is nothing really problematic about setting it unconditionally: we just try to keep the value small so we can better detect problems in __dump_folio() and inconsistencies around the expected largest folio in the system. Ideally, we'd have a better way to obtain the maximum hugetlb folio size and detect ourselves whether we really end up with gigantic folios. Let's defer bigger changes and fix the warnings first. While at it, handle gigantic DAX folios more clearly: DAX can only end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD. Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with HUGETLB_PAGE. Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on powerpc, we will now also allow for runtime allocations of folios in some more powerpc configs. I don't think this is a problem, but if it is we could handle it through __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED. While __dump_page()/__dump_folio was also problematic (not handling dumping of tail pages of such gigantic folios correctly), it doesn't seem critical enough to mark it as a fix. Link: https://lkml.kernel.org/r/20251114214920.2550676-1-david@kernel.org Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes when registering hstate") Reported-by: Christophe Leroy Closes: https://lore.kernel.org/r/3e043453-3f27-48ad-b987-cc39f523060a@csgroup.eu/ Reported-by: Sourabh Jain Closes: https://lore.kernel.org/r/94377f5c-d4f0-4c0f-b0f6-5bf1cd7305b1@linux.ibm.com/ Signed-off-by: David Hildenbrand (Red Hat) Cc: Ritesh Harjani (IBM) Cc: Madhavan Srinivasan Cc: Donet Tom Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: Lorenzo Stoakes Cc: "Liam R. Howlett" Cc: Vlastimil Babka Cc: Mike Rapoport Cc: Suren Baghdasaryan Cc: Michal Hocko Cc: Nathan Chancellor Signed-off-by: Andrew Morton --- arch/powerpc/Kconfig | 1 + arch/powerpc/platforms/Kconfig.cputype | 1 - include/linux/mm.h | 13 ++++++++++--- mm/Kconfig | 7 +++++++ 4 files changed, 18 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index e24f4d88885a..9537a61ebae0 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -137,6 +137,7 @@ config PPC select ARCH_HAS_DMA_OPS if PPC64 select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_GCOV_PROFILE_ALL + select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS select ARCH_HAS_KCOV select ARCH_HAS_KERNEL_FPU_SUPPORT if PPC64 && PPC_FPU select ARCH_HAS_MEMBARRIER_CALLBACKS diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index 7b527d18aa5e..4c321a8ea896 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU config PPC_RADIX_MMU bool "Radix MMU Support" depends on PPC_BOOK3S_64 - select ARCH_HAS_GIGANTIC_PAGE default y help Enable support for the Power ISA 3.0 Radix style MMU. Currently this diff --git a/include/linux/mm.h b/include/linux/mm.h index d16b33bacc32..7c79b3369b82 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const struct folio *folio) return folio_large_nr_pages(folio); } -#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE) +#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS) /* * We don't expect any folios that exceed buddy sizes (and consequently * memory sections). @@ -2087,10 +2087,17 @@ static inline unsigned long folio_nr_pages(const struct folio *folio) * pages are guaranteed to be contiguous. */ #define MAX_FOLIO_ORDER PFN_SECTION_SHIFT -#else +#elif defined(CONFIG_HUGETLB_PAGE) /* * There is no real limit on the folio size. We limit them to the maximum we - * currently expect (e.g., hugetlb, dax). + * currently expect (see CONFIG_HAVE_GIGANTIC_FOLIOS): with hugetlb, we expect + * no folios larger than 16 GiB on 64bit and 1 GiB on 32bit. + */ +#define MAX_FOLIO_ORDER get_order(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G) +#else +/* + * Without hugetlb, gigantic folios that are bigger than a single PUD are + * currently impossible. */ #define MAX_FOLIO_ORDER PUD_ORDER #endif diff --git a/mm/Kconfig b/mm/Kconfig index 0e26f4fc8717..ca3f146bc705 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -908,6 +908,13 @@ config PAGE_MAPCOUNT config PGTABLE_HAS_HUGE_LEAVES def_bool TRANSPARENT_HUGEPAGE || HUGETLB_PAGE +# +# We can end up creating gigantic folio. +# +config HAVE_GIGANTIC_FOLIOS + def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \ + (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) + # TODO: Allow to be enabled without THP config ARCH_SUPPORTS_HUGE_PFNMAP def_bool n -- cgit From 00fbff75c5acb4755f06f08bd1071879c63940c5 Mon Sep 17 00:00:00 2001 From: Sourabh Jain Date: Sun, 2 Nov 2025 01:07:41 +0530 Subject: crash: fix crashkernel resource shrink When crashkernel is configured with a high reservation, shrinking its value below the low crashkernel reservation causes two issues: 1. Invalid crashkernel resource objects 2. Kernel crash if crashkernel shrinking is done twice For example, with crashkernel=200M,high, the kernel reserves 200MB of high memory and some default low memory (say 256MB). The reservation appears as: cat /proc/iomem | grep -i crash af000000-beffffff : Crash kernel 433000000-43f7fffff : Crash kernel If crashkernel is then shrunk to 50MB (echo 52428800 > /sys/kernel/kexec_crash_size), /proc/iomem still shows 256MB reserved: af000000-beffffff : Crash kernel Instead, it should show 50MB: af000000-b21fffff : Crash kernel Further shrinking crashkernel to 40MB causes a kernel crash with the following trace (x86): BUG: kernel NULL pointer dereference, address: 0000000000000038 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI Call Trace: ? __die_body.cold+0x19/0x27 ? page_fault_oops+0x15a/0x2f0 ? search_module_extables+0x19/0x60 ? search_bpf_extables+0x5f/0x80 ? exc_page_fault+0x7e/0x180 ? asm_exc_page_fault+0x26/0x30 ? __release_resource+0xd/0xb0 release_resource+0x26/0x40 __crash_shrink_memory+0xe5/0x110 crash_shrink_memory+0x12a/0x190 kexec_crash_size_store+0x41/0x80 kernfs_fop_write_iter+0x141/0x1f0 vfs_write+0x294/0x460 ksys_write+0x6d/0xf0 This happens because __crash_shrink_memory()/kernel/crash_core.c incorrectly updates the crashk_res resource object even when crashk_low_res should be updated. Fix this by ensuring the correct crashkernel resource object is updated when shrinking crashkernel memory. Link: https://lkml.kernel.org/r/20251101193741.289252-1-sourabhjain@linux.ibm.com Fixes: 16c6006af4d4 ("kexec: enable kexec_crash_size to support two crash kernel regions") Signed-off-by: Sourabh Jain Acked-by: Baoquan He Cc: Zhen Lei Cc: Signed-off-by: Andrew Morton --- kernel/crash_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 3b1c43382eec..99dac1aa972a 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -373,7 +373,7 @@ static int __crash_shrink_memory(struct resource *old_res, old_res->start = 0; old_res->end = 0; } else { - crashk_res.end = ram_res->start - 1; + old_res->end = ram_res->start - 1; } crash_free_reserved_phys_range(ram_res->start, ram_res->end); -- cgit From 3470715e5c22578c6ea4098b256d5a904e12eef2 Mon Sep 17 00:00:00 2001 From: "David Hildenbrand (Red Hat)" Date: Mon, 3 Nov 2025 11:36:59 +0100 Subject: MAINTAINERS: update David Hildenbrand's email address Switch to kernel.org email address as I will be leaving Red Hat. The old address will remain active until end of January 2026, so performing the change now should make sure that most mails will reach me. Link: https://lkml.kernel.org/r/20251103103659.379335-1-david@kernel.org Signed-off-by: David Hildenbrand Signed-off-by: David Hildenbrand (Red Hat) Signed-off-by: Andrew Morton --- .mailmap | 1 + MAINTAINERS | 28 ++++++++++++++-------------- 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/.mailmap b/.mailmap index 369cfe467932..a14166f834a4 100644 --- a/.mailmap +++ b/.mailmap @@ -206,6 +206,7 @@ Danilo Krummrich David Brownell David Collins David Heidelberg +David Hildenbrand David Rheinsberg David Rheinsberg David Rheinsberg diff --git a/MAINTAINERS b/MAINTAINERS index 5b93346f464f..c39701eec3fe 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11526,7 +11526,7 @@ F: include/linux/platform_data/huawei-gaokun-ec.h HUGETLB SUBSYSTEM M: Muchun Song M: Oscar Salvador -R: David Hildenbrand +R: David Hildenbrand L: linux-mm@kvack.org S: Maintained F: Documentation/ABI/testing/sysfs-kernel-mm-hugepages @@ -13733,7 +13733,7 @@ KERNEL VIRTUAL MACHINE for s390 (KVM/s390) M: Christian Borntraeger M: Janosch Frank M: Claudio Imbrenda -R: David Hildenbrand +R: David Hildenbrand L: kvm@vger.kernel.org S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git @@ -16220,7 +16220,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git F: drivers/devfreq/tegra30-devfreq.c MEMORY HOT(UN)PLUG -M: David Hildenbrand +M: David Hildenbrand M: Oscar Salvador L: linux-mm@kvack.org S: Maintained @@ -16245,7 +16245,7 @@ F: tools/mm/ MEMORY MANAGEMENT - CORE M: Andrew Morton -M: David Hildenbrand +M: David Hildenbrand R: Lorenzo Stoakes R: Liam R. Howlett R: Vlastimil Babka @@ -16301,7 +16301,7 @@ F: mm/execmem.c MEMORY MANAGEMENT - GUP (GET USER PAGES) M: Andrew Morton -M: David Hildenbrand +M: David Hildenbrand R: Jason Gunthorpe R: John Hubbard R: Peter Xu @@ -16317,7 +16317,7 @@ F: tools/testing/selftests/mm/gup_test.c MEMORY MANAGEMENT - KSM (Kernel Samepage Merging) M: Andrew Morton -M: David Hildenbrand +M: David Hildenbrand R: Xu Xin R: Chengming Zhou L: linux-mm@kvack.org @@ -16333,7 +16333,7 @@ F: mm/mm_slot.h MEMORY MANAGEMENT - MEMORY POLICY AND MIGRATION M: Andrew Morton -M: David Hildenbrand +M: David Hildenbrand R: Zi Yan R: Matthew Brost R: Joshua Hahn @@ -16373,7 +16373,7 @@ F: mm/workingset.c MEMORY MANAGEMENT - MISC M: Andrew Morton -M: David Hildenbrand +M: David Hildenbrand R: Lorenzo Stoakes R: Liam R. Howlett R: Vlastimil Babka @@ -16461,7 +16461,7 @@ F: mm/shuffle.h MEMORY MANAGEMENT - RECLAIM M: Andrew Morton M: Johannes Weiner -R: David Hildenbrand +R: David Hildenbrand R: Michal Hocko R: Qi Zheng R: Shakeel Butt @@ -16474,7 +16474,7 @@ F: mm/workingset.c MEMORY MANAGEMENT - RMAP (REVERSE MAPPING) M: Andrew Morton -M: David Hildenbrand +M: David Hildenbrand M: Lorenzo Stoakes R: Rik van Riel R: Liam R. Howlett @@ -16519,7 +16519,7 @@ F: mm/swapfile.c MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE) M: Andrew Morton -M: David Hildenbrand +M: David Hildenbrand M: Lorenzo Stoakes R: Zi Yan R: Baolin Wang @@ -16621,7 +16621,7 @@ MEMORY MAPPING - MADVISE (MEMORY ADVICE) M: Andrew Morton M: Liam R. Howlett M: Lorenzo Stoakes -M: David Hildenbrand +M: David Hildenbrand R: Vlastimil Babka R: Jann Horn L: linux-mm@kvack.org @@ -27088,7 +27088,7 @@ F: net/vmw_vsock/virtio_transport_common.c VIRTIO BALLOON M: "Michael S. Tsirkin" -M: David Hildenbrand +M: David Hildenbrand L: virtualization@lists.linux.dev S: Maintained F: drivers/virtio/virtio_balloon.c @@ -27243,7 +27243,7 @@ F: drivers/iommu/virtio-iommu.c F: include/uapi/linux/virtio_iommu.h VIRTIO MEM DRIVER -M: David Hildenbrand +M: David Hildenbrand L: virtualization@lists.linux.dev S: Maintained W: https://virtio-mem.gitlab.io/ -- cgit From f1d47cafe513b5552a5b20a7af0936d9070a8a78 Mon Sep 17 00:00:00 2001 From: Zi Yan Date: Wed, 5 Nov 2025 11:29:10 -0500 Subject: mm/huge_memory: fix folio split check for anon folios in swapcache Both uniform and non uniform split check missed the check to prevent splitting anon folios in swapcache to non-zero order. Splitting anon folios in swapcache to non-zero order can cause data corruption since swapcache only support PMD order and order-0 entries. This can happen when one use split_huge_pages under debugfs to split anon folios in swapcache. In-tree callers do not perform such an illegal operation. Only debugfs interface could trigger it. I will put adding a test case on my TODO list. Fix the check. Link: https://lkml.kernel.org/r/20251105162910.752266-1-ziy@nvidia.com Fixes: 58729c04cf10 ("mm/huge_memory: add buddy allocator like (non-uniform) folio_split()") Signed-off-by: Zi Yan Reported-by: "David Hildenbrand (Red Hat)" Closes: https://lore.kernel.org/all/dc0ecc2c-4089-484f-917f-920fdca4c898@kernel.org/ Acked-by: David Hildenbrand (Red Hat) Cc: Baolin Wang Cc: Barry Song Cc: Dev Jain Cc: Lance Yang Cc: Liam Howlett Cc: Lorenzo Stoakes Cc: Nico Pache Cc: Ryan Roberts Cc: Wei Yang Cc: Signed-off-by: Andrew Morton --- mm/huge_memory.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 323654fb4f8c..2f2a521e5d68 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3522,7 +3522,8 @@ bool non_uniform_split_supported(struct folio *folio, unsigned int new_order, /* order-1 is not supported for anonymous THP. */ VM_WARN_ONCE(warns && new_order == 1, "Cannot split to order-1 folio"); - return new_order != 1; + if (new_order == 1) + return false; } else if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !mapping_large_folio_support(folio->mapping)) { /* @@ -3553,7 +3554,8 @@ bool uniform_split_supported(struct folio *folio, unsigned int new_order, if (folio_test_anon(folio)) { VM_WARN_ONCE(warns && new_order == 1, "Cannot split to order-1 folio"); - return new_order != 1; + if (new_order == 1) + return false; } else if (new_order) { if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !mapping_large_folio_support(folio->mapping)) { -- cgit From a26ec8f3d4e56d4a7ffa301e8032dca9df0bbc05 Mon Sep 17 00:00:00 2001 From: Pasha Tatashin Date: Thu, 6 Nov 2025 17:06:35 -0500 Subject: lib/test_kho: check if KHO is enabled We must check whether KHO is enabled prior to issuing KHO commands, otherwise KHO internal data structures are not initialized. Link: https://lkml.kernel.org/r/20251106220635.2608494-1-pasha.tatashin@soleen.com Fixes: b753522bed0b ("kho: add test for kexec handover") Signed-off-by: Pasha Tatashin Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-lkp/202511061629.e242724-lkp@intel.com Reviewed-by: Pratyush Yadav Reviewed-by: Mike Rapoport (Microsoft) Cc: Alexander Graf Cc: Signed-off-by: Andrew Morton --- lib/test_kho.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/test_kho.c b/lib/test_kho.c index 60cd899ea745..fff018e5548d 100644 --- a/lib/test_kho.c +++ b/lib/test_kho.c @@ -301,6 +301,9 @@ static int __init kho_test_init(void) phys_addr_t fdt_phys; int err; + if (!kho_is_enabled()) + return 0; + err = kho_retrieve_subtree(KHO_TEST_FDT, &fdt_phys); if (!err) return kho_test_restore(fdt_phys); -- cgit From 216158f063fe24fb003bd7da0cd92cd6e2c4d48b Mon Sep 17 00:00:00 2001 From: Ankit Khushwaha Date: Thu, 6 Nov 2025 15:25:32 +0530 Subject: selftests/user_events: fix type cast for write_index packed member in perf_test Accessing 'reg.write_index' directly triggers a -Waddress-of-packed-member warning due to potential unaligned pointer access: perf_test.c:239:38: warning: taking address of packed member 'write_index' of class or structure 'user_reg' may result in an unaligned pointer value [-Waddress-of-packed-member] 239 | ASSERT_NE(-1, write(self->data_fd, ®.write_index, | ^~~~~~~~~~~~~~~ Since write(2) works with any alignment. Casting '®.write_index' explicitly to 'void *' to suppress this warning. Link: https://lkml.kernel.org/r/20251106095532.15185-1-ankitkhushwaha.linux@gmail.com Fixes: 42187bdc3ca4 ("selftests/user_events: Add perf self-test for empty arguments events") Signed-off-by: Ankit Khushwaha Cc: Beau Belgrave Cc: "Masami Hiramatsu (Google)" Cc: Steven Rostedt Cc: sunliming Cc: Wei Yang Cc: Shuah Khan Cc: Signed-off-by: Andrew Morton --- tools/testing/selftests/user_events/perf_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/user_events/perf_test.c b/tools/testing/selftests/user_events/perf_test.c index 5288e768b207..68625362add2 100644 --- a/tools/testing/selftests/user_events/perf_test.c +++ b/tools/testing/selftests/user_events/perf_test.c @@ -236,7 +236,7 @@ TEST_F(user, perf_empty_events) { ASSERT_EQ(1 << reg.enable_bit, self->check); /* Ensure write shows up at correct offset */ - ASSERT_NE(-1, write(self->data_fd, ®.write_index, + ASSERT_NE(-1, write(self->data_fd, (void *)®.write_index, sizeof(reg.write_index))); val = (void *)(((char *)perf_page) + perf_page->data_offset); ASSERT_EQ(PERF_RECORD_SAMPLE, *val); -- cgit From 1c2a936edd71e133f2806e68324ec81a4eb07588 Mon Sep 17 00:00:00 2001 From: Kairui Song Date: Tue, 11 Nov 2025 21:36:08 +0800 Subject: mm, swap: fix potential UAF issue for VMA readahead Since commit 78524b05f1a3 ("mm, swap: avoid redundant swap device pinning"), the common helper for allocating and preparing a folio in the swap cache layer no longer tries to get a swap device reference internally, because all callers of __read_swap_cache_async are already holding a swap entry reference. The repeated swap device pinning isn't needed on the same swap device. Caller of VMA readahead is also holding a reference to the target entry's swap device, but VMA readahead walks the page table, so it might encounter swap entries from other devices, and call __read_swap_cache_async on another device without holding a reference to it. So it is possible to cause a UAF when swapoff of device A raced with swapin on device B, and VMA readahead tries to read swap entries from device A. It's not easy to trigger, but in theory, it could cause real issues. Make VMA readahead try to get the device reference first if the swap device is a different one from the target entry. Link: https://lkml.kernel.org/r/20251111-swap-fix-vma-uaf-v1-1-41c660e58562@tencent.com Fixes: 78524b05f1a3 ("mm, swap: avoid redundant swap device pinning") Suggested-by: Huang Ying Signed-off-by: Kairui Song Acked-by: Chris Li Cc: Baoquan He Cc: Barry Song Cc: Kemeng Shi Cc: Nhat Pham Cc: Signed-off-by: Andrew Morton --- mm/swap_state.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/mm/swap_state.c b/mm/swap_state.c index b13e9c4baa90..f4980dde5394 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -748,6 +748,8 @@ static struct folio *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask, blk_start_plug(&plug); for (addr = start; addr < end; ilx++, addr += PAGE_SIZE) { + struct swap_info_struct *si = NULL; + if (!pte++) { pte = pte_offset_map(vmf->pmd, addr); if (!pte) @@ -761,8 +763,19 @@ static struct folio *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask, continue; pte_unmap(pte); pte = NULL; + /* + * Readahead entry may come from a device that we are not + * holding a reference to, try to grab a reference, or skip. + */ + if (swp_type(entry) != swp_type(targ_entry)) { + si = get_swap_device(entry); + if (!si) + continue; + } folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx, &page_allocated, false); + if (si) + put_swap_device(si); if (!folio) continue; if (page_allocated) { -- cgit From 1107aac1ad7f445a83604b14af7be47f1a795c66 Mon Sep 17 00:00:00 2001 From: Takashi Sakamoto Date: Fri, 14 Nov 2025 23:44:21 +0900 Subject: firewire: core: fix to update generation field in topology map The generation field of topology map is updated after initialized by zero. The updated value of generation field is always zero, and is against specification. This commit fixes the bug. Fixes: 7d138cb269db ("firewire: core: use spin lock specific to topology map") Link: https://lore.kernel.org/r/20251114144421.415278-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto --- drivers/firewire/core-topology.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/firewire/core-topology.c b/drivers/firewire/core-topology.c index 2f73bcd5696f..ed3ae8cdb0cd 100644 --- a/drivers/firewire/core-topology.c +++ b/drivers/firewire/core-topology.c @@ -441,12 +441,13 @@ static void update_topology_map(__be32 *buffer, size_t buffer_size, int root_nod const u32 *self_ids, int self_id_count) { __be32 *map = buffer; + u32 next_generation = be32_to_cpu(buffer[1]) + 1; int node_count = (root_node_id & 0x3f) + 1; memset(map, 0, buffer_size); *map++ = cpu_to_be32((self_id_count + 2) << 16); - *map++ = cpu_to_be32(be32_to_cpu(buffer[1]) + 1); + *map++ = cpu_to_be32(next_generation); *map++ = cpu_to_be32((node_count << 16) | self_id_count); while (self_id_count--) -- cgit From 6a23ae0a96a600d1d12557add110e0bb6e32730c Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 16 Nov 2025 14:25:38 -0800 Subject: Linux 6.18-rc6 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index fb4389aa5d5f..d763c2c75cdb 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 6 PATCHLEVEL = 18 SUBLEVEL = 0 -EXTRAVERSION = -rc5 +EXTRAVERSION = -rc6 NAME = Baby Opossum Posse # *DOCUMENTATION* -- cgit From 36c6f3c03d104faf1aa90922f2310549c175420f Mon Sep 17 00:00:00 2001 From: Zqiang Date: Mon, 17 Nov 2025 20:53:10 +0800 Subject: sched_ext: Use IRQ_WORK_INIT_HARD() to initialize rq->scx.kick_cpus_irq_work For PREEMPT_RT kernels, the kick_cpus_irq_workfn() be invoked in the per-cpu irq_work/* task context and there is no rcu-read critical section to protect. this commit therefore use IRQ_WORK_INIT_HARD() to initialize the per-cpu rq->scx.kick_cpus_irq_work in the init_sched_ext_class(). Signed-off-by: Zqiang Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 07399210ac2d..7aae1d0ce37e 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -5322,7 +5322,7 @@ void __init init_sched_ext_class(void) BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_preempt, GFP_KERNEL, n)); BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_wait, GFP_KERNEL, n)); rq->scx.deferred_irq_work = IRQ_WORK_INIT_HARD(deferred_irq_workfn); - init_irq_work(&rq->scx.kick_cpus_irq_work, kick_cpus_irq_workfn); + rq->scx.kick_cpus_irq_work = IRQ_WORK_INIT_HARD(kick_cpus_irq_workfn); if (cpu_online(cpu)) cpu_rq(cpu)->scx.flags |= SCX_RQ_ONLINE; -- cgit From da02a1824884d6c84c5e5b5ac373b0c9e3288ec2 Mon Sep 17 00:00:00 2001 From: Aleksei Nikiforov Date: Wed, 12 Nov 2025 19:27:24 +0100 Subject: s390/ctcm: Fix double-kfree The function 'mpc_rcvd_sweep_req(mpcginfo)' is called conditionally from function 'ctcmpc_unpack_skb'. It frees passed mpcginfo. After that a call to function 'kfree' in function 'ctcmpc_unpack_skb' frees it again. Remove 'kfree' call in function 'mpc_rcvd_sweep_req(mpcginfo)'. Bug detected by the clang static analyzer. Fixes: 0c0b20587b9f25a2 ("s390/ctcm: fix potential memory leak") Reviewed-by: Aswin Karuvally Signed-off-by: Aleksei Nikiforov Signed-off-by: Aswin Karuvally Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251112182724.1109474-1-aswin@linux.ibm.com Signed-off-by: Jakub Kicinski --- drivers/s390/net/ctcm_mpc.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/s390/net/ctcm_mpc.c b/drivers/s390/net/ctcm_mpc.c index 0aeafa772fb1..407b7c516658 100644 --- a/drivers/s390/net/ctcm_mpc.c +++ b/drivers/s390/net/ctcm_mpc.c @@ -701,7 +701,6 @@ static void mpc_rcvd_sweep_req(struct mpcg_info *mpcginfo) grp->sweep_req_pend_num--; ctcmpc_send_sweep_resp(ch); - kfree(mpcginfo); return; } -- cgit From bed22c7b90af732978715a1789bca1c3cfa245a6 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Sun, 16 Nov 2025 10:10:29 +0200 Subject: selftests: net: lib: Do not overwrite error messages ret_set_ksft_status() calls ksft_status_merge() with the current return status and the last one. It treats a non-zero return code from ksft_status_merge() as an indication that the return status was overwritten by the last one and therefore overwrites the return message with the last one. Currently, ksft_status_merge() returns a non-zero return code even if the current return status and the last one are equal. This results in return messages being overwritten which is counter-productive since we are more interested in the first failure message and not the last one. Fix by changing ksft_status_merge() to only return a non-zero return code if the current return status was actually changed. Add a test case which checks that the first error message is not overwritten. Before: # ./lib_sh_test.sh [...] TEST: RET tfail2 tfail -> fail [FAIL] retmsg=tfail expected tfail2 [...] # echo $? 1 After: # ./lib_sh_test.sh [...] TEST: RET tfail2 tfail -> fail [ OK ] [...] # echo $? 0 Fixes: 596c8819cb78 ("selftests: forwarding: Have RET track kselftest framework constants") Reviewed-by: Petr Machata Signed-off-by: Ido Schimmel Link: https://patch.msgid.link/20251116081029.69112-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/forwarding/lib_sh_test.sh | 7 +++++++ tools/testing/selftests/net/lib.sh | 2 +- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/forwarding/lib_sh_test.sh b/tools/testing/selftests/net/forwarding/lib_sh_test.sh index ff2accccaf4d..b4eda6c6199e 100755 --- a/tools/testing/selftests/net/forwarding/lib_sh_test.sh +++ b/tools/testing/selftests/net/forwarding/lib_sh_test.sh @@ -30,6 +30,11 @@ tfail() do_test "tfail" false } +tfail2() +{ + do_test "tfail2" false +} + txfail() { FAIL_TO_XFAIL=yes do_test "txfail" false @@ -132,6 +137,8 @@ test_ret() ret_subtest $ksft_fail "tfail" txfail tfail ret_subtest $ksft_xfail "txfail" txfail txfail + + ret_subtest $ksft_fail "tfail2" tfail2 tfail } exit_status_tests_run() diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh index feba4ef69a54..f448bafb3f20 100644 --- a/tools/testing/selftests/net/lib.sh +++ b/tools/testing/selftests/net/lib.sh @@ -43,7 +43,7 @@ __ksft_status_merge() weights[$i]=$((weight++)) done - if [[ ${weights[$a]} > ${weights[$b]} ]]; then + if [[ ${weights[$a]} -ge ${weights[$b]} ]]; then echo "$a" return 0 else -- cgit From 8e0a754b0836d996802713bbebc87bc1cc17925c Mon Sep 17 00:00:00 2001 From: Lorenzo Bianconi Date: Thu, 13 Nov 2025 18:19:38 +0100 Subject: net: airoha: Do not loopback traffic to GDM2 if it is available on the device Airoha_eth driver forwards offloaded uplink traffic (packets received on GDM1 and forwarded to GDM{3,4}) to GDM2 in order to apply hw QoS. This is correct if the device does not support a dedicated GDM2 port. In this case, in order to enable hw offloading for uplink traffic, the packets should be sent to GDM{3,4} directly. Fixes: 9cd451d414f6 ("net: airoha: Add loopback support for GDM2") Signed-off-by: Lorenzo Bianconi Link: https://patch.msgid.link/20251113-airoha-hw-offload-gdm2-fix-v1-1-7e4ca300872f@kernel.org Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/airoha/airoha_ppe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c index 691361b25407..c0e17035db18 100644 --- a/drivers/net/ethernet/airoha/airoha_ppe.c +++ b/drivers/net/ethernet/airoha/airoha_ppe.c @@ -282,7 +282,7 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth, if (!airoha_is_valid_gdm_port(eth, port)) return -EINVAL; - if (dsa_port >= 0) + if (dsa_port >= 0 || eth->ports[1]) pse_port = port->id == 4 ? FE_PSE_PORT_GDM4 : port->id; else -- cgit From 896f1a2493b59beb2b5ccdf990503dbb16cb2256 Mon Sep 17 00:00:00 2001 From: Pavel Zhigulin Date: Thu, 13 Nov 2025 14:27:56 +0300 Subject: net: qlogic/qede: fix potential out-of-bounds read in qede_tpa_cont() and qede_tpa_end() The loops in 'qede_tpa_cont()' and 'qede_tpa_end()', iterate over 'cqe->len_list[]' using only a zero-length terminator as the stopping condition. If the terminator was missing or malformed, the loop could run past the end of the fixed-size array. Add an explicit bound check using ARRAY_SIZE() in both loops to prevent a potential out-of-bounds access. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 55482edc25f0 ("qede: Add slowpath/fastpath support and enable hardware GRO") Signed-off-by: Pavel Zhigulin Link: https://patch.msgid.link/20251113112757.4166625-1-Pavel.Zhigulin@kaspersky.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/qlogic/qede/qede_fp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c index 847fa62c80df..e338bfc8b7b2 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_fp.c +++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c @@ -4,6 +4,7 @@ * Copyright (c) 2019-2020 Marvell International Ltd. */ +#include #include #include #include @@ -960,7 +961,7 @@ static inline void qede_tpa_cont(struct qede_dev *edev, { int i; - for (i = 0; cqe->len_list[i]; i++) + for (i = 0; cqe->len_list[i] && i < ARRAY_SIZE(cqe->len_list); i++) qede_fill_frag_skb(edev, rxq, cqe->tpa_agg_index, le16_to_cpu(cqe->len_list[i])); @@ -985,7 +986,7 @@ static int qede_tpa_end(struct qede_dev *edev, dma_unmap_page(rxq->dev, tpa_info->buffer.mapping, PAGE_SIZE, rxq->data_direction); - for (i = 0; cqe->len_list[i]; i++) + for (i = 0; cqe->len_list[i] && i < ARRAY_SIZE(cqe->len_list); i++) qede_fill_frag_skb(edev, rxq, cqe->tpa_agg_index, le16_to_cpu(cqe->len_list[i])); if (unlikely(i > 1)) -- cgit From 0f08f0b0fb5e674b48f30e86c103760204a1d3f3 Mon Sep 17 00:00:00 2001 From: Florian Fuchs Date: Thu, 13 Nov 2025 19:10:00 +0100 Subject: net: ps3_gelic_net: handle skb allocation failures Handle skb allocation failures in RX path, to avoid NULL pointer dereference and RX stalls under memory pressure. If the refill fails with -ENOMEM, complete napi polling and wake up later to retry via timer. Also explicitly re-enable RX DMA after oom, so the dmac doesn't remain stopped in this situation. Previously, memory pressure could lead to skb allocation failures and subsequent Oops like: Oops: Kernel access of bad area, sig: 11 [#2] Hardware name: SonyPS3 Cell Broadband Engine 0x701000 PS3 NIP [c0003d0000065900] gelic_net_poll+0x6c/0x2d0 [ps3_gelic] (unreliable) LR [c0003d00000659c4] gelic_net_poll+0x130/0x2d0 [ps3_gelic] Call Trace: gelic_net_poll+0x130/0x2d0 [ps3_gelic] (unreliable) __napi_poll+0x44/0x168 net_rx_action+0x178/0x290 Steps to reproduce the issue: 1. Start a continuous network traffic, like scp of a 20GB file 2. Inject failslab errors using the kernel fault injection: echo -1 > /sys/kernel/debug/failslab/times echo 30 > /sys/kernel/debug/failslab/interval echo 100 > /sys/kernel/debug/failslab/probability 3. After some time, traces start to appear, kernel Oopses and the system stops Step 2 is not always necessary, as it is usually already triggered by the transfer of a big enough file. Fixes: 02c1889166b4 ("ps3: gigabit ethernet driver for PS3, take3") Signed-off-by: Florian Fuchs Link: https://patch.msgid.link/20251113181000.3914980-1-fuchsfl@gmail.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/toshiba/ps3_gelic_net.c | 45 +++++++++++++++++++++------- drivers/net/ethernet/toshiba/ps3_gelic_net.h | 1 + 2 files changed, 35 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c index 5ee8e8980393..591866fc9055 100644 --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c @@ -260,6 +260,7 @@ void gelic_card_down(struct gelic_card *card) if (atomic_dec_if_positive(&card->users) == 0) { pr_debug("%s: real do\n", __func__); napi_disable(&card->napi); + timer_delete_sync(&card->rx_oom_timer); /* * Disable irq. Wireless interrupts will * be disabled later if any @@ -970,7 +971,8 @@ static void gelic_net_pass_skb_up(struct gelic_descr *descr, * gelic_card_decode_one_descr - processes an rx descriptor * @card: card structure * - * returns 1 if a packet has been sent to the stack, otherwise 0 + * returns 1 if a packet has been sent to the stack, -ENOMEM on skb alloc + * failure, otherwise 0 * * processes an rx descriptor by iommu-unmapping the data buffer and passing * the packet up to the stack @@ -981,16 +983,18 @@ static int gelic_card_decode_one_descr(struct gelic_card *card) struct gelic_descr_chain *chain = &card->rx_chain; struct gelic_descr *descr = chain->head; struct net_device *netdev = NULL; - int dmac_chain_ended; + int dmac_chain_ended = 0; + int prepare_rx_ret; status = gelic_descr_get_status(descr); if (status == GELIC_DESCR_DMA_CARDOWNED) return 0; - if (status == GELIC_DESCR_DMA_NOT_IN_USE) { + if (status == GELIC_DESCR_DMA_NOT_IN_USE || !descr->skb) { dev_dbg(ctodev(card), "dormant descr? %p\n", descr); - return 0; + dmac_chain_ended = 1; + goto refill; } /* netdevice select */ @@ -1048,9 +1052,10 @@ static int gelic_card_decode_one_descr(struct gelic_card *card) refill: /* is the current descriptor terminated with next_descr == NULL? */ - dmac_chain_ended = - be32_to_cpu(descr->hw_regs.dmac_cmd_status) & - GELIC_DESCR_RX_DMA_CHAIN_END; + if (!dmac_chain_ended) + dmac_chain_ended = + be32_to_cpu(descr->hw_regs.dmac_cmd_status) & + GELIC_DESCR_RX_DMA_CHAIN_END; /* * So that always DMAC can see the end * of the descriptor chain to avoid @@ -1062,10 +1067,11 @@ refill: gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE); /* - * this call can fail, but for now, just leave this - * descriptor without skb + * this call can fail, propagate the error */ - gelic_descr_prepare_rx(card, descr); + prepare_rx_ret = gelic_descr_prepare_rx(card, descr); + if (prepare_rx_ret) + return prepare_rx_ret; chain->tail = descr; chain->head = descr->next; @@ -1087,6 +1093,13 @@ refill: return 1; } +static void gelic_rx_oom_timer(struct timer_list *t) +{ + struct gelic_card *card = timer_container_of(card, t, rx_oom_timer); + + napi_schedule(&card->napi); +} + /** * gelic_net_poll - NAPI poll function called by the stack to return packets * @napi: napi structure @@ -1099,14 +1112,22 @@ static int gelic_net_poll(struct napi_struct *napi, int budget) { struct gelic_card *card = container_of(napi, struct gelic_card, napi); int packets_done = 0; + int work_result = 0; while (packets_done < budget) { - if (!gelic_card_decode_one_descr(card)) + work_result = gelic_card_decode_one_descr(card); + if (work_result != 1) break; packets_done++; } + if (work_result == -ENOMEM) { + napi_complete_done(napi, packets_done); + mod_timer(&card->rx_oom_timer, jiffies + 1); + return packets_done; + } + if (packets_done < budget) { napi_complete_done(napi, packets_done); gelic_card_rx_irq_on(card); @@ -1576,6 +1597,8 @@ static struct gelic_card *gelic_alloc_card_net(struct net_device **netdev) mutex_init(&card->updown_lock); atomic_set(&card->users, 0); + timer_setup(&card->rx_oom_timer, gelic_rx_oom_timer, 0); + return card; } diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.h b/drivers/net/ethernet/toshiba/ps3_gelic_net.h index f7d7931e51b7..c10f1984a5a1 100644 --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.h +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.h @@ -268,6 +268,7 @@ struct gelic_vlan_id { struct gelic_card { struct napi_struct napi; struct net_device *netdev[GELIC_PORT_MAX]; + struct timer_list rx_oom_timer; /* * hypervisor requires irq_status should be * 8 bytes aligned, but u64 member is -- cgit From 5bebe8de19264946d398ead4e6c20c229454a552 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Tue, 18 Nov 2025 08:21:27 -0800 Subject: mm/huge_memory: Fix initialization of huge zero folio The recent fix to properly initialize the tags of the huge zero folio had an unfortunate not-so-subtle side effect: it caused the actual *contents* of the huge zero folio to not be initialized at all when the hardware didn't support the memory tagging. The reason was the unfortunate semantics of tag_clear_highpage(): on hardware that didn't do the tagging, it would silently just not do anything at all. And since this is done only on arm64 with MTE support, that basically meant most hardware. It wasn't necessarily immediately obvious since the huge zero page isn't necessarily very heavily used - or because it might already be zero because all-zeroes is the most common pattern. But it ends up causing random odd user space failures when you do hit it. The unfortunate semantics have been around for a while, but became a real bug only when we started actively using __GFP_ZEROTAGS in the generic get_huge_zero_folio() function - before that, it had only ever been used in code that checked that the hardware supported it. Fix this by simply changing the semantics of tag_clear_highpage() to return whether it actually successfully did something or not. While at it, also make it initialize multiple pages in one go, since that's actually what the only caller wants it to do and it simplifies the whole logic. Fixes: adfb6609c680 ("mm/huge_memory: initialise the tags of the huge zero folio") Link: https://lore.kernel.org/all/20251117082023.90176-1-00107082@163.com/ Reviewed-by: David Hildenbrand (Red Hat) Reported-and-tested-by: David Wang <00107082@163.com> Reported-and-tested-by: Carlos Llamas Signed-off-by: Linus Torvalds --- arch/arm64/include/asm/page.h | 4 ++-- arch/arm64/mm/fault.c | 21 +++++++++++---------- include/linux/highmem.h | 6 ++++-- mm/page_alloc.c | 9 ++------- 4 files changed, 19 insertions(+), 21 deletions(-) diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 2312e6ee595f..258cca4b4873 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -33,8 +33,8 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, unsigned long vaddr); #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio -void tag_clear_highpage(struct page *to); -#define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE +bool tag_clear_highpages(struct page *to, int numpages); +#define __HAVE_ARCH_TAG_CLEAR_HIGHPAGES #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 125dfa6c613b..a193b6a5d1e6 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -967,20 +967,21 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, return vma_alloc_folio(flags, 0, vma, vaddr); } -void tag_clear_highpage(struct page *page) +bool tag_clear_highpages(struct page *page, int numpages) { /* * Check if MTE is supported and fall back to clear_highpage(). * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and - * post_alloc_hook() will invoke tag_clear_highpage(). + * post_alloc_hook() will invoke tag_clear_highpages(). */ - if (!system_supports_mte()) { - clear_highpage(page); - return; - } + if (!system_supports_mte()) + return false; - /* Newly allocated page, shouldn't have been tagged yet */ - WARN_ON_ONCE(!try_page_mte_tagging(page)); - mte_zero_clear_page_tags(page_address(page)); - set_page_mte_tagged(page); + /* Newly allocated pages, shouldn't have been tagged yet */ + for (int i = 0; i < numpages; i++, page++) { + WARN_ON_ONCE(!try_page_mte_tagging(page)); + mte_zero_clear_page_tags(page_address(page)); + set_page_mte_tagged(page); + } + return true; } diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 105cc4c00cc3..abc20f9810fd 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -249,10 +249,12 @@ static inline void clear_highpage_kasan_tagged(struct page *page) kunmap_local(kaddr); } -#ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE +#ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGES -static inline void tag_clear_highpage(struct page *page) +/* Return false to let people know we did not initialize the pages */ +static inline bool tag_clear_highpages(struct page *page, int numpages) { + return false; } #endif diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 600d9e981c23..ed82ee55e66a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1822,14 +1822,9 @@ inline void post_alloc_hook(struct page *page, unsigned int order, * If memory tags should be zeroed * (which happens only when memory should be initialized as well). */ - if (zero_tags) { - /* Initialize both memory and memory tags. */ - for (i = 0; i != 1 << order; ++i) - tag_clear_highpage(page + i); + if (zero_tags) + init = !tag_clear_highpages(page, 1 << order); - /* Take note that memory was initialized by the loop above. */ - init = false; - } if (!should_skip_kasan_unpoison(gfp_flags) && kasan_unpoison_pages(page, order, init)) { /* Take note that memory was initialized by KASAN. */ -- cgit From 3fa05f96fc08dff5e846c2cc283a249c1bf029a1 Mon Sep 17 00:00:00 2001 From: Yosry Ahmed Date: Wed, 12 Nov 2025 01:30:17 +0000 Subject: KVM: SVM: Fix redundant updates of LBR MSR intercepts Don't update the LBR MSR intercept bitmaps if they're already up-to-date, as unconditionally updating the intercepts forces KVM to recalculate the MSR bitmaps for vmcb02 on every nested VMRUN. The redundant updates are functionally okay; however, they neuter an optimization in Hyper-V nested virtualization enlightenments and this manifests as a self-test failure. In particular, Hyper-V lets L1 mark "nested enlightenments" as clean, i.e. tell KVM that no changes were made to the MSR bitmap since the last VMRUN. The hyperv_svm_test KVM selftest intentionally changes the MSR bitmap "without telling KVM about it" to verify that KVM honors the clean hint, correctly fails because KVM notices the changed bitmap anyway: ==== Test Assertion Failure ==== x86/hyperv_svm_test.c:120: vmcb->control.exit_code == 0x081 pid=193558 tid=193558 errno=4 - Interrupted system call 1 0x0000000000411361: assert_on_unhandled_exception at processor.c:659 2 0x0000000000406186: _vcpu_run at kvm_util.c:1699 3 (inlined by) vcpu_run at kvm_util.c:1710 4 0x0000000000401f2a: main at hyperv_svm_test.c:175 5 0x000000000041d0d3: __libc_start_call_main at libc-start.o:? 6 0x000000000041f27c: __libc_start_main_impl at ??:? 7 0x00000000004021a0: _start at ??:? vmcb->control.exit_code == SVM_EXIT_VMMCALL Do *not* fix this by skipping svm_hv_vmcb_dirty_nested_enlightenments() when svm_set_intercept_for_msr() performs a no-op change. changes to the L0 MSR interception bitmap are only triggered by full CPUID updates and MSR filter updates, both of which should be rare. Changing svm_set_intercept_for_msr() risks hiding unintended pessimizations like this one, and is actually more complex than this change. Fixes: fbe5e5f030c2 ("KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed Link: https://patch.msgid.link/20251112013017.1836863-1-yosry.ahmed@linux.dev [Rewritten commit message based on mailing list discussion. - Paolo] Reviewed-by: Sean Christopherson Tested-by: Sean Christopherson Signed-off-by: Paolo Bonzini --- arch/x86/kvm/svm/svm.c | 9 ++++++++- arch/x86/kvm/svm/svm.h | 1 + 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 10c21e4c5406..9d29b2e7e855 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -705,7 +705,11 @@ void *svm_alloc_permissions_map(unsigned long size, gfp_t gfp_mask) static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu) { - bool intercept = !(to_svm(vcpu)->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK); + struct vcpu_svm *svm = to_svm(vcpu); + bool intercept = !(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK); + + if (intercept == svm->lbr_msrs_intercepted) + return; svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHFROMIP, MSR_TYPE_RW, intercept); svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHTOIP, MSR_TYPE_RW, intercept); @@ -714,6 +718,8 @@ static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu) if (sev_es_guest(vcpu->kvm)) svm_set_intercept_for_msr(vcpu, MSR_IA32_DEBUGCTLMSR, MSR_TYPE_RW, intercept); + + svm->lbr_msrs_intercepted = intercept; } void svm_vcpu_free_msrpm(void *msrpm) @@ -1221,6 +1227,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu) } svm->x2avic_msrs_intercepted = true; + svm->lbr_msrs_intercepted = true; svm->vmcb01.ptr = page_address(vmcb01_page); svm->vmcb01.pa = __sme_set(page_to_pfn(vmcb01_page) << PAGE_SHIFT); diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index c856d8e0f95e..dd78e6402345 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -336,6 +336,7 @@ struct vcpu_svm { bool guest_state_loaded; bool x2avic_msrs_intercepted; + bool lbr_msrs_intercepted; /* Guest GIF value, used when vGIF is not enabled */ bool guest_gif; -- cgit From cdcbb8e8d10f656642380ee13516290437b52b36 Mon Sep 17 00:00:00 2001 From: Naoki Ueki Date: Mon, 3 Nov 2025 21:16:45 +0900 Subject: HID: elecom: Add support for ELECOM M-XT3URBK (018F) The ELECOM M-XT3URBK trackball has an additional device ID (0x018F), which shares the same report descriptor as the existing device (0x00FB). However, the driver does not currently recognize this new ID, resulting in only five buttons being functional. This patch adds the new device ID so that all six buttons work properly. Signed-off-by: Naoki Ueki Signed-off-by: Jiri Kosina --- drivers/hid/hid-elecom.c | 6 ++++-- drivers/hid/hid-ids.h | 3 ++- drivers/hid/hid-quirks.c | 3 ++- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/hid/hid-elecom.c b/drivers/hid/hid-elecom.c index 69771fd35006..981d1b6e9658 100644 --- a/drivers/hid/hid-elecom.c +++ b/drivers/hid/hid-elecom.c @@ -75,7 +75,8 @@ static const __u8 *elecom_report_fixup(struct hid_device *hdev, __u8 *rdesc, */ mouse_button_fixup(hdev, rdesc, *rsize, 20, 28, 22, 14, 8); break; - case USB_DEVICE_ID_ELECOM_M_XT3URBK: + case USB_DEVICE_ID_ELECOM_M_XT3URBK_00FB: + case USB_DEVICE_ID_ELECOM_M_XT3URBK_018F: case USB_DEVICE_ID_ELECOM_M_XT3DRBK: case USB_DEVICE_ID_ELECOM_M_XT4DRBK: /* @@ -119,7 +120,8 @@ static const __u8 *elecom_report_fixup(struct hid_device *hdev, __u8 *rdesc, static const struct hid_device_id elecom_devices[] = { { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_BM084) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XGL20DLBK) }, - { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK) }, + { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK_00FB) }, + { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK_018F) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3DRBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT4DRBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_DT1URBK) }, diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 85db279baa72..c4589075a5ed 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -449,7 +449,8 @@ #define USB_VENDOR_ID_ELECOM 0x056e #define USB_DEVICE_ID_ELECOM_BM084 0x0061 #define USB_DEVICE_ID_ELECOM_M_XGL20DLBK 0x00e6 -#define USB_DEVICE_ID_ELECOM_M_XT3URBK 0x00fb +#define USB_DEVICE_ID_ELECOM_M_XT3URBK_00FB 0x00fb +#define USB_DEVICE_ID_ELECOM_M_XT3URBK_018F 0x018f #define USB_DEVICE_ID_ELECOM_M_XT3DRBK 0x00fc #define USB_DEVICE_ID_ELECOM_M_XT4DRBK 0x00fd #define USB_DEVICE_ID_ELECOM_M_DT1URBK 0x00fe diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index 22760ac50f2d..c89a015686c0 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -410,7 +410,8 @@ static const struct hid_device_id hid_have_special_driver[] = { #if IS_ENABLED(CONFIG_HID_ELECOM) { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_BM084) }, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XGL20DLBK) }, - { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK) }, + { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK_00FB) }, + { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK_018F) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3DRBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT4DRBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_DT1URBK) }, -- cgit From 4e127a74786fa9573a32c8aa4bbf69ef78c3232a Mon Sep 17 00:00:00 2001 From: Stuart Hayhurst Date: Mon, 3 Nov 2025 14:21:13 +0000 Subject: HID: corsair-void: Use %pe for printing PTR_ERR Use %pe to print a PTR_ERR to silence a cocci warning Reported-by: kernel test robot Reported-by: Julia Lawall Closes: https://lore.kernel.org/r/202510300342.WtPn2jF3-lkp@intel.com/ Signed-off-by: Stuart Hayhurst Signed-off-by: Jiri Kosina --- drivers/hid/hid-corsair-void.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/hid/hid-corsair-void.c b/drivers/hid/hid-corsair-void.c index fee134a7eba3..5e9a5b8f7f16 100644 --- a/drivers/hid/hid-corsair-void.c +++ b/drivers/hid/hid-corsair-void.c @@ -553,9 +553,8 @@ static void corsair_void_add_battery(struct corsair_void_drvdata *drvdata) if (IS_ERR(new_supply)) { hid_err(drvdata->hid_dev, - "failed to register battery '%s' (reason: %ld)\n", - drvdata->battery_desc.name, - PTR_ERR(new_supply)); + "failed to register battery '%s' (reason: %pe)\n", + drvdata->battery_desc.name, new_supply); return; } -- cgit From 9d7b89a1028230315a8999cfea7795fbe84f62cc Mon Sep 17 00:00:00 2001 From: Tomasz Pakuła Date: Mon, 3 Nov 2025 21:02:43 +0100 Subject: HID: pidff: Fix needs_playback check MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A small bug made it's way here when rewriting code to Linux quality. Currently, if an effect is not infinite and a program requests it's playback with the same number of loops, the play command won't be fired and if an effect is infinite, the spam will continue. We want every playback update for non-infinite effects and only some for infinite (detecting when a program requests stop with 0 which will be different than previous value which is usually 1 or 255). Signed-off-by: Tomasz Pakuła Signed-off-by: Jiri Kosina --- drivers/hid/usbhid/hid-pidff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hid/usbhid/hid-pidff.c b/drivers/hid/usbhid/hid-pidff.c index edd61ef50e16..95377c5f6335 100644 --- a/drivers/hid/usbhid/hid-pidff.c +++ b/drivers/hid/usbhid/hid-pidff.c @@ -806,8 +806,8 @@ static int pidff_request_effect_upload(struct pidff_device *pidff, int efnum) static int pidff_needs_playback(struct pidff_device *pidff, int effect_id, int n) { - return pidff->effect[effect_id].is_infinite || - pidff->effect[effect_id].loop_count != n; + return !pidff->effect[effect_id].is_infinite || + pidff->effect[effect_id].loop_count != n; } /* -- cgit From 8513c154f8ad7097653dd9bf43d6155e5aad4ab3 Mon Sep 17 00:00:00 2001 From: Abdun Nihaal Date: Mon, 10 Nov 2025 22:45:50 +0530 Subject: HID: playstation: Fix memory leak in dualshock4_get_calibration_data() The memory allocated for buf is not freed in the error paths when ps_get_report() fails. Free buf before jumping to transfer_failed label Fixes: 947992c7fa9e ("HID: playstation: DS4: Fix calibration workaround for clone devices") Signed-off-by: Abdun Nihaal Reviewed-by: Silvan Jegen Signed-off-by: Jiri Kosina --- drivers/hid/hid-playstation.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index 63f6eb9030d1..128aa6abd10b 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -1942,6 +1942,7 @@ static int dualshock4_get_calibration_data(struct dualshock4 *ds4) "Failed to retrieve DualShock4 calibration info: %d\n", ret); ret = -EILSEQ; + kfree(buf); goto transfer_failed; } else { break; @@ -1959,6 +1960,7 @@ static int dualshock4_get_calibration_data(struct dualshock4 *ds4) if (ret) { hid_warn(hdev, "Failed to retrieve DualShock4 calibration info: %d\n", ret); + kfree(buf); goto transfer_failed; } } -- cgit From a78eb69d60ce893de48dd75f725ba21309131fc2 Mon Sep 17 00:00:00 2001 From: Abdun Nihaal Date: Mon, 10 Nov 2025 22:59:41 +0530 Subject: HID: uclogic: Fix potential memory leak in error path In uclogic_params_ugee_v2_init_event_hooks(), the memory allocated for event_hook is not freed in the next error path. Fix that by freeing it. Fixes: a251d6576d2a ("HID: uclogic: Handle wireless device reconnection") Signed-off-by: Abdun Nihaal Signed-off-by: Jiri Kosina --- drivers/hid/hid-uclogic-params.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/hid/hid-uclogic-params.c b/drivers/hid/hid-uclogic-params.c index ffa14a4621ef..4c4bac6f792b 100644 --- a/drivers/hid/hid-uclogic-params.c +++ b/drivers/hid/hid-uclogic-params.c @@ -1369,8 +1369,10 @@ static int uclogic_params_ugee_v2_init_event_hooks(struct hid_device *hdev, event_hook->hdev = hdev; event_hook->size = ARRAY_SIZE(reconnect_event); event_hook->event = kmemdup(reconnect_event, event_hook->size, GFP_KERNEL); - if (!event_hook->event) + if (!event_hook->event) { + kfree(event_hook); return -ENOMEM; + } list_add_tail(&event_hook->list, &p->event_hooks->list); -- cgit From 118082368c2b6ddefe6cb607efc312285148f044 Mon Sep 17 00:00:00 2001 From: Emil Tantilov Date: Mon, 13 Oct 2025 08:08:24 -0700 Subject: idpf: fix possible vport_config NULL pointer deref in remove Attempting to remove the driver will cause a crash in cases where the vport failed to initialize. Following trace is from an instance where the driver failed during an attempt to create a VF: [ 1661.543624] idpf 0000:84:00.7: Device HW Reset initiated [ 1722.923726] idpf 0000:84:00.7: Transaction timed-out (op:1 cookie:2900 vc_op:1 salt:29 timeout:60000ms) [ 1723.353263] BUG: kernel NULL pointer dereference, address: 0000000000000028 ... [ 1723.358472] RIP: 0010:idpf_remove+0x11c/0x200 [idpf] ... [ 1723.364973] Call Trace: [ 1723.365475] [ 1723.365972] pci_device_remove+0x42/0xb0 [ 1723.366481] device_release_driver_internal+0x1a9/0x210 [ 1723.366987] pci_stop_bus_device+0x6d/0x90 [ 1723.367488] pci_stop_and_remove_bus_device+0x12/0x20 [ 1723.367971] pci_iov_remove_virtfn+0xbd/0x120 [ 1723.368309] sriov_disable+0x34/0xe0 [ 1723.368643] idpf_sriov_configure+0x58/0x140 [idpf] [ 1723.368982] sriov_numvfs_store+0xda/0x1c0 Avoid the NULL pointer dereference by adding NULL pointer check for vport_config[i], before freeing user_config.q_coalesce. Fixes: e1e3fec3e34b ("idpf: preserve coalescing settings across resets") Signed-off-by: Emil Tantilov Reviewed-by: Chittim Madhu Reviewed-by: Simon Horman Tested-by: Samuel Salin Reviewed-by: Aleksandr Loktionov Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/intel/idpf/idpf_main.c b/drivers/net/ethernet/intel/idpf/idpf_main.c index 8c46481d2e1f..8cf4ff697572 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_main.c +++ b/drivers/net/ethernet/intel/idpf/idpf_main.c @@ -63,6 +63,8 @@ destroy_wqs: destroy_workqueue(adapter->vc_event_wq); for (i = 0; i < adapter->max_vports; i++) { + if (!adapter->vport_config[i]) + continue; kfree(adapter->vport_config[i]->user_config.q_coalesce); kfree(adapter->vport_config[i]); adapter->vport_config[i] = NULL; -- cgit From 7a601324ac9828468291151d220edb47a6a82449 Mon Sep 17 00:00:00 2001 From: Andreas Kemnade Date: Tue, 18 Nov 2025 11:26:52 -0800 Subject: MAINTAINERS: sync omap devicetree maintainers with omap platform Both used to go through Tony's branches, so lets keep things together. This was missed at the time when Co-Maintainers were added. Signed-off-by: Andreas Kemnade Acked-by: Aaro Koskinen Acked-by: Tony Lindgren Reviewed-by: Roger Quadros Acked-by: Kevin Hilman Link: https://patch.msgid.link/20240915195321.1071967-1-andreas@kemnade.info Signed-off-by: Kevin Hilman Link: https://lore.kernel.org/r/20251118192652.316198-1-khilman@baylibre.com Signed-off-by: Arnd Bergmann --- MAINTAINERS | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 29f86c9aa27b..e661829ff723 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -18779,6 +18779,10 @@ S: Maintained F: arch/arm/*omap*/*clock* OMAP DEVICE TREE SUPPORT +M: Aaro Koskinen +M: Andreas Kemnade +M: Kevin Hilman +M: Roger Quadros M: Tony Lindgren L: linux-omap@vger.kernel.org L: devicetree@vger.kernel.org -- cgit From 23a5b9b12de9dcd15ebae4f1abc8814ec1c51ab0 Mon Sep 17 00:00:00 2001 From: Grzegorz Nitka Date: Mon, 20 Oct 2025 12:02:16 +0200 Subject: ice: fix PTP cleanup on driver removal in error path Improve the cleanup on releasing PTP resources in error path. The error case might happen either at the driver probe and PTP feature initialization or on PTP restart (errors in reset handling, NVM update etc). In both cases, calls to PF PTP cleanup (ice_ptp_cleanup_pf function) and 'ps_lock' mutex deinitialization were missed. Additionally, ptp clock was not unregistered in the latter case. Keep PTP state as 'uninitialized' on init to distinguish between error scenarios and to avoid resource release duplication at driver removal. The consequence of missing ice_ptp_cleanup_pf call is the following call trace dumped when ice_adapter object is freed (port list is not empty, as it is required at this stage): [ T93022] ------------[ cut here ]------------ [ T93022] WARNING: CPU: 10 PID: 93022 at ice/ice_adapter.c:67 ice_adapter_put+0xef/0x100 [ice] ... [ T93022] RIP: 0010:ice_adapter_put+0xef/0x100 [ice] ... [ T93022] Call Trace: [ T93022] [ T93022] ? ice_adapter_put+0xef/0x100 [ice 33d2647ad4f6d866d41eefff1806df37c68aef0c] [ T93022] ? __warn.cold+0xb0/0x10e [ T93022] ? ice_adapter_put+0xef/0x100 [ice 33d2647ad4f6d866d41eefff1806df37c68aef0c] [ T93022] ? report_bug+0xd8/0x150 [ T93022] ? handle_bug+0xe9/0x110 [ T93022] ? exc_invalid_op+0x17/0x70 [ T93022] ? asm_exc_invalid_op+0x1a/0x20 [ T93022] ? ice_adapter_put+0xef/0x100 [ice 33d2647ad4f6d866d41eefff1806df37c68aef0c] [ T93022] pci_device_remove+0x42/0xb0 [ T93022] device_release_driver_internal+0x19f/0x200 [ T93022] driver_detach+0x48/0x90 [ T93022] bus_remove_driver+0x70/0xf0 [ T93022] pci_unregister_driver+0x42/0xb0 [ T93022] ice_module_exit+0x10/0xdb0 [ice 33d2647ad4f6d866d41eefff1806df37c68aef0c] ... [ T93022] ---[ end trace 0000000000000000 ]--- [ T93022] ice: module unloaded Fixes: e800654e85b5 ("ice: Use ice_adapter for PTP shared data instead of auxdev") Signed-off-by: Grzegorz Nitka Reviewed-by: Aleksandr Loktionov Reviewed-by: Paul Menzel Tested-by: Rinitha S (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_ptp.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index fb0f6365a6d6..8ec0f7d0fceb 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -3246,7 +3246,7 @@ void ice_ptp_init(struct ice_pf *pf) err = ice_ptp_init_port(pf, &ptp->port); if (err) - goto err_exit; + goto err_clean_pf; /* Start the PHY timestamping block */ ice_ptp_reset_phy_timestamping(pf); @@ -3263,13 +3263,19 @@ void ice_ptp_init(struct ice_pf *pf) dev_info(ice_pf_to_dev(pf), "PTP init successful\n"); return; +err_clean_pf: + mutex_destroy(&ptp->port.ps_lock); + ice_ptp_cleanup_pf(pf); err_exit: /* If we registered a PTP clock, release it */ if (pf->ptp.clock) { ptp_clock_unregister(ptp->clock); pf->ptp.clock = NULL; } - ptp->state = ICE_PTP_ERROR; + /* Keep ICE_PTP_UNINIT state to avoid ambiguity at driver unload + * and to avoid duplicated resources release. + */ + ptp->state = ICE_PTP_UNINIT; dev_err(ice_pf_to_dev(pf), "PTP failed %d\n", err); } @@ -3282,9 +3288,19 @@ err_exit: */ void ice_ptp_release(struct ice_pf *pf) { - if (pf->ptp.state != ICE_PTP_READY) + if (pf->ptp.state == ICE_PTP_UNINIT) return; + if (pf->ptp.state != ICE_PTP_READY) { + mutex_destroy(&pf->ptp.port.ps_lock); + ice_ptp_cleanup_pf(pf); + if (pf->ptp.clock) { + ptp_clock_unregister(pf->ptp.clock); + pf->ptp.clock = NULL; + } + return; + } + pf->ptp.state = ICE_PTP_UNINIT; /* Disable timestamping for both Tx and Rx */ -- cgit From f94c1a114ac209977bdf5ca841b98424295ab1f0 Mon Sep 17 00:00:00 2001 From: Shay Drory Date: Mon, 17 Nov 2025 14:05:49 +0200 Subject: devlink: rate: Unset parent pointer in devl_rate_nodes_destroy The function devl_rate_nodes_destroy is documented to "Unset parent for all rate objects". However, it was only calling the driver-specific `rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing the parent's refcount, without actually setting the `devlink_rate->parent` pointer to NULL. This leaves a dangling pointer in the `devlink_rate` struct, which cause refcount error in netdevsim[1] and mlx5[2]. In addition, this is inconsistent with the behavior of `devlink_nl_rate_parent_node_set`, where the parent pointer is correctly cleared. This patch fixes the issue by explicitly setting `devlink_rate->parent` to NULL after notifying the driver, thus fulfilling the function's documented behavior for all rate objects. [1] repro steps: echo 1 > /sys/bus/netdevsim/new_device devlink dev eswitch set netdevsim/netdevsim1 mode switchdev echo 1 > /sys/bus/netdevsim/devices/netdevsim1/sriov_numvfs devlink port function rate add netdevsim/netdevsim1/test_node devlink port function rate set netdevsim/netdevsim1/128 parent test_node echo 1 > /sys/bus/netdevsim/del_device dmesg: refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 8 PID: 1530 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0 CPU: 8 UID: 0 PID: 1530 Comm: bash Not tainted 6.18.0-rc4+ #1 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:refcount_warn_saturate+0x42/0xe0 Call Trace: devl_rate_leaf_destroy+0x8d/0x90 __nsim_dev_port_del+0x6c/0x70 [netdevsim] nsim_dev_reload_destroy+0x11c/0x140 [netdevsim] nsim_drv_remove+0x2b/0xb0 [netdevsim] device_release_driver_internal+0x194/0x1f0 bus_remove_device+0xc6/0x130 device_del+0x159/0x3c0 device_unregister+0x1a/0x60 del_device_store+0x111/0x170 [netdevsim] kernfs_fop_write_iter+0x12e/0x1e0 vfs_write+0x215/0x3d0 ksys_write+0x5f/0xd0 do_syscall_64+0x55/0x10f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 [2] devlink dev eswitch set pci/0000:08:00.0 mode switchdev devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 1000 devlink port function rate add pci/0000:08:00.0/group1 devlink port function rate set pci/0000:08:00.0/32768 parent group1 modprobe -r mlx5_ib mlx5_fwctl mlx5_core dmesg: refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 7 PID: 16151 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0 CPU: 7 UID: 0 PID: 16151 Comm: bash Not tainted 6.17.0-rc7_for_upstream_min_debug_2025_10_02_12_44 #1 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:refcount_warn_saturate+0x42/0xe0 Call Trace: devl_rate_leaf_destroy+0x8d/0x90 mlx5_esw_offloads_devlink_port_unregister+0x33/0x60 [mlx5_core] mlx5_esw_offloads_unload_rep+0x3f/0x50 [mlx5_core] mlx5_eswitch_unload_sf_vport+0x40/0x90 [mlx5_core] mlx5_sf_esw_event+0xc4/0x120 [mlx5_core] notifier_call_chain+0x33/0xa0 blocking_notifier_call_chain+0x3b/0x50 mlx5_eswitch_disable_locked+0x50/0x110 [mlx5_core] mlx5_eswitch_disable+0x63/0x90 [mlx5_core] mlx5_unload+0x1d/0x170 [mlx5_core] mlx5_uninit_one+0xa2/0x130 [mlx5_core] remove_one+0x78/0xd0 [mlx5_core] pci_device_remove+0x39/0xa0 device_release_driver_internal+0x194/0x1f0 unbind_store+0x99/0xa0 kernfs_fop_write_iter+0x12e/0x1e0 vfs_write+0x215/0x3d0 ksys_write+0x5f/0xd0 do_syscall_64+0x53/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: d75559845078 ("devlink: Allow setting parent node of rate objects") Signed-off-by: Shay Drory Reviewed-by: Carolina Jubran Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1763381149-1234377-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski --- net/devlink/rate.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/devlink/rate.c b/net/devlink/rate.c index 264fb82cba19..d157a8419bca 100644 --- a/net/devlink/rate.c +++ b/net/devlink/rate.c @@ -828,13 +828,15 @@ void devl_rate_nodes_destroy(struct devlink *devlink) if (!devlink_rate->parent) continue; - refcount_dec(&devlink_rate->parent->refcnt); if (devlink_rate_is_leaf(devlink_rate)) ops->rate_leaf_parent_set(devlink_rate, NULL, devlink_rate->priv, NULL, NULL); else if (devlink_rate_is_node(devlink_rate)) ops->rate_node_parent_set(devlink_rate, NULL, devlink_rate->priv, NULL, NULL); + + refcount_dec(&devlink_rate->parent->refcnt); + devlink_rate->parent = NULL; } list_for_each_entry_safe(devlink_rate, tmp, &devlink->rate_list, list) { if (devlink_rate_is_node(devlink_rate)) { -- cgit From 426358d9be7ce3518966422f87b96f1bad27295f Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 17 Nov 2025 10:07:44 +0000 Subject: mptcp: fix a race in mptcp_pm_del_add_timer() mptcp_pm_del_add_timer() can call sk_stop_timer_sync(sk, &entry->add_timer) while another might have free entry already, as reported by syzbot. Add RCU protection to fix this issue. Also change confusing add_timer variable with stop_timer boolean. syzbot report: BUG: KASAN: slab-use-after-free in __timer_delete_sync+0x372/0x3f0 kernel/time/timer.c:1616 Read of size 4 at addr ffff8880311e4150 by task kworker/1:1/44 CPU: 1 UID: 0 PID: 44 Comm: kworker/1:1 Not tainted syzkaller #0 PREEMPT_{RT,(full)} Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025 Workqueue: events mptcp_worker Call Trace: dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report+0x118/0x150 mm/kasan/report.c:595 __timer_delete_sync+0x372/0x3f0 kernel/time/timer.c:1616 sk_stop_timer_sync+0x1b/0x90 net/core/sock.c:3631 mptcp_pm_del_add_timer+0x283/0x310 net/mptcp/pm.c:362 mptcp_incoming_options+0x1357/0x1f60 net/mptcp/options.c:1174 tcp_data_queue+0xca/0x6450 net/ipv4/tcp_input.c:5361 tcp_rcv_established+0x1335/0x2670 net/ipv4/tcp_input.c:6441 tcp_v4_do_rcv+0x98b/0xbf0 net/ipv4/tcp_ipv4.c:1931 tcp_v4_rcv+0x252a/0x2dc0 net/ipv4/tcp_ipv4.c:2374 ip_protocol_deliver_rcu+0x221/0x440 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:239 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318 __netif_receive_skb_one_core net/core/dev.c:6079 [inline] __netif_receive_skb+0x143/0x380 net/core/dev.c:6192 process_backlog+0x31e/0x900 net/core/dev.c:6544 __napi_poll+0xb6/0x540 net/core/dev.c:7594 napi_poll net/core/dev.c:7657 [inline] net_rx_action+0x5f7/0xda0 net/core/dev.c:7784 handle_softirqs+0x22f/0x710 kernel/softirq.c:622 __do_softirq kernel/softirq.c:656 [inline] __local_bh_enable_ip+0x1a0/0x2e0 kernel/softirq.c:302 mptcp_pm_send_ack net/mptcp/pm.c:210 [inline] mptcp_pm_addr_send_ack+0x41f/0x500 net/mptcp/pm.c:-1 mptcp_pm_worker+0x174/0x320 net/mptcp/pm.c:1002 mptcp_worker+0xd5/0x1170 net/mptcp/protocol.c:2762 process_one_work kernel/workqueue.c:3263 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 Allocated by task 44: kasan_save_stack mm/kasan/common.c:56 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:77 poison_kmalloc_redzone mm/kasan/common.c:400 [inline] __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:417 kasan_kmalloc include/linux/kasan.h:262 [inline] __kmalloc_cache_noprof+0x1ef/0x6c0 mm/slub.c:5748 kmalloc_noprof include/linux/slab.h:957 [inline] mptcp_pm_alloc_anno_list+0x104/0x460 net/mptcp/pm.c:385 mptcp_pm_create_subflow_or_signal_addr+0xf9d/0x1360 net/mptcp/pm_kernel.c:355 mptcp_pm_nl_fully_established net/mptcp/pm_kernel.c:409 [inline] __mptcp_pm_kernel_worker+0x417/0x1ef0 net/mptcp/pm_kernel.c:1529 mptcp_pm_worker+0x1ee/0x320 net/mptcp/pm.c:1008 mptcp_worker+0xd5/0x1170 net/mptcp/protocol.c:2762 process_one_work kernel/workqueue.c:3263 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 Freed by task 6630: kasan_save_stack mm/kasan/common.c:56 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:77 __kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:587 kasan_save_free_info mm/kasan/kasan.h:406 [inline] poison_slab_object mm/kasan/common.c:252 [inline] __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:284 kasan_slab_free include/linux/kasan.h:234 [inline] slab_free_hook mm/slub.c:2523 [inline] slab_free mm/slub.c:6611 [inline] kfree+0x197/0x950 mm/slub.c:6818 mptcp_remove_anno_list_by_saddr+0x2d/0x40 net/mptcp/pm.c:158 mptcp_pm_flush_addrs_and_subflows net/mptcp/pm_kernel.c:1209 [inline] mptcp_nl_flush_addrs_list net/mptcp/pm_kernel.c:1240 [inline] mptcp_pm_nl_flush_addrs_doit+0x593/0xbb0 net/mptcp/pm_kernel.c:1281 genl_family_rcv_msg_doit+0x215/0x300 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x60e/0x790 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2552 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0x846/0xa10 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg+0x21c/0x270 net/socket.c:742 ____sys_sendmsg+0x508/0x820 net/socket.c:2630 ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2684 __sys_sendmsg net/socket.c:2716 [inline] __do_sys_sendmsg net/socket.c:2721 [inline] __se_sys_sendmsg net/socket.c:2719 [inline] __x64_sys_sendmsg+0x1a1/0x260 net/socket.c:2719 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Cc: stable@vger.kernel.org Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout") Reported-by: syzbot+2a6fbf0f0530375968df@syzkaller.appspotmail.com Closes: https://lore.kernel.org/691ad3c3.a70a0220.f6df1.0004.GAE@google.com Signed-off-by: Eric Dumazet Cc: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251117100745.1913963-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/mptcp/pm.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 2ff1b9499568..9604b91902b8 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -18,6 +18,7 @@ struct mptcp_pm_add_entry { u8 retrans_times; struct timer_list add_timer; struct mptcp_sock *sock; + struct rcu_head rcu; }; static DEFINE_SPINLOCK(mptcp_pm_list_lock); @@ -155,7 +156,7 @@ bool mptcp_remove_anno_list_by_saddr(struct mptcp_sock *msk, entry = mptcp_pm_del_add_timer(msk, addr, false); ret = entry; - kfree(entry); + kfree_rcu(entry, rcu); return ret; } @@ -345,22 +346,27 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk, { struct mptcp_pm_add_entry *entry; struct sock *sk = (struct sock *)msk; - struct timer_list *add_timer = NULL; + bool stop_timer = false; + + rcu_read_lock(); spin_lock_bh(&msk->pm.lock); entry = mptcp_lookup_anno_list_by_saddr(msk, addr); if (entry && (!check_id || entry->addr.id == addr->id)) { entry->retrans_times = ADD_ADDR_RETRANS_MAX; - add_timer = &entry->add_timer; + stop_timer = true; } if (!check_id && entry) list_del(&entry->list); spin_unlock_bh(&msk->pm.lock); - /* no lock, because sk_stop_timer_sync() is calling timer_delete_sync() */ - if (add_timer) - sk_stop_timer_sync(sk, add_timer); + /* Note: entry might have been removed by another thread. + * We hold rcu_read_lock() to ensure it is not freed under us. + */ + if (stop_timer) + sk_stop_timer_sync(sk, &entry->add_timer); + rcu_read_unlock(); return entry; } @@ -415,7 +421,7 @@ static void mptcp_pm_free_anno_list(struct mptcp_sock *msk) list_for_each_entry_safe(entry, tmp, &free_list, list) { sk_stop_timer_sync(sk, &entry->add_timer); - kfree(entry); + kfree_rcu(entry, rcu); } } -- cgit From d47515af6cccd7484d8b0870376858c9848a18ec Mon Sep 17 00:00:00 2001 From: Pradyumn Rahar Date: Mon, 17 Nov 2025 14:16:08 +0200 Subject: net/mlx5: Clean up only new IRQ glue on request_irq() failure The mlx5_irq_alloc() function can inadvertently free the entire rmap and end up in a crash[1] when the other threads tries to access this, when request_irq() fails due to exhausted IRQ vectors. This commit modifies the cleanup to remove only the specific IRQ mapping that was just added. This prevents removal of other valid mappings and ensures precise cleanup of the failed IRQ allocation's associated glue object. Note: This error is observed when both fwctl and rds configs are enabled. [1] mlx5_core 0000:05:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:05:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:06:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:06:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:06:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:03:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 general protection fault, probably for non-canonical address 0xe277a58fde16f291: 0000 [#1] SMP NOPTI RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Call Trace: ? show_trace_log_lvl+0x1d6/0x2f9 ? show_trace_log_lvl+0x1d6/0x2f9 ? mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core] ? __die_body.cold+0x8/0xa ? die_addr+0x39/0x53 ? exc_general_protection+0x1c4/0x3e9 ? dev_vprintk_emit+0x5f/0x90 ? asm_exc_general_protection+0x22/0x27 ? free_irq_cpu_rmap+0x23/0x7d mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core] irq_pool_request_vector+0x7d/0x90 [mlx5_core] mlx5_irq_request+0x2e/0xe0 [mlx5_core] mlx5_irq_request_vector+0xad/0xf7 [mlx5_core] comp_irq_request_pci+0x64/0xf0 [mlx5_core] create_comp_eq+0x71/0x385 [mlx5_core] ? mlx5e_open_xdpsq+0x11c/0x230 [mlx5_core] mlx5_comp_eqn_get+0x72/0x90 [mlx5_core] ? xas_load+0x8/0x91 mlx5_comp_irqn_get+0x40/0x90 [mlx5_core] mlx5e_open_channel+0x7d/0x3c7 [mlx5_core] mlx5e_open_channels+0xad/0x250 [mlx5_core] mlx5e_open_locked+0x3e/0x110 [mlx5_core] mlx5e_open+0x23/0x70 [mlx5_core] __dev_open+0xf1/0x1a5 __dev_change_flags+0x1e1/0x249 dev_change_flags+0x21/0x5c do_setlink+0x28b/0xcc4 ? __nla_parse+0x22/0x3d ? inet6_validate_link_af+0x6b/0x108 ? cpumask_next+0x1f/0x35 ? __snmp6_fill_stats64.constprop.0+0x66/0x107 ? __nla_validate_parse+0x48/0x1e6 __rtnl_newlink+0x5ff/0xa57 ? kmem_cache_alloc_trace+0x164/0x2ce rtnl_newlink+0x44/0x6e rtnetlink_rcv_msg+0x2bb/0x362 ? __netlink_sendskb+0x4c/0x6c ? netlink_unicast+0x28f/0x2ce ? rtnl_calcit.isra.0+0x150/0x146 netlink_rcv_skb+0x5f/0x112 netlink_unicast+0x213/0x2ce netlink_sendmsg+0x24f/0x4d9 __sock_sendmsg+0x65/0x6a ____sys_sendmsg+0x28f/0x2c9 ? import_iovec+0x17/0x2b ___sys_sendmsg+0x97/0xe0 __sys_sendmsg+0x81/0xd8 do_syscall_64+0x35/0x87 entry_SYSCALL_64_after_hwframe+0x6e/0x0 RIP: 0033:0x7fc328603727 Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48 RSP: 002b:00007ffe8eb3f1a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc328603727 RDX: 0000000000000000 RSI: 00007ffe8eb3f1f0 RDI: 000000000000000d RBP: 00007ffe8eb3f1f0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000000000 R14: 00007ffe8eb3f3c8 R15: 00007ffe8eb3f3bc ---[ end trace f43ce73c3c2b13a2 ]--- RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Code: 0f 1f 80 00 00 00 00 48 85 ff 74 6b 55 48 89 fd 53 66 83 7f 06 00 74 24 31 db 48 8b 55 08 0f b7 c3 48 8b 04 c2 48 85 c0 74 09 <8b> 38 31 f6 e8 c4 0a b8 ff 83 c3 01 66 3b 5d 06 72 de b8 ff ff ff RSP: 0018:ff384881640eaca0 EFLAGS: 00010282 RAX: e277a58fde16f291 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ff2335e2e20b3600 RSI: 0000000000000000 RDI: ff2335e2e20b3400 RBP: ff2335e2e20b3400 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 00000000ffffffe4 R12: ff384881640ead88 R13: ff2335c3760751e0 R14: ff2335e2e1672200 R15: ff2335c3760751f8 FS: 00007fc32ac22480(0000) GS:ff2335e2d6e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f651ab54000 CR3: 00000029f1206003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x1dc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) kvm-guest: disable async PF for cpu 0 Fixes: 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation") Signed-off-by: Mohith Kumar Thummaluru Tested-by: Mohith Kumar Thummaluru Reviewed-by: Moshe Shemesh Reviewed-by: Shay Drori Signed-off-by: Pradyumn Rahar Reviewed-by: Jacob Keller Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1763381768-1234998-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index e18a850c615c..aa3b5878e3da 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -324,10 +324,8 @@ err_xa: free_irq(irq->map.virq, &irq->nh); err_req_irq: #ifdef CONFIG_RFS_ACCEL - if (i && rmap && *rmap) { - free_irq_cpu_rmap(*rmap); - *rmap = NULL; - } + if (i && rmap && *rmap) + irq_cpu_rmap_remove(*rmap, irq->map.virq); err_irq_rmap: #endif if (i && pci_msix_can_alloc_dyn(dev->pdev)) -- cgit From 7bf3a476ce43833c49fceddbe94ff3472e04e9bc Mon Sep 17 00:00:00 2001 From: Kuniyuki Iwashima Date: Mon, 17 Nov 2025 17:47:10 +0000 Subject: af_unix: Read sk_peek_offset() again after sleeping in unix_stream_read_generic(). Miao Wang reported a bug of SO_PEEK_OFF on AF_UNIX SOCK_STREAM socket. The unexpected behaviour is triggered when the peek offset is larger than the recv queue and the thread is unblocked by new data. Let's assume a socket which has "aaaa" in the recv queue and the peek offset is 4. First, unix_stream_read_generic() reads the offset 4 and skips the skb(s) of "aaaa" with the code below: skip = max(sk_peek_offset(sk, flags), 0); /* @skip is 4. */ do { ... while (skip >= unix_skb_len(skb)) { skip -= unix_skb_len(skb); ... skb = skb_peek_next(skb, &sk->sk_receive_queue); if (!skb) goto again; /* @skip is 0. */ } The thread jumps to the 'again' label and goes to sleep since new data has not arrived yet. Later, new data "bbbb" unblocks the thread, and the thread jumps to the 'redo:' label to restart the entire process from the first skb in the recv queue. do { ... redo: ... last = skb = skb_peek(&sk->sk_receive_queue); ... again: if (skb == NULL) { ... timeo = unix_stream_data_wait(sk, timeo, last, last_len, freezable); ... goto redo; /* @skip is 0 !! */ However, the peek offset is not reset in the path. If the buffer size is 8, recv() will return "aaaabbbb" without skipping any data, and the final offset will be 12 (the original offset 4 + peeked skbs' length 8). After sleeping in unix_stream_read_generic(), we have to fetch the peek offset again. Let's move the redo label before mutex_lock(&u->iolock). Fixes: 9f389e35674f ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag") Reported-by: Miao Wang Closes: https://lore.kernel.org/netdev/3B969F90-F51F-4B9D-AB1A-994D9A54D460@gmail.com/ Signed-off-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20251117174740.3684604-2-kuniyu@google.com Signed-off-by: Jakub Kicinski --- net/unix/af_unix.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 768098dec231..833c3616d2a2 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2954,6 +2954,7 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, u = unix_sk(sk); +redo: /* Lock the socket to prevent queue disordering * while sleeps in memcpy_tomsg */ @@ -2965,7 +2966,6 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, struct sk_buff *skb, *last; int chunk; -redo: unix_state_lock(sk); if (sock_flag(sk, SOCK_DEAD)) { err = -ECONNRESET; @@ -3015,7 +3015,6 @@ again: goto out; } - mutex_lock(&u->iolock); goto redo; unlock: unix_state_unlock(sk); -- cgit From e1bb28bf13f41af5d7cc48359d1755cbcda4d502 Mon Sep 17 00:00:00 2001 From: Kuniyuki Iwashima Date: Mon, 17 Nov 2025 17:47:11 +0000 Subject: selftest: af_unix: Add test for SO_PEEK_OFF. The test covers various cases to verify SO_PEEK_OFF behaviour for all AF_UNIX socket types. two_chunks_blocking and two_chunks_overlap_blocking reproduce the issue mentioned in the previous patch. Without the patch, the two tests fail: # RUN so_peek_off.stream.two_chunks_blocking ... # so_peek_off.c:121:two_chunks_blocking:Expected 'bbbb' == 'aaaabbbb'. # two_chunks_blocking: Test terminated by assertion # FAIL so_peek_off.stream.two_chunks_blocking not ok 3 so_peek_off.stream.two_chunks_blocking # RUN so_peek_off.stream.two_chunks_overlap_blocking ... # so_peek_off.c:159:two_chunks_overlap_blocking:Expected 'bbbb' == 'aaaabbbb'. # two_chunks_overlap_blocking: Test terminated by assertion # FAIL so_peek_off.stream.two_chunks_overlap_blocking not ok 5 so_peek_off.stream.two_chunks_overlap_blocking With the patch, all tests pass: # PASSED: 15 / 15 tests passed. # Totals: pass:15 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20251117174740.3684604-3-kuniyu@google.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/.gitignore | 1 + tools/testing/selftests/net/af_unix/Makefile | 1 + tools/testing/selftests/net/af_unix/so_peek_off.c | 162 ++++++++++++++++++++++ 3 files changed, 164 insertions(+) create mode 100644 tools/testing/selftests/net/af_unix/so_peek_off.c diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore index 439101b518ee..8f9850a71f54 100644 --- a/tools/testing/selftests/net/.gitignore +++ b/tools/testing/selftests/net/.gitignore @@ -45,6 +45,7 @@ skf_net_off socket so_incoming_cpu so_netns_cookie +so_peek_off so_txtime so_rcv_listener stress_reuseport_listen diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile index de805cbbdf69..528d14c598bb 100644 --- a/tools/testing/selftests/net/af_unix/Makefile +++ b/tools/testing/selftests/net/af_unix/Makefile @@ -6,6 +6,7 @@ TEST_GEN_PROGS := \ scm_inq \ scm_pidfd \ scm_rights \ + so_peek_off \ unix_connect \ # end of TEST_GEN_PROGS diff --git a/tools/testing/selftests/net/af_unix/so_peek_off.c b/tools/testing/selftests/net/af_unix/so_peek_off.c new file mode 100644 index 000000000000..1a77728128e5 --- /dev/null +++ b/tools/testing/selftests/net/af_unix/so_peek_off.c @@ -0,0 +1,162 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright 2025 Google LLC */ + +#include +#include + +#include + +#include "../../kselftest_harness.h" + +FIXTURE(so_peek_off) +{ + int fd[2]; /* 0: sender, 1: receiver */ +}; + +FIXTURE_VARIANT(so_peek_off) +{ + int type; +}; + +FIXTURE_VARIANT_ADD(so_peek_off, stream) +{ + .type = SOCK_STREAM, +}; + +FIXTURE_VARIANT_ADD(so_peek_off, dgram) +{ + .type = SOCK_DGRAM, +}; + +FIXTURE_VARIANT_ADD(so_peek_off, seqpacket) +{ + .type = SOCK_SEQPACKET, +}; + +FIXTURE_SETUP(so_peek_off) +{ + struct timeval timeout = { + .tv_sec = 0, + .tv_usec = 3000, + }; + int ret; + + ret = socketpair(AF_UNIX, variant->type, 0, self->fd); + ASSERT_EQ(0, ret); + + ret = setsockopt(self->fd[1], SOL_SOCKET, SO_RCVTIMEO_NEW, + &timeout, sizeof(timeout)); + ASSERT_EQ(0, ret); + + ret = setsockopt(self->fd[1], SOL_SOCKET, SO_PEEK_OFF, + &(int){0}, sizeof(int)); + ASSERT_EQ(0, ret); +} + +FIXTURE_TEARDOWN(so_peek_off) +{ + close_range(self->fd[0], self->fd[1], 0); +} + +#define sendeq(fd, str, flags) \ + do { \ + int bytes, len = strlen(str); \ + \ + bytes = send(fd, str, len, flags); \ + ASSERT_EQ(len, bytes); \ + } while (0) + +#define recveq(fd, str, buflen, flags) \ + do { \ + char buf[(buflen) + 1] = {}; \ + int bytes; \ + \ + bytes = recv(fd, buf, buflen, flags); \ + ASSERT_NE(-1, bytes); \ + ASSERT_STREQ(str, buf); \ + } while (0) + +#define async \ + for (pid_t pid = (pid = fork(), \ + pid < 0 ? \ + __TH_LOG("Failed to start async {}"), \ + _metadata->exit_code = KSFT_FAIL, \ + __bail(1, _metadata), \ + 0xdead : \ + pid); \ + !pid; exit(0)) + +TEST_F(so_peek_off, single_chunk) +{ + sendeq(self->fd[0], "aaaabbbb", 0); + + recveq(self->fd[1], "aaaa", 4, MSG_PEEK); + recveq(self->fd[1], "bbbb", 100, MSG_PEEK); +} + +TEST_F(so_peek_off, two_chunks) +{ + sendeq(self->fd[0], "aaaa", 0); + sendeq(self->fd[0], "bbbb", 0); + + recveq(self->fd[1], "aaaa", 4, MSG_PEEK); + recveq(self->fd[1], "bbbb", 100, MSG_PEEK); +} + +TEST_F(so_peek_off, two_chunks_blocking) +{ + async { + usleep(1000); + sendeq(self->fd[0], "aaaa", 0); + } + + recveq(self->fd[1], "aaaa", 4, MSG_PEEK); + + async { + usleep(1000); + sendeq(self->fd[0], "bbbb", 0); + } + + /* goto again; -> goto redo; in unix_stream_read_generic(). */ + recveq(self->fd[1], "bbbb", 100, MSG_PEEK); +} + +TEST_F(so_peek_off, two_chunks_overlap) +{ + sendeq(self->fd[0], "aaaa", 0); + recveq(self->fd[1], "aa", 2, MSG_PEEK); + + sendeq(self->fd[0], "bbbb", 0); + + if (variant->type == SOCK_STREAM) { + /* SOCK_STREAM tries to fill the buffer. */ + recveq(self->fd[1], "aabb", 4, MSG_PEEK); + recveq(self->fd[1], "bb", 100, MSG_PEEK); + } else { + /* SOCK_DGRAM and SOCK_SEQPACKET returns at the skb boundary. */ + recveq(self->fd[1], "aa", 100, MSG_PEEK); + recveq(self->fd[1], "bbbb", 100, MSG_PEEK); + } +} + +TEST_F(so_peek_off, two_chunks_overlap_blocking) +{ + async { + usleep(1000); + sendeq(self->fd[0], "aaaa", 0); + } + + recveq(self->fd[1], "aa", 2, MSG_PEEK); + + async { + usleep(1000); + sendeq(self->fd[0], "bbbb", 0); + } + + /* Even SOCK_STREAM does not wait if at least one byte is read. */ + recveq(self->fd[1], "aa", 100, MSG_PEEK); + + recveq(self->fd[1], "bbbb", 100, MSG_PEEK); +} + +TEST_HARNESS_MAIN -- cgit From e31a11be41cd134f245c01d1329e7bc89aba78fb Mon Sep 17 00:00:00 2001 From: Wei Fang Date: Mon, 17 Nov 2025 18:29:43 +0800 Subject: net: phylink: add missing supported link modes for the fixed-link Pause, Asym_Pause and Autoneg bits are not set when pl->supported is initialized, so these link modes will not work for the fixed-link. This leads to a TCP performance degradation issue observed on the i.MX943 platform. The switch CPU port of i.MX943 is connected to an ENETC MAC, this link is a fixed link and the link speed is 2.5Gbps. And one of the switch user ports is the RGMII interface, and its link speed is 1Gbps. If the flow-control of the fixed link is not enabled, we can easily observe the iperf performance of TCP packets is very low. Because the inbound rate on the CPU port is greater than the outbound rate on the user port, the switch is prone to congestion, leading to the loss of some TCP packets and requiring multiple retransmissions. Solving this problem should be as simple as setting the Asym_Pause and Pause bits. The reason why the Autoneg bit needs to be set, Russell has gave a very good explanation in the thread [1], see below. "As the advertising and lp_advertising bitmasks have to be non-empty, and the swphy reports aneg capable, aneg complete, and AN enabled, then for consistency with that state, Autoneg should be set. This is how it was prior to the blamed commit." Fixes: de7d3f87be3c ("net: phylink: Use phy_caps_lookup for fixed-link configuration") Link: https://lore.kernel.org/aRjqLN8eQDIQfBjS@shell.armlinux.org.uk # [1] Signed-off-by: Wei Fang Reviewed-by: Maxime Chevallier Link: https://patch.msgid.link/20251117102943.1862680-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski --- drivers/net/phy/phylink.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 9d7799ea1c17..918244308215 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -637,6 +637,9 @@ static int phylink_validate(struct phylink *pl, unsigned long *supported, static void phylink_fill_fixedlink_supported(unsigned long *supported) { + linkmode_set_bit(ETHTOOL_LINK_MODE_Pause_BIT, supported); + linkmode_set_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, supported); + linkmode_set_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, supported); linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT, supported); linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT, supported); linkmode_set_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT, supported); -- cgit From e837b9091b277ae6f309d7e9fc93cb0308cf461f Mon Sep 17 00:00:00 2001 From: Bitterblue Smith Date: Fri, 14 Nov 2025 00:54:48 +0200 Subject: wifi: rtw89: hw_scan: Don't let the operating channel be last Scanning can be offloaded to the firmware. To that end, the driver prepares a list of channels to scan, including periodic visits back to the operating channel, and sends the list to the firmware. When the channel list is too long to fit in a single H2C message, the driver splits the list, sends the first part, and tells the firmware to scan. When the scan is complete, the driver sends the next part of the list and tells the firmware to scan. When the last channel that fit in the H2C message is the operating channel something seems to go wrong in the firmware. It will acknowledge receiving the list of channels but apparently it will not do anything more. The AP can't be pinged anymore. The driver still receives beacons, though. One way to avoid this is to split the list of channels before the operating channel. Affected devices: * RTL8851BU with firmware 0.29.41.3 * RTL8832BU with firmware 0.29.29.8 * RTL8852BE with firmware 0.29.29.8 The commit 57a5fbe39a18 ("wifi: rtw89: refactor flow that hw scan handles channel list") is found by git blame, but it is actually to refine the scan flow, but not a culprit, so skip Fixes tag. Reported-by: Bitterblue Smith Closes: https://lore.kernel.org/linux-wireless/0abbda91-c5c2-4007-84c8-215679e652e1@gmail.com/ Cc: stable@vger.kernel.org # 6.16+ Signed-off-by: Bitterblue Smith Acked-by: Ping-Ke Shih Signed-off-by: Ping-Ke Shih Link: https://patch.msgid.link/c1e61744-8db4-4646-867f-241b47d30386@gmail.com --- drivers/net/wireless/realtek/rtw89/fw.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/wireless/realtek/rtw89/fw.c b/drivers/net/wireless/realtek/rtw89/fw.c index ab904a7def1b..080c4f8a655a 100644 --- a/drivers/net/wireless/realtek/rtw89/fw.c +++ b/drivers/net/wireless/realtek/rtw89/fw.c @@ -7694,6 +7694,13 @@ int rtw89_hw_scan_add_chan_list_ax(struct rtw89_dev *rtwdev, INIT_LIST_HEAD(&list); list_for_each_entry_safe(ch_info, tmp, &scan_info->chan_list, list) { + /* The operating channel (tx_null == true) should + * not be last in the list, to avoid breaking + * RTL8851BU and RTL8832BU. + */ + if (list_len + 1 == RTW89_SCAN_LIST_LIMIT_AX && ch_info->tx_null) + break; + list_move_tail(&ch_info->list, &list); list_len++; -- cgit From 5e15395f6d9ec07395866c5511f4b4ac566c0c9b Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Tue, 18 Nov 2025 08:20:19 +0100 Subject: mptcp: fix ack generation for fallback msk mptcp_cleanup_rbuf() needs to know the last most recent, mptcp-level rcv_wnd sent, and such information is tracked into the msk->old_wspace field, updated at ack transmission time by mptcp_write_options(). Fallback socket do not add any mptcp options, such helper is never invoked, and msk->old_wspace value remain stale. That in turn makes ack generation at recvmsg() time quite random. Address the issue ensuring mptcp_write_options() is invoked even for fallback sockets, and just update the needed info in such a case. The issue went unnoticed for a long time, as mptcp currently overshots the fallback socket receive buffer autotune significantly. It is going to change in the near future. Fixes: e3859603ba13 ("mptcp: better msk receive window updates") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/594 Signed-off-by: Paolo Abeni Reviewed-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-1-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/options.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 1103b3341a70..8a63bd00807d 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -838,8 +838,11 @@ bool mptcp_established_options(struct sock *sk, struct sk_buff *skb, opts->suboptions = 0; + /* Force later mptcp_write_options(), but do not use any actual + * option space. + */ if (unlikely(__mptcp_check_fallback(msk) && !mptcp_check_infinite_map(skb))) - return false; + return true; if (unlikely(skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_RST)) { if (mptcp_established_options_fastclose(sk, &opt_size, remaining, opts) || @@ -1319,6 +1322,20 @@ update_wspace: WRITE_ONCE(msk->old_wspace, tp->rcv_wnd); } +static void mptcp_track_rwin(struct tcp_sock *tp) +{ + const struct sock *ssk = (const struct sock *)tp; + struct mptcp_subflow_context *subflow; + struct mptcp_sock *msk; + + if (!ssk) + return; + + subflow = mptcp_subflow_ctx(ssk); + msk = mptcp_sk(subflow->conn); + WRITE_ONCE(msk->old_wspace, tp->rcv_wnd); +} + __sum16 __mptcp_make_csum(u64 data_seq, u32 subflow_seq, u16 data_len, __wsum sum) { struct csum_pseudo_header header; @@ -1611,6 +1628,10 @@ mp_rst: opts->reset_transient, opts->reset_reason); return; + } else if (unlikely(!opts->suboptions)) { + /* Fallback to TCP */ + mptcp_track_rwin(tp); + return; } if (OPTION_MPTCP_PRIO & opts->suboptions) { -- cgit From 4f102d747cadd8f595f2b25882eed9bec1675fb1 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Tue, 18 Nov 2025 08:20:20 +0100 Subject: mptcp: avoid unneeded subflow-level drops The rcv window is shared among all the subflows. Currently, MPTCP sync the TCP-level rcv window with the MPTCP one at tcp_transmit_skb() time. The above means that incoming data may sporadically observe outdated TCP-level rcv window and being wrongly dropped by TCP. Address the issue checking for the edge condition before queuing the data at TCP level, and eventually syncing the rcv window as needed. Note that the issue is actually present from the very first MPTCP implementation, but backports older than the blamed commit below will range from impossible to useless. Before: $ nstat -n; sleep 1; nstat -z TcpExtBeyondWindow TcpExtBeyondWindow 14 0.0 After: $ nstat -n; sleep 1; nstat -z TcpExtBeyondWindow TcpExtBeyondWindow 0 0.0 Fixes: fa3fe2b15031 ("mptcp: track window announced to peer") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-2-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/options.c | 31 +++++++++++++++++++++++++++++++ net/mptcp/protocol.h | 1 + 2 files changed, 32 insertions(+) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 8a63bd00807d..f24ae7d40e88 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -1044,6 +1044,31 @@ static void __mptcp_snd_una_update(struct mptcp_sock *msk, u64 new_snd_una) WRITE_ONCE(msk->snd_una, new_snd_una); } +static void rwin_update(struct mptcp_sock *msk, struct sock *ssk, + struct sk_buff *skb) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + struct tcp_sock *tp = tcp_sk(ssk); + u64 mptcp_rcv_wnd; + + /* Avoid touching extra cachelines if TCP is going to accept this + * skb without filling the TCP-level window even with a possibly + * outdated mptcp-level rwin. + */ + if (!skb->len || skb->len < tcp_receive_window(tp)) + return; + + mptcp_rcv_wnd = atomic64_read(&msk->rcv_wnd_sent); + if (!after64(mptcp_rcv_wnd, subflow->rcv_wnd_sent)) + return; + + /* Some other subflow grew the mptcp-level rwin since rcv_wup, + * resync. + */ + tp->rcv_wnd += mptcp_rcv_wnd - subflow->rcv_wnd_sent; + subflow->rcv_wnd_sent = mptcp_rcv_wnd; +} + static void ack_update_msk(struct mptcp_sock *msk, struct sock *ssk, struct mptcp_options_received *mp_opt) @@ -1211,6 +1236,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) */ if (mp_opt.use_ack) ack_update_msk(msk, sk, &mp_opt); + rwin_update(msk, sk, skb); /* Zero-data-length packets are dropped by the caller and not * propagated to the MPTCP layer, so the skb extension does not @@ -1297,6 +1323,10 @@ static void mptcp_set_rwin(struct tcp_sock *tp, struct tcphdr *th) if (rcv_wnd_new != rcv_wnd_old) { raise_win: + /* The msk-level rcv wnd is after the tcp level one, + * sync the latter. + */ + rcv_wnd_new = rcv_wnd_old; win = rcv_wnd_old - ack_seq; tp->rcv_wnd = min_t(u64, win, U32_MAX); new_win = tp->rcv_wnd; @@ -1320,6 +1350,7 @@ raise_win: update_wspace: WRITE_ONCE(msk->old_wspace, tp->rcv_wnd); + subflow->rcv_wnd_sent = rcv_wnd_new; } static void mptcp_track_rwin(struct tcp_sock *tp) diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 379a88e14e8d..5575ef64ea31 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -509,6 +509,7 @@ struct mptcp_subflow_context { u64 remote_key; u64 idsn; u64 map_seq; + u64 rcv_wnd_sent; u32 snd_isn; u32 token; u32 rel_write_seq; -- cgit From 17393fa7b7086664be519e7230cb6ed7ec7d9462 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Tue, 18 Nov 2025 08:20:21 +0100 Subject: mptcp: fix premature close in case of fallback I'm observing very frequent self-tests failures in case of fallback when running on a CONFIG_PREEMPT kernel. The root cause is that subflow_sched_work_if_closed() closes any subflow as soon as it is half-closed and has no incoming data pending. That works well for regular subflows - MPTCP needs bi-directional connectivity to operate on a given subflow - but for fallback socket is race prone. When TCP peer closes the connection before the MPTCP one, subflow_sched_work_if_closed() will schedule the MPTCP worker to gracefully close the subflow, and shortly after will do another schedule to inject and process a dummy incoming DATA_FIN. On CONFIG_PREEMPT kernel, the MPTCP worker can kick-in and close the fallback subflow before subflow_sched_work_if_closed() is able to create the dummy DATA_FIN, unexpectedly interrupting the transfer. Address the issue explicitly avoiding closing fallback subflows on when the peer is only half-closed. Note that, when the subflow is able to create the DATA_FIN before the worker invocation, the worker will change the msk state before trying to close the subflow and will skip the latter operation as the msk will not match anymore the precondition in __mptcp_close_subflow(). Fixes: f09b0ad55a11 ("mptcp: close subflow when receiving TCP+FIN") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-3-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/protocol.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e27e0fe2460f..e30e9043a694 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2563,7 +2563,8 @@ static void __mptcp_close_subflow(struct sock *sk) if (ssk_state != TCP_CLOSE && (ssk_state != TCP_CLOSE_WAIT || - inet_sk_state_load(sk) != TCP_ESTABLISHED)) + inet_sk_state_load(sk) != TCP_ESTABLISHED || + __mptcp_check_fallback(msk))) continue; /* 'subflow_data_ready' will re-sched once rx queue is empty */ -- cgit From 1bba3f219c5e8c29e63afa3c1fc24f875ebec119 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Tue, 18 Nov 2025 08:20:22 +0100 Subject: mptcp: do not fallback when OoO is present In case of DSS corruption, the MPTCP protocol tries to avoid the subflow reset if fallback is possible. Such corruptions happen in the receive path; to ensure fallback is possible the stack additionally needs to check for OoO data, otherwise the fallback will break the data stream. Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598 Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-4-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/protocol.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e30e9043a694..6f0e8f670d83 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -76,6 +76,13 @@ bool __mptcp_try_fallback(struct mptcp_sock *msk, int fb_mib) if (__mptcp_check_fallback(msk)) return true; + /* The caller possibly is not holding the msk socket lock, but + * in the fallback case only the current subflow is touching + * the OoO queue. + */ + if (!RB_EMPTY_ROOT(&msk->out_of_order_queue)) + return false; + spin_lock_bh(&msk->fallback_lock); if (!msk->allow_infinite_fallback) { spin_unlock_bh(&msk->fallback_lock); -- cgit From fff0c87996672816a84c3386797a5e69751c5888 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Tue, 18 Nov 2025 08:20:23 +0100 Subject: mptcp: decouple mptcp fastclose from tcp close With the current fastclose implementation, the mptcp_do_fastclose() helper is in charge of two distinct actions: send the fastclose reset and cleanup the subflows. Formally decouple the two steps, ensuring that mptcp explicitly closes all the subflows after the mentioned helper. This will make the upcoming fix simpler, and allows dropping the 2nd argument from mptcp_destroy_common(). The Fixes tag is then the same as in the next commit to help with the backports. Fixes: d21f83485518 ("mptcp: use fastclose on more edge scenarios") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-5-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/protocol.c | 13 +++++++++---- net/mptcp/protocol.h | 2 +- 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 6f0e8f670d83..c59246c1fde6 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2808,7 +2808,11 @@ static void mptcp_worker(struct work_struct *work) __mptcp_close_subflow(sk); if (mptcp_close_tout_expired(sk)) { + struct mptcp_subflow_context *subflow, *tmp; + mptcp_do_fastclose(sk); + mptcp_for_each_subflow_safe(msk, subflow, tmp) + __mptcp_close_ssk(sk, subflow->tcp_sock, subflow, 0); mptcp_close_wake_up(sk); } @@ -3233,7 +3237,8 @@ static int mptcp_disconnect(struct sock *sk, int flags) /* msk->subflow is still intact, the following will not free the first * subflow */ - mptcp_destroy_common(msk, MPTCP_CF_FASTCLOSE); + mptcp_do_fastclose(sk); + mptcp_destroy_common(msk); /* The first subflow is already in TCP_CLOSE status, the following * can't overlap with a fallback anymore @@ -3412,7 +3417,7 @@ void mptcp_rcv_space_init(struct mptcp_sock *msk, const struct sock *ssk) msk->rcvq_space.space = TCP_INIT_CWND * TCP_MSS_DEFAULT; } -void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags) +void mptcp_destroy_common(struct mptcp_sock *msk) { struct mptcp_subflow_context *subflow, *tmp; struct sock *sk = (struct sock *)msk; @@ -3421,7 +3426,7 @@ void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags) /* join list will be eventually flushed (with rst) at sock lock release time */ mptcp_for_each_subflow_safe(msk, subflow, tmp) - __mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, flags); + __mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, 0); __skb_queue_purge(&sk->sk_receive_queue); skb_rbtree_purge(&msk->out_of_order_queue); @@ -3439,7 +3444,7 @@ static void mptcp_destroy(struct sock *sk) /* allow the following to close even the initial subflow */ msk->free_first = 1; - mptcp_destroy_common(msk, 0); + mptcp_destroy_common(msk); sk_sockets_allocated_dec(sk); } diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 5575ef64ea31..6ca97096607c 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -977,7 +977,7 @@ static inline void mptcp_propagate_sndbuf(struct sock *sk, struct sock *ssk) local_bh_enable(); } -void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags); +void mptcp_destroy_common(struct mptcp_sock *msk); #define MPTCP_TOKEN_MAX_RETRIES 4 -- cgit From ae155060247be8dcae3802a95bd1bdf93ab3215d Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Tue, 18 Nov 2025 08:20:24 +0100 Subject: mptcp: fix duplicate reset on fastclose The CI reports sporadic failures of the fastclose self-tests. The root cause is a duplicate reset, not carrying the relevant MPTCP option. In the failing scenario the bad reset is received by the peer before the fastclose one, preventing the reception of the latter. Indeed there is window of opportunity at fastclose time for the following race: mptcp_do_fastclose __mptcp_close_ssk __tcp_close() tcp_set_state() [1] tcp_send_active_reset() [2] After [1] the stack will send reset to in-flight data reaching the now closed port. Such reset may race with [2]. Address the issue explicitly sending a single reset on fastclose before explicitly moving the subflow to close status. Fixes: d21f83485518 ("mptcp: use fastclose on more edge scenarios") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/596 Signed-off-by: Paolo Abeni Reviewed-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-6-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/protocol.c | 36 +++++++++++++++++++++++------------- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index c59246c1fde6..a70267a74e3c 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2409,7 +2409,6 @@ bool __mptcp_retransmit_pending_data(struct sock *sk) /* flags for __mptcp_close_ssk() */ #define MPTCP_CF_PUSH BIT(1) -#define MPTCP_CF_FASTCLOSE BIT(2) /* be sure to send a reset only if the caller asked for it, also * clean completely the subflow status when the subflow reaches @@ -2420,7 +2419,7 @@ static void __mptcp_subflow_disconnect(struct sock *ssk, unsigned int flags) { if (((1 << ssk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) || - (flags & MPTCP_CF_FASTCLOSE)) { + subflow->send_fastclose) { /* The MPTCP code never wait on the subflow sockets, TCP-level * disconnect should never fail */ @@ -2467,14 +2466,8 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); - if ((flags & MPTCP_CF_FASTCLOSE) && !__mptcp_check_fallback(msk)) { - /* be sure to force the tcp_close path - * to generate the egress reset - */ - ssk->sk_lingertime = 0; - sock_set_flag(ssk, SOCK_LINGER); - subflow->send_fastclose = 1; - } + if (subflow->send_fastclose && ssk->sk_state != TCP_CLOSE) + tcp_set_state(ssk, TCP_CLOSE); need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk); if (!dispose_it) { @@ -2779,9 +2772,26 @@ static void mptcp_do_fastclose(struct sock *sk) struct mptcp_sock *msk = mptcp_sk(sk); mptcp_set_state(sk, TCP_CLOSE); - mptcp_for_each_subflow_safe(msk, subflow, tmp) - __mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), - subflow, MPTCP_CF_FASTCLOSE); + + /* Explicitly send the fastclose reset as need */ + if (__mptcp_check_fallback(msk)) + return; + + mptcp_for_each_subflow_safe(msk, subflow, tmp) { + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + + lock_sock(ssk); + + /* Some subflow socket states don't allow/need a reset.*/ + if ((1 << ssk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) + goto unlock; + + subflow->send_fastclose = 1; + tcp_send_active_reset(ssk, ssk->sk_allocation, + SK_RST_REASON_TCP_ABORT_ON_CLOSE); +unlock: + release_sock(ssk); + } } static void mptcp_worker(struct work_struct *work) -- cgit From efff6cd53ac52827948298043270bb81ff17fdff Mon Sep 17 00:00:00 2001 From: "Matthieu Baerts (NGI0)" Date: Tue, 18 Nov 2025 08:20:25 +0100 Subject: selftests: mptcp: join: fastclose: remove flaky marks After recent fixes like the parent commit, and "selftests: mptcp: connect: trunc: read all recv data", the two fastclose subtests no longer look flaky any more. It then feels fine to remove these flaky marks, to no longer ignore these subtests in case of errors. Reviewed-by: Geliang Tang Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-7-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 2 -- 1 file changed, 2 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index 41503c241989..303abbca59fc 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3500,7 +3500,6 @@ fullmesh_tests() fastclose_tests() { if reset_check_counter "fastclose test" "MPTcpExtMPFastcloseTx"; then - MPTCP_LIB_SUBTEST_FLAKY=1 test_linkfail=1024 fastclose=client \ run_tests $ns1 $ns2 10.0.1.1 chk_join_nr 0 0 0 @@ -3509,7 +3508,6 @@ fastclose_tests() fi if reset_check_counter "fastclose server test" "MPTcpExtMPFastcloseRx"; then - MPTCP_LIB_SUBTEST_FLAKY=1 test_linkfail=1024 fastclose=server \ run_tests $ns1 $ns2 10.0.1.1 join_rst_nr=1 \ -- cgit From fb13c6bb810ca871964e062cf91882d1c83db509 Mon Sep 17 00:00:00 2001 From: "Matthieu Baerts (NGI0)" Date: Tue, 18 Nov 2025 08:20:26 +0100 Subject: selftests: mptcp: join: endpoints: longer timeout In rare cases, when the test environment is very slow, some endpoints tests can fail because some expected events have not been seen. Because the tests are expecting a long on-going connection, and they are not waiting for the end of the transfer, it is fine to have a longer timeout, and even go over the default one. This connection will be killed at the end, after the verifications: increasing the timeout doesn't change anything, apart from avoiding it to end before the end of the verifications. To play it safe, all endpoints tests not waiting for the end of the transfer are now having a longer timeout: 2 minutes. The Fixes commit was making the connection longer, but still, the default timeout would have stopped it after 1 minute, which might not be enough in very slow environments. Fixes: 6457595db987 ("selftests: mptcp: join: endpoints: longer transfer") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts (NGI0) Reviewed-by: Geliang Tang Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-8-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index 303abbca59fc..93d38ded5e4e 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3941,7 +3941,7 @@ endpoint_tests() pm_nl_set_limits $ns1 2 2 pm_nl_set_limits $ns2 2 2 pm_nl_add_endpoint $ns1 10.0.2.1 flags signal - { test_linkfail=128 speed=slow \ + { timeout_test=120 test_linkfail=128 speed=slow \ run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null local tests_pid=$! @@ -3968,7 +3968,7 @@ endpoint_tests() pm_nl_set_limits $ns2 0 3 pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow - { test_linkfail=128 speed=5 \ + { timeout_test=120 test_linkfail=128 speed=5 \ run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null local tests_pid=$! @@ -4046,7 +4046,7 @@ endpoint_tests() # broadcast IP: no packet for this address will be received on ns1 pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal - { test_linkfail=128 speed=5 \ + { timeout_test=120 test_linkfail=128 speed=5 \ run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null local tests_pid=$! @@ -4119,7 +4119,7 @@ endpoint_tests() # broadcast IP: no packet for this address will be received on ns1 pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow - { test_linkfail=128 speed=20 \ + { timeout_test=120 test_linkfail=128 speed=20 \ run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null local tests_pid=$! -- cgit From 0e4ec14dc1ee4b1ec347729c225c3ca950f2bcf6 Mon Sep 17 00:00:00 2001 From: "Matthieu Baerts (NGI0)" Date: Tue, 18 Nov 2025 08:20:27 +0100 Subject: selftests: mptcp: join: userspace: longer timeout In rare cases, when the test environment is very slow, some userspace tests can fail because some expected events have not been seen. Because the tests are expecting a long on-going connection, and they are not waiting for the end of the transfer, it is fine to have a longer timeout, and even go over the default one. This connection will be killed at the end, after the verifications: increasing the timeout doesn't change anything, apart from avoiding it to end before the end of the verifications. To play it safe, all userspace tests not waiting for the end of the transfer are now having a longer timeout: 2 minutes. The Fixes commit was making the connection longer, but still, the default timeout would have stopped it after 1 minute, which might not be enough in very slow environments. Fixes: 290493078b96 ("selftests: mptcp: join: userspace: longer transfer") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts (NGI0) Reviewed-by: Geliang Tang Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-9-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index 93d38ded5e4e..74632beae2c6 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3804,7 +3804,7 @@ userspace_tests() continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then set_userspace_pm $ns1 pm_nl_set_limits $ns2 2 2 - { test_linkfail=128 speed=5 \ + { timeout_test=120 test_linkfail=128 speed=5 \ run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null local tests_pid=$! wait_mpj $ns1 @@ -3837,7 +3837,7 @@ userspace_tests() continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then set_userspace_pm $ns2 pm_nl_set_limits $ns1 0 1 - { test_linkfail=128 speed=5 \ + { timeout_test=120 test_linkfail=128 speed=5 \ run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null local tests_pid=$! wait_mpj $ns2 @@ -3865,7 +3865,7 @@ userspace_tests() continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then set_userspace_pm $ns2 pm_nl_set_limits $ns1 0 1 - { test_linkfail=128 speed=5 \ + { timeout_test=120 test_linkfail=128 speed=5 \ run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null local tests_pid=$! wait_mpj $ns2 @@ -3886,7 +3886,7 @@ userspace_tests() continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then set_userspace_pm $ns2 pm_nl_set_limits $ns1 0 1 - { test_linkfail=128 speed=5 \ + { timeout_test=120 test_linkfail=128 speed=5 \ run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null local tests_pid=$! wait_mpj $ns2 @@ -3910,7 +3910,7 @@ userspace_tests() continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then set_userspace_pm $ns1 pm_nl_set_limits $ns2 1 1 - { test_linkfail=128 speed=5 \ + { timeout_test=120 test_linkfail=128 speed=5 \ run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null local tests_pid=$! wait_mpj $ns1 -- cgit From 92e239e36d600002559074994a545fcfac9afd2d Mon Sep 17 00:00:00 2001 From: Gang Yan Date: Tue, 18 Nov 2025 08:20:28 +0100 Subject: mptcp: fix address removal logic in mptcp_pm_nl_rm_addr Fix inverted WARN_ON_ONCE condition that prevented normal address removal counter updates. The current code only executes decrement logic when the counter is already 0 (abnormal state), while normal removals (counter > 0) are ignored. Signed-off-by: Gang Yan Fixes: 636113918508 ("mptcp: pm: remove '_nl' from mptcp_pm_nl_rm_addr_received") Cc: stable@vger.kernel.org Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-10-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/pm_kernel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c index 2ae95476dba3..0a50fd5edc06 100644 --- a/net/mptcp/pm_kernel.c +++ b/net/mptcp/pm_kernel.c @@ -672,7 +672,7 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) void mptcp_pm_nl_rm_addr(struct mptcp_sock *msk, u8 rm_id) { - if (rm_id && WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) { + if (rm_id && !WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) { u8 limit_add_addr_accepted = mptcp_pm_get_limit_add_addr_accepted(msk); -- cgit From 0eee0fdf9b7b0baf698f9b426384aa9714d76a51 Mon Sep 17 00:00:00 2001 From: Gang Yan Date: Tue, 18 Nov 2025 08:20:29 +0100 Subject: selftests: mptcp: add a check for 'add_addr_accepted' The previous patch fixed an issue with the 'add_addr_accepted' counter. This was not spot by the test suite. Check this counter and 'add_addr_signal' in MPTCP Join 'delete re-add signal' test. This should help spotting similar regressions later on. These counters are crucial for ensuring the MPTCP path manager correctly handles the subflow creation via 'ADD_ADDR'. Signed-off-by: Gang Yan Reviewed-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-11-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index 74632beae2c6..43f31f8d587f 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -4055,38 +4055,45 @@ endpoint_tests() $ns1 10.0.2.1 id 1 flags signal chk_subflow_nr "before delete" 2 chk_mptcp_info subflows 1 subflows 1 + chk_mptcp_info add_addr_signal 2 add_addr_accepted 1 pm_nl_del_endpoint $ns1 1 10.0.2.1 pm_nl_del_endpoint $ns1 2 224.0.0.1 sleep 0.5 chk_subflow_nr "after delete" 1 chk_mptcp_info subflows 0 subflows 0 + chk_mptcp_info add_addr_signal 0 add_addr_accepted 0 pm_nl_add_endpoint $ns1 10.0.2.1 id 1 flags signal pm_nl_add_endpoint $ns1 10.0.3.1 id 2 flags signal wait_mpj $ns2 chk_subflow_nr "after re-add" 3 chk_mptcp_info subflows 2 subflows 2 + chk_mptcp_info add_addr_signal 2 add_addr_accepted 2 pm_nl_del_endpoint $ns1 42 10.0.1.1 sleep 0.5 chk_subflow_nr "after delete ID 0" 2 chk_mptcp_info subflows 2 subflows 2 + chk_mptcp_info add_addr_signal 2 add_addr_accepted 2 pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal wait_mpj $ns2 chk_subflow_nr "after re-add ID 0" 3 chk_mptcp_info subflows 3 subflows 3 + chk_mptcp_info add_addr_signal 3 add_addr_accepted 2 pm_nl_del_endpoint $ns1 99 10.0.1.1 sleep 0.5 chk_subflow_nr "after re-delete ID 0" 2 chk_mptcp_info subflows 2 subflows 2 + chk_mptcp_info add_addr_signal 2 add_addr_accepted 2 pm_nl_add_endpoint $ns1 10.0.1.1 id 88 flags signal wait_mpj $ns2 chk_subflow_nr "after re-re-add ID 0" 3 chk_mptcp_info subflows 3 subflows 3 + chk_mptcp_info add_addr_signal 3 add_addr_accepted 2 mptcp_lib_kill_group_wait $tests_pid kill_events_pids -- cgit From 3ceb6ac2116ecda1c5d779bb73271479e70fccb4 Mon Sep 17 00:00:00 2001 From: Oleksij Rempel Date: Fri, 14 Nov 2025 10:09:51 +0100 Subject: net: dsa: microchip: lan937x: Fix RGMII delay tuning Correct RGMII delay application logic in lan937x_set_tune_adj(). The function was missing `data16 &= ~PORT_TUNE_ADJ` before setting the new delay value. This caused the new value to be bitwise-OR'd with the existing PORT_TUNE_ADJ field instead of replacing it. For example, when setting the RGMII 2 TX delay on port 4, the intended TUNE_ADJUST value of 0 (RGMII_2_TX_DELAY_2NS) was incorrectly OR'd with the default 0x1B (from register value 0xDA3), leaving the delay at the wrong setting. This patch adds the missing mask to clear the field, ensuring the correct delay value is written. Physical measurements on the RGMII TX lines confirm the fix, showing the delay changing from ~1ns (before change) to ~2ns. While testing on i.MX 8MP showed this was within the platform's timing tolerance, it did not match the intended hardware-characterized value. Fixes: b19ac41faa3f ("net: dsa: microchip: apply rgmii tx and rx delay in phylink mac config") Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel Link: https://patch.msgid.link/20251114090951.4057261-1-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni --- drivers/net/dsa/microchip/lan937x_main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/dsa/microchip/lan937x_main.c b/drivers/net/dsa/microchip/lan937x_main.c index b1ae3b9de3d1..5a1496fff445 100644 --- a/drivers/net/dsa/microchip/lan937x_main.c +++ b/drivers/net/dsa/microchip/lan937x_main.c @@ -540,6 +540,7 @@ static void lan937x_set_tune_adj(struct ksz_device *dev, int port, ksz_pread16(dev, port, reg, &data16); /* Update tune Adjust */ + data16 &= ~PORT_TUNE_ADJ; data16 |= FIELD_PREP(PORT_TUNE_ADJ, val); ksz_pwrite16(dev, port, reg, data16); -- cgit From d70b592551ff23747e26e74081205babf8dba9b6 Mon Sep 17 00:00:00 2001 From: David Bauer Date: Tue, 18 Nov 2025 01:16:18 +0100 Subject: l2tp: reset skb control buffer on xmit The L2TP stack did not reset the skb control buffer before sending the encapsulated package. In a setup with an ath10k radio and batman-adv over an L2TP tunnel massive fragmentations happen sporadically if the L2TP tunnel is established over IPv4. L2TP might reset some of the fields in the IP control buffer, but L2TP assumes the type of the control buffer to be of an IPv4 packet. In case the L2TP interface is used as a batadv hardif or the packet is an IPv6 packet, this assumption breaks. Clear the entire control buffer to avoid such mishaps altogether. Fixes: f77ae9390438 ("[PPPOL2TP]: Reset meta-data in xmit function") Signed-off-by: David Bauer Link: https://patch.msgid.link/20251118001619.242107-1-mail@david-bauer.net Signed-off-by: Paolo Abeni --- net/l2tp/l2tp_core.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index 369a2f2e459c..0710281dd95a 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1246,9 +1246,9 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb, uns else l2tp_build_l2tpv3_header(session, __skb_push(skb, session->hdr_len)); - /* Reset skb netfilter state */ - memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); - IPCB(skb)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED | IPSKB_REROUTED); + /* Reset control buffer */ + memset(skb->cb, 0, sizeof(skb->cb)); + nf_reset_ct(skb); /* L2TP uses its own lockdep subclass to avoid lockdep splats caused by -- cgit From 7d277a7a58578dd62fd546ddaef459ec24ccae36 Mon Sep 17 00:00:00 2001 From: Andrey Vatoropin Date: Wed, 19 Nov 2025 10:51:12 +0000 Subject: be2net: pass wrb_params in case of OS2BMC MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit be_insert_vlan_in_pkt() is called with the wrb_params argument being NULL at be_send_pkt_to_bmc() call site.  This may lead to dereferencing a NULL pointer when processing a workaround for specific packet, as commit bc0c3405abbb ("be2net: fix a Tx stall bug caused by a specific ipv6 packet") states. The correct way would be to pass the wrb_params from be_xmit(). Fixes: 760c295e0e8d ("be2net: Support for OS2BMC.") Cc: stable@vger.kernel.org Signed-off-by: Andrey Vatoropin Link: https://patch.msgid.link/20251119105015.194501-1-a.vatoropin@crpt.ru Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/emulex/benet/be_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index cb004fd16252..5bb31c8fab39 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -1296,7 +1296,8 @@ static void be_xmit_flush(struct be_adapter *adapter, struct be_tx_obj *txo) (adapter->bmc_filt_mask & BMC_FILT_MULTICAST) static bool be_send_pkt_to_bmc(struct be_adapter *adapter, - struct sk_buff **skb) + struct sk_buff **skb, + struct be_wrb_params *wrb_params) { struct ethhdr *eh = (struct ethhdr *)(*skb)->data; bool os2bmc = false; @@ -1360,7 +1361,7 @@ done: * to BMC, asic expects the vlan to be inline in the packet. */ if (os2bmc) - *skb = be_insert_vlan_in_pkt(adapter, *skb, NULL); + *skb = be_insert_vlan_in_pkt(adapter, *skb, wrb_params); return os2bmc; } @@ -1387,7 +1388,7 @@ static netdev_tx_t be_xmit(struct sk_buff *skb, struct net_device *netdev) /* if os2bmc is enabled and if the pkt is destined to bmc, * enqueue the pkt a 2nd time with mgmt bit set. */ - if (be_send_pkt_to_bmc(adapter, &skb)) { + if (be_send_pkt_to_bmc(adapter, &skb, &wrb_params)) { BE_WRB_F_SET(wrb_params.features, OS2BMC, 1); wrb_cnt = be_xmit_enqueue(adapter, txo, skb, &wrb_params); if (unlikely(!wrb_cnt)) -- cgit From 002541ef650b742a198e4be363881439bb9d86b4 Mon Sep 17 00:00:00 2001 From: Michal Luczaj Date: Wed, 19 Nov 2025 15:02:59 +0100 Subject: vsock: Ignore signal/timeout on connect() if already established During connect(), acting on a signal/timeout by disconnecting an already established socket leads to several issues: 1. connect() invoking vsock_transport_cancel_pkt() -> virtio_transport_purge_skbs() may race with sendmsg() invoking virtio_transport_get_credit(). This results in a permanently elevated `vvs->bytes_unsent`. Which, in turn, confuses the SOCK_LINGER handling. 2. connect() resetting a connected socket's state may race with socket being placed in a sockmap. A disconnected socket remaining in a sockmap breaks sockmap's assumptions. And gives rise to WARNs. 3. connect() transitioning SS_CONNECTED -> SS_UNCONNECTED allows for a transport change/drop after TCP_ESTABLISHED. Which poses a problem for any simultaneous sendmsg() or connect() and may result in a use-after-free/null-ptr-deref. Do not disconnect socket on signal/timeout. Keep the logic for unconnected sockets: they don't linger, can't be placed in a sockmap, are rejected by sendmsg(). [1]: https://lore.kernel.org/netdev/e07fd95c-9a38-4eea-9638-133e38c2ec9b@rbox.co/ [2]: https://lore.kernel.org/netdev/20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co/ [3]: https://lore.kernel.org/netdev/60f1b7db-3099-4f6a-875e-af9f6ef194f6@rbox.co/ Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Signed-off-by: Michal Luczaj Reviewed-by: Stefano Garzarella Link: https://patch.msgid.link/20251119-vsock-interrupted-connect-v2-1-70734cf1233f@rbox.co Signed-off-by: Jakub Kicinski --- net/vmw_vsock/af_vsock.c | 40 +++++++++++++++++++++++++++++++--------- 1 file changed, 31 insertions(+), 9 deletions(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 76763247a377..a9ca9c3b87b3 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1661,18 +1661,40 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, timeout = schedule_timeout(timeout); lock_sock(sk); - if (signal_pending(current)) { - err = sock_intr_errno(timeout); - sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; - sock->state = SS_UNCONNECTED; - vsock_transport_cancel_pkt(vsk); - vsock_remove_connected(vsk); - goto out_wait; - } else if ((sk->sk_state != TCP_ESTABLISHED) && (timeout == 0)) { - err = -ETIMEDOUT; + /* Connection established. Whatever happens to socket once we + * release it, that's not connect()'s concern. No need to go + * into signal and timeout handling. Call it a day. + * + * Note that allowing to "reset" an already established socket + * here is racy and insecure. + */ + if (sk->sk_state == TCP_ESTABLISHED) + break; + + /* If connection was _not_ established and a signal/timeout came + * to be, we want the socket's state reset. User space may want + * to retry. + * + * sk_state != TCP_ESTABLISHED implies that socket is not on + * vsock_connected_table. We keep the binding and the transport + * assigned. + */ + if (signal_pending(current) || timeout == 0) { + err = timeout == 0 ? -ETIMEDOUT : sock_intr_errno(timeout); + + /* Listener might have already responded with + * VIRTIO_VSOCK_OP_RESPONSE. Its handling expects our + * sk_state == TCP_SYN_SENT, which hereby we break. + * In such case VIRTIO_VSOCK_OP_RST will follow. + */ sk->sk_state = TCP_CLOSE; sock->state = SS_UNCONNECTED; + + /* Try to cancel VIRTIO_VSOCK_OP_REQUEST skb sent out by + * transport->connect(). + */ vsock_transport_cancel_pkt(vsk); + goto out_wait; } -- cgit