git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-01-10	Merge tag 'drm-fixes-2025-01-11' of https://gitlab.freedesktop.org/drm/kernel	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Regular weekly fixes, this has the usual amdgpu/xe/i915 bits. There is a bigger bunch of mediatek patches that I considered not including at this stage, but all the changes (except for one were obvious small fixes, and the rotation one is a few lines, and I suppose will help someone have their screen up the right way), I decided to include it since I expect it got slowed down by holidays etc, and it's not that mainstream a hw platform. i915: - Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link" amdgpu: - Display interrupt fixes - Fix display max surface mismatches - Fix divide error in DM plane scale calcs - Display divide by 0 checks in dml helpers - SMU 13 AD/DC interrrupt handling fix - Fix locking around buddy trim handling amdkfd: - Fix page fault with shader debugger enabled - Fix eviction fence wq handling xe: - Avoid a NULL ptr deref when wedging - Fix power gate sequence on DG1 mediatek: - Revert "drm/mediatek: dsi: Correct calculation formula of PHY Timing" - Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err - Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb() - Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported - Add support for 180-degree rotation in the display driver - Stop selecting foreign drivers - Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()" - Fix YCbCr422 color format issue for DP - Fix mode valid issue for dp - dp: Reference common DAI properties - dsi: Add registers to pdata to fix MT8186/MT8188 - Remove unneeded semicolon - Add return value check when reading DPCD - Initialize pointer in mtk_drm_of_ddp_path_build_one()" * tag 'drm-fixes-2025-01-11' of https://gitlab.freedesktop.org/drm/kernel: (26 commits) drm/xe/dg1: Fix power gate sequence. drm/xe: Fix tlb invalidation when wedging Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link" drm/amdgpu: Add a lock when accessing the buddy trim function drm/amd/pm: fix BUG: scheduling while atomic drm/amdkfd: wq_release signals dma_fence only when available drm/amd/display: Add check for granularity in dml ceil/floor helpers drm/amdkfd: fixed page fault when enable MES shader debugger drm/amd/display: fix divide error in DM plane scale calcs drm/amd/display: increase MAX_SURFACES to the value supported by hw drm/amd/display: fix page fault due to max surface definition mismatch drm/amd/display: Remove unnecessary amdgpu_irq_get/put drm/mediatek: Initialize pointer in mtk_drm_of_ddp_path_build_one() drm/mediatek: Add return value check when reading DPCD drm/mediatek: Remove unneeded semicolon drm/mediatek: mtk_dsi: Add registers to pdata to fix MT8186/MT8188 dt-bindings: display: mediatek: dp: Reference common DAI properties drm/mediatek: Fix mode valid issue for dp drm/mediatek: Fix YCbCr422 color format issue for DP Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()" ...
2025-01-10	Merge tag 'riscv-for-linus-6.13-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - a handful of selftest fixes - fix a memory leak in relocation processing during module loading - avoid sleeping in die() - fix kprobe instruction slot address calculations - fix DT node reference leak in SBI idle probing - avoid initializing out of bounds pages on sparse vmemmap systems with a gap at the start of their physical memory map - fix backtracing through exceptions - _Q_PENDING_LOOPS is now defined whenever QUEUED_SPINLOCKS=y - local labels in entry.S are now marked with ".L", which prevents them from trashing backtraces - a handful of fixes for SBI-based performance counters * tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: drivers/perf: riscv: Do not allow invalid raw event config drivers/perf: riscv: Return error for default case drivers/perf: riscv: Fix Platform firmware event data tools: selftests: riscv: Add test count for vstate_prctl tools: selftests: riscv: Add pass message for v_initval_nolibc riscv: use local label names instead of global ones in assembly riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition riscv: stacktrace: fix backtracing through exceptions riscv: mm: Fix the out of bound issue of vmemmap address cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu riscv: kprobes: Fix incorrect address calculation riscv: Fix sleeping in invalid context in die() riscv: module: remove relocation_head rel_entry member allocation riscv: selftests: Fix warnings pointer masking test
2025-01-10	drm/amd/display: Initialize denominator defaults to 1	Alex Hung
	[WHAT & HOW] Variables, used as denominators and maybe not assigned to other values, should be initialized to non-zero to avoid DIVIDE_BY_ZERO, as reported by Coverity. Reviewed-by: Austin Zheng <austin.zheng@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e2c4c6c10542ccfe4a0830bb6c9fd5b177b7bbb7)
2025-01-10	drm/amd/display: Use HW lock mgr for PSR1	Tom Chung
	[Why] Without the dmub hw lock, it may cause the lock timeout issue while do modeset on PSR1 eDP panel. [How] Allow dmub hw lock for PSR1. Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a2b5a9956269f4c1a09537177f18ab0229fe79f7)
2025-01-10	drm/amd/display: Remove unnecessary eDP power down	Yiling Chen
	[why] When first time of link training is fail, eDP would be powered down and would not be powered up for next retry link training. It causes that all of retry link linking would be fail. [how] We has extracted both power up and down sequence from enable/disable link output function before DCN32. We remov eDP power down in dcn32_disable_link_output(). Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yiling Chen <yi-ling.chen2@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f5860c88cdfe7300d08c1aef881bba0cac369e34)
2025-01-10	drm/amd/display: Do not elevate mem_type change to full update	Leo Li
	[Why] There should not be any need to revalidate bandwidth on memory placement change, since the fb is expected to be pinned to DCN-accessable memory before scanout. For APU it's DRAM, and DGPU, it's VRAM. However, async flips + memory type change needs to be rejected. [How] Do not set lock_and_validation_needed on mem_type change. Instead, reject an async_flip request if the crtc's buffer(s) changed mem_type. This may fix stuttering/corruption experienced with PSR SU and PSR1 panels, if the compositor allocates fbs in both VRAM carveout and GTT and flips between them. Fixes: a7c0cad0dc06 ("drm/amd/display: ensure async flips are only accepted for fast updates") Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4caacd1671b7a013ad04cd8b6398f002540bdd4d) Cc: stable@vger.kernel.org
2025-01-10	drm/amd/display: Do not wait for PSR disable on vbl enable	Leo Li
	[Why] Outside of a modeset/link configuration change, we should not have to wait for the panel to exit PSR. Depending on the panel and it's state, it may take multiple frames for it to exit PSR. Therefore, waiting in all scenarios may cause perceived stuttering, especially in combination with faster vblank shutdown. [How] PSR1 disable is hooked up to the vblank enable event, and vice versa. In case of vblank enable, do not wait for panel to exit PSR, but still wait in all other cases. We also avoid a call to unnecessarily change power_opts on disable - this ends up sending another command to dmcub fw. When testing against IGT, some crc tests like kms_plane_alpha_blend and amd_hotplug were failing due to CRC timeouts. This was found to be caused by the early return before HW has fully exited PSR1. Fix this by first making sure we grab a vblank reference, then waiting for panel to exit PSR1, before programming hw for CRC generation. Fixes: 58a261bfc967 ("drm/amd/display: use a more lax vblank enable policy for older ASICs") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3743 Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit aa6713fa2046f4c09bf3013dd1420ae15603ca6f) Cc: stable@vger.kernel.org
2025-01-10	Revert "drm/amd/display: Enable urgent latency adjustments for DCN35"	Nicholas Susanto
	Revert commit 284f141f5ce5 ("drm/amd/display: Enable urgent latency adjustments for DCN35") [Why & How] Urgent latency increase caused 2.8K OLED monitor caused it to block this panel support P0. Reverting this change does not reintroduce the netflix corruption issue which it fixed. Fixes: 284f141f5ce5 ("drm/amd/display: Enable urgent latency adjustments for DCN35") Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Nicholas Susanto <Nicholas.Susanto@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c7ccfc0d4241a834c25a9a9e1e78b388b4445d23) Cc: stable@vger.kernel.org
2025-01-10	drm/amd/display: Reduce accessing remote DPCD overhead	Wayne Lin
	[Why] Observed frame rate get dropped by tool like glxgear. Even though the output to monitor is 60Hz, the rendered frame rate drops to 30Hz lower. It's due to code path in some cases will trigger dm_dp_mst_is_port_support_mode() to read out remote Link status to assess the available bandwidth for dsc maniplation. Overhead of keep reading remote DPCD is considerable. [How] Store the remote link BW in mst_local_bw and use end-to-end full_pbn as an indicator to decide whether update the remote link bw or not. Whenever we need the info to assess the BW, visit the stored one first. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3720 Fixes: fa57924c76d9 ("drm/amd/display: Refactor function dm_dp_mst_is_port_support_mode()") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4a9a918545455a5979c6232fcf61ed3d8f0db3ae) Cc: stable@vger.kernel.org
2025-01-10	drm/amd/display: Validate mdoe under MST LCT=1 case as well	Wayne Lin
	[Why & How] Currently in dm_dp_mst_is_port_support_mode(), when valdidating mode under dsc decoding at the last DP link config, we only validate the case when there is an UFP. However, if the MSTB LCT=1, there is no UFP. Under this case, use root_link_bw_in_kbps as the available bw to compare. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3720 Fixes: fa57924c76d9 ("drm/amd/display: Refactor function dm_dp_mst_is_port_support_mode()") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a04d9534a8a75b2806c5321c387be450c364b55e) Cc: stable@vger.kernel.org
2025-01-10	drm/amdgpu/smu13: update powersave optimizations	Alex Deucher
	Only apply when compute profile is selected. This is the only supported configuration. Selecting other profiles can lead to performane degradations. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d477e39532d725b1cdb3c8005c689c74ffbf3b94) Cc: stable@vger.kernel.org # 6.12.x
2025-01-10	Merge tag 'platform-drivers-x86-v6.13-5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: "Fixes and new HW support: - amd/pmc: Match IRQ1 wakeup disable with the enable on i8042 side - intel: power-domains: Clearwater Forest support - intel/pmc: Skip SSRAM setup when no additional devices are present - ISST: Clearwater Forest support" * tag 'platform-drivers-x86-v6.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: intel/pmc: Fix ioremap() of bad address platform/x86: ISST: Add Clearwater Forest to support list platform/x86/intel: power-domains: Add Clearwater Forest support platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it
2025-01-10	Merge tag 'gpio-fixes-for-v6.13-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: "There's one small fix for real HW - gpio-loongson. The rest concern two virtual testing drivers in which some issues were recently found and addressed: - fix resource leaks in error path in gpio-virtuser (and one consistent memory leak triggered on every device removal)) - fix the use-case of having multiple con_ids in a lookup table in gpio-virtuser which has never worked (despite being advertised) - don't allow rmdir() on configfs directories when they are in use in gpio-sim and gpio-virtuser - fix register offsets in gpio-loongson-64" * tag 'gpio-fixes-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: loongson: Fix Loongson-2K2000 ACPI GPIO register offset gpio: sim: lock up configfs that an instantiated device depends on gpio: virtuser: lock up configfs that an instantiated device depends on gpio: virtuser: fix handling of multiple conn_ids in lookup table gpio: virtuser: fix missing lookup table cleanups
2025-01-10	Merge tag 'usb-serial-6.13-rc7' of ↵	Greg Kroah-Hartman
	ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial device ids for 6.13-rc7 Here are some new modem and cp210x device ids. All have been in linux-next with no reported issues. * tag 'usb-serial-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: add Neoway N723-EA support USB: serial: option: add MeiG Smart SRM815 USB: serial: cp210x: add Phoenix Contact UPS Device
2025-01-10	Merge tag 'mediatek-drm-fixes-20250104' of ↵	Dave Airlie
	https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20250104 1. Revert "drm/mediatek: dsi: Correct calculation formula of PHY Timing" 2. Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err 3. Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb() 4. Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported 5. Add support for 180-degree rotation in the display driver 6. Stop selecting foreign drivers 7. Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()" 8. Fix YCbCr422 color format issue for DP 9. Fix mode valid issue for dp 10. dp: Reference common DAI properties 11. dsi: Add registers to pdata to fix MT8186/MT8188 12. Remove unneeded semicolon 13. Add return value check when reading DPCD 14. Initialize pointer in mtk_drm_of_ddp_path_build_one() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250104124227.45505-1-chunkuang.hu@kernel.org
2025-01-10	Merge tag 'drm-xe-fixes-2025-01-09' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Avoid a NULL ptr deref when wedging (Lucas) - Fix power gate sequence on DG1 (Rodrigo) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z4AcqP3Io_r0pEsR@fedora
2025-01-10	Merge tag 'amd-drm-fixes-6.13-2025-01-09' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.13-2025-01-09: amdgpu: - Display interrupt fixes - Fix display max surface mismatches - Fix divide error in DM plane scale calcs - Display divide by 0 checks in dml helpers - SMU 13 AD/DC interrrupt handling fix - Fix locking around buddy trim handling amdkfd: - Fix page fault with shader debugger enabled - Fix eviction fence wq handling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250109164236.477295-1-alexander.deucher@amd.com
2025-01-09	net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()	Sudheer Kumar Doredla
	CPSW ALE has 75-bit ALE entries stored across three 32-bit words. The cpsw_ale_get_field() and cpsw_ale_set_field() functions support ALE field entries spanning up to two words at the most. The cpsw_ale_get_field() and cpsw_ale_set_field() functions work as expected when ALE field spanned across word1 and word2, but fails when ALE field spanned across word2 and word3. For example, while reading the ALE field spanned across word2 and word3 (i.e. bits 62 to 64), the word3 data shifted to an incorrect position due to the index becoming zero while flipping. The same issue occurred when setting an ALE entry. This issue has not been seen in practice but will be an issue in the future if the driver supports accessing ALE fields spanning word2 and word3 Fix the methods to handle getting/setting fields spanning up to two words. Fixes: b685f1a58956 ("net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field()") Signed-off-by: Sudheer Kumar Doredla <s-doredla@ti.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Link: https://patch.msgid.link/20250108172433.311694-1-s-doredla@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	scsi: iscsi: Fix redundant response for ISCSI_UEVENT_GET_HOST_STATS request	Xiang Zhang
	The ISCSI_UEVENT_GET_HOST_STATS request is already handled in iscsi_get_host_stats(). This fix ensures that redundant responses are skipped in iscsi_if_rx(). - On success: send reply and stats from iscsi_get_host_stats() within if_recv_msg(). - On error: fall through. Signed-off-by: Xiang Zhang <hawkxiang.cpp@gmail.com> Link: https://lore.kernel.org/r/20250107022432.65390-1-hawkxiang.cpp@gmail.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-09	scsi: core: Fix command pass through retry regression	Mike Christie
	scsi_check_passthrough() is always called, but it doesn't check for if a command completed successfully. As a result, if a command was successful and the caller used SCMD_FAILURE_RESULT_ANY to indicate what failures it wanted to retry, we will end up retrying the command. This will cause delays during device discovery because of the command being sent multiple times. For some USB devices it can also cause the wrong device size to be used. This patch adds a check for if the command was successful. If it is we return immediately instead of trying to match a failure. Fixes: 994724e6b3f0 ("scsi: core: Allow passthrough to request midlayer retries") Reported-by: Kris Karas <bugs-a21@moonlit-rail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219652 Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20250107010220.7215-1-michael.christie@oracle.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-09	Merge tag 'net-6.13-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, Bluetooth and WPAN. No outstanding fixes / investigations at this time. Current release - new code bugs: - eth: fbnic: revert HWMON support, it doesn't work at all and revert is similar size as the fixes Previous releases - regressions: - tcp: allow a connection when sk_max_ack_backlog is zero - tls: fix tls_sw_sendmsg error handling Previous releases - always broken: - netdev netlink family: - prevent accessing NAPI instances from another namespace - don't dump Tx and uninitialized NAPIs - net: sysctl: avoid using current->nsproxy, fix null-deref if task is exiting and stick to opener's netns - sched: sch_cake: add bounds checks to host bulk flow fairness counts Misc: - annual cleanup of inactive maintainers" * tag 'net-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits) rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy sctp: sysctl: udp_port: avoid using current->nsproxy sctp: sysctl: auth_enable: avoid using current->nsproxy sctp: sysctl: rto_min/max: avoid using current->nsproxy sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy mptcp: sysctl: blackhole timeout: avoid using current->nsproxy mptcp: sysctl: sched: avoid using current->nsproxy mptcp: sysctl: avail sched: remove write access MAINTAINERS: remove Lars Povlsen from Microchip Sparx5 SoC MAINTAINERS: remove Noam Dagan from AMAZON ETHERNET MAINTAINERS: remove Ying Xue from TIPC MAINTAINERS: remove Mark Lee from MediaTek Ethernet MAINTAINERS: mark stmmac ethernet as an Orphan MAINTAINERS: remove Andy Gospodarek from bonding MAINTAINERS: update maintainers for Microchip LAN78xx MAINTAINERS: mark Synopsys DW XPCS as Orphan net/mlx5: Fix variable not being completed when function returns rtase: Fix a check for error in rtase_alloc_msix() net: stmmac: dwmac-tegra: Read iommu stream id from device tree ...
2025-01-09	Merge patch series "SBI PMU event related fixes"	Palmer Dabbelt
	Atish Patra <atishp@rivosinc.com> says: Here are two minor improvement/fixes in the PMU event path. The first patch was part of the series[1]. The 2nd patch was suggested during the series review. While the series can only be merged once SBI v3.0 is frozen, these two patches can be independent of SBI v3.0 and can be merged sooner. Hence, these two patches are sent as a separate series. * b4-shazam-merge: drivers/perf: riscv: Do not allow invalid raw event config drivers/perf: riscv: Return error for default case drivers/perf: riscv: Fix Platform firmware event data Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-0-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09	drivers/perf: riscv: Do not allow invalid raw event config	Atish Patra
	The SBI specification allows only lower 48bits of hpmeventX to be configured via SBI PMU. Currently, the driver masks of the higher bits but doesn't return an error. This will lead to an additional SBI call for config matching which should return for an invalid event error in most of the cases. However, if a platform(i.e Rocket and sifive cores) implements a bitmap of all bits in the event encoding this will lead to an incorrect event being programmed leading to user confusion. Report the error to the user if higher bits are set during the event mapping itself to avoid the confusion and save an additional SBI call. Suggested-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-3-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09	drivers/perf: riscv: Return error for default case	Atish Patra
	If the upper two bits has an invalid valid (0x1), the event mapping is not reliable as it returns an uninitialized variable. Return appropriate value for the default case. Fixes: f0c9363db2dd ("perf/riscv-sbi: Add platform specific firmware event handling") Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-2-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09	drivers/perf: riscv: Fix Platform firmware event data	Atish Patra
	Platform firmware event data field is allowed to be 62 bits for Linux as uppper most two bits are reserved to indicate SBI fw or platform specific firmware events. However, the event data field is masked as per the hardware raw event mask which is not correct. Fix the platform firmware event data field with proper mask. Fixes: f0c9363db2dd ("perf/riscv-sbi: Add platform specific firmware event handling") Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09	net/mlx5: Fix variable not being completed when function returns	Chenguang Zhao
	When cmd_alloc_index(), fails cmd_work_handler() needs to complete ent->slotted before returning early. Otherwise the task which issued the command may hang: mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry INFO: task kworker/13:2:4055883 blocked for more than 120 seconds. Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/13:2 D 0 4055883 2 0x00000228 Workqueue: events mlx5e_tx_dim_work [mlx5_core] Call trace: __switch_to+0xe8/0x150 __schedule+0x2a8/0x9b8 schedule+0x2c/0x88 schedule_timeout+0x204/0x478 wait_for_common+0x154/0x250 wait_for_completion+0x28/0x38 cmd_exec+0x7a0/0xa00 [mlx5_core] mlx5_cmd_exec+0x54/0x80 [mlx5_core] mlx5_core_modify_cq+0x6c/0x80 [mlx5_core] mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core] mlx5e_tx_dim_work+0x54/0x68 [mlx5_core] process_one_work+0x1b0/0x448 worker_thread+0x54/0x468 kthread+0x134/0x138 ret_from_fork+0x10/0x18 Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore") Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250108030009.68520-1-zhaochenguang@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	rtase: Fix a check for error in rtase_alloc_msix()	Dan Carpenter
	The pci_irq_vector() function never returns zero. It returns negative error codes or a positive non-zero IRQ number. Fix the error checking to test for negatives. Fixes: a36e9f5cfe9e ("rtase: Add support for a pci table in this module") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/f2ecc88d-af13-4651-9820-7cc665230019@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	net: stmmac: dwmac-tegra: Read iommu stream id from device tree	Parker Newman
	Nvidia's Tegra MGBE controllers require the IOMMU "Stream ID" (SID) to be written to the MGBE_WRAP_AXI_ASID0_CTRL register. The current driver is hard coded to use MGBE0's SID for all controllers. This causes softirq time outs and kernel panics when using controllers other than MGBE0. Example dmesg errors when an ethernet cable is connected to MGBE1: [ 116.133290] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx [ 121.851283] tegra-mgbe 6910000.ethernet eth1: NETDEV WATCHDOG: CPU: 5: transmit queue 0 timed out 5690 ms [ 121.851782] tegra-mgbe 6910000.ethernet eth1: Reset adapter. [ 121.892464] tegra-mgbe 6910000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0 [ 121.905920] tegra-mgbe 6910000.ethernet eth1: PHY [stmmac-1:00] driver [Aquantia AQR113] (irq=171) [ 121.907356] tegra-mgbe 6910000.ethernet eth1: Enabling Safety Features [ 121.907578] tegra-mgbe 6910000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported [ 121.908399] tegra-mgbe 6910000.ethernet eth1: registered PTP clock [ 121.908582] tegra-mgbe 6910000.ethernet eth1: configuring for phy/10gbase-r link mode [ 125.961292] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx [ 181.921198] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 181.921404] rcu: 7-....: (1 GPs behind) idle=540c/1/0x4000000000000002 softirq=1748/1749 fqs=2337 [ 181.921684] rcu: (detected by 4, t=6002 jiffies, g=1357, q=1254 ncpus=8) [ 181.921878] Sending NMI from CPU 4 to CPUs 7: [ 181.921886] NMI backtrace for cpu 7 [ 181.922131] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6 [ 181.922390] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024 [ 181.922658] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 181.922847] pc : handle_softirqs+0x98/0x368 [ 181.922978] lr : __do_softirq+0x18/0x20 [ 181.923095] sp : ffff80008003bf50 [ 181.923189] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000 [ 181.923379] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0 [ 181.924486] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70 [ 181.925568] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000 [ 181.926655] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000 [ 181.931455] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d [ 181.938628] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160 [ 181.945804] x8 : ffff8000827b3160 x7 : f9157b241586f343 x6 : eeb6502a01c81c74 [ 181.953068] x5 : a4acfcdd2e8096bb x4 : ffffce78ea277340 x3 : 00000000ffffd1e1 [ 181.960329] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000 [ 181.967591] Call trace: [ 181.970043] handle_softirqs+0x98/0x368 (P) [ 181.974240] __do_softirq+0x18/0x20 [ 181.977743] ____do_softirq+0x14/0x28 [ 181.981415] call_on_irq_stack+0x24/0x30 [ 181.985180] do_softirq_own_stack+0x20/0x30 [ 181.989379] __irq_exit_rcu+0x114/0x140 [ 181.993142] irq_exit_rcu+0x14/0x28 [ 181.996816] el1_interrupt+0x44/0xb8 [ 182.000316] el1h_64_irq_handler+0x14/0x20 [ 182.004343] el1h_64_irq+0x80/0x88 [ 182.007755] cpuidle_enter_state+0xc4/0x4a8 (P) [ 182.012305] cpuidle_enter+0x3c/0x58 [ 182.015980] cpuidle_idle_call+0x128/0x1c0 [ 182.020005] do_idle+0xe0/0xf0 [ 182.023155] cpu_startup_entry+0x3c/0x48 [ 182.026917] secondary_start_kernel+0xdc/0x120 [ 182.031379] __secondary_switched+0x74/0x78 [ 212.971162] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-.... } 6103 jiffies s: 417 root: 0x80/. [ 212.985935] rcu: blocking rcu_node structures (internal RCU debug): [ 212.992758] Sending NMI from CPU 0 to CPUs 7: [ 212.998539] NMI backtrace for cpu 7 [ 213.004304] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6 [ 213.016116] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024 [ 213.030817] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 213.040528] pc : handle_softirqs+0x98/0x368 [ 213.046563] lr : __do_softirq+0x18/0x20 [ 213.051293] sp : ffff80008003bf50 [ 213.055839] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000 [ 213.067304] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0 [ 213.077014] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70 [ 213.087339] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000 [ 213.097313] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000 [ 213.107201] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d [ 213.116651] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160 [ 213.127500] x8 : ffff8000827b3160 x7 : 0a37b344852820af x6 : 3f049caedd1ff608 [ 213.138002] x5 : cff7cfdbfaf31291 x4 : ffffce78ea277340 x3 : 00000000ffffde04 [ 213.150428] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000 [ 213.162063] Call trace: [ 213.165494] handle_softirqs+0x98/0x368 (P) [ 213.171256] __do_softirq+0x18/0x20 [ 213.177291] ____do_softirq+0x14/0x28 [ 213.182017] call_on_irq_stack+0x24/0x30 [ 213.186565] do_softirq_own_stack+0x20/0x30 [ 213.191815] __irq_exit_rcu+0x114/0x140 [ 213.196891] irq_exit_rcu+0x14/0x28 [ 213.202401] el1_interrupt+0x44/0xb8 [ 213.207741] el1h_64_irq_handler+0x14/0x20 [ 213.213519] el1h_64_irq+0x80/0x88 [ 213.217541] cpuidle_enter_state+0xc4/0x4a8 (P) [ 213.224364] cpuidle_enter+0x3c/0x58 [ 213.228653] cpuidle_idle_call+0x128/0x1c0 [ 213.233993] do_idle+0xe0/0xf0 [ 213.237928] cpu_startup_entry+0x3c/0x48 [ 213.243791] secondary_start_kernel+0xdc/0x120 [ 213.249830] __secondary_switched+0x74/0x78 This bug has existed since the dwmac-tegra driver was added in Dec 2022 (See Fixes tag below for commit hash). The Tegra234 SOC has 4 MGBE controllers, however Nvidia's Developer Kit only uses MGBE0 which is why the bug was not found previously. Connect Tech has many products that use 2 (or more) MGBE controllers. The solution is to read the controller's SID from the existing "iommus" device tree property. The 2nd field of the "iommus" device tree property is the controller's SID. Device tree snippet from tegra234.dtsi showing MGBE1's "iommus" property: smmu_niso0: iommu@12000000 { compatible = "nvidia,tegra234-smmu", "nvidia,smmu-500"; ... } /* MGBE1 */ ethernet@6900000 { compatible = "nvidia,tegra234-mgbe"; ... iommus = <&smmu_niso0 TEGRA234_SID_MGBE_VF1>; ... } Nvidia's arm-smmu driver reads the "iommus" property and stores the SID in the MGBE device's "fwspec" struct. The dwmac-tegra driver can access the SID using the tegra_dev_iommu_get_stream_id() helper function found in linux/iommu.h. Calling tegra_dev_iommu_get_stream_id() should not fail unless the "iommus" property is removed from the device tree or the IOMMU is disabled. While the Tegra234 SOC technically supports bypassing the IOMMU, it is not supported by the current firmware, has not been tested and not recommended. More detailed discussion with Thierry Reding from Nvidia linked below. Fixes: d8ca113724e7 ("net: stmmac: tegra: Add MGBE support") Link: https://lore.kernel.org/netdev/cover.1731685185.git.pnewman@connecttech.com Signed-off-by: Parker Newman <pnewman@connecttech.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/6fb97f32cf4accb4f7cf92846f6b60064ba0a3bd.1736284360.git.pnewman@connecttech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	mctp i3c: fix MCTP I3C driver multi-thread issue	Leo Yang
	We found a timeout problem with the pldm command on our system. The reason is that the MCTP-I3C driver has a race condition when receiving multiple-packet messages in multi-thread, resulting in a wrong packet order problem. We identified this problem by adding a debug message to the mctp_i3c_read function. According to the MCTP spec, a multiple-packet message must be composed in sequence, and if there is a wrong sequence, the whole message will be discarded and wait for the next SOM. For example, SOM → Pkt Seq #2 → Pkt Seq #1 → Pkt Seq #3 → EOM. Therefore, we try to solve this problem by adding a mutex to the mctp_i3c_read function. Before the modification, when a command requesting a multiple-packet message response is sent consecutively, an error usually occurs within 100 loops. After the mutex, it can go through 40000 loops without any error, and it seems to run well. Fixes: c8755b29b58e ("mctp i3c: MCTP I3C driver") Signed-off-by: Leo Yang <Leo-Yang@quantatw.com> Link: https://patch.msgid.link/20250107031529.3296094-1-Leo-Yang@quantatw.com [pabeni@redhat.com: dropped already answered question from changelog] Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-09	i2c: atr: Fix client detach	Tomi Valkeinen
	i2c-atr catches the BUS_NOTIFY_DEL_DEVICE event on the bus and removes the translation by calling i2c_atr_detach_client(). However, BUS_NOTIFY_DEL_DEVICE happens when the device is about to be removed from this bus, i.e. before removal, and thus before calling .remove() on the driver. If the driver happens to do any i2c transactions in its remove(), they will fail. Fix this by catching BUS_NOTIFY_REMOVED_DEVICE instead, thus removing the translation only after the device is actually removed. Fixes: a076a860acae ("media: i2c: add I2C Address Translator (ATR) support") Cc: stable@vger.kernel.org Signed-off-by: Tomi Valkeinen <tomi.valkeinen+renesas@ideasonboard.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Romain Gantois <romain.gantois@bootlin.com> Tested-by: Romain Gantois <romain.gantois@bootlin.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2025-01-09	i2c: core: fix reference leak in i2c_register_adapter()	Joe Hattori
	The reference count of the device incremented in device_initialize() is not decremented when device_add() fails. Add a put_device() call before returning from the function. This bug was found by an experimental static analysis tool that I am developing. Fixes: 60f68597024d ("i2c: core: Setup i2c_adapter runtime-pm before calling device_add()") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2025-01-09	drm/xe/dg1: Fix power gate sequence.	Rodrigo Vivi
	sub-pipe PG is not present on DG1. Setting these bits can disable other power gates and cause GPU hangs on video playbacks. VLK: 16314, 4304 Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13381 Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241219235536.454270-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 2f12e9c029315c1400059b2e7fdf53117c09c3a9) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-01-09	drm/xe: Fix tlb invalidation when wedging	Lucas De Marchi
	If GuC fails to load, the driver wedges, but in the process it tries to do stuff that may not be initialized yet. This moves the xe_gt_tlb_invalidation_init() to be done earlier: as its own doc says, it's a software-only initialization and should had been named with the _early() suffix. Move it to be called by xe_gt_init_early(), so the locks and seqno are initialized, avoiding a NULL ptr deref when wedging: xe 0000:03:00.0: [drm] ERROR GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01 xe 0000:03:00.0: [drm] ERROR GT0: firmware signature verification failed xe 0000:03:00.0: [drm] ERROR CRITICAL: Xe has declared device 0000:03:00.0 as wedged. ... BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 UID: 0 PID: 3908 Comm: modprobe Tainted: G U W 6.13.0-rc4-xe+ #3 Tainted: [U]=USER, [W]=WARN Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DDR5 UDIMM CRB, BIOS ADLSFWI1.R00.3275.A00.2207010640 07/01/2022 RIP: 0010:xe_gt_tlb_invalidation_reset+0x75/0x110 [xe] This can be easily triggered by poking the GuC binary to force a signature failure. There will still be an extra message, xe 0000:03:00.0: [drm] ERROR GT0: GuC mmio request 0x4100: no reply 0x4100 but that's better than a NULL ptr deref. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3956 Fixes: c9474b726b93 ("drm/xe: Wedge the entire device") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250103001111.331684-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 5001ef3af8f2c972d6fd9c5221a8457556f8bea6) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-01-08	Merge branch '100GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-01-07 (ice, igc) For ice: Arkadiusz corrects mask value being used to determine DPLL phase range. Przemyslaw corrects frequency value for E823 devices. For igc: En-Wei Wu adds a check and, early, return for failed register read. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: return early when failing to read EECD register ice: fix incorrect PHY settings for 100 GB/s ice: fix max values for dpll pin phase adjust ==================== Link: https://patch.msgid.link/20250107190150.1758577-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08	Merge tag 'for-net-2025-01-08' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - btmtk: Fix failed to send func ctrl for MediaTek devices. - hci_sync: Fix not setting Random Address when required - MGMT: Fix Add Device to responding before completing - btnxpuart: Fix driver sending truncated data * tag 'for-net-2025-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: btmtk: Fix failed to send func ctrl for MediaTek devices. Bluetooth: btnxpuart: Fix driver sending truncated data Bluetooth: MGMT: Fix Add Device to responding before completing Bluetooth: hci_sync: Fix not setting Random Address when required ==================== Link: https://patch.msgid.link/20250108162627.1623760-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08	Merge tag 'scsi-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four driver fixes in UFS, mostly to do with power management" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: qcom: Power down the controller/device during system suspend for SM8550/SM8650 SoCs scsi: ufs: qcom: Allow passing platform specific OF data scsi: ufs: core: Honor runtime/system PM levels if set by host controller drivers scsi: ufs: qcom: Power off the PHY if it was already powered on in ufs_qcom_power_up_sequence()
2025-01-08	cpuidle: riscv-sbi: fix device node release in early exit of ↵	Javier Carrasco
	for_each_possible_cpu The 'np' device_node is initialized via of_cpu_device_node_get(), which requires explicit calls to of_node_put() when it is no longer required to avoid leaking the resource. Instead of adding the missing calls to of_node_put() in all execution paths, use the cleanup attribute for 'np' by means of the __free() macro, which automatically calls of_node_put() when the variable goes out of scope. Given that 'np' is only used within the for_each_possible_cpu(), reduce its scope to release the nood after every iteration of the loop. Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver") Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://lore.kernel.org/r/20241116-cpuidle-riscv-sbi-cleanup-v3-1-a3a46372ce08@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08	net: hns3: fix kernel crash when 1588 is sent on HIP08 devices	Jie Wang
	Currently, HIP08 devices does not register the ptp devices, so the hdev->ptp is NULL. But the tx process would still try to set hardware time stamp info with SKBTX_HW_TSTAMP flag and cause a kernel crash. [ 128.087798] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 ... [ 128.280251] pc : hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.286600] lr : hclge_ptp_set_tx_info+0x20/0x140 [hclge] [ 128.292938] sp : ffff800059b93140 [ 128.297200] x29: ffff800059b93140 x28: 0000000000003280 [ 128.303455] x27: ffff800020d48280 x26: ffff0cb9dc814080 [ 128.309715] x25: ffff0cb9cde93fa0 x24: 0000000000000001 [ 128.315969] x23: 0000000000000000 x22: 0000000000000194 [ 128.322219] x21: ffff0cd94f986000 x20: 0000000000000000 [ 128.328462] x19: ffff0cb9d2a166c0 x18: 0000000000000000 [ 128.334698] x17: 0000000000000000 x16: ffffcf1fc523ed24 [ 128.340934] x15: 0000ffffd530a518 x14: 0000000000000000 [ 128.347162] x13: ffff0cd6bdb31310 x12: 0000000000000368 [ 128.353388] x11: ffff0cb9cfbc7070 x10: ffff2cf55dd11e02 [ 128.359606] x9 : ffffcf1f85a212b4 x8 : ffff0cd7cf27dab0 [ 128.365831] x7 : 0000000000000a20 x6 : ffff0cd7cf27d000 [ 128.372040] x5 : 0000000000000000 x4 : 000000000000ffff [ 128.378243] x3 : 0000000000000400 x2 : ffffcf1f85a21294 [ 128.384437] x1 : ffff0cb9db520080 x0 : ffff0cb9db500080 [ 128.390626] Call trace: [ 128.393964] hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.399893] hns3_nic_net_xmit+0x39c/0x4c4 [hns3] [ 128.405468] xmit_one.constprop.0+0xc4/0x200 [ 128.410600] dev_hard_start_xmit+0x54/0xf0 [ 128.415556] sch_direct_xmit+0xe8/0x634 [ 128.420246] __dev_queue_xmit+0x224/0xc70 [ 128.425101] dev_queue_xmit+0x1c/0x40 [ 128.429608] ovs_vport_send+0xac/0x1a0 [openvswitch] [ 128.435409] do_output+0x60/0x17c [openvswitch] [ 128.440770] do_execute_actions+0x898/0x8c4 [openvswitch] [ 128.446993] ovs_execute_actions+0x64/0xf0 [openvswitch] [ 128.453129] ovs_dp_process_packet+0xa0/0x224 [openvswitch] [ 128.459530] ovs_vport_receive+0x7c/0xfc [openvswitch] [ 128.465497] internal_dev_xmit+0x34/0xb0 [openvswitch] [ 128.471460] xmit_one.constprop.0+0xc4/0x200 [ 128.476561] dev_hard_start_xmit+0x54/0xf0 [ 128.481489] __dev_queue_xmit+0x968/0xc70 [ 128.486330] dev_queue_xmit+0x1c/0x40 [ 128.490856] ip_finish_output2+0x250/0x570 [ 128.495810] __ip_finish_output+0x170/0x1e0 [ 128.500832] ip_finish_output+0x3c/0xf0 [ 128.505504] ip_output+0xbc/0x160 [ 128.509654] ip_send_skb+0x58/0xd4 [ 128.513892] udp_send_skb+0x12c/0x354 [ 128.518387] udp_sendmsg+0x7a8/0x9c0 [ 128.522793] inet_sendmsg+0x4c/0x8c [ 128.527116] __sock_sendmsg+0x48/0x80 [ 128.531609] __sys_sendto+0x124/0x164 [ 128.536099] __arm64_sys_sendto+0x30/0x5c [ 128.540935] invoke_syscall+0x50/0x130 [ 128.545508] el0_svc_common.constprop.0+0x10c/0x124 [ 128.551205] do_el0_svc+0x34/0xdc [ 128.555347] el0_svc+0x20/0x30 [ 128.559227] el0_sync_handler+0xb8/0xc0 [ 128.563883] el0_sync+0x160/0x180 Fixes: 0bf5eb788512 ("net: hns3: add support for PTP") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106143642.539698-8-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08	net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue	Hao Lan
	The TQP BAR space is divided into two segments. TQPs 0-1023 and TQPs 1024-1279 are in different BAR space addresses. However, hclge_fetch_pf_reg does not distinguish the tqp space information when reading the tqp space information. When the number of TQPs is greater than 1024, access bar space overwriting occurs. The problem of different segments has been considered during the initialization of tqp.io_base. Therefore, tqp.io_base is directly used when the queue is read in hclge_fetch_pf_reg. The error message: Unable to handle kernel paging request at virtual address ffff800037200000 pc : hclge_fetch_pf_reg+0x138/0x250 [hclge] lr : hclge_get_regs+0x84/0x1d0 [hclge] Call trace: hclge_fetch_pf_reg+0x138/0x250 [hclge] hclge_get_regs+0x84/0x1d0 [hclge] hns3_get_regs+0x2c/0x50 [hns3] ethtool_get_regs+0xf4/0x270 dev_ethtool+0x674/0x8a0 dev_ioctl+0x270/0x36c sock_do_ioctl+0x110/0x2a0 sock_ioctl+0x2ac/0x530 __arm64_sys_ioctl+0xa8/0x100 invoke_syscall+0x4c/0x124 el0_svc_common.constprop.0+0x140/0x15c do_el0_svc+0x30/0xd0 el0_svc+0x1c/0x2c el0_sync_handler+0xb0/0xb4 el0_sync+0x168/0x180 Fixes: 939ccd107ffc ("net: hns3: move dump regs function to a separate file") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-7-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08	net: hns3: initialize reset_timer before hclgevf_misc_irq_init()	Jian Shen
	Currently the misc irq is initialized before reset_timer setup. But it will access the reset_timer in the irq handler. So initialize the reset_timer earlier. Fixes: ff200099d271 ("net: hns3: remove unnecessary work in hclgevf_main") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-6-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08	net: hns3: don't auto enable misc vector	Jian Shen
	Currently, there is a time window between misc irq enabled and service task inited. If an interrupte is reported at this time, it will cause warning like below: [ 16.324639] Call trace: [ 16.324641] __queue_delayed_work+0xb8/0xe0 [ 16.324643] mod_delayed_work_on+0x78/0xd0 [ 16.324655] hclge_errhand_task_schedule+0x58/0x90 [hclge] [ 16.324662] hclge_misc_irq_handle+0x168/0x240 [hclge] [ 16.324666] __handle_irq_event_percpu+0x64/0x1e0 [ 16.324667] handle_irq_event+0x80/0x170 [ 16.324670] handle_fasteoi_edge_irq+0x110/0x2bc [ 16.324671] __handle_domain_irq+0x84/0xfc [ 16.324673] gic_handle_irq+0x88/0x2c0 [ 16.324674] el1_irq+0xb8/0x140 [ 16.324677] arch_cpu_idle+0x18/0x40 [ 16.324679] default_idle_call+0x5c/0x1bc [ 16.324682] cpuidle_idle_call+0x18c/0x1c4 [ 16.324684] do_idle+0x174/0x17c [ 16.324685] cpu_startup_entry+0x30/0x6c [ 16.324687] secondary_start_kernel+0x1a4/0x280 [ 16.324688] ---[ end trace 6aa0bff672a964aa ]--- So don't auto enable misc vector when request irq.. Fixes: 7be1b9f3e99f ("net: hns3: make hclge_service use delayed workqueue") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08	net: hns3: Resolved the issue that the debugfs query result is inconsistent.	Hao Lan
	This patch modifies the implementation of debugfs: When the user process stops unexpectedly, not all data of the file system is read. In this case, the save_buf pointer is not released. When the user process is called next time, save_buf is used to copy the cached data to the user space. As a result, the queried data is stale. To solve this problem, this patch implements .open() and .release() handler for debugfs file_operations. moving allocation buffer and execution of the cmd to the .open() handler and freeing in to the .release() handler. Allocate separate buffer for each reader and associate the buffer with the file pointer. When different user read processes no longer share the buffer, the stale data problem is fixed. Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Guangwei Zhang <zhangwangwei6@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106143642.539698-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08	net: hns3: fix missing features due to dev->features configuration too early	Hao Lan
	Currently, the netdev->features is configured in hns3_nic_set_features. As a result, __netdev_update_features considers that there is no feature difference, and the procedures of the real features are missing. Fixes: 2a7556bb2b73 ("net: hns3: implement ndo_features_check ops for hns3 driver") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106143642.539698-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08	net: hns3: fixed reset failure issues caused by the incorrect reset type	Hao Lan
	When a reset type that is not supported by the driver is input, a reset pending flag bit of the HNAE3_NONE_RESET type is generated in reset_pending. The driver does not have a mechanism to clear this type of error. As a result, the driver considers that the reset is not complete. This patch provides a mechanism to clear the HNAE3_NONE_RESET flag and the parameter of hnae3_ae_ops.set_default_reset_request is verified. The error message: hns3 0000:39:01.0: cmd failed -16 hns3 0000:39:01.0: hclge device re-init failed, VF is disabled! hns3 0000:39:01.0: failed to reset VF stack hns3 0000:39:01.0: failed to reset VF(4) hns3 0000:39:01.0: prepare reset(2) wait done hns3 0000:39:01.0 eth4: already uninitialized Use the crash tool to view struct hclgevf_dev: struct hclgevf_dev { ... default_reset_request = 0x20, reset_level = HNAE3_NONE_RESET, reset_pending = 0x100, reset_type = HNAE3_NONE_RESET, ... }; Fixes: 720bd5837e37 ("net: hns3: add set_default_reset_request in the hnae3_ae_ops") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08	Merge tag 'for-6.13/dm-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mikulas Patocka: - dm-array fixes - dm-verity forward error correction fixes - remove the flag DM_TARGET_PASSES_INTEGRITY from dm-ebs - dm-thin RCU list fix * tag 'for-6.13/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm thin: make get_first_thin use rcu-safe list first function dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITY dm-verity FEC: Avoid copying RS parity bytes twice. dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2) dm array: fix cursor index when skipping across block boundaries dm array: fix unreleased btree blocks on closing a faulty array cursor dm array: fix releasing a faulty array block twice in dm_array_cursor_end
2025-01-08	Bluetooth: btmtk: Fix failed to send func ctrl for MediaTek devices.	Chris Lu
	Use usb_autopm_get_interface() and usb_autopm_put_interface() in btmtk_usb_shutdown(), it could send func ctrl after enabling autosuspend. Bluetooth: btmtk_usb_hci_wmt_sync() hci0: Execution of wmt command timed out Bluetooth: btmtk_usb_shutdown() hci0: Failed to send wmt func ctrl (-110) Fixes: 5c5e8c52e3ca ("Bluetooth: btmtk: move btusb_mtk_[setup, shutdown] to btmtk.c") Signed-off-by: Chris Lu <chris.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-08	Bluetooth: btnxpuart: Fix driver sending truncated data	Neeraj Sanjay Kale
	This fixes the apparent controller hang issue seen during stress test where the host sends a truncated payload, followed by HCI commands. The controller treats these HCI commands as a part of previously truncated payload, leading to command timeouts. Adding a serdev_device_wait_until_sent() call after serdev_device_write_buf() fixed the issue. Fixes: 689ca16e5232 ("Bluetooth: NXP: Add protocol support for NXP Bluetooth chipsets") Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-08	misc: microchip: pci1xxxx: Resolve return code mismatch during GPIO set config	Rengarajan S
	Driver returns -EOPNOTSUPPORTED on unsupported parameters case in set config. Upper level driver checks for -ENOTSUPP. Because of the return code mismatch, the ioctls from userspace fail. Resolve the issue by passing -ENOTSUPP during unsupported case. Fixes: 7d3e4d807df2 ("misc: microchip: pci1xxxx: load gpio driver for the gpio controller auxiliary device enumerated by the auxiliary bus driver.") Cc: stable <stable@kernel.org> Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20241205133626.1483499-3-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08	misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handling	Rengarajan S
	Resolve kernel panic caused by improper handling of IRQs while accessing GPIO values. This is done by replacing generic_handle_irq with handle_nested_irq. Fixes: 1f4d8ae231f4 ("misc: microchip: pci1xxxx: Add gpio irq handler and irq helper functions irq_ack, irq_mask, irq_unmask and irq_set_type of irq_chip.") Cc: stable <stable@kernel.org> Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20241205133626.1483499-2-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08	dm thin: make get_first_thin use rcu-safe list first function	Krister Johansen
	The documentation in rculist.h explains the absence of list_empty_rcu() and cautions programmers against relying on a list_empty() -> list_first() sequence in RCU safe code. This is because each of these functions performs its own READ_ONCE() of the list head. This can lead to a situation where the list_empty() sees a valid list entry, but the subsequent list_first() sees a different view of list head state after a modification. In the case of dm-thin, this author had a production box crash from a GP fault in the process_deferred_bios path. This function saw a valid list head in get_first_thin() but when it subsequently dereferenced that and turned it into a thin_c, it got the inside of the struct pool, since the list was now empty and referring to itself. The kernel on which this occurred printed both a warning about a refcount_t being saturated, and a UBSAN error for an out-of-bounds cpuid access in the queued spinlock, prior to the fault itself. When the resulting kdump was examined, it was possible to see another thread patiently waiting in thin_dtr's synchronize_rcu. The thin_dtr call managed to pull the thin_c out of the active thins list (and have it be the last entry in the active_thins list) at just the wrong moment which lead to this crash. Fortunately, the fix here is straight forward. Switch get_first_thin() function to use list_first_or_null_rcu() which performs just a single READ_ONCE() and returns NULL if the list is already empty. This was run against the devicemapper test suite's thin-provisioning suites for delete and suspend and no regressions were observed. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Fixes: b10ebd34ccca ("dm thin: fix rcu_read_lock being held in code that can sleep") Cc: stable@vger.kernel.org Acked-by: Ming-Hung Tsai <mtsai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>