summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-06-03Merge tag 'trace-v6.16-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix UAF in module unload in ftrace when there's a bug in the module If a module is buggy and triggers ftrace_disable which is set when an anomaly is detected, when it gets unloaded it doesn't free the hooks into kallsyms, and when a kallsyms lookup is performed it may access the mod->modname field and crash via UAF. Fix this by still freeing the mod_maps that are attached to kallsyms on module unload regardless if ftrace_disable is set or not. - Do not bother allocating mod_maps for kallsyms if ftrace_disable is set - Remove unused trace events When a trace event or tracepoint is created but not used, it still creates the code and data structures needed for that trace event. This just wastes memory. Remove the trace events that are created but not used. This does not remove trace events that are created but are not used due configs not being set. That will be handled later. This only removes events that have no user under any config. * tag 'trace-v6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: fsdax: Remove unused trace events for dax insert mapping genirq/matrix: Remove unused irq_matrix_alloc_reserved tracepoint xdp: Remove unused mem_return_failed event ftrace: Don't allocate ftrace module map if ftrace is disabled ftrace: Fix UAF when lookup kallsym after ftrace disabled
2025-06-03Merge tag 'cgroup-for-6.16-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "The rstat per-subsystem split change skipped per-cpu allocation on UP configs; however even on UP, depending on config options, the size of the percpu struct may not be zero leading to crashes. Fix it by conditionalizing the per-cpu area allocation and usage on the size of the per-cpu struct" * tag 'cgroup-for-6.16-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: adjust criteria for rstat subsystem cpu lock access
2025-06-03Merge tag 'cxl-for-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dave Jiang: - Remove always true condition in cxl features code - Add verification of CHBS length for CXL 2.0 - Ignore interleave granularity when interleave ways is 1 - Add update addressing mising MODULE_DESCRIPTION for cxl_test - A series of cleanups/refactor to prep for AMD Zen5 translate code - Clean %pa debug printk in core/hdm.c - Documentation updates: - Update to CXL Maturity Map - Fixes to source linking in CXL documentation - CXL documentation fixes, spelling corrections - A large collection of CXL documentation for the entire CXL subsystem, including documentation on CXL related platform and firmware notes - Remove redundant code of cxlctl_get_supported_features() - Series to support CXL RAS Features - Including "Patrol Scrub Control", "Error Check Scrub", "Performance Maitenance" and "Memory Sparing". The series connects CXL to EDAC. * tag 'cxl-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (53 commits) cxl/edac: Add CXL memory device soft PPR control feature cxl/edac: Add CXL memory device memory sparing control feature cxl/edac: Support for finding memory operation attributes from the current boot cxl/edac: Add support for PERFORM_MAINTENANCE command cxl/edac: Add CXL memory device ECS control feature cxl/edac: Add CXL memory device patrol scrub control feature cxl: Update prototype of function get_support_feature_info() EDAC: Update documentation for the CXL memory patrol scrub control feature cxl/features: Remove the inline specifier from to_cxlfs() cxl/feature: Remove redundant code of get supported features docs: ABI: Fix "firwmare" to "firmware" cxl/Documentation: Fix typo in sysfs write_bandwidth attribute path cxl: doc/linux/access-coordinates Update access coordinates calculation methods cxl: docs/platform/acpi/srat Add generic target documentation cxl: docs/platform/cdat reference documentation Documentation: Update the CXL Maturity Map cxl: Sync up the driver-api/cxl documentation cxl: docs - add self-referencing cross-links cxl: docs/allocation/hugepages cxl: docs/allocation/reclaim ...
2025-06-03PM: sleep: Add locking to dpm_async_resume_children()Rafael J. Wysocki
Commit 0cbef962ce1f ("PM: sleep: Resume children after resuming the parent") introduced a subtle concurrency issue that may lead to a kernel crash if system suspend is aborted and may also slow down asynchronous device resume otherwise. Namely, the initial list walks in dpm_noirq_resume_devices(), dpm_resume_early(), and dpm_resume() call dpm_clear_async_state() for every device and attempt to asynchronously resume it if it has no children (so it is a "root" device). The asynchronous resume of a root device triggers an attempt to asynchronously resume its children which may take place before calling dpm_clear_async_state() for them due to the lack of synchronization between dpm_async_resume_children() and the code calling dpm_clear_async_state(). If this happens, the dpm_clear_async_state() that comes in late, will clear power.work_in_progress for the given device after it has been set by __dpm_async(), so the suspend callback will be allowed to run once again for the same device during the same transition. This leads to a whole range of interesting breakage. Fortunately, if the suspend transition is not aborted, power.work_in_progress is set by it for all devices, so dpm_async_resume_children() will not schedule asynchronous resume for them until dpm_clear_async_state() clears that flag, but this means missing an opportunity to start the resume of those devices earlier. Address the above issue by adding dpm_list_mtx locking to dpm_async_resume_children(), so it will wait for the entire initial list walk and the invocation of dpm_clear_async_state() for all devices to be completed before scheduling any new asynchronous resume callbacks. Fixes: 0cbef962ce1f ("PM: sleep: Resume children after resuming the parent") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4280 Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/13779172.uLZWGnKmhe@rjwysocki.net
2025-06-03PM: sleep: Fix power.is_suspended cleanup for direct-complete devicesRafael J. Wysocki
Commit 03f1444016b7 ("PM: sleep: Fix handling devices with direct_complete set on errors") caused power.is_suspended to be set for devices with power.direct_complete set, but it forgot to ensure the clearing of that flag for them in device_resume(), so power.is_suspended is still set for them during the next system suspend-resume cycle. If that cycle is aborted in dpm_suspend(), the subsequent invocation of dpm_resume() will trigger a device_resume() call for every device and because power.is_suspended is set for the devices in question, they will not be skipped by device_resume() as expected which causes scary error messages to be logged (as appropriate). To address this issue, move the clearing of power.is_suspended in device_resume() immediately after the power.is_suspended check so it will be always cleared for all devices processed by that function. Fixes: 03f1444016b7 ("PM: sleep: Fix handling devices with direct_complete set on errors") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4280 Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/4990586.GXAFRqVoOG@rjwysocki.net
2025-06-03PM: sleep: Fix list splicing in device suspend error pathsRafael J. Wysocki
Commits aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children") and 443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous") added list splicing to the error paths of dpm_suspend(), dpm_suspend_late(), and dpm_noirq_suspend_devices(), but they should have used the list_splice_init() variant because the emptied list is used going forward in all of these cases. Replace list_splice() with list_splice_init() in the code in question as appropriate. Fixes: aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children") Fixes: 443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4280 Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/4659282.LvFx2qVVIh@rjwysocki.net
2025-06-03Merge tag 'backlight-next-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight Pull backlight updates from Lee Jones: "Framebuffer Subsystem (fbdev): - The display's blanking status is now tracked in 'struct fb_info' - 'framebuffer_alloc()' initializes the blank state to FB_BLANK_UNBLANK - 'register_framebuffer()' sets the state to 'FB_BLANK_POWERDOWN' if an 'fb_blank' callback exists, ensuring 'FB_EVENT_BLANK' listeners correctly see the display being turned on during the first modeset - The 'FB_EVENT_BLANK' event data now includes both the new and the old blank states - 'fb_blank()' has been reworked to return early on errors, without functional changes, in preparation for further state tracking improvements - Fbdev now calls dedicated functions in the backlight subsystems to notify them of blank state changes, instead of relying on fbdev event notifiers - For LCDs, fbdev also calls a dedicated function to notify of mode changes - Removed the definitions for the unused fbdev event constants 'FB_EVENT_MODE_CHANGE' and 'FB_EVENT_BLANK' from the header file Backlight Subsystem: - Implemented fbdev blank state tracking using the (newly enhanced) blank state information provided directly by 'FB_EVENT_BLANK' - Removed internal blank state tracking fields ('fb_bl_on') from 'struct backlight_device' - Moved the handling of blank-state updates into a separate internal helper function, 'backlight_notify_blank()' - Removed support for fbdev events and replaced it with a dedicated function call interface ('backlight_notify_blank()' and 'backlight_notify_blank_all()') for display drivers to update backlight status LCD Subsystem: - Moved the handling of display updates (blank events and mode changes) from fbdev event notifiers to separate internal helper functions ('lcd_notify_blank', 'lcd_notify_mode_change') - Removed support for fbdev events and replaced it with dedicated function call interfaces ('lcd_notify_blank_all()', 'lcd_notify_mode_change_all()') - The LCD subsystem now maintains its own internal list of LCD devices instead of relying on fbdev notifiers LED Backlight Trigger: - Moved the handling of blank-state updates into a separate internal helper, 'ledtrig_backlight_notify_blank()' - Removed support for fbdev events and replaced it with a dedicated function call, 'ledtrig_backlight_blank()', for fbdev to notify trigger of blank state changes - The LED backlight trigger now maintains its own internal list of triggers instead of relying on fbdev notifiers Qualcomm WLED Backlight: - Added a NULL check after 'devm_kasprintf()' in 'wled_configure()' to prevent a potential NULL pointer dereference if memory allocation fails" * tag 'backlight-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: backlight: pm8941: Add NULL check in wled_configure() fbdev: Remove constants of unused events leds: backlight trigger: Replace fb events with a dedicated function call leds: backlight trigger: Move blank-state handling into helper backlight: lcd: Replace fb events with a dedicated function call backlight: lcd: Move event handling into helpers backlight: Replace fb events with a dedicated function call backlight: Move blank-state handling into helper backlight: Implement fbdev tracking with blank state from event fbdev: Send old blank state in FB_EVENT_BLANK fbdev: Track display blanking state fbdev: Rework fb_blank()
2025-06-03drm/amd/display: Promote DAL to 3.2.336Taimur Hassan
This version brings along following fixes: - Fix brightness relevant settings - Fix calling blanking stream twice - Extend dc mode validation types to support more scenarios - Update DMCUB loading sequence for DCN3.5 Acked-by: ChiaHsuan Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: replace fast_validate with enum dc_validate_modeYan Li
[Why] The boolean fast_validate is used as an input parameter in multiple functions. To support more scenarios, we are replacing it with enum dc_validate_mode. [How] The enum dc_validate_mode introduces three possible values: 1) DC_VALIDATE_MODE_AND_PROGRAMMING: Apply the mode to hardware 2) DC_VALIDATE_MODE_ONLY: Check whether the mode can be supported 3) DC_VALIDATE_MODE_AND_STATE_INDEX: Check if the mode can be supported, and determine the optimal voltage level needed to support it. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Yan Li <yan.li@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Update DMCUB loading sequence for DCN3.5Nicholas Kazlauskas
[Why] New sequence from HW for reset and firmware reloading has been provided that aims to stabilize the reload sequence in the case the firmware is hung or has outstanding requests. [How] Update the sequence to remove the DMUIF reset and the redundant writes in the release. Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Promote DAL to 3.2.335Taimur Hassan
This version brings along following fixes: - Fixes for DML21 - Support OLED SDR with AMD ABC - Indirect buffer transport for FAMS2 commands - Correct non-OLED pre_T11_delay - Optime boot-up consuming time - Add support for 2nd sharpening range - Fix on chroma planes scaling Acked-by: ChiaHsuan Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: [FW Promotion] Release 0.1.12.0Taimur Hassan
Add dmub command to support LSDMA Acked-by: ChiaHsuan Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Move vmalloc include to header fileRay Wu
[Why & How] Move vmalloc.h include code to header file. Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Add support for 2nd sharpening rangeSamson Tam
[Why & How] Add support for 2nd sharpening range for cases where we want override existing DCN sharpening range Reviewed-by: Ilya Bakoulin <ilya.bakoulin@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Do not bypass chroma scaling in 1:1 caseNavid Assadian
[Why] When doing 2:1 downscaling on a YUV sub-sampled format, the chroma scaling ratio is 1:1. Since chroma has cositing, it is needed to do scaling on the chroma plane(s) and not to bypass chroma scaling. [How] Do not set the chroma taps to one when the chroma ratio is identity and the input format is a sub-sampled YUV format. Reviewed-by: Samson Tam <samson.tam@amd.com> Signed-off-by: Navid Assadian <Navid.Assadian@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Add DML path for FAMS methodsOleh Kuzhylnyi
[Why] DML needs a path for FAMS methods. [How] Apply instance of fams2_stream_sub_params_v2 structure with a FAMS placeholder for DML. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Oleh Kuzhylnyi <okuzhyln@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Add disconnect case on dongle checkJingwen Zhu
[why] In the case of an external monitor disconnection, the kernel mode will attempt to post new timing validation with two path counts (eDP + external monitor removed to virtual). [how] Skip validating color depth and pixel encoding in the scenario involving a DP to HDMI active converter dongle. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Jingwen Zhu <Jingwen.Zhu@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Avoid trying AUX transactions on disconnected portsWayne Lin
[Why & How] Observe that we try to access DPCD 0x600h of disconnected DP ports. In order not to wasting time on retrying these ports, call dpcd_write_rx_power_ctrl() after checking its connection status. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Drop unnecessary `amdgpu` prefixMario Limonciello
[Why] The `drm_*()` print macros will handle including the driver in the print already. The extra print of the word `amdgpu` is unnecessary. [How] Modify all prints to drop `amdgpu: `. Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Indirect buffer transport for FAMS2 commandsOleh Kuzhylnyi
[Why] The quantity and duration of FAMS2 commands are set to increase in future products. This necessitates the implementation of a new mechanism for chaining commands together, allowing all commands to be processed within a single transaction. [How] The indirect buffer acts as a shared buffer on the driver side, mapped to DMUB's internal CW7 address. Its source address and size are sent through mailbox command to DMUB, triggering the transaction. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Oleh Kuzhylnyi <okuzhyln@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: move RMCM programmingYihan Zhu
[WHY & HOW] Move only RMCM programming outside of dcn401. Extended HW definition in dc for memory layout to extend support. Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Support OLED SDR with AMD ABCCamille Cho
[Why] Nits programming for SDR panel is only supported by VESA ABC. [How] 1. Loose nits programming for OLED SDR panel with AMD ABC. 2. We support two ABC methods. Disable one before we program the other in case panel freaks out. 3. Update HDR judgement in setBR with a solider condition. Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com> Signed-off-by: Camille Cho <Camille.Cho@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: DML21 FixesAustin Zheng
- Store state related info inside mode_lib. - Fix bad DCFCLK deep sleep - Update FAMS structure in DMUB header Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Re-order FAMS2 sub commandsAlvin Lee
[Why & How] New enums need to be added to the end to avoid back compat issues. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: [FW Promotion] Release 0.1.11.0Taimur Hassan
Refactoring some DMUB related structs and enum. Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Fix default DC and AC levelsMario Limonciello
[Why] DC and AC levels are advertised in a percentage, not a luminance. [How] Scale DC and AC levels to supported values. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4221 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Add debugging message for brightness capsMario Limonciello
[Why] Default BIOS brightness caps are buried in ACPI. [How] Add extra dynamic debug that can show default brightness caps. Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Use DC log instead of using DM error msgCruise Hung
[Why & How] It sent an error msg when it failed to read the DP tunneling DPCD field. This should just be a warning msg. Use a DC log instead of a DM error msg. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Cruise Hung <Cruise.Hung@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Avoid calling blank_stream() twiceZhongwei Zhang
[Why] We've made fix for garbage in dcn31_reset_back_end_for_pipe(), adding blank_stream() before disable_crtc(). And set_dpms_off() will call blank_stream() again. [How] Add flag to avoid calling blank_stream() twice. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Zhongwei Zhang <Zhongwei.Zhang@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amd/display: Correct non-OLED pre_T11_delay.Zhongwei Zhang
[Why] Only OLED panels require non-zero pre_T11_delay defaultly. Others should be controlled by power sequence. [How] For non OLED, pre_T11_delay delay in code should be zero. Also post_T7_delay. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Zhongwei Zhang <Zhongwei.Zhang@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amdgpu: Add userq fence support to SDMAv7.0Arunpravin Paneer Selvam
- Add userq fence support to SDMAv7.0. - GFX12's user fence irq src id differs from GFX11's, hence we need create a new irq srcid header file for GFX12. User fence irq src id information- GFX11 and SDMA6.0 - 0x43 GFX12 and SDMA7.0 - 0x46 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amdgpu: Fix integer overflow in amdgpu_gem_add_input_fence()Dan Carpenter
The "num_syncobj_handles" is a u32 value that comes from the user via the ioctl. On 32bit systems the "sizeof(uint32_t) * num_syncobj_handles" multiplication can have an integer overflow. Use size_mul() to fix that. Fixes: 38c67ec9aa4b ("drm/amdgpu: Add input fence to sync bo map/unmap") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03fsdax: Remove unused trace events for dax insert mappingSteven Rostedt
When the dax_fault_actor() helper was factored out, it removed the calls to the dax_pmd_insert_mapping and dax_insert_mapping events but never removed the events themselves. As each event created takes up memory (roughly 5K each), this is a waste as it is never used. Remove the unused dax_pmd_insert_mapping and dax_insert_mapping trace events. Link: https://lore.kernel.org/all/20250529130138.544ffec4@gandalf.local.home/ Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250529152211.688800c9@gandalf.local.home Fixes: c2436190e492 ("fsdax: factor out a dax_fault_actor() helper") Acked-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-06-03Merge tag 'leds-next-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds Pull LED updates from Lee Jones: "LED Triggers: - Allow writing "default" to the sysfs 'trigger' attribute to set an LED to its default trigger - If the default trigger is "none", writing "default" will remove the current trigger - Updated sysfs ABI documentation for the new "default" trigger functionality LED KUnit Testing: - Provide a skeleton KUnit test suite for the LEDs framework - Expand the LED class device registration KUnit test to cover more scenarios, including 'brightness_get' behavior - Add KUnit tests for the LED lookup and get API ('led_add_lookup', 'devm_led_get') LED Flash Class: - Add support for setting flash/strobe duration through a new 'duration_set' op and 'led_set_flash_duration()' function, aligning with 'V4L2_CID_FLASH_DURATION' Texas Instruments TPS6131x: - Add a new driver for the TPS61310/TPS61311 flash LED controllers - The driver supports the device's three constant-current sinks for flash and torch modes LED Core: - Prevent potential 'snprintf()' truncations in LED names by checking for buffer overflows ChromeOS EC LEDs: - Avoid a -Wflex-array-member-not-at-end GCC warning by replacing an on-stack flexible structure definition with a utility function call Multicolor LEDs: - Fix issue where setting multi_intensity while software blinking is active could stop blinking PCA955x LEDs: - Avoid potential buffer overflow when creating default labels by changing a field's type to 'u8' and updating format specifiers PCA995x LEDs: - Fix a typo (stray space) in an 'of_device_id' entry in the 'pca995x_of_match' table Kconfig: - Prevent LED drivers from being enabled by default when 'COMPILE_TEST' is set Device Property API: - Split 'device_get_child_node_count()' into a new helper 'fwnode_get_child_node_count()' that doesn't require a device struct, making the API more symmetrical Driver Modernization (using 'fwnode_get_child_node_count()'): - Update 'leds-pwm-multicolor', 'leds-ncp5623' and 'leds-ncp5623' to use the new 'fwnode_get_child_node_count()' helper, removing their custom implementation - As above in the USB Type-C TCPM driver Driver Modernization (using new GPIO setter callbacks): - Convert 'leds-lgm-sso' to use new GPIO line value setter callbacks which return an integer for error handling - Convert 'leds-pca955x', 'leds-pca9532' and 'leds-tca6507' to use new GPIO setter callbacks Documentation: - Remove the '.rst' extension for 'leds-st1202' in the documentation index for consistency LP8860 LEDs: - Use 'regmap_multi_reg_write()' for EEPROM writes instead of manual looping - Use scoped mutex guards and 'devm_mutex_init()' to simplify function exits and ensure automatic cleanup - Remove default register definitions that are unused when regmap caching is not active - Use 'devm_regulator_get_enable_optional()' to handle the optional regulator, simplifying enabling and removing manual disabling - Refactor 'lp8860_unlock_eeprom()' to only perform the unlock operation, removing the lock part and an unnecessary parameter - Use a 'devm' action to disable the enable-GPIO, simplifying cleanup and error paths, and remove the now-empty '.remove()' function Turris Omnia LEDs: - Drop unnecessary commas in terminator entries of 'struct attribute' and 'struct of_device_id' arrays MT6370 RGB LEDs: - Use the 'LINEAR_RANGE()' for defining 'struct linear_range' entries to improve robustness Texas Instruments TPS6131x: - Add new devicetree bindings for the TI TPS61310/TPS61311 flash LED driver" * tag 'leds-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: (31 commits) leds: tps6131x: Add support for Texas Instruments TPS6131X flash LED driver dt-bindings: leds: Add Texas Instruments TPS6131x flash LED driver leds: flash: Add support for flash/strobe duration leds: rgb: leds-mt6370-rgb: Improve definition of some struct linear_range leds: led-test: Provide tests for the lookup and get infrastructure leds: led-test: Fill out the registration test to cover more test cases leds: led-test: Remove standard error checking after KUNIT_ASSERT_*() leds: pca995x: Fix typo in pca995x_of_match's of_device_id entry leds: Provide skeleton KUnit testing for the LEDs framework leds: tca6507: Use new GPIO line value setter callbacks leds: pca9532: Use new GPIO line value setter callbacks leds: pca955x: Use new GPIO line value setter callbacks leds: lgm-sso: Use new GPIO line value setter callbacks leds: Do not enable by default during compile testing leds: turris-omnia: Drop commas in the terminator entries leds: lp8860: Disable GPIO with devm action leds: lp8860: Only unlock in lp8860_unlock_eeprom() leds: lp8860: Enable regulator using enable_optional helper leds: lp8860: Remove default regs when not caching leds: lp8860: Use new mutex guards to cleanup function exits ...
2025-06-03drm/amdgpu: Fix integer overflow issues in amdgpu_userq_fence.cDan Carpenter
This patch only affects 32bit systems. There are several integer overflows bugs here but only the "sizeof(u32) * num_syncobj" multiplication is a problem at runtime. (The last lines of this patch). These variables are u32 variables that come from the user. The issue is the multiplications can overflow leading to us allocating a smaller buffer than intended. For the first couple integer overflows, the syncobj_handles = memdup_user() allocation is immediately followed by a kmalloc_array(): syncobj = kmalloc_array(num_syncobj_handles, sizeof(*syncobj), GFP_KERNEL); In that situation the kmalloc_array() works as a bounds check and we haven't accessed the syncobj_handlesp[] array yet so the integer overflow is harmless. But the "num_syncobj" multiplication doesn't have that and the integer overflow could lead to an out of bounds access. Fixes: a292fdecd728 ("drm/amdgpu: Implement userqueue signal/wait IOCTL") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amdgpu: disable workload profile switching when OD is enabledAlex Deucher
Users have reported that they have to reduce the level of undervolting to acheive stability when dynamic workload profiles are enabled on GC 10.3.x. Disable dynamic workload profiles if the user has enabled OD. Fixes: b9467983b774 ("drm/amdgpu: add dynamic workload profile switching for gfx10") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4262 Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.15.x
2025-06-03drm/amdgpu/gfx10: Refine Cleaner Shader for GFX10.1.10Vitaly Prosyak
This patch updates the cleaner shader, which is responsible for initializing GPU resources such as Local Data Share (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs). Changes include adjustments to register clearing and shader configuration. - Updated GPU resource initialization addresses in the cleaner shader from `be803080` to `be803000`. - Simplified the logic in the SGPR clearing section, ensuring all SGPRs are set to zero. Fixes: 25961bad9212 ("drm/amdgpu/gfx10: Add cleaner shader for GFX10.1.10") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Manu Rastogi <manu.rastogi@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amdgpu: Add more checks to discovery fetchLijo Lazar
Add more checks for valid vram size and log error, if any. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03drm/amdkfd: enable kfd on RISCV systemsXuemei Liu
KFD has been confirmed that can run on RISCV systems. It's necessary to support CONFIG_HSA_AMD on RISCV. Signed-off-by: Xuemei Liu <liu.xuemei1@zte.com.cn> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03Merge tag 'mfd-next-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD updates from Lee Jones: "Samsung Exynos ACPM: - Populate child platform devices from device tree data - Introduce a new API, 'devm_acpm_get_by_node()', for child devices to get the ACPM handle ROHM PMICs: - Add support for the ROHM BD96802 scalable companion PMIC to the BD96801 core driver - Add support for controlling the BD96802 using the BD96801 regulator driver - Add support to the BD96805, which is almost identical to the BD96801 - Add support to the BD96806, which is similar to the BD96802 Maxim MAX77759: - Add a core driver for the MAX77759 companion PMIC - Add a GPIO driver for the expander functions on the MAX77759 - Add an NVMEM driver to expose the non-volatile memory on the MAX77759 STMicroelectronics STM32MP25: - Add support for the STM32MP25 SoC to the stm32-lptimer - Add support for the STM32MP25 to the clocksource driver, handling new register access requirements - Add support for the STM32MP25 to the PWM driver, enabling up to two PWM outputs Broadcom BCM590xx: - Add support for the BCM59054 PMU - Parse the PMU ID and revision to support behavioral differences between chip revisions - Add regulator support for the BCM59054 Samsung S2MPG10: - Add support for the S2MPG10 PMIC, which communicates via the Samsung ACPM firmware instead of I2C Exynos ACPM: - Improve timeout detection reliability by using ktime APIs instead of a loop counter assumption - Allow PMIC access during late system shutdown by switching to 'udelay()' instead of a sleeping function - Fix an issue where reading command results longer than 8 bytes would fail - Silence non-error '-EPROBE_DEFER' messages during boot to clean up logs Exynos LPASS: - Fix an error handling path by switching to 'devm_regmap_init_mmio()' to prevent resource leaks - Fix a bug where 'exynos_lpass_disable()' was called twice in the remove function - Fix another resource leak in the probe's error path by using 'devm_add_action_or_reset()' Samsung SEC: - Handle the s2dos05, which does not have IRQ support, explicitly to prevent warnings - Fix the core driver to correctly handle errors from 'sec_irq_init()' instead of ignoring them STMPE-SPI: - Correct an undeclared identifier in the 'MODULE_DEVICE_TABLE' macro MAINTAINERS: - Adjust a file path for the Siemens IPC LED drivers entry to fix a broken reference Maxim Drivers: - Correct the spelling of "Electronics" in Samsung copyright headers across multiple files General: - Fix wakeup source memory leaks on device unbind for 88pm886, as3722, max14577, max77541, max77705, max8925, rt5033, and sprd-sc27xx drivers Samsung SEC Drivers: - Split the driver into a transport-agnostic core ('sec-core') and transport-specific ('sec-i2c', 'sec-acpm') modules to support non-I2C devices - Merge the 'sec-core' and 'sec-irq' modules to reduce memory consumption - Move internal APIs to a private header to clean up the public API - Improve code style by sorting includes, cleaning up headers, sorting device tables, and using helper macros like 'dev_err_probe()', 'MFD_CELL', and 'REGMAP_IRQ_REG' - Make regmap configuration for s2dos05/s2mpu05 explicit to improve clarity - Rework platform data and regmap instantiation to use OF match data instead of a large switch statement ROHM BD96801/2: - Prepare the driver for new models by separating chip-specific data into its own structure - Drop IC name prefix from IRQ resource names in both the MFD and regulator drivers for simplification Broadcom BCM590xx: - Refactor the regulator driver to store descriptions in a table to ease support for new chips - Rename BCM59056-specific data to prepare for the addition of other regulators - Use 'dev_err_probe()' for cleaner error handling Exynos ACPM: - Correct kerneldoc warnings and use the conventional 'np' argument name General MFD: - Convert 'aat2870' and 'tps65010' to use the per-client debugfs directory provided by the I2C core - Convert 'sm501', 'tps65010' and 'ucb1x00' to use the new GPIO line value setter callbacks - Constify 'regmap_irq_chip' and other structures in '88pm886' to move data to read-only sections BCM590xx: - Drop the unused "id" member from the 'bcm590xx' struct in preparation for a replacement Samsung SEC Core: - Remove forward declarations for functions that no longer exist SM501: - Remove the unused 'sm501_find_clock()' function New Compatibles: - Google: Add a PMIC child node to the 'google,gs101-acpm-ipc' binding - ROHM: Add new bindings for 'rohm,bd96802-regulator' and 'rohm,bd96802-pmic', and add compatibles for BD96805 and BD96806 - Maxim: Add new bindings for 'maxim,max77759-gpio', 'maxim,max77759-nvmem', and the top-level 'maxim,max77759' - STM: Add 'stm32mp25' compatible to the 'stm32-lptimer' binding - Broadcom: Add 'bcm59054' compatible - Atmel/Microchip: Add 'microchip,sama7d65-gpbr' and 'microchip,sama7d65-secumod' compatibles - Samsung: Add 's2mpg10' compatible to the 'samsung,s2mps11' MFD binding - MediaTek: Add compatibles for 'mt6893' (scpsys), 'mt7988-topmisc', and 'mt8365-infracfg-nao' - Qualcomm: Add 'qcom,apq8064-mmss-sfpb' and 'qcom,apq8064-sps-sic' syscon compatibles Refactoring & Cleanup: - Convert Broadcom BCM59056 devicetree bindings to YAML and split them into MFD and regulator parts - Convert the Microchip AT91 secumod binding to YAML - Drop unrelated consumer nodes from binding examples to reduce bloat - Correct indentation and style in various DTS examples" * tag 'mfd-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (81 commits) mfd: maxim: Correct Samsung "Electronics" spelling in copyright headers mfd: maxim: Correct Samsung "Electronics" spelling in headers mfd: sm501: Remove unused sm501_find_clock mfd: 88pm886: Constify struct regmap_irq_chip and some other structures dt-bindings: mfd: syscon: Add mediatek,mt8365-infracfg-nao mfd: sprd-sc27xx: Fix wakeup source leaks on device unbind mfd: rt5033: Fix wakeup source leaks on device unbind mfd: max8925: Fix wakeup source leaks on device unbind mfd: max77705: Fix wakeup source leaks on device unbind mfd: max77541: Fix wakeup source leaks on device unbind mfd: max14577: Fix wakeup source leaks on device unbind mfd: as3722: Fix wakeup source leaks on device unbind mfd: 88pm886: Fix wakeup source leaks on device unbind dt-bindings: mfd: Correct indentation and style in DTS example dt-bindings: mfd: Drop unrelated nodes from DTS example dt-bindings: mfd: syscon: Add qcom,apq8064-sps-sic dt-bindings: mfd: syscon: Add qcom,apq8064-mmss-sfpb mfd: stmpe-spi: Correct the name used in MODULE_DEVICE_TABLE dt-bindings: mfd: syscon: Add mt7988-topmisc mfd: exynos-lpass: Fix another error handling path in exynos_lpass_probe() ...
2025-06-03block: drop direction param from bio_integrity_copy_user()Caleb Sander Mateos
direction is determined from bio, which is already passed in. Compute op_is_write(bio_op(bio)) directly instead of converting it to an iter direction and back to a bool. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/r/20250603183133.1178062-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-03sched_ext: idle: Skip cross-node search with !CONFIG_NUMAAndrea Righi
In the idle CPU selection logic, attempting cross-node searches adds unnecessary complexity when CONFIG_NUMA is disabled. Since there's no meaningful concept of nodes in this case, simplify the logic by restricting the idle CPU search to the current node only. Fixes: 48849271e6611 ("sched_ext: idle: Per-node idle cpumasks") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-03Merge tag 'hid-for-linus-2025060301' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Jiri Kosina: - support for Apple Magic Mouse 2 USB-C (Aditya Garg) - power management improvement for multitouch devices (Werner Sembach) - fix for ACPI initialization in intel-thc driver (Wentao Guan) - adaptation of HID drivers to use new gpio_chip's line setter callbacks (Bartosz Golaszewski) - fix potential OOB in usbhid_parse() (Terry Junge) - make it possible to set hid_mouse_ignore_list dynamically (the same way we handle other quirks) (Aditya Garg) - other small assorted fixes and device ID additions * tag 'hid-for-linus-2025060301' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: multitouch: Disable touchpad on firmware level while not in use HID: core: Add functions for HID drivers to react on first open and last close call HID: HID_APPLETB_BL should depend on X86 HID: HID_APPLETB_KBD should depend on X86 HID: appletb-kbd: Use secs_to_jiffies() instead of msecs_to_jiffies() HID: intel-thc-hid: intel-thc: make read-only arrays static const HID: magicmouse: Apple Magic Mouse 2 USB-C support HID: mcp2221: use new line value setter callbacks HID: mcp2200: use new line value setter callbacks HID: cp2112: use new line value setter callbacks HID: cp2112: use lock guards HID: cp2112: hold the lock for the entire direction_output() call HID: cp2112: destroy mutex on driver detach HID: intel-thc-hid: intel-quicki2c: pass correct arguments to acpi_evaluate_object HID: corsair-void: Use to_delayed_work() HID: hid-logitech: use sysfs_emit_at() instead of scnprintf() HID: quirks: Add HID_QUIRK_IGNORE_MOUSE quirk HID: usbhid: Eliminate recurrent out-of-bounds bug in usbhid_parse() HID: Kysona: Add periodic online check
2025-06-03dm-stripe: small code cleanupMikulas Patocka
This commit doesn't fix any bug, it is just code cleanup. Use the function format_dev_t instead of sprintf, because format_dev_t does the same thing. Remove the useless memset call. An unsigned integer can take at most 10 digits, so extend the array size to 22. (note that because the range of minor and major numbers is limited, the size 16 could not be exceeded, thus this function couldn't write beyond string end) Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-06-03dm-verity: fix a memory leak if some arguments are specified multiple timesMikulas Patocka
If some of the arguments "check_at_most_once", "ignore_zero_blocks", "use_fec_from_device", "root_hash_sig_key_desc" were specified more than once on the target line, a memory leak would happen. This commit fixes the memory leak. It also fixes error handling in verity_verify_sig_parse_opt_args. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org
2025-06-03dm-mirror: fix a tiny race conditionMikulas Patocka
There's a tiny race condition in dm-mirror. The functions queue_bio and write_callback grab a spinlock, add a bio to the list, drop the spinlock and wake up the mirrord thread that processes bios in the list. It may be possible that the mirrord thread processes the bio just after spin_unlock_irqrestore is called, before wakeup_mirrord. This spurious wake-up is normally harmless, however if the device mapper device is unloaded just after the bio was processed, it may be possible that wakeup_mirrord(ms) uses invalid "ms" pointer. Fix this bug by moving wakeup_mirrord inside the spinlock. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org
2025-06-03iavf: get rid of the crit lockPrzemek Kitszel
Get rid of the crit lock. That frees us from the error prone logic of try_locks. Thanks to netdev_lock() by Jakub it is now easy, and in most cases we were protected by it already - replace crit lock by netdev lock when it was not the case. Lockdep reports that we should cancel the work under crit_lock [splat1], and that was the scheme we have mostly followed since [1] by Slawomir. But when that is done we still got into deadlocks [splat2]. So instead we should look at the bigger problem, namely "weird locking/scheduling" of the iavf. The first step to fix that is to remove the crit lock. I will followup with a -next series that simplifies scheduling/tasks. Cancel the work without netdev lock (weird unlock+lock scheme), to fix the [splat2] (which would be totally ugly if we would kept the crit lock). Extend protected part of iavf_watchdog_task() to include scheduling more work. Note that the removed comment in iavf_reset_task() was misplaced, it belonged to inside of the removed if condition, so it's gone now. [splat1] - w/o this patch - The deadlock during VF removal: WARNING: possible circular locking dependency detected sh/3825 is trying to acquire lock: ((work_completion)(&(&adapter->watchdog_task)->work)){+.+.}-{0:0}, at: start_flush_work+0x1a1/0x470 but task is already holding lock: (&adapter->crit_lock){+.+.}-{4:4}, at: iavf_remove+0xd1/0x690 [iavf] which lock already depends on the new lock. [splat2] - when cancelling work under crit lock, w/o this series, see [2] for the band aid attempt WARNING: possible circular locking dependency detected sh/3550 is trying to acquire lock: ((wq_completion)iavf){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x26/0x90 but task is already holding lock: (&dev->lock){+.+.}-{4:4}, at: iavf_remove+0xa6/0x6e0 [iavf] which lock already depends on the new lock. [1] fc2e6b3b132a ("iavf: Rework mutexes for better synchronisation") [2] https://github.com/pkitszel/linux/commit/52dddbfc2bb60294083f5711a158a Fixes: d1639a17319b ("iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies") Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03iavf: sprinkle netdev_assert_locked() annotationsPrzemek Kitszel
Lockdep annotations help in general, but here it is extra good, as next commit will remove crit lock. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03iavf: extract iavf_watchdog_step() out of iavf_watchdog_task()Przemek Kitszel
Finish up easy refactor of watchdog_task, total for this + prev two commits is: 1 file changed, 47 insertions(+), 82 deletions(-) Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03iavf: simplify watchdog_task in terms of adminq task schedulingPrzemek Kitszel
Simplify the decision whether to schedule adminq task. The condition is the same, but it is executed in more scenarios. Note that movement of watchdog_done label makes this commit a bit surprising. (Hence not squashing it to anything bigger). Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>