linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-02-25	drm/amdgpu: Log the creation of a coredump file	André Almeida
	After a GPU reset happens, the driver creates a coredump file. However, the user might not be aware of it. Log the file creation the user can find more information about the device and add the file to bug reports. This is similar to what the xe driver does. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu/mes: keep enforce isolation up to date	Alex Deucher
	Re-send the mes message on resume to make sure the mes state is up to date. Fixes: 8521e3c5f058 ("drm/amd/amdgpu: limit single process inside MES") Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Shaoyun Liu <shaoyun.liu@amd.com> Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amd/pm: Use separate metrics table for smu_v13_0_12	Asad Kamal
	Use separate metrics table for smu_v13_0_12 and fetch metrics data using that. v2: Fix jpeg busy indexing (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: Add core reset registers for JPEG5_0_1	Sathishkumar S
	Add core reset control register definitions and align all prior register definitions to end at 100 column length for uniformity. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: Per-instance init func for JPEG5_0_1	Sathishkumar S
	Add helper functions to handle per-instance and per-core initialization and deinitialization in JPEG5_0_1. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amd/display: fix an indent issue in DML21	Aurabindo Pillai
	Remove extraneous tab and newline in dml2_core_dcn4.c that was reported by the bot Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202502211920.txUfwtSj-lkp@intel.com/ Fixes: 70839da6360 ("drm/amd/display: Add new DCN401 sources") Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: disable BAR resize on Dell G5 SE	Alex Deucher
	There was a quirk added to add a workaround for a Sapphire RX 5600 XT Pulse that didn't allow BAR resizing. However, the quirk caused a regression with runtime pm on Dell laptops using those chips, rather than narrowing the scope of the resizing quirk, add a quirk to prevent amdgpu from resizing the BAR on those Dell platforms unless runtime pm is disabled. v2: update commit message, add runpm check Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707 Fixes: 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amd/pm: Fetch fru product info for smu_v13_0_12	Asad Kamal
	Fetch fru product info for smu_v13_0_12 from static metrics table v2: Field by field copy for fru info(Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amd/pm: Fetch static metrics table	Asad Kamal
	Fetch clock frequency table from static metrics table for smu_v13_0_12 v2: Move PPTable definition, remove unnecessary checks for getting static metrics table(Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amd/pm: Add GetStaticMetricTable message	Asad Kamal
	Add GetStaticMetricTable message for smu_v13_0_12 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amd/pm: Update pmfw headers for smu_v13_0_12	Asad Kamal
	Update pmfw headers for smu_v13_0_12 new messages & metrics table. Static metrics table for frequency added, Separate metrics table for smu_v13_0_12 added. Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: Update amdgpu_job_timedout to check if the ring is guilty	Jesse.zhang@amd.com
	This patch updates the `amdgpu_job_timedout` function to check if the ring is actually guilty of causing the timeout. If not, it skips error handling and fence completion. v2: move the is_guilty check down into the queue reset area (Alex) v3: need to call is_guilty before reset (Alex) v4: squash in is_guilty logic fixes (Alex) Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amd/pm: add support for checking SDMA reset capability	Jesse.zhang@amd.com
	This patch introduces a new function to check if the SMU supports resetting the SDMA engine. This capability check ensures that the driver does not attempt to reset the SDMA engine on hardware that does not support it. The following changes are included: - New function `amdgpu_dpm_reset_sdma_is_supported` to check SDMA reset support at the AMDGPU driver level. - New function `smu_reset_sdma_is_supported` to check SDMA reset support at the SMU level. - Implementation of `smu_v13_0_6_reset_sdma_is_supported` for the specific SMU version v13.0.6. - Updated `smu_v13_0_6_reset_sdma` to use the new capability check before attempting to reset the SDMA engine. v2: change smu_reset_sdma_is_supported type to bool (Tim) Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: Add reset function pointer for SDMA v4.4.2 page ring	Jesse.zhang@amd.com
	This patch adds a reset function pointer to the SDMA v4.4.2 page ring functionality. The new function pointer `reset` is set to `sdma_v4_4_2_reset_queue`, which is responsible for resetting the SDMA queue. Changes: - Add `reset` function pointer to `sdma_v4_4_2_page_ring_funcs`. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: Improve SDMA reset logic with guilty queue tracking	Jesse.zhang@amd.com
	This patch includes the remaining improvements to the SDMA reset logic: - Added `gfx_guilty` and `page_guilty` flags to track guilty queues. - Updated the reset and resume functions to handle the guilty state. - Cached the `rptr` before reset. v2: 1.replace the caller with a guilty bool. If the queue is the guilty one, set the rptr and wptr to the saved wptr value, else, set the rptr and wptr to the saved rptr value. (Alex) 2. cache the rptr before the reset. (Alex) v3: Keeping intermediate variables like u64 rwptr simplifies resotre rptr/wptr.(Lijo) Suggested-by: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu/sdma: Introduce is_guilty callbacks for sdma GFX and PAGE rings	Jesse.zhang@amd.com
	This patch introduces the `is_guilty` callbacks for the GFX and PAGE rings. These callbacks check if a ring is guilty of causing a timeout or error. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: Introduce cached_rptr and is_guilty callback in amdgpu_ring	Jesse.zhang@amd.com
	This patch introduces the following changes: - Add `cached_rptr` to the `amdgpu_ring` structure to store the read pointer before a reset. - Add `is_guilty` callback to the `amdgpu_ring_funcs` structure to check if a ring is guilty of causing a timeout. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: Introduce conditional user queue suspension for SDMA resets	Jesse.zhang@amd.com
	- Modify the `amdgpu_sdma_reset_engine` function to accept a `suspend_user_queues` parameter. - This parameter allows the function to conditionally suspend and resume user queues during SDMA resets. - Ensure that user queues are suspended only when necessary to avoid unnecessary overhead and potential deadlocks. - Restart the scheduler's work queue for the GFX and page rings after the reset to allow new tasks to be submitted. This change improves synchronization between the KGD and the KFD during SDMA resets, ensuring proper handling of user queues and avoiding race conditions. V2: replace the ring_lock with the existed the scheduler locks for the queues (ring->sched) on the sdma engine.(Alex) v3: call drm_sched_wqueue_stop() rather than job_list_lock. If a GPU ring reset was already initiated for one ring at amdgpu_job_timedout, skip resetting that ring and call drm_sched_wqueue_stop() for the other rings (Alex) replace the common lock (sdma_reset_lock) with DQM lock to to resolve reset races between the two driver sections during KFD eviction.(Jon) Rename the caller to Reset_src and Change AMDGPU_RESET_SRC_SDMA_KGD/KFD to AMDGPU_RESET_SRC_SDMA_HWS/RING (Jon) v4: restart the wqueue if the reset was successful, or fall back to a full adapter reset. (Alex) move definition of reset source to enumeration AMDGPU_RESET_SRCS, and check reset src in amdgpu_sdma_reset_instance (Jon) v5: Call amdgpu_amdkfd_suspend/resume at the start/end of reset function respectively under !SRC_HWS conditions only (Jon) v6: replace the paramter src with a bool suspend_user_queues, remove the paramter src in pre/post func. (Jon) Suggested-by: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Suggested-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Acked-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: Remove redundant logic in GC v9.4.3	Lijo Lazar
	GFXOFF check is not needed for GC v9.4.3. Also, save/restore list is available by default. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: Do not poweroff UVDJ in JPEG4_0_3	Sathishkumar S
	Update power gate setting to not poweroff UVDJ in JPEG4_0_3. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu/sdma: Refactor SDMA reset functionality and add callback support	Jesse.zhang@amd.com
	This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver to improve modularity and support shared usage between AMDGPU and KFD. The changes include: 1. Refactored SDMA Reset Logic: - Split the `sdma_v4_4_2_reset_queue` function into two separate functions: - `sdma_v4_4_2_stop_queue`: Stops the SDMA queue before reset. - `sdma_v4_4_2_restore_queue`: Restores the SDMA queue after reset. - These functions are now used as callbacks for the shared reset mechanism. 2. Added Callback Support: - Introduced a new structure `sdma_v4_4_2_reset_funcs` to hold the stop and restore callbacks. - Added `sdma_v4_4_2_set_reset_funcs` to register these callbacks with the shared reset mechanism using `amdgpu_set_on_reset_callbacks`. 3. Fixed Reset Queue Function: - Modified `sdma_v4_4_2_reset_queue` to use the shared `amdgpu_sdma_reset_queue` function, ensuring consistency across the driver. This patch ensures that SDMA reset functionality is more modular, reusable, and aligned with the shared reset mechanism between AMDGPU and KFD. v2: Renamed sdma_v4_4_2_set_reset_funcs to sdma_v4_4_2_set_engine_reset_funcs. Renamed sdma_v4_4_2_reset_funcs to sdma_v4_4_2_engine_reset_funcs.(Alex) Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu/kfd: Add shared SDMA reset functionality with callback support	Jesse.zhang@amd.com
	This patch introduces shared SDMA reset functionality between AMDGPU and KFD. The implementation includes the following key changes: 1. Added `amdgpu_sdma_reset_queue`: - Resets a specific SDMA queue by instance ID. - Invokes registered pre-reset and post-reset callbacks to allow KFD and AMDGPU to save/restore their state during the reset process. 2. Added `amdgpu_set_on_reset_callbacks`: - Allows KFD and AMDGPU to register callback functions for pre-reset and post-reset operations. - Callbacks are stored in a global linked list and invoked in the correct order during SDMA reset. This patch ensures that both AMDGPU and KFD can handle SDMA reset events gracefully, with proper state saving and restoration. It also provides a flexible callback mechanism for future extensions. v2: fix CamelCase and put the SDMA helper into amdgpu_sdma.c (Alex) v3: rename the `amdgpu_register_on_reset_callbacks` function to `amdgpu_sdma_register_on_reset_callbacks` move global reset_callback_list to struct amdgpu_sdma (Alex) v4: Update the reset callback function description and rename the reset function to amdgpu_sdma_reset_engine (Alex) Suggested-by: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: correct the name of mes_pipe structure	Likun Gao
	Correct the structure name admgpu_mes_pipe to amdgpu_mes_pipe. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdkfd: Preserve cp_hqd_pq_control on update_mqd	David Yat Sin
	When userspace applications call AMDKFD_IOC_UPDATE_QUEUE. Preserve bitfields that do not need to be modified as they contain flags to track queue states that are used by CP FW. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	amdgpu/pm/legacy: fix suspend/resume issues	chr[]
	resume and irq handler happily races in set_power_state() * amdgpu_legacy_dpm_compute_clocks() needs lock * protect irq work handler * fix dpm_enabled usage v2: fix clang build, integrate Lijo's comments (Alex) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2524 Fixes: 3712e7a49459 ("drm/amd/pm: unified lock protections in amdgpu_dpm.c") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> # on Oland PRO Signed-off-by: chr[] <chris@rudorff.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/amdgpu: update the handle ptr in is_idle	Sunil Khatri
	Update the *handle to amdgpu_ip_block ptr for all functions pointers of is_idle. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25	drm/msm/dp: Add support for LTTPR handling	Abel Vesa
	Link Training Tunable PHY Repeaters (LTTPRs) are defined in DisplayPort 1.4a specification. As the name suggests, these PHY repeaters are capable of adjusting their output for link training purposes. According to the DisplayPort standard, LTTPRs have two operating modes: - non-transparent - it replies to DPCD LTTPR field specific AUX requests, while passes through all other AUX requests - transparent - it passes through all AUX requests. Switching between these two modes is done by the DPTX by issuing an AUX write to the DPCD PHY_REPEATER_MODE register. The msm DP driver is currently lacking any handling of LTTPRs. This means that if at least one LTTPR is found between DPTX and DPRX, the link training would fail if that LTTPR was not already configured in transparent mode. The section 3.6.6.1 from the DisplayPort v2.0 specification mandates that before link training with the LTTPR is started, the DPTX may place the LTTPR in non-transparent mode by first switching to transparent mode and then to non-transparent mode. This operation seems to be needed only on first link training and doesn't need to be done again until device is unplugged. It has been observed on a few X Elite-based platforms which have such LTTPRs in their board design that the DPTX needs to follow the procedure described above in order for the link training to be successful. So add support for reading the LTTPR DPCD caps to figure out the number of such LTTPRs first. Then, for platforms (or Type-C dongles) that have at least one such an LTTPR, set its operation mode to transparent mode first and then to non-transparent, just like the mentioned section of the specification mandates. Tested-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250203-drm-dp-msm-add-lttpr-transparent-mode-set-v5-4-c865d0e56d6e@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2025-02-25	drm/i915/dp: Use the generic helper to control LTTPR transparent mode	Abel Vesa
	LTTPRs operating modes are defined by the DisplayPort standard and the generic framework now provides a helper to switch between them, which is handling the explicit disabling of non-transparent mode and its disable->enable sequence mentioned in the DP Standard v2.0 section 3.6.6.1. So use the new drm generic helper instead as it makes the code a bit cleaner. Since the driver specific implementation holds the lttrp_common_caps, if the call to the drm generic helper fails, the lttrp_common_caps need to be updated as the helper has already rolled back to transparent mode. Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250203-drm-dp-msm-add-lttpr-transparent-mode-set-v5-3-c865d0e56d6e@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2025-02-25	drm/nouveau/dp: Use the generic helper to control LTTPR transparent mode	Abel Vesa
	LTTPRs operating modes are defined by the DisplayPort standard and the generic framework now provides a helper to switch between them, which is handling the explicit disabling of non-transparent mode and its disable->enable sequence mentioned in the DP Standard v2.0 section 3.6.6.1. So use the new drm generic helper instead as it makes the code a bit cleaner. Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Danilo Krummrich <dakr@kernel.org> # via IRC Link: https://patchwork.freedesktop.org/patch/msgid/20250203-drm-dp-msm-add-lttpr-transparent-mode-set-v5-2-c865d0e56d6e@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2025-02-25	drm/dp: Add helper to set LTTPRs in transparent mode	Abel Vesa
	According to the DisplayPort standard, LTTPRs have two operating modes: - non-transparent - it replies to DPCD LTTPR field specific AUX requests, while passes through all other AUX requests - transparent - it passes through all AUX requests. Switching between this two modes is done by the DPTX by issuing an AUX write to the DPCD PHY_REPEATER_MODE register. Add a generic helper that allows switching between these modes. Also add a generic wrapper for the helper that handles the explicit disabling of non-transparent mode and its disable->enable sequence mentioned in the DP Standard v2.0 section 3.6.6.1. Do this in order to move this handling out of the vendor specific driver implementation into the generic framework. Tested-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250203-drm-dp-msm-add-lttpr-transparent-mode-set-v5-1-c865d0e56d6e@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2025-02-25	drm/vkms: Round fixp2int conversion in lerp_u16	Harry Wentland
	fixp2int always rounds down, fixp2int_ceil rounds up. We need the new fixp2int_round. Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241220043410.416867-3-alex.hung@amd.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-02-25	drm/xe/oa: Allow oa_exponent value of 0	Umesh Nerlige Ramappa
	OA exponent value of 0 is a valid value for periodic reports. Allow user to pass 0 for the OA sampling interval since it gets converted to 2 gt clock ticks. v2: Update the check in xe_oa_stream_init as well (Ashutosh) v3: Fix mi-rpc failure by setting default exponent to -1 (CI) v4: Add the Fixes tag Fixes: b6fd51c62119 ("drm/xe/oa/uapi: Define and parse OA stream properties") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221213352.1712932-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 30341f0b8ea71725cc4ab2c43e3a3b749892fc92) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-25	drm/i915/dp_mst: Fix encoder HW state readout for UHBR MST	Imre Deak
	The encoder HW/SW state verification should use a SW state which stays unchanged while the encoder/output is active. The intel_dp::is_mst flag used during state computation to choose between the DP SST/MST modes can change while the output is active, if the sink gets disconnected or the MST topology is removed for another reason. A subsequent state verification using intel_dp::is_mst leads then to a mismatch if the output is disabled/re-enabled without recomputing its state. Use the encoder's active MST link count instead, which will be always non-zero for an active MST output and will be zero for SST. Fixes: 35d2e4b75649 ("drm/i915/ddi: start distinguishing 128b/132b SST and MST at state readout") Fixes: 40d489fac0e8 ("drm/i915/ddi: handle 128b/132b SST in intel_ddi_read_func_ctl()") Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250224093242.1859583-1-imre.deak@intel.com
2025-02-25	Merge drm/drm-next into drm-misc-next	Thomas Zimmermann
	Backmerging to get fixes from v6.14-rc4. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-02-25	drm: panel: Add a panel driver for the Summit display	Sasha Finkelstein
	This is the display panel used for the touchbar on laptops that have it. Co-developed-by: Nick Chan <towinchenmi@gmail.com> Signed-off-by: Nick Chan <towinchenmi@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Sasha Finkelstein <fnkl.kernel@gmail.com> Link: https://lore.kernel.org/r/20250217-adpdrm-v7-3-ca2e44b3c7d8@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250217-adpdrm-v7-3-ca2e44b3c7d8@gmail.com
2025-02-25	drm/panel: simple: Add BOE AV123Z7M-N17 panel	Maud Spierings
	Add support for the BOE AV123Z7M-N17 12.3" LVDS panel. Signed-off-by: Maud Spierings <maudspierings@gocontroll.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250224-initial_display-v1-8-5ccbbf613543@gocontroll.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250224-initial_display-v1-8-5ccbbf613543@gocontroll.com
2025-02-25	drm/panel: simple: add BOE AV101HDT-A10 panel	Maud Spierings
	add support for the BOE AV101HDT-A10 10.1" LVDS panel Signed-off-by: Maud Spierings <maudspierings@gocontroll.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250224-initial_display-v1-7-5ccbbf613543@gocontroll.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250224-initial_display-v1-7-5ccbbf613543@gocontroll.com
2025-02-25	drm/mipi-dsi: extend "multi" functions and use them in sony-td4353-jdi	Tejas Vipin
	Removes mipi_dsi_dcs_set_tear_off and replaces it with a multi version as after replacing it in sony-td4353-jdi, it doesn't appear anywhere else. sony-td4353-jdi is converted to use multi style functions, including mipi_dsi_dcs_set_tear_off_multi. Signed-off-by: Tejas Vipin <tejasvipin76@gmail.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250220045721.145905-1-tejasvipin76@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250220045721.145905-1-tejasvipin76@gmail.com
2025-02-25	Merge tag 'v6.14-rc4' into drm-next	Dave Airlie
	Backmerge Linux 6.14-rc4 at the request of tzimmermann so misc-next can base on rc4. Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-02-24	drm/repaper: fix integer overflows in repeat functions	Nikita Zhandarovich
	There are conditions, albeit somewhat unlikely, under which right hand expressions, calculating the end of time period in functions like repaper_frame_fixed_repeat(), may overflow. For instance, if 'factor10x' in repaper_get_temperature() is high enough (170), as is 'epd->stage_time' in repaper_probe(), then the resulting value of 'end' will not fit in unsigned int expression. Mitigate this by casting 'epd->factored_stage_time' to wider type before any multiplication is done. Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE. Fixes: 3589211e9b03 ("drm/tinydrm: Add RePaper e-ink driver") Cc: stable@vger.kernel.org Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Signed-off-by: Alex Lanzano <lanzano.alex@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250116134801.22067-1-n.zhandarovich@fintech.ru
2025-02-24	drm/bridge: ti-sn65dsi86: Check for CONFIG_PWM using IS_REACHABLE()	Uwe Kleine-König
	Currently CONFIG_PWM is a bool but I intend to change it to tristate. If CONFIG_PWM=m in the configuration, the cpp symbol CONFIG_PWM isn't defined and so the PWM code paths in the ti-sn65dsi86 driver are not used. The correct way to check for CONFIG_PWM is using IS_REACHABLE which does the right thing for all cases CONFIG_DRM_TI_SN65DSI86 ∈ { y, m } x CONFIG_PWM ∈ { y, m, n }. There is no change until CONFIG_PWM actually becomes tristate. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com> Reviewed-by: Robert Foss <rfoss@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250217174936.758420-2-u.kleine-koenig@baylibre.com
2025-02-24	drm/xe/oa: Allow oa_exponent value of 0	Umesh Nerlige Ramappa
	OA exponent value of 0 is a valid value for periodic reports. Allow user to pass 0 for the OA sampling interval since it gets converted to 2 gt clock ticks. v2: Update the check in xe_oa_stream_init as well (Ashutosh) v3: Fix mi-rpc failure by setting default exponent to -1 (CI) v4: Add the Fixes tag Fixes: b6fd51c62119 ("drm/xe/oa/uapi: Define and parse OA stream properties") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221213352.1712932-1-umesh.nerlige.ramappa@intel.com
2025-02-24	drm/xe/devcoredump: Remove IS_ERR_OR_NULL check for kzalloc	Shuicheng Lin
	kzalloc returns a valid pointer or NULL if the allocation fails. It never returns an error pointer. It is better to check for NULL directly. Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250220001710.1803749-3-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-24	drm/xe/devcoredump: Fix print typo of offset	Shuicheng Lin
	The log should print with "offset" instead of "size". Correct the typo in the comment. v2: split kzalloc change and add typo fix in commit message (Lucas) Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250220001710.1803749-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-24	drm/xe/xe_pmu: Acquire forcewake on event init for engine events	Riana Tauro
	When the engine events are created, acquire GT forcewake to read gpm timestamp required for the events and release on event destroy. This cannot be done during read due to the raw spinlock held my pmu. v2: remove forcewake counting (Umesh) v3: remove extra space (Umesh) v4: use event pmu private data (Lucas) free local copy (Umesh) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250224053903.2253539-6-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-24	drm/xe/xe_pmu: Add PMU support for engine activity	Riana Tauro
	PMU provides two counters (engine-active-ticks, engine-total-ticks) to calculate engine activity. When querying engine activity, user must group these 2 counters using the perf_event group mechanism to ensure both counters are sampled together. To list the events ./perf list xe_0000_03_00.0/engine-active-ticks/ [Kernel PMU event] xe_0000_03_00.0/engine-total-ticks/ [Kernel PMU event] The formats to be used with the above are engine_instance - config:12-19 engine_class - config:20-27 gt - config:60-63 The events can then be read using perf tool ./perf stat -e xe_0000_03_00.0/engine-active-ticks,gt=0, engine_class=0,engine_instance=0/, xe_0000_03_00.0/engine-total-ticks,gt=0, engine_class=0,engine_instance=0/ -I 1000 Engine activity can then be calculated as below engine activity % = (engine active ticks/engine total ticks) * 100 v2: validate gt rename total-ticks to engine-total-ticks add helper to get hwe (Umesh) v3: fix checkpatch warning add details to documentation (Umesh) remove ascii formats from documentation (Lucas) v4: remove unnecessary warn within raw_spinlock (Lucas) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250224053903.2253539-5-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-24	drm/xe/userptr: fix EFAULT handling	Matthew Auld
	Currently we treat EFAULT from hmm_range_fault() as a non-fatal error when called from xe_vm_userptr_pin() with the idea that we want to avoid killing the entire vm and chucking an error, under the assumption that the user just did an unmap or something, and has no intention of actually touching that memory from the GPU. At this point we have already zapped the PTEs so any access should generate a page fault, and if the pin fails there also it will then become fatal. However it looks like it's possible for the userptr vma to still be on the rebind list in preempt_rebind_work_func(), if we had to retry the pin again due to something happening in the caller before we did the rebind step, but in the meantime needing to re-validate the userptr and this time hitting the EFAULT. This explains an internal user report of hitting: [ 191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe] [ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738690] Call Trace: [ 191.738692] <TASK> [ 191.738694] ? show_regs+0x69/0x80 [ 191.738698] ? __warn+0x93/0x1a0 [ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738759] ? report_bug+0x18f/0x1a0 [ 191.738764] ? handle_bug+0x63/0xa0 [ 191.738767] ? exc_invalid_op+0x19/0x70 [ 191.738770] ? asm_exc_invalid_op+0x1b/0x20 [ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738834] ? ret_from_fork_asm+0x1a/0x30 [ 191.738849] bind_op_prepare+0x105/0x7b0 [xe] [ 191.738906] ? dma_resv_reserve_fences+0x301/0x380 [ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe] [ 191.738966] ? kmemleak_alloc+0x4b/0x80 [ 191.738973] ops_execute+0x188/0x9d0 [xe] [ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe] [ 191.739098] ? trace_hardirqs_on+0x4d/0x60 [ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe] Followed by NPD, when running some workload, since the sg was never actually populated but the vma is still marked for rebind when it should be skipped for this special EFAULT case. This is confirmed to fix the user report. v2 (MattB): - Move earlier. v3 (MattB): - Update the commit message to make it clear that this indeed fixes the issue. Fixes: 521db22a1d70 ("drm/xe: Invalidate userptr VMA on page pin fault") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221143840.167150-5-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 6b93cb98910c826c2e2004942f8b060311e43618) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-24	drm/xe/userptr: restore invalidation list on error	Matthew Auld
	On error restore anything still on the pin_list back to the invalidation list on error. For the actual pin, so long as the vma is tracked on either list it should get picked up on the next pin, however it looks possible for the vma to get nuked but still be present on this per vm pin_list leading to corruption. An alternative might be then to instead just remove the link when destroying the vma. v2: - Also add some asserts. - Keep the overzealous locking so that we are consistent with the docs; updating the docs and related bits will be done as a follow up. Fixes: ed2bdf3b264d ("drm/xe/vm: Subclass userptr vmas") Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221143840.167150-4-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 4e37e928928b730de9aa9a2f5dc853feeebc1742) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-24	drm/xe/guc: Expose engine activity only for supported GuC version	Riana Tauro
	Engine activity is supported only on GuC submission version >= 1.14.1 Allow enabling/reading engine activity only on supported GuC versions. Warn once if not supported. v2: use guc interface version (John) v3: use debug log (Umesh) v4: use variable for supported and use gt logs use a friendlier log message (Michal) v5: fix kernel-doc do not continue in init if not supported (Michal) v6: remove hardcoding values (Michal) Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250224053903.2253539-4-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-24	drm/xe/trace: Add trace for engine activity	Riana Tauro
	Add engine activity related information to trace events for better debuggability v2: add trace for engine activity (Umesh) v3: use hex for quanta_ratio Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250224053903.2253539-3-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>