Age | Commit message (Collapse) | Author |
|
add new vcn and jpeg msg
v2: squash in updates (Alex)
v3: rework code for better compat with other smu14.x variants (Alex)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: lima1002 <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use include guard macro name that follows naming used by the other
GuC ABI files.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213214908.1481-1-michal.wajdeczko@intel.com
|
|
It's better to keep all hardware GGTT definitions separated from
the driver code. It also helps to avoid duplicated definitions.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240326131042.319-1-michal.wajdeczko@intel.com
|
|
The onboard_usb_hub driver has been updated to support non-hub devices,
which has led to some renaming.
Update to the new name (ONBOARD_USB_DEV) accordingly.
Acked-by: Helen Koike <helen.koike@collabora.com>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Javier Carrasco <javier.carrasco@wolfvision.net>
Link: https://lore.kernel.org/r/20240325-onboard_xvf3500-v8-3-29e3f9222922@wolfvision.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There are cases where soft reset seems to succeed, but
does not, so always use mode1/2 for now.
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Currently, with F32 HWS GPU reset is only when unmap queue fails.
However, if compute queue doesn't repond to preemption request in time
unmap will return without any error. In this case, only preemption error
is logged and Reset is not triggered. Call GPU reset in this case also.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
An errant disk backup on my desktop got into debugfs and triggered the
following deadlock scenario in the amdgpu debugfs files. The machine
also hard-resets immediately after those lines are printed (although I
wasn't able to reproduce that part when reading by hand):
[ 1318.016074][ T1082] ======================================================
[ 1318.016607][ T1082] WARNING: possible circular locking dependency detected
[ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 Not tainted
[ 1318.017598][ T1082] ------------------------------------------------------
[ 1318.018096][ T1082] tar/1082 is trying to acquire lock:
[ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x80
[ 1318.019084][ T1082]
[ 1318.019084][ T1082] but task is already holding lock:
[ 1318.020052][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.020607][ T1082]
[ 1318.020607][ T1082] which lock already depends on the new lock.
[ 1318.020607][ T1082]
[ 1318.022081][ T1082]
[ 1318.022081][ T1082] the existing dependency chain (in reverse order) is:
[ 1318.023083][ T1082]
[ 1318.023083][ T1082] -> #2 (reservation_ww_class_mutex){+.+.}-{3:3}:
[ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0
[ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90
[ 1318.025161][ T1082] dma_resv_lockdep+0x18a/0x330
[ 1318.025683][ T1082] do_one_initcall+0x6a/0x350
[ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310
[ 1318.026728][ T1082] kernel_init+0x15/0x1a0
[ 1318.027242][ T1082] ret_from_fork+0x2c/0x40
[ 1318.027759][ T1082] ret_from_fork_asm+0x11/0x20
[ 1318.028281][ T1082]
[ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}:
[ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330
[ 1318.029790][ T1082] do_one_initcall+0x6a/0x350
[ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310
[ 1318.030722][ T1082] kernel_init+0x15/0x1a0
[ 1318.031168][ T1082] ret_from_fork+0x2c/0x40
[ 1318.031598][ T1082] ret_from_fork_asm+0x11/0x20
[ 1318.032011][ T1082]
[ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}:
[ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680
[ 1318.033141][ T1082] lock_acquire+0xcd/0x2c0
[ 1318.033487][ T1082] __might_fault+0x58/0x80
[ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu]
[ 1318.034181][ T1082] full_proxy_read+0x55/0x80
[ 1318.034487][ T1082] vfs_read+0xa7/0x360
[ 1318.034788][ T1082] ksys_read+0x70/0xf0
[ 1318.035085][ T1082] do_syscall_64+0x94/0x180
[ 1318.035375][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 1318.035664][ T1082]
[ 1318.035664][ T1082] other info that might help us debug this:
[ 1318.035664][ T1082]
[ 1318.036487][ T1082] Chain exists of:
[ 1318.036487][ T1082] &mm->mmap_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex
[ 1318.036487][ T1082]
[ 1318.037310][ T1082] Possible unsafe locking scenario:
[ 1318.037310][ T1082]
[ 1318.037838][ T1082] CPU0 CPU1
[ 1318.038101][ T1082] ---- ----
[ 1318.038350][ T1082] lock(reservation_ww_class_mutex);
[ 1318.038590][ T1082] lock(reservation_ww_class_acquire);
[ 1318.038839][ T1082] lock(reservation_ww_class_mutex);
[ 1318.039083][ T1082] rlock(&mm->mmap_lock);
[ 1318.039328][ T1082]
[ 1318.039328][ T1082] *** DEADLOCK ***
[ 1318.039328][ T1082]
[ 1318.040029][ T1082] 1 lock held by tar/1082:
[ 1318.040259][ T1082] #0: ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.040560][ T1082]
[ 1318.040560][ T1082] stack backtrace:
[ 1318.041053][ T1082] CPU: 22 PID: 1082 Comm: tar Not tainted 6.8.0-rc7-00015-ge0c8221b72c0 #17 3316c85d50e282c5643b075d1f01a4f6365e39c2
[ 1318.041329][ T1082] Hardware name: Gigabyte Technology Co., Ltd. B650 AORUS PRO AX/B650 AORUS PRO AX, BIOS F20 12/14/2023
[ 1318.041614][ T1082] Call Trace:
[ 1318.041895][ T1082] <TASK>
[ 1318.042175][ T1082] dump_stack_lvl+0x4a/0x80
[ 1318.042460][ T1082] check_noncircular+0x145/0x160
[ 1318.042743][ T1082] __lock_acquire+0x14bf/0x2680
[ 1318.043022][ T1082] lock_acquire+0xcd/0x2c0
[ 1318.043301][ T1082] ? __might_fault+0x40/0x80
[ 1318.043580][ T1082] ? __might_fault+0x40/0x80
[ 1318.043856][ T1082] __might_fault+0x58/0x80
[ 1318.044131][ T1082] ? __might_fault+0x40/0x80
[ 1318.044408][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu 8fe2afaa910cbd7654c8cab23563a94d6caebaab]
[ 1318.044749][ T1082] full_proxy_read+0x55/0x80
[ 1318.045042][ T1082] vfs_read+0xa7/0x360
[ 1318.045333][ T1082] ksys_read+0x70/0xf0
[ 1318.045623][ T1082] do_syscall_64+0x94/0x180
[ 1318.045913][ T1082] ? do_syscall_64+0xa0/0x180
[ 1318.046201][ T1082] ? lockdep_hardirqs_on+0x7d/0x100
[ 1318.046487][ T1082] ? do_syscall_64+0xa0/0x180
[ 1318.046773][ T1082] ? do_syscall_64+0xa0/0x180
[ 1318.047057][ T1082] ? do_syscall_64+0xa0/0x180
[ 1318.047337][ T1082] ? do_syscall_64+0xa0/0x180
[ 1318.047611][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 1318.047887][ T1082] RIP: 0033:0x7f480b70a39d
[ 1318.048162][ T1082] Code: 91 ba 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb b2 e8 18 a3 01 00 0f 1f 84 00 00 00 00 00 80 3d a9 3c 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 53 48 83
[ 1318.048769][ T1082] RSP: 002b:00007ffde77f5c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 1318.049083][ T1082] RAX: ffffffffffffffda RBX: 0000000000000800 RCX: 00007f480b70a39d
[ 1318.049392][ T1082] RDX: 0000000000000800 RSI: 000055c9f2120c00 RDI: 0000000000000008
[ 1318.049703][ T1082] RBP: 0000000000000800 R08: 000055c9f2120a94 R09: 0000000000000007
[ 1318.050011][ T1082] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c9f2120c00
[ 1318.050324][ T1082] R13: 0000000000000008 R14: 0000000000000008 R15: 0000000000000800
[ 1318.050638][ T1082] </TASK>
amdgpu_debugfs_mqd_read() holds a reservation when it calls
put_user(), which may fault and acquire the mmap_sem. This violates
the established locking order.
Bounce the mqd data through a kernel buffer to get put_user() out of
the illegal section.
Fixes: 445d85e3c1df ("drm/amdgpu: add debugfs interface for reading MQDs")
Cc: stable@vger.kernel.org # v6.5+
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Share same codes with 4.0.5 and enable collaborate mode for VPE.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Align with FW changes.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY & HOW]
DCN351 and DCN35 should use the same bounding box and IP settings.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Xi Liu <xi.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This version brings along following fixes:
- Fix some bound and NULL check
- Fix nonseamless transition from ODM + MPO to ODM + subvp
- Allow Z8 when stutter threshold is not met
- Remove plane and stream pointers from dc scratch
- Remove read/write to external register
- Increase number of hpo dp link encoders
- Increase clock table size
- Add new IPS config mode
- Build scaling params when a new plane is appended
- Refactor DML2 interfaces
- Allow idle opts for no flip case on PSR panel
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
Driver crashes when pipe idx not set properly
[how]
Add code to skip the pipe that idx not set properly
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Muhammad Ahmed <ahmed.ahmed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
- Add Display PHY FSM command interface for automated testing
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Anthony Koo <anthony.koo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY&HOW]
Converting the watermark set structure to a union and modifying some interfaces
to accommodate future usage.
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why&How]
Remove several plane and stream pointers from dc for code
refactoring.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Number of hpo dp2 link encoders is increased.
Instances are changed.
[How]
Increased size in resource pool, init for each instance
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Sridevi Arvindekar <sridevi.arvindekar@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why&how]
To prevent out of bounds error, we need
to increase the clock table size.
Reviewed-by: Xi Liu <xi.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why&How]
Some panels don't meet the stutter threshold (4k etc), this leads to
power regressions. Allow z8 for panels that don't meet the threshold
but support PSR/replay
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Bhawanpreet Lakha <bhawanpreet.lakha@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why&How]
Some panels don't meet the stutter threshold (4k etc), this leads to
power regressions. Allow z8 for panels that don't meet the threshold
but support PSR/replay
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Bhawanpreet Lakha <bhawanpreet.lakha@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
We don't have a way to specify IPS2 for display off but RCG only for
static screen and local video playback.
[How]
Add a new setting that allows RCG only when displays are active but
IPS2 when all displays are off.
Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
NumFclkLevelsEnabled is used for DcfClocks bounds check
instead of designated NumDcfClkLevelsEnabled.
That can cause array index out-of-bounds access.
[How]
Use designated variable for dcn35 DcfClocks bounds check.
Fixes: a8edc9cc0b14 ("drm/amd/display: Fix array-index-out-of-bounds in dcn35_clkmgr")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
MPC flow rate control is not needed for DCN30 and above. Current logic
that uses it can result in underflow for certain edge cases (such as
DSC N422 + ODM combine + 422 left edge pixel).
[How]
Remove MPC flow rate control logic and programming for DCN30 and above.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why & how]
We are boundling changes in plane state and build scaling params
together. This is to simplify DML code so DML doesn't need to build
scaling params. We are also avoiding rebuilding scaling params for
planes without scaling changes.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
when ODM + MPO is used for all 4 available pipes. Pipe transition will
be nonseamless. Phantom OTG master pipe reuses the secondary OPP head
pipe. There is no possible seamless path to transit to the new
state. The correct logic would be to reuse a secondary DPP pipe as the
phantom OTG master pipe. This way we are able to first transit the
minimal transtion state of new and then transit to new state seamlessly.
current New (nonseamless)
________________________ ________________________
| plane0 slice0 stream0| | plane0 slice0 stream0|
|DPP0----OPP0----OTG0----| |DPP0----OPP0----OTG0----|
| plane1 | | | | plane0 slice1 | |
|DPP2----| | | |DPP2----OPP2----| |
| plane0 slice1 | | | plane0 slice0 stream1|
|DPP1----OPP1----| | |DPP1----OPP1----OTG1----|
| plane1 | | | plane0 slice1 | |
|DPP3----| | |DPP3----OPP3----| |
|________________________| |________________________|
New (seamless) New (minimal transition)
________________________ ________________________
| plane0 slice0 stream0| | plane0 slice0 stream0|
|DPP0----OPP0----OTG0----| |DPP0----OPP0----OTG0----|
| plane0 slice1 | | | plane0 slice1 | |
|DPP1----OPP1----| | |DPP1----OPP1----| |
| plane0 slice0 stream1| |________________________|
|DPP2----OPP2----OTG2----|
| plane0 slice1 | |
|DPP3----OPP3----| |
|________________________|
[how]
Try to acquire free pipes used as secondary DPP pipes from current state
before try to acquire any free pipes for new OTG master pipe.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why&how]
In some platform out_transfer_func may not be popualted. We need to check
for null before dereferencing it.
Fixes: d2dea1f14038 ("drm/amd/display: Generalize new minimal transition path")
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why&how]
Add the missing null check before dereference for dc_stream_status*
Fixes: 2d5bb791e24f ("drm/amd/display: Implement update_planes_and_stream_v3 sequence")
Reviewed-by: Josip Pavic <josip.pavic@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Sohaib Nadeem <sohaib.nadeem@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why&How}
Some interfaces needed changes to support future architectures.
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why&How]
These additional callbacks to DC will be required for the DML2 wrapper. Also
consolidate common callbacks for projects to a single location for maintenance.
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why&how]
We need to remove the reference to these registers to
prevent any usage in the future.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Previous patch to allow DTBCLK disable didn't address boot case. Driver
thinks DTBCLK is disabled by default, so we don't send disable message to
PMFW. DTBCLK is then enabled at idle desktop on boot, burning power.
[How]
Set dtbclk_en to true on boot so that disable message is sent during first
commit.
Fixes: 27750e176a4f ("drm/amd/display: Allow DTBCLK disable for DCN35")
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <syed.hassan@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
transitions.
[WHY]
Previously, we'd disabled HPO whenever an HPO display was disconnected. This
caused other HPO displays to blank whenever one was unplugged.
[HOW]
This change restricts HPO enable/disable to dce110_apply_ctx_to_hw and adds a
helper function (dce110_is_hpo_enabled) that returns true if any HPO displays
are present in a context. We compare the current and previous dc ctx to check
whether HPO is transitioning from on to off or vice versa, and adjust the HPO
state accordingly.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: Chris Park <chris.park@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Natanel Roizenman <natanel.roizenman@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why & how]
There were some fixes in dcn35 that need
to be ported over to dcn351 to prevent any
regression.
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Reviewed-by: Liu, Xi (Alex) <xiliu102@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We need to re-enable idle power optimizations after entering PSR. Since,
we get kicked out of idle power optimizations before entering PSR
(entering PSR requires us to write to DCN registers, which isn't allowed
while we are in IPS).
Fixes: a9b1a4f684b3 ("drm/amd/display: Add more checks for exiting idle in DC")
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why & How]
There is a corner case where a single PSR panel
fails to enter idle optimizations if the panel
is not flipping (no planes or DPMS_OFF == true).
This is because the panel will not enter PSR if it's
not flipping, but this will prevent the FW idle opt
path from being executed. To handle this case we will
allow entry to idle opt from driver side even when a
PSR panel is connected under the following scenarios:
1. Only a single PSR panel is connected
2. PSR panel is not flipping
Reviewed-by: Samson Tam <samson.tam@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Refactor xe_sync_entry_signal so it doesn't have to
modify xe_sched_job struct instead create a new helper function
to set user fence values for a job.
v2: Move the sync type check to xe_sched_job_init_user_fence(Lucas)
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321161142.4954-1-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
SLPC shutdown is called in reset and suspend paths. In the reset
path, it is possible that the H2G call gets lost as GuC is in the
process of being reset. There is no value in stopping SLPC when
it will happen anyways.
In the suspend path, we disable communication with GuC, so there
is no need to explicitly shutdown SLPC.
v2: Rebase
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240325235602.1155486-1-vinay.belgaumkar@intel.com
|
|
Use FIELD_PREP for setting lrc descriptor fields instead
of shifting values to fields.
v2: Use ULL macro variants
v3: Do not use FIELD_PREP for 1-bit values
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240322191455.7613-1-niranjana.vishwanathapura@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
drivers/gpu/drm/i915/gt/intel_workarounds.c.rej was incorrectly added to
the tree after solving a conflict. Remove it.
Fixes: 326e30e4624c ("drm/i915: Drop dead code for pvc")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/r/20240325083435.4f970eec@canb.auug.org.au
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240325144728.537855-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Historically the expectation was to set brightness to max on enable, if
it was zero. However, the policy should be to preserve brightness across
disable/enable, for example the userspace might want to dim the
brightness before disable (e.g. on suspend) and gradually bring it back
up after enable (e.g. on resume). If the brightness gets bumped to max
at enable, this causes flicker as the userspace expects the brightness
to have been preserved across disable/enable.
For example:
(suspend)
[53949.248875] i915 0000:00:02.0: [drm:intel_edp_backlight_off]
[53949.452046] i915 0000:00:02.0: [drm:intel_backlight_set_pwm_level] set backlight PWM = 0
(wakeup)
[53986.067356] i915 0000:00:02.0: [drm:intel_edp_backlight_on]
[53986.067367] i915 0000:00:02.0: [drm:intel_backlight_enable] pipe A
[53986.067476] i915 0000:00:02.0: [drm:intel_backlight_set_pwm_level] set backlight PWM = 96000
[53986.119766] backlight intel_backlight: PM: calling backlight_resume+0x0/0x7a @ 4916, parent: card0-eDP-1
[53986.119781] backlight intel_backlight: PM: backlight_resume+0x0/0x7a returned 0 after 0 usecs
[53986.220068] [drm:intel_backlight_device_update_status] updating intel_backlight, brightness=26321/96000
[53986.220086] i915 0000:00:02.0: [drm:intel_panel_actually_set_backlight] set backlight level = 27961
Reduce the check to respecting the minimum brightness, and avoid bumping
min brightness to max on enable.
Note: It's possible there's still userspace out there that relies on the
old behaviour. It would be unfortunate, but there's really only one way
to find out.
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Gareth Yu <gareth.yu@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[Jani: Rewrote the commit message.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321045311.2124111-1-gareth.yu@intel.com
|
|
When we have no VBT we currently assume ports A-F are
all pontially valid for every platform. That is nonsense.
Grab the bitmask of valid ports from the runtime info
instead.
Although the defaults we actually fill here look semi-sensible
only for hsw-skl era hardware. Dunno if we should try to do
something more appropriate here for other platforms,
or just try to nuke the whole thing?
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319092443.15769-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
intel_bios_encoder_supports_dp_dual_mode()
If we have no VBT, or the VBT didn't declare the encoder
in question, we won't have the 'devdata' for the encoder.
Instead of oopsing just bail early.
We won't be able to tell whether the port is DP++ or not,
but so be it.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10464
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319092443.15769-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
Calling i915_gem_object_get_dma_address() from the vblank
evade critical section triggers might_sleep().
While we know that we've already pinned the framebuffer
and thus i915_gem_object_get_dma_address() will in fact
not sleep in this case, it seems reasonable to keep the
unconditional might_sleep() for maximum coverage.
So let's instead pre-populate the dma address during
fb pinning, which all happens before we enter the
vblank evade critical section.
We can use u32 for the dma address as this class of
hardware doesn't support >32bit addresses.
Cc: stable@vger.kernel.org
Fixes: 0225a90981c8 ("drm/i915: Make cursor plane registers unlocked")
Reported-by: Borislav Petkov <bp@alien8.de>
Closes: https://lore.kernel.org/intel-gfx/20240227100342.GAZd2zfmYcPS_SndtO@fat_crate.local/
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240325175738.3440-1-ville.syrjala@linux.intel.com
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
|
|
There is a spelling mistake in a drm_err message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240326100219.43989-1-colin.i.king@gmail.com
|
|
The init sequence specifies the 0x11 and 0x29 dsi commands, which are
the exit-sleep and display-on commands.
In the actual prepare step the driver already uses the appropriate
function calls for those, so drop the duplicates.
Fixes: e5f9d543419c ("drm/panel: ltk050h3146w: add support for Leadtek LTK050H3148W-CTA6 variant")
Signed-off-by: Heiko Stuebner <heiko.stuebner@cherry.de>
Reviewed-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320131232.327196-2-heiko@sntech.de
|
|
Similar to other variants, the LTK050H3148W wants to run in video mode
when displaying data. So far only the Synopsis DSI driver was using this
panel and it is always switching to video mode, independent of this flag
being set.
Other DSI drivers might handle this differently, so add the flag.
Fixes: e5f9d543419c ("drm/panel: ltk050h3146w: add support for Leadtek LTK050H3148W-CTA6 variant")
Signed-off-by: Heiko Stuebner <heiko.stuebner@cherry.de>
Reviewed-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320131232.327196-1-heiko@sntech.de
|
|
The Alpha blending for 30 bit RGB/BGR are not
functioning properly for rk3568/rk3588, so remove
it from the format list.
Fixes: bfd8a5c228fa ("drm/rockchip: vop2: Add more supported 10bit formats")
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240304100952.3592984-1-andyshrk@163.com
|
|
Originally, with strict in order execution, we could complete execution
only when the queue was empty. Preempt-to-busy allows replacement of an
active request that may complete before the preemption is processed by
HW. If that happens, the request is retired from the queue, but the
queue_priority_hint remains set, preventing direct submission until
after the next CS interrupt is processed.
This preempt-to-busy race can be triggered by the heartbeat, which will
also act as the power-management barrier and upon completion allow us to
idle the HW. We may process the completion of the heartbeat, and begin
parking the engine before the CS event that restores the
queue_priority_hint, causing us to fail the assertion that it is MIN.
<3>[ 166.210729] __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1))
<0>[ 166.210781] Dumping ftrace buffer:
<0>[ 166.210795] ---------------------------------
...
<0>[ 167.302811] drm_fdin-1097 2..s1. 165741070us : trace_ports: 0000:00:02.0 rcs0: promote { ccid:20 1217:2 prio 0 }
<0>[ 167.302861] drm_fdin-1097 2d.s2. 165741072us : execlists_submission_tasklet: 0000:00:02.0 rcs0: preempting last=1217:2, prio=0, hint=2147483646
<0>[ 167.302928] drm_fdin-1097 2d.s2. 165741072us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 1217:2, current 0
<0>[ 167.302992] drm_fdin-1097 2d.s2. 165741073us : __i915_request_submit: 0000:00:02.0 rcs0: fence 3:4660, current 4659
<0>[ 167.303044] drm_fdin-1097 2d.s1. 165741076us : execlists_submission_tasklet: 0000:00:02.0 rcs0: context:3 schedule-in, ccid:40
<0>[ 167.303095] drm_fdin-1097 2d.s1. 165741077us : trace_ports: 0000:00:02.0 rcs0: submit { ccid:40 3:4660* prio 2147483646 }
<0>[ 167.303159] kworker/-89 11..... 165741139us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence c90:2, current 2
<0>[ 167.303208] kworker/-89 11..... 165741148us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:c90 unpin
<0>[ 167.303272] kworker/-89 11..... 165741159us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 1217:2, current 2
<0>[ 167.303321] kworker/-89 11..... 165741166us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:1217 unpin
<0>[ 167.303384] kworker/-89 11..... 165741170us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 3:4660, current 4660
<0>[ 167.303434] kworker/-89 11d..1. 165741172us : __intel_context_retire: 0000:00:02.0 rcs0: context:1216 retire runtime: { total:56028ns, avg:56028ns }
<0>[ 167.303484] kworker/-89 11..... 165741198us : __engine_park: 0000:00:02.0 rcs0: parked
<0>[ 167.303534] <idle>-0 5d.H3. 165741207us : execlists_irq_handler: 0000:00:02.0 rcs0: semaphore yield: 00000040
<0>[ 167.303583] kworker/-89 11..... 165741397us : __intel_context_retire: 0000:00:02.0 rcs0: context:1217 retire runtime: { total:325575ns, avg:0ns }
<0>[ 167.303756] kworker/-89 11..... 165741777us : __intel_context_retire: 0000:00:02.0 rcs0: context:c90 retire runtime: { total:0ns, avg:0ns }
<0>[ 167.303806] kworker/-89 11..... 165742017us : __engine_park: __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1))
<0>[ 167.303811] ---------------------------------
<4>[ 167.304722] ------------[ cut here ]------------
<2>[ 167.304725] kernel BUG at drivers/gpu/drm/i915/gt/intel_engine_pm.c:283!
<4>[ 167.304731] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[ 167.304734] CPU: 11 PID: 89 Comm: kworker/11:1 Tainted: G W 6.8.0-rc2-CI_DRM_14193-gc655e0fd2804+ #1
<4>[ 167.304736] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
<4>[ 167.304738] Workqueue: i915-unordered retire_work_handler [i915]
<4>[ 167.304839] RIP: 0010:__engine_park+0x3fd/0x680 [i915]
<4>[ 167.304937] Code: 00 48 c7 c2 b0 e5 86 a0 48 8d 3d 00 00 00 00 e8 79 48 d4 e0 bf 01 00 00 00 e8 ef 0a d4 e0 31 f6 bf 09 00 00 00 e8 03 49 c0 e0 <0f> 0b 0f 0b be 01 00 00 00 e8 f5 61 fd ff 31 c0 e9 34 fd ff ff 48
<4>[ 167.304940] RSP: 0018:ffffc9000059fce0 EFLAGS: 00010246
<4>[ 167.304942] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000006
<4>[ 167.304944] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
<4>[ 167.304946] RBP: ffff8881330ca1b0 R08: 0000000000000001 R09: 0000000000000001
<4>[ 167.304947] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881330ca000
<4>[ 167.304948] R13: ffff888110f02aa0 R14: ffff88812d1d0205 R15: ffff88811277d4f0
<4>[ 167.304950] FS: 0000000000000000(0000) GS:ffff88844f780000(0000) knlGS:0000000000000000
<4>[ 167.304952] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 167.304953] CR2: 00007fc362200c40 CR3: 000000013306e003 CR4: 0000000000770ef0
<4>[ 167.304955] PKRU: 55555554
<4>[ 167.304957] Call Trace:
<4>[ 167.304958] <TASK>
<4>[ 167.305573] ____intel_wakeref_put_last+0x1d/0x80 [i915]
<4>[ 167.305685] i915_request_retire.part.0+0x34f/0x600 [i915]
<4>[ 167.305800] retire_requests+0x51/0x80 [i915]
<4>[ 167.305892] intel_gt_retire_requests_timeout+0x27f/0x700 [i915]
<4>[ 167.305985] process_scheduled_works+0x2db/0x530
<4>[ 167.305990] worker_thread+0x18c/0x350
<4>[ 167.305993] kthread+0xfe/0x130
<4>[ 167.305997] ret_from_fork+0x2c/0x50
<4>[ 167.306001] ret_from_fork_asm+0x1b/0x30
<4>[ 167.306004] </TASK>
It is necessary for the queue_priority_hint to be lower than the next
request submission upon waking up, as we rely on the hint to decide when
to kick the tasklet to submit that first request.
Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10154
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318135906.716055-2-janusz.krzysztofik@linux.intel.com
|
|
This reverts commit 7a2280e8dcd2f1f436db9631287c0b21cf6a92b0, obsoleted
by "drm/i915/vma: Fix UAF on destroy against retire race".
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-8-janusz.krzysztofik@linux.intel.com
|
|
There was an attempt to fix an issue of illegal attempts to free a still
active i915 VMA object when parking a GT believed to be idle, reported by
CI on 2-GT Meteor Lake. As a solution, an extra wakeref for a Primary GT
was acquired from i915_gem_do_execbuffer() -- see commit f56fe3e91787
("drm/i915: Fix a VMA UAF for multi-gt platform").
However, that fix occurred insufficient -- the issue was still reported by
CI. That wakeref was released on exit from i915_gem_do_execbuffer(), then
potentially before completion of the request and deactivation of its
associated VMAs. Moreover, CI reports indicated that single-GT platforms
also suffered sporadically from the same race.
Since the issue has now been fixed by a preceding patch "drm/i915/vma: Fix
UAF on destroy against retire race", drop the no longer useful changes
introduced by that insufficient fix.
v3: Also drop the no longer used .wakeref_gt0 field from struct
i915_execbuffer.
v2: Avoid the word "revert" in commit message (Rodrigo),
- update commit description reusing relevant chunks dropped from the
description of the proper fix (Rodrigo).
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-7-janusz.krzysztofik@linux.intel.com
|
|
Object debugging tools were sporadically reporting illegal attempts to
free a still active i915 VMA object when parking a GT believed to be idle.
[161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915]
[161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0
...
[161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1
[161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
[161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915]
[161.360592] RIP: 0010:debug_print_object+0x80/0xb0
...
[161.361347] debug_object_free+0xeb/0x110
[161.361362] i915_active_fini+0x14/0x130 [i915]
[161.361866] release_references+0xfe/0x1f0 [i915]
[161.362543] i915_vma_parked+0x1db/0x380 [i915]
[161.363129] __gt_park+0x121/0x230 [i915]
[161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915]
That has been tracked down to be happening when another thread is
deactivating the VMA inside __active_retire() helper, after the VMA's
active counter has been already decremented to 0, but before deactivation
of the VMA's object is reported to the object debugging tool.
We could prevent from that race by serializing i915_active_fini() with
__active_retire() via ref->tree_lock, but that wouldn't stop the VMA from
being used, e.g. from __i915_vma_retire() called at the end of
__active_retire(), after that VMA has been already freed by a concurrent
i915_vma_destroy() on return from the i915_active_fini(). Then, we should
rather fix the issue at the VMA level, not in i915_active.
Since __i915_vma_parked() is called from __gt_park() on last put of the
GT's wakeref, the issue could be addressed by holding the GT wakeref long
enough for __active_retire() to complete before that wakeref is released
and the GT parked.
I believe the issue was introduced by commit d93939730347 ("drm/i915:
Remove the vma refcount") which moved a call to i915_active_fini() from
a dropped i915_vma_release(), called on last put of the removed VMA kref,
to i915_vma_parked() processing path called on last put of a GT wakeref.
However, its visibility to the object debugging tool was suppressed by a
bug in i915_active that was fixed two weeks later with commit e92eb246feb9
("drm/i915/active: Fix missing debug object activation").
A VMA associated with a request doesn't acquire a GT wakeref by itself.
Instead, it depends on a wakeref held directly by the request's active
intel_context for a GT associated with its VM, and indirectly on that
intel_context's engine wakeref if the engine belongs to the same GT as the
VMA's VM. Those wakerefs are released asynchronously to VMA deactivation.
Fix the issue by getting a wakeref for the VMA's GT when activating it,
and putting that wakeref only after the VMA is deactivated. However,
exclude global GTT from that processing path, otherwise the GPU never goes
idle. Since __i915_vma_retire() may be called from atomic contexts, use
async variant of wakeref put. Also, to avoid circular locking dependency,
take care of acquiring the wakeref before VM mutex when both are needed.
v7: Add inline comments with justifications for:
- using untracked variants of intel_gt_pm_get/put() (Nirmoy),
- using async variant of _put(),
- not getting the wakeref in case of a global GTT,
- always getting the first wakeref outside vm->mutex.
v6: Since __i915_vma_active/retire() callbacks are not serialized, storing
a wakeref tracking handle inside struct i915_vma is not safe, and
there is no other good place for that. Use untracked variants of
intel_gt_pm_get/put_async().
v5: Replace "tile" with "GT" across commit description (Rodrigo),
- avoid mentioning multi-GT case in commit description (Rodrigo),
- explain why we need to take a temporary wakeref unconditionally inside
i915_vma_pin_ww() (Rodrigo).
v4: Refresh on top of commit 5e4e06e4087e ("drm/i915: Track gt pm
wakerefs") (Andi),
- for more easy backporting, split out removal of former insufficient
workarounds and move them to separate patches (Nirmoy).
- clean up commit message and description a bit.
v3: Identify root cause more precisely, and a commit to blame,
- identify and drop former workarounds,
- update commit message and description.
v2: Get the wakeref before VM mutex to avoid circular locking dependency,
- drop questionable Fixes: tag.
Fixes: d93939730347 ("drm/i915: Remove the vma refcount")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/8875
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org # v5.19+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-6-janusz.krzysztofik@linux.intel.com
|