summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2024-03-28drm/xe/vf: Add proper detection of the SR-IOV VF modeMichal Wajdeczko
SR-IOV VF mode detection is based on testing VF capability bit on the register that is accessible from both the PF and enabled VFs. Bspec: 49904, 53227 Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327182740.407-4-michal.wajdeczko@intel.com
2024-03-28drm/xe: Move SR-IOV probe to xe_device_probe_early()Michal Wajdeczko
SR-IOV mode detection requires access to the MMIO register and this can be done now in xe_device_probe_early(). We can also drop explicit has_sriov parameter as this flag is now already available from xe->info. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327182740.407-3-michal.wajdeczko@intel.com
2024-03-28drm/xe: Separate pure MMIO init from VRAM checkoutMichal Wajdeczko
We can setup root tile registers mapping at the same time as we do early mapping of the entire MMIO BAR and keep mandatory VRAM checkout as a separate step. This will allow us to perform SR-IOV VF mode detection between those two steps using regular MMIO regs access functions. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327182740.407-2-michal.wajdeczko@intel.com
2024-03-28drm/i915/dp: Remove support for UHBR13.5Arun R Murthy
UHBR13.5 is not supported in MTL and also the DP2.1 spec says UHBR13.5 is optional. Hence removing UHBR135 from the supported link rates. v2: Reframed the commit message and added link to the issue. Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com> Fixes: 62618c7f117e ("drm/i915/mtl: C20 PLL programming") Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228144350.3184930-1-arun.r.murthy@intel.com
2024-03-28drm: DRM_WERROR should depend on DRMGeert Uytterhoeven
There is no point in asking the user about enforcing the DRM compiler warning policy when configuring a kernel without DRM support. Fixes: f89632a9e5fa6c47 ("drm: Add CONFIG_DRM_WERROR") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/631a1f4c066181b54617bfe2f38b0bd0ac865b68.1711474200.git.geert+renesas@glider.be Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-28drm/bridge: it6505: Remove useless selectMaxime Ripard
The IT6505 bridge Kconfig symbol selects a Kconfig symbol that doesn't exist. Remove it. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-13-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm: Switch DRM_DISPLAY_HDMI_HELPER to depends onMaxime Ripard
Most of our helpers have relied on being selected so far through Kconfig, but that creates issues when we have multiple layers of helpers with some depending on others. Indeed, select doesn't select a dependency's dependencies, and thus isn't super intuitive. Depends on however doesn't have that limitation, so we can just switch all the drivers that were selecting DRM_DISPLAY_HDMI_HELPER to depend on it. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-12-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm: Switch DRM_DISPLAY_HDCP_HELPER to depends onMaxime Ripard
Most of our helpers have relied on being selected so far through Kconfig, but that creates issues when we have multiple layers of helpers with some depending on others. Indeed, select doesn't select a dependency's dependencies, and thus isn't super intuitive. Depends on however doesn't have that limitation, so we can just switch all the drivers that were selecting DRM_DISPLAY_HDCP_HELPER to depend on it. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-11-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm: Switch DRM_DISPLAY_DP_HELPER to depends onMaxime Ripard
Most of our helpers have relied on being selected so far through Kconfig, but that creates issues when we have multiple layers of helpers with some depending on others. Indeed, select doesn't select a dependency's dependencies, and thus isn't super intuitive. Depends on however doesn't have that limitation, so we can just switch all the drivers that were selecting DRM_DISPLAY_DP_HELPER to depend on it. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-10-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm: Switch DRM_DISPLAY_DP_AUX_BUS to depends onMaxime Ripard
Most of our helpers have relied on being selected so far through Kconfig, but that creates issues when we have multiple layers of helpers with some depending on others. Indeed, select doesn't select a dependency's dependencies, and thus isn't super intuitive. Depends on however doesn't have that limitation, so we can just switch all the drivers that were selecting DRM_DISPLAY_DP_AUX_BUS to depend on it. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-9-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm: Switch DRM_DISPLAY_HELPER to depends onMaxime Ripard
Most of our helpers have relied on being selected so far through Kconfig, but that creates issues when we have multiple layers of helpers with some depending on others. Indeed, select doesn't select a dependency's dependencies, and thus isn't super intuitive. Depends on however doesn't have that limitation, so we can just switch all the drivers that were selecting DRM_DISPLAY_HELPER to depend on it. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-8-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm: Make drivers depends on DRM_DW_HDMIMaxime Ripard
DRM_DW_HDMI has a number of dependencies that might not be enabled. However, drivers were used to selecting it while not enforcing the DRM_DW_HDMI dependencies. This could result in Kconfig warnings (and further build breakages) such as: Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for DRM_DW_HDMI Depends on [n]: HAS_IOMEM [=y] && DRM [=m] && DRM_BRIDGE [=y] && DRM_DISPLAY_HELPER [=n] Selected by [m]: - DRM_SUN8I_DW_HDMI [=m] && HAS_IOMEM [=y] && DRM_SUN4I [=m] Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202403262127.kZkttfNz-lkp@intel.com/ Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-7-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/display: Make all helpers visible and switch to depends onMaxime Ripard
All the helpers Kconfig symbols so far have relied on drivers selecting them, and that's what most drivers did. However, this creates an issue nowadays when helpers depend on each other, and select doesn't transitively select a dependency dependencies. Depends on doesn't have that limitation though, so let's convert those symbols to be dependable and use depends on between them too. Suggested-by: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-6-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/display: Reorder Kconfig symbolsMaxime Ripard
The display kconfig helpers are not organized in any particular order, so let's move them around to create an alphabetical order. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-5-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/display: Make DisplayPort CEC-over-AUX Kconfig name consistentMaxime Ripard
While most display helpers Kconfig symbols have the DRM_DISPLAY prefix, the DisplayPort CEC tunnelling implementation uses CONFIG_DRM_DISPLAY_DP_AUX_CEC. Since the number of users is limited, we can easily rename it to make it consistent. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-4-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/display: Make DisplayPort AUX Chardev Kconfig name consistentMaxime Ripard
While most display helpers Kconfig symbols have the DRM_DISPLAY prefix, the DisplayPort-AUX chardev interface uses DRM_DP_AUX_CHARDEV. Since the number of users is limited and it's a selected symbol, we can easily rename it to make it consistent. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-3-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/display: Make DisplayPort tunnel debug Kconfig name consistentMaxime Ripard
While most display helpers Kconfig symbols have the DRM_DISPLAY prefix, the DisplayPort Tunnel debugging uses DRM_DISPLAY_DEBUG_DP_TUNNEL_STATE. Since the number of users is limited and it's a selected symbol, we can easily rename it to make it consistent. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-2-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/display: Make DisplayPort AUX bus Kconfig name consistentMaxime Ripard
While most display helpers Kconfig symbols have the DRM_DISPLAY prefix, the DisplayPort AUX bus implementation uses DRM_DP_AUX_BUS. Since the number of users is limited and it's a selected symbol, we can easily rename it to make it consistent. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20240327-kms-kconfig-helpers-v3-1-eafee11b84b3@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/qxl: remove unused variable from `qxl_process_single_command()`Miguel Ojeda
Clang 14 in an (essentially) defconfig loongarch64 build for next-20240327 reports [1]: drivers/gpu/drm/qxl/qxl_ioctl.c:148:14: error: variable 'num_relocs' set but not used [-Werror,-Wunused-but-set-variable] The variable was originally used in the `out_free_bos` label, but commit 74d9a6335dce ("drm/qxl: Simplify cleaning qxl processing command") removed the use that happened in that label. Thus remove the unused variable. Fixes: 74d9a6335dce ("drm/qxl: Simplify cleaning qxl processing command") Closes: https://lore.kernel.org/lkml/CANiq72kqqQfUxLkHJYqeBAhpc6YcX7bfR96gmmbF=j8hEOykqw@mail.gmail.com/ [1] Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/r/20240327175556.233126-2-ojeda@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/qxl: remove unused `count` variable from `qxl_surface_id_alloc()`Miguel Ojeda
Clang 14 in an (essentially) defconfig loongarch64 build for next-20240326 reports [1]: drivers/gpu/drm/qxl/qxl_cmd.c:424:6: error: variable 'count' set but not used [-Werror,-Wunused-but-set-variable] The variable is already unused in the version that got into the tree. Thus remove the unused variable. Fixes: f64122c1f6ad ("drm: add new QXL driver. (v1.4)") Closes: https://lore.kernel.org/lkml/CANiq72mjc5t4n25SQvYSrOEhxxpXYPZ4pPzneSJHEnc3qApu2Q@mail.gmail.com/ [1] Closes: https://lore.kernel.org/all/20240327163331.GB1153323@dev-arch.thelio-3990X/ Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/r/20240327175556.233126-1-ojeda@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/xe: Move vma rebinding to the drm_exec locking loopThomas Hellström
Rebinding might allocate page-table bos, causing evictions. To support blocking locking during these evictions, perform the rebinding in the drm_exec locking loop. Also Reserve fence slots where actually needed rather than trying to predict how many fence slots will be needed over a complete wound-wait transaction. v2: - Remove a leftover call to xe_vm_rebind() (Matt Brost) - Add a helper function xe_vm_validate_rebind() (Matt Brost) v3: - Add comments and squash with previous patch (Matt Brost) Fixes: 24f947d58fe5 ("drm/xe: Use DRM GPUVM helpers for external- and evicted objects") Fixes: 29f424eb8702 ("drm/xe/exec: move fence reservation") Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-5-thomas.hellstrom@linux.intel.com
2024-03-28drm/xe: Make TLB invalidation fences unorderedThomas Hellström
They can actually complete out-of-order, so allocate a unique fence context for each fence. Fixes: 5387e865d90e ("drm/xe: Add TLB invalidation fence after rebinds issued from execs") Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-4-thomas.hellstrom@linux.intel.com
2024-03-28drm/xe: Rework rebindingThomas Hellström
Instead of handling the vm's rebind fence separately, which is error prone if they are not strictly ordered, attach rebind fences as kernel fences to the vm's resv. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-3-thomas.hellstrom@linux.intel.com
2024-03-28drm/xe: Use ring ops TLB invalidation for rebindsThomas Hellström
For each rebind we insert a GuC TLB invalidation and add a corresponding unordered TLB invalidation fence. This might add a huge number of TLB invalidation fences to wait for so rather than doing that, defer the TLB invalidation to the next ring ops for each affected exec queue. Since the TLB is invalidated on exec_queue switch, we need to invalidate once for each affected exec_queue. v2: - Simplify if-statements around the tlb_flush_seqno. (Matthew Brost) - Add some comments and asserts. Fixes: 5387e865d90e ("drm/xe: Add TLB invalidation fence after rebinds issued from execs") Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-2-thomas.hellstrom@linux.intel.com
2024-03-28drm/i915: add bug.h include to i915_memcpy.cDave Airlie
This is stopping me building here for some reason, /home/airlied/devel/kernel/dim/drm-fixes/drivers/gpu/drm/i915/i915_memcpy.c: In function ‘i915_unaligned_memcpy_from_wc’: /home/airlied/devel/kernel/dim/drm-fixes/drivers/gpu/drm/i915/i915_memcpy.c:33:25: error: implicit declaration of function ‘BUG_ON’; did you mean ‘CI_BUG_ON’? [-Werror=implicit-function-declaration] 33 | #define CI_BUG_ON(expr) BUG_ON(expr) | ^~~~~~ /home/airlied/devel/kernel/dim/drm-fixes/drivers/gpu/drm/i915/i915_memcpy.c:144:9: note: in expansion of macro ‘CI_BUG_ON’ 144 | CI_BUG_ON(!i915_has_memcpy_from_wc()); | ^~~~~~~~~ engage maintainer overrides :-) Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-03-28Merge tag 'amd-drm-fixes-6.9-2024-03-27' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.9-2024-03-27: amdgpu: - SMU 14.0.1 updates - DCN 3.5.x updates - VPE fix - eDP panel flickering fix - Suspend fix - PSR fix - DCN 3.0+ fix - VCN 4.0.6 updates - debugfs fix amdkfd: - DMA-Buf fix - GFX 9.4.2 TLB flush fix - CP interrupt fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328025342.8700-1-alexander.deucher@amd.com
2024-03-27drm/xe/guc: Use GuC ID Manager in submission codeMichal Wajdeczko
We are ready to replace private guc_ids management code with separate GuC ID Manager that can be shared with upcoming SR-IOV PF provisioning code. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-5-michal.wajdeczko@intel.com
2024-03-27drm/xe/kunit: Add basic tests for GuC context ID ManagerMichal Wajdeczko
Before we switch-over submission code to use new GuC context ID Manager, lets add some kunit tests to make sure that ID manager works as expected. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-4-michal.wajdeczko@intel.com
2024-03-27drm/xe/guc: Introduce GuC context ID ManagerMichal Wajdeczko
While we are already managing GuC IDs directly in GuC submission code, using bitmap() for MLRC and ida() for SLRC, this code can't be easily extended to meet additional requirements for SR-IOV use cases, like limited number of IDs available on VFs, or ID range reservation for provisioning VFs by the PF. Add a separate component for managing GuC IDs, that will replace existing ID management. Start with bitmap() based implementation that could be optimized later based on perf data. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-3-michal.wajdeczko@intel.com
2024-03-27drm/xe/guc: Move GUC_ID_MAX definition to GuC ABI headerMichal Wajdeczko
This macro represents GuC firmware capability and shall be defined in the firmware ABI header. Move it to xe_guc_fwif.h file. Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-2-michal.wajdeczko@intel.com
2024-03-27drm/vmwgfx: Create debugfs ttm_resource_manager entry only if neededJocelyn Falempe
The driver creates /sys/kernel/debug/dri/0/mob_ttm even when the corresponding ttm_resource_manager is not allocated. This leads to a crash when trying to read from this file. Add a check to create mob_ttm, system_mob_ttm, and gmr_ttm debug file only when the corresponding ttm_resource_manager is allocated. crash> bt PID: 3133409 TASK: ffff8fe4834a5000 CPU: 3 COMMAND: "grep" #0 [ffffb954506b3b20] machine_kexec at ffffffffb2a6bec3 #1 [ffffb954506b3b78] __crash_kexec at ffffffffb2bb598a #2 [ffffb954506b3c38] crash_kexec at ffffffffb2bb68c1 #3 [ffffb954506b3c50] oops_end at ffffffffb2a2a9b1 #4 [ffffb954506b3c70] no_context at ffffffffb2a7e913 #5 [ffffb954506b3cc8] __bad_area_nosemaphore at ffffffffb2a7ec8c #6 [ffffb954506b3d10] do_page_fault at ffffffffb2a7f887 #7 [ffffb954506b3d40] page_fault at ffffffffb360116e [exception RIP: ttm_resource_manager_debug+0x11] RIP: ffffffffc04afd11 RSP: ffffb954506b3df0 RFLAGS: 00010246 RAX: ffff8fe41a6d1200 RBX: 0000000000000000 RCX: 0000000000000940 RDX: 0000000000000000 RSI: ffffffffc04b4338 RDI: 0000000000000000 RBP: ffffb954506b3e08 R8: ffff8fee3ffad000 R9: 0000000000000000 R10: ffff8fe41a76a000 R11: 0000000000000001 R12: 00000000ffffffff R13: 0000000000000001 R14: ffff8fe5bb6f3900 R15: ffff8fe41a6d1200 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffb954506b3e00] ttm_resource_manager_show at ffffffffc04afde7 [ttm] #9 [ffffb954506b3e30] seq_read at ffffffffb2d8f9f3 RIP: 00007f4c4eda8985 RSP: 00007ffdbba9e9f8 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 000000000037e000 RCX: 00007f4c4eda8985 RDX: 000000000037e000 RSI: 00007f4c41573000 RDI: 0000000000000003 RBP: 000000000037e000 R8: 0000000000000000 R9: 000000000037fe30 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4c41573000 R13: 0000000000000003 R14: 00007f4c41572010 R15: 0000000000000003 ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Fixes: af4a25bbe5e7 ("drm/vmwgfx: Add debugfs entries for various ttm resource managers") Cc: <stable@vger.kernel.org> Reviewed-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240312093551.196609-1-jfalempe@redhat.com
2024-03-27drm/amdgpu: fix deadlock while reading mqd from debugfsJohannes Weiner
An errant disk backup on my desktop got into debugfs and triggered the following deadlock scenario in the amdgpu debugfs files. The machine also hard-resets immediately after those lines are printed (although I wasn't able to reproduce that part when reading by hand): [ 1318.016074][ T1082] ====================================================== [ 1318.016607][ T1082] WARNING: possible circular locking dependency detected [ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 Not tainted [ 1318.017598][ T1082] ------------------------------------------------------ [ 1318.018096][ T1082] tar/1082 is trying to acquire lock: [ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x80 [ 1318.019084][ T1082] [ 1318.019084][ T1082] but task is already holding lock: [ 1318.020052][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.020607][ T1082] [ 1318.020607][ T1082] which lock already depends on the new lock. [ 1318.020607][ T1082] [ 1318.022081][ T1082] [ 1318.022081][ T1082] the existing dependency chain (in reverse order) is: [ 1318.023083][ T1082] [ 1318.023083][ T1082] -> #2 (reservation_ww_class_mutex){+.+.}-{3:3}: [ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0 [ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90 [ 1318.025161][ T1082] dma_resv_lockdep+0x18a/0x330 [ 1318.025683][ T1082] do_one_initcall+0x6a/0x350 [ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.026728][ T1082] kernel_init+0x15/0x1a0 [ 1318.027242][ T1082] ret_from_fork+0x2c/0x40 [ 1318.027759][ T1082] ret_from_fork_asm+0x11/0x20 [ 1318.028281][ T1082] [ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}: [ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330 [ 1318.029790][ T1082] do_one_initcall+0x6a/0x350 [ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.030722][ T1082] kernel_init+0x15/0x1a0 [ 1318.031168][ T1082] ret_from_fork+0x2c/0x40 [ 1318.031598][ T1082] ret_from_fork_asm+0x11/0x20 [ 1318.032011][ T1082] [ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}: [ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680 [ 1318.033141][ T1082] lock_acquire+0xcd/0x2c0 [ 1318.033487][ T1082] __might_fault+0x58/0x80 [ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu] [ 1318.034181][ T1082] full_proxy_read+0x55/0x80 [ 1318.034487][ T1082] vfs_read+0xa7/0x360 [ 1318.034788][ T1082] ksys_read+0x70/0xf0 [ 1318.035085][ T1082] do_syscall_64+0x94/0x180 [ 1318.035375][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e [ 1318.035664][ T1082] [ 1318.035664][ T1082] other info that might help us debug this: [ 1318.035664][ T1082] [ 1318.036487][ T1082] Chain exists of: [ 1318.036487][ T1082] &mm->mmap_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex [ 1318.036487][ T1082] [ 1318.037310][ T1082] Possible unsafe locking scenario: [ 1318.037310][ T1082] [ 1318.037838][ T1082] CPU0 CPU1 [ 1318.038101][ T1082] ---- ---- [ 1318.038350][ T1082] lock(reservation_ww_class_mutex); [ 1318.038590][ T1082] lock(reservation_ww_class_acquire); [ 1318.038839][ T1082] lock(reservation_ww_class_mutex); [ 1318.039083][ T1082] rlock(&mm->mmap_lock); [ 1318.039328][ T1082] [ 1318.039328][ T1082] *** DEADLOCK *** [ 1318.039328][ T1082] [ 1318.040029][ T1082] 1 lock held by tar/1082: [ 1318.040259][ T1082] #0: ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.040560][ T1082] [ 1318.040560][ T1082] stack backtrace: [ 1318.041053][ T1082] CPU: 22 PID: 1082 Comm: tar Not tainted 6.8.0-rc7-00015-ge0c8221b72c0 #17 3316c85d50e282c5643b075d1f01a4f6365e39c2 [ 1318.041329][ T1082] Hardware name: Gigabyte Technology Co., Ltd. B650 AORUS PRO AX/B650 AORUS PRO AX, BIOS F20 12/14/2023 [ 1318.041614][ T1082] Call Trace: [ 1318.041895][ T1082] <TASK> [ 1318.042175][ T1082] dump_stack_lvl+0x4a/0x80 [ 1318.042460][ T1082] check_noncircular+0x145/0x160 [ 1318.042743][ T1082] __lock_acquire+0x14bf/0x2680 [ 1318.043022][ T1082] lock_acquire+0xcd/0x2c0 [ 1318.043301][ T1082] ? __might_fault+0x40/0x80 [ 1318.043580][ T1082] ? __might_fault+0x40/0x80 [ 1318.043856][ T1082] __might_fault+0x58/0x80 [ 1318.044131][ T1082] ? __might_fault+0x40/0x80 [ 1318.044408][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu 8fe2afaa910cbd7654c8cab23563a94d6caebaab] [ 1318.044749][ T1082] full_proxy_read+0x55/0x80 [ 1318.045042][ T1082] vfs_read+0xa7/0x360 [ 1318.045333][ T1082] ksys_read+0x70/0xf0 [ 1318.045623][ T1082] do_syscall_64+0x94/0x180 [ 1318.045913][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.046201][ T1082] ? lockdep_hardirqs_on+0x7d/0x100 [ 1318.046487][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.046773][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.047057][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.047337][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.047611][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e [ 1318.047887][ T1082] RIP: 0033:0x7f480b70a39d [ 1318.048162][ T1082] Code: 91 ba 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb b2 e8 18 a3 01 00 0f 1f 84 00 00 00 00 00 80 3d a9 3c 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 53 48 83 [ 1318.048769][ T1082] RSP: 002b:00007ffde77f5c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 1318.049083][ T1082] RAX: ffffffffffffffda RBX: 0000000000000800 RCX: 00007f480b70a39d [ 1318.049392][ T1082] RDX: 0000000000000800 RSI: 000055c9f2120c00 RDI: 0000000000000008 [ 1318.049703][ T1082] RBP: 0000000000000800 R08: 000055c9f2120a94 R09: 0000000000000007 [ 1318.050011][ T1082] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c9f2120c00 [ 1318.050324][ T1082] R13: 0000000000000008 R14: 0000000000000008 R15: 0000000000000800 [ 1318.050638][ T1082] </TASK> amdgpu_debugfs_mqd_read() holds a reservation when it calls put_user(), which may fault and acquire the mmap_sem. This violates the established locking order. Bounce the mqd data through a kernel buffer to get put_user() out of the illegal section. Fixes: 445d85e3c1df ("drm/amdgpu: add debugfs interface for reading MQDs") Cc: stable@vger.kernel.org # v6.5+ Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amdgpu: enable UMSCH 4.0.6Lang Yu
Share same codes with 4.0.5 and enable collaborate mode for VPE. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amdgpu/umsch: update UMSCH 4.0 FW interfaceLang Yu
Align with FW changes. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Set DCN351 BB and IP the same as DCN35Xi Liu
[WHY & HOW] DCN351 and DCN35 should use the same bounding box and IP settings. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Jun Lei <jun.lei@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Xi Liu <xi.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Fix bounds check for dcn35 DcfClocksRoman Li
[Why] NumFclkLevelsEnabled is used for DcfClocks bounds check instead of designated NumDcfClkLevelsEnabled. That can cause array index out-of-bounds access. [How] Use designated variable for dcn35 DcfClocks bounds check. Fixes: a8edc9cc0b14 ("drm/amd/display: Fix array-index-out-of-bounds in dcn35_clkmgr") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Remove MPC rate control logic from DCN30 and aboveGeorge Shen
[Why] MPC flow rate control is not needed for DCN30 and above. Current logic that uses it can result in underflow for certain edge cases (such as DSC N422 + ODM combine + 422 left edge pixel). [How] Remove MPC flow rate control logic and programming for DCN30 and above. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: fix a dereference of a NULL pointerWenjing Liu
[why&how] In some platform out_transfer_func may not be popualted. We need to check for null before dereferencing it. Fixes: d2dea1f14038 ("drm/amd/display: Generalize new minimal transition path") Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Send DTBCLK disable message on first commitTaimur Hassan
[Why] Previous patch to allow DTBCLK disable didn't address boot case. Driver thinks DTBCLK is disabled by default, so we don't send disable message to PMFW. DTBCLK is then enabled at idle desktop on boot, burning power. [How] Set dtbclk_en to true on boot so that disable message is sent during first commit. Fixes: 27750e176a4f ("drm/amd/display: Allow DTBCLK disable for DCN35") Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <syed.hassan@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Update dcn351 to latest dcn35 configSung Joon Kim
[why & how] There were some fixes in dcn35 that need to be ported over to dcn351 to prevent any regression. Signed-off-by: Sung Joon Kim <sungkim@amd.com> Reviewed-by: Liu, Xi (Alex) <xiliu102@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: fix IPX enablementHamza Mahfooz
We need to re-enable idle power optimizations after entering PSR. Since, we get kicked out of idle power optimizations before entering PSR (entering PSR requires us to write to DCN registers, which isn't allowed while we are in IPS). Fixes: a9b1a4f684b3 ("drm/amd/display: Add more checks for exiting idle in DC") Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd: Flush GFXOFF requests in prepare stageMario Limonciello
If the system hasn't entered GFXOFF when suspend starts it can cause hangs accessing GC and RLC during the suspend stage. Cc: <stable@vger.kernel.org> # 6.1.y: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback") Cc: <stable@vger.kernel.org> # 6.1.y: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks") Cc: <stable@vger.kernel.org> # 6.1.y: 2ceec37b0e3d ("drm/amd: Add missing kernel doc for prepare_suspend()") Cc: <stable@vger.kernel.org> # 6.1.y: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend") Cc: <stable@vger.kernel.org> # 6.6.y: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback") Cc: <stable@vger.kernel.org> # 6.6.y: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks") Cc: <stable@vger.kernel.org> # 6.6.y: 2ceec37b0e3d ("drm/amd: Add missing kernel doc for prepare_suspend()") Cc: <stable@vger.kernel.org> # 6.6.y: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend") Cc: <stable@vger.kernel.org> # 6.1+ Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3132 Fixes: ab4750332dbe ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amdkfd: range check cp bad op exception interruptsJonathan Kim
Due to a CP interrupt bug, bad packet garbage exception codes are raised. Do a range check so that the debugger and runtime do not receive garbage codes. Update the user api to guard exception code type checking as well. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Tested-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27Revert "drm/amd/display: Fix sending VSC (+ colorimetry) packets for DP/eDP ↵Harry Wentland
displays without PSR" This causes flicker on a bunch of eDP panels. The info_packet code also caused regressions on other OSes that we haven't' seen on Linux yet, but that is likely due to the fact that we haven't had a chance to test those environments on Linux. We'll need to revisit this. This reverts commit 202260f64519e591b5cd99626e441b6559f571a3. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3207 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3151 Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-03-27drm/amdkfd: fix TLB flush after unmap for GFX9.4.2Eric Huang
TLB flush after unmap accidentially was removed on gfx9.4.2. It is to add it back. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-03-27drm/amdgpu/vpe: power on vpe when hw_initPeyton Lee
To fix mode2 reset failure. Should power on VPE when hw_init. Signed-off-by: Peyton Lee <peytolee@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: increase bb clock for DCN351Xi Liu
[Why and how] Bounding box clocks for DCN351 should be increased as per request Reviewed-by: Swapnil Patel <swapnil.patel@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Xi Liu <xi.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Prevent crash when disable streamChris Park
[Why] Disabling stream encoder invokes a function that no longer exists. [How] Check if the function declaration is NULL in disable stream encoder. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Chris Park <chris.park@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Increase Z8 watermark times.Natanel Roizenman
Increase Z8 watermark times from 210->250us and 320->350us. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Natanel Roizenman <natanel.roizenman@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amdkfd: Check cgroup when returning DMABuf infoMukul Joshi
Check cgroup permissions when returning DMA-buf info and based on cgroup info return the GPU id of the GPU that have access to the BO. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>