linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-03-20	drm/amdgpu: correct the KGQ fallback message	Prike Liang
	Fix the KGQ fallback function name, as this will help differentiate the failure in the KCQ enablement. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amdgpu/pm: Check the validity of overdiver power limit	Ma Jun
	Check the validity of overdriver power limit before using it. Fixes: 7968e9748fbb ("drm/amdgpu/pm: Fix the power1_min_cap value") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Lazar Lijo <lijo.lazar@amd.com> Suggested-by: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-03-20	drm/amdgpu/pm: Fix NULL pointer dereference when get power limit	Ma Jun
	Because powerplay_table initialization is skipped under sriov case, We check and set default lower and upper OD value if powerplay_table is NULL. Fixes: 7968e9748fbb ("drm/amdgpu/pm: Fix the power1_min_cap value") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reported-by: Yin Zhenguo <zhenguo.yin@amd.com> Suggested-by: Lazar Lijo <lijo.lazar@amd.com> Suggested-by: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-03-20	drm/amdgpu: Skip access PF-only registers on gfx10/gfxhub2_1 under SRIOV	ZhenGuo Yin
	[Why] RLCG interface returns "out-of-range" error under SRIOV VF when accessing PF-only registers. [How] Skip access PF-only registers on gfx10/gfxhub2_1 under SRIOV. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amdgpu: Init zone device and drm client after mode-1 reset on reload	Ahmad Rehman
	In passthrough environment, when amdgpu is reloaded after unload, mode-1 is triggered after initializing the necessary IPs, That init does not include KFD, and KFD init waits until the reset is completed. KFD init is called in the reset handler, but in this case, the zone device and drm client is not initialized, causing app to create kernel panic. v2: Removing the init KFD condition from amdgpu_amdkfd_drm_client_create. As the previous version has the potential of creating DRM client twice. v3: v2 patch results in SDMA engine hung as DRM open causes VM clear to SDMA before SDMA init. Adding the condition to in drm client creation, on top of v1, to guard against drm client creation call multiple times. Signed-off-by: Ahmad Rehman <Ahmad.Rehman@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amdgpu: amdgpu_ttm_gart_bind set gtt bound flag	Philip Yang
	Otherwise after the GTT bo is released, the GTT and gart space is freed but amdgpu_ttm_backend_unbind will not clear the gart page table entry and leave valid mapping entry pointing to the stale system page. Then if GPU access the gart address mistakely, it will read undefined value instead page fault, harder to debug and reproduce the real issue. Cc: stable@vger.kernel.org Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amdgpu/vcn: enable vcn1 fw load for VCN 4_0_6	Saleemkhan Jamadar
	v1 - update the fw header for each vcn instance (Veera) VCN1 has different FW binary in VCN v4_0_6. Add changes to load the VCN1 fw binary Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amd/display: Enable DML2 debug flags	Aurabindo Pillai
	[WHY & HOW] Enable DML2 related debug config options in DM for testing purposes. Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amd/display: Change default size for dummy plane in DML2	Swapnil Patel
	[WHY & HOW] Currently, to map dc states into dml_display_cfg, We create a dummy plane if the stream doesn't have any planes attached to it. This dummy plane uses max addersable width height. This results in certain mode validations failing when they shouldn't. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Swapnil Patel <swapnil.patel@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amdgpu: Reset IH OVERFLOW_EN bit for IH 7.0	Friedrich Vock
	IH 7.0 support landed shortly after the original patch for resetting the bit on all other generations, but without that patch applied. Fixes: 12443fc53e7d ("drm/amdgpu: Add ih v7_0 ip block support") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amdgpu: fix mmhub client id out-of-bounds access	Lang Yu
	Properly handle cid 0x140. Fixes: aba2be41470a ("drm/amdgpu: add mmhub 3.3.0 support") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amdgpu: fix use-after-free bug	Vitaly Prosyak
	The bug can be triggered by sending a single amdgpu_gem_userptr_ioctl to the AMDGPU DRM driver on any ASICs with an invalid address and size. The bug was reported by Joonkyo Jung <joonkyoj@yonsei.ac.kr>. For example the following code: static void Syzkaller1(int fd) { struct drm_amdgpu_gem_userptr arg; int ret; arg.addr = 0xffffffffffff0000; arg.size = 0x80000000; /2 Gb/ arg.flags = 0x7; ret = drmIoctl(fd, 0xc1186451/amdgpu_gem_userptr_ioctl/, &arg); } Due to the address and size are not valid there is a failure in amdgpu_hmm_register->mmu_interval_notifier_insert->__mmu_interval_notifier_insert-> check_shl_overflow, but we even the amdgpu_hmm_register failure we still call amdgpu_hmm_unregister into amdgpu_gem_object_free which causes access to a bad address. The following stack is below when the issue is reproduced when Kazan is enabled: [ +0.000014] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020 [ +0.000009] RIP: 0010:mmu_interval_notifier_remove+0x327/0x340 [ +0.000017] Code: ff ff 49 89 44 24 08 48 b8 00 01 00 00 00 00 ad de 4c 89 f7 49 89 47 40 48 83 c0 22 49 89 47 48 e8 ce d1 2d 01 e9 32 ff ff ff <0f> 0b e9 16 ff ff ff 4c 89 ef e8 fa 14 b3 ff e9 36 ff ff ff e8 80 [ +0.000014] RSP: 0018:ffffc90002657988 EFLAGS: 00010246 [ +0.000013] RAX: 0000000000000000 RBX: 1ffff920004caf35 RCX: ffffffff8160565b [ +0.000011] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8881a9f78260 [ +0.000010] RBP: ffffc90002657a70 R08: 0000000000000001 R09: fffff520004caf25 [ +0.000010] R10: 0000000000000003 R11: ffffffff8161d1d6 R12: ffff88810e988c00 [ +0.000010] R13: ffff888126fb5a00 R14: ffff88810e988c0c R15: ffff8881a9f78260 [ +0.000011] FS: 00007ff9ec848540(0000) GS:ffff8883cc880000(0000) knlGS:0000000000000000 [ +0.000012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000010] CR2: 000055b3f7e14328 CR3: 00000001b5770000 CR4: 0000000000350ef0 [ +0.000010] Call Trace: [ +0.000006] <TASK> [ +0.000007] ? show_regs+0x6a/0x80 [ +0.000018] ? __warn+0xa5/0x1b0 [ +0.000019] ? mmu_interval_notifier_remove+0x327/0x340 [ +0.000018] ? report_bug+0x24a/0x290 [ +0.000022] ? handle_bug+0x46/0x90 [ +0.000015] ? exc_invalid_op+0x19/0x50 [ +0.000016] ? asm_exc_invalid_op+0x1b/0x20 [ +0.000017] ? kasan_save_stack+0x26/0x50 [ +0.000017] ? mmu_interval_notifier_remove+0x23b/0x340 [ +0.000019] ? mmu_interval_notifier_remove+0x327/0x340 [ +0.000019] ? mmu_interval_notifier_remove+0x23b/0x340 [ +0.000020] ? __pfx_mmu_interval_notifier_remove+0x10/0x10 [ +0.000017] ? kasan_save_alloc_info+0x1e/0x30 [ +0.000018] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? __kasan_kmalloc+0xb1/0xc0 [ +0.000018] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? __kasan_check_read+0x11/0x20 [ +0.000020] amdgpu_hmm_unregister+0x34/0x50 [amdgpu] [ +0.004695] amdgpu_gem_object_free+0x66/0xa0 [amdgpu] [ +0.004534] ? __pfx_amdgpu_gem_object_free+0x10/0x10 [amdgpu] [ +0.004291] ? do_syscall_64+0x5f/0xe0 [ +0.000023] ? srso_return_thunk+0x5/0x5f [ +0.000017] drm_gem_object_free+0x3b/0x50 [drm] [ +0.000489] amdgpu_gem_userptr_ioctl+0x306/0x500 [amdgpu] [ +0.004295] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004270] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? __this_cpu_preempt_check+0x13/0x20 [ +0.000015] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? sysvec_apic_timer_interrupt+0x57/0xc0 [ +0.000020] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ +0.000022] ? drm_ioctl_kernel+0x17b/0x1f0 [drm] [ +0.000496] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004272] ? drm_ioctl_kernel+0x190/0x1f0 [drm] [ +0.000492] drm_ioctl_kernel+0x140/0x1f0 [drm] [ +0.000497] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004297] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm] [ +0.000489] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? __kasan_check_write+0x14/0x20 [ +0.000016] drm_ioctl+0x3da/0x730 [drm] [ +0.000475] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004293] ? __pfx_drm_ioctl+0x10/0x10 [drm] [ +0.000506] ? __pfx_rpm_resume+0x10/0x10 [ +0.000016] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? __kasan_check_write+0x14/0x20 [ +0.000010] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? _raw_spin_lock_irqsave+0x99/0x100 [ +0.000015] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ +0.000014] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? preempt_count_sub+0x18/0xc0 [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000010] ? _raw_spin_unlock_irqrestore+0x27/0x50 [ +0.000019] amdgpu_drm_ioctl+0x7e/0xe0 [amdgpu] [ +0.004272] __x64_sys_ioctl+0xcd/0x110 [ +0.000020] do_syscall_64+0x5f/0xe0 [ +0.000021] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ +0.000015] RIP: 0033:0x7ff9ed31a94f [ +0.000012] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ +0.000013] RSP: 002b:00007fff25f66790 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ +0.000016] RAX: ffffffffffffffda RBX: 000055b3f7e133e0 RCX: 00007ff9ed31a94f [ +0.000012] RDX: 000055b3f7e133e0 RSI: 00000000c1186451 RDI: 0000000000000003 [ +0.000010] RBP: 00000000c1186451 R08: 0000000000000000 R09: 0000000000000000 [ +0.000009] R10: 0000000000000008 R11: 0000000000000246 R12: 00007fff25f66ca8 [ +0.000009] R13: 0000000000000003 R14: 000055b3f7021ba8 R15: 00007ff9ed7af040 [ +0.000024] </TASK> [ +0.000007] ---[ end trace 0000000000000000 ]--- v2: Consolidate any error handling into amdgpu_hmm_register which applied to kfd_bo also. (Christian) v3: Improve syntax and comment (Christian) Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Joonkyo Jung <joonkyoj@yonsei.ac.kr> Cc: Dokyung Song <dokyungs@yonsei.ac.kr> Cc: <jisoo.jang@yonsei.ac.kr> Cc: <yw9865@yonsei.ac.kr> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amdgpu: Handle duplicate BOs during process restore	Mukul Joshi
	In certain situations, some apps can import a BO multiple times (through IPC for example). To restore such processes successfully, we need to tell drm to ignore duplicate BOs. While at it, also add additional logging to prevent silent failures when process restore fails. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/amd/display: Use freesync when `DRM_EDID_FEATURE_CONTINUOUS_FREQ` found	Mario Limonciello
	The monitor shipped with the Framework 16 supports VRR [1], but it's not being advertised. This is because the detailed timing block doesn't contain `EDID_DETAIL_MONITOR_RANGE` which amdgpu looks for to find min and max frequencies. This check however is superfluous for this case because update_display_info() calls drm_get_monitor_range() to get these ranges already. So if the `DRM_EDID_FEATURE_CONTINUOUS_FREQ` EDID feature is found then turn on freesync without extra checks. v2: squash in fix from Harry Closes: https://www.reddit.com/r/framework/comments/1b4y2i5/no_variable_refresh_rate_on_the_framework_16_on/ Closes: https://www.reddit.com/r/framework/comments/1b6vzcy/framework_16_variable_refresh_rate/ Closes: https://community.frame.work/t/resolved-no-vrr-freesync-with-amd-version/42338 Link: https://gist.github.com/superm1/e8fbacfa4d0f53150231d3a3e0a13faf Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20	drm/xe/gt: Remove continue statement which has no effect	Tejas Upadhyay
	Remove continue statement which does not have real effect as no actions are to be taken post continue. Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318114057.3831274-1-tejas.upadhyay@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-03-20	drm/panel: atna33xc20: Fix unbalanced regulator in the case HPD doesn't assert	Douglas Anderson
	When the atna33xc20 driver was first written the resume code never returned an error. If there was a problem waiting for HPD it just printed a warning and moved on. This changed in response to review feedback [1] on a future patch but I accidentally didn't account for rolling back the regulator enable in the error cases. Do so now. [1] https://lore.kernel.org/all/5f3cf3a6-1cc2-63e4-f76b-4ee686764705@linaro.org/ Fixes: 3b5765df375c ("drm/panel: atna33xc20: Take advantage of wait_hpd_asserted() in struct drm_dp_aux") Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240313-homestarpanel-regulator-v1-1-b8e3a336da12@chromium.org
2024-03-20	drm/i915: Rename ICL_PORT_TX_DW6 bits	Ville Syrjälä
	Our definitions for bit 7 and bit 0 of ICL_PORT_TX_DW6 are swapped. Functionally it doesn't matter as we always set both bits, but let's rename the bits to match bspec 100%. And while at it, add the definition for bits 1-6 as well, just to have it all fully documented. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240308072400.28918-1-ville.syrjala@linux.intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-20	drm/ttm: warn when resv objs are mixed in a bulk_move	Christian König
	The BOs in a bulk move must share all the same reservation object to make sure that we lock the whole bulk during eviction. Actually document and enforce that with a warning. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240312105555.3065-1-christian.koenig@amd.com
2024-03-20	drm/xe/display: fix type of intel_uncore_read*() functions	Luca Coelho
	Some of the backported intel_uncore_read*() functions used the wrong types. Change the function declarations accordingly. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240314065221.1181158-1-luciano.coelho@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-20	drm/xe: Move xe_ggtt_invalidate out from ggtt->lock	Maarten Lankhorst
	Considering the caller of the GGTT functions should keep the backing storage alive before the function completes, it's not necessary to invalidate with the GGTT lock held. This just adds latency for every user of the GGTT. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-5-matthew.brost@intel.com
2024-03-20	drm/xe: Add XE_BO_GGTT_INVALIDATE flag	Matthew Brost
	Add XE_BO_GGTT_INVALIDATE flag which indicates the GGTT should be invalidated when a BO is added / removed from the GGTT. This is typically set when a BO is used by the GuC as the GuC has GGTT TLBs. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> [mlankhorst: Small fix to only inherit GGTT_INVALIDATE from src bo] [mlankhorst: Remove _BIT from name] Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-4-matthew.brost@intel.com
2024-03-20	drm/xe: Drop ggtt invalidate from display code	Matthew Brost
	Only buffers mapped in the GGTT used by the GuC require an invalidation. Display buffers do not require an invalidation. Delete the invalidatio from display code and make invalidation a static function in xe_ggtt.c. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-3-matthew.brost@intel.com
2024-03-20	drm/xe: Add a NULL check in xe_ttm_stolen_mgr_init	Nirmoy Das
	Add an explicit check to ensure that the mgr is not NULL. Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240319130925.22399-1-nirmoy.das@intel.com
2024-03-19	drm/xe: Use correct function pointer type	Niranjana Vishwanathapura
	Use xe_exec_queue_user_extension_fn type for exec_queue_user_extension_funcs.` Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240319174919.1847-1-niranjana.vishwanathapura@intel.com
2024-03-19	drm/xe: Streamline exec queue freeing path	Niranjana Vishwanathapura
	Ensure exec queue freeing happens at one place, that is in __xe_exec_queue_free(). It releases q->vm reference also. Set q->vm before handling extensions as they can potentially reference it. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240319175947.15890-1-niranjana.vishwanathapura@intel.com
2024-03-19	drm/xe: Separate out sched/deregister_done handling	Niranjana Vishwanathapura
	Abstract out the core part of sched_done and deregister_done handlers to separate functions to decouple them from any protocol error handling part and make them more readable. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240319184153.16667-1-niranjana.vishwanathapura@intel.com
2024-03-20	drm/i915/scaler: Update Pipe src size check in skl_update_scaler	Ankit Nautiyal
	For Earlier platforms, the Pipe source size is 12-bits so max pipe source width and height is 4096. For newer platforms it is 13-bits so theoretically max width/height is 8192. For few of the earlier platforms the scaler did not use all bits of the PIPESRC, so max scaler source size was used to make that the pipe source size is programmed within limits, before using scaler. This creates a problem, for MTL where scaler source size is 4096, but max pipe source width can theroretically be 8192. Switch the check to use the max scaler destination size, which closely match the limits. Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313143825.3461208-1-ankit.k.nautiyal@intel.com
2024-03-20	drm/lcdif: Do not disable clocks on already suspended hardware	Marek Vasut
	In case the LCDIF is enabled in DT but unused, the clocks used by the LCDIF are not enabled. Those clocks may even have a use count of 0 in case there are no other users of those clocks. This can happen e.g. in case the LCDIF drives HDMI bridge which has no panel plugged into the HDMI connector. Do not attempt to disable clocks in the suspend callback and re-enable clocks in the resume callback unless the LCDIF is enabled and was in use before the system entered suspend, otherwise the driver might end up trying to disable clocks which are already disabled with use count 0, and would trigger a warning from clock core about this condition. Note that the lcdif_rpm_suspend() and lcdif_rpm_resume() functions internally perform the clocks disable and enable operations and act as runtime PM hooks too. Reviewed-by: Liu Ying <victor.liu@nxp.com> Fixes: 9db35bb349a0 ("drm: lcdif: Add support for i.MX8MP LCDIF variant") Signed-off-by: Marek Vasut <marex@denx.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240226082644.32603-1-marex@denx.de
2024-03-19	drm/xe/guc: Don't support older GuC 70.x releases	Daniele Ceraolo Spurio
	Supporting older GuC versions comes with baggage, both on the coding side (due to interfaces only being available from a certain version onwards) and on the testing side (due to having to make sure the driver works as expected with older GuCs). Since all of our Xe platform are still under force probe, we haven't committed to support any specific GuC version and we therefore don't need to support the older once, which means that we can force a bottom limit to what GuC we accept. This allows us to remove any conditional statements based on older GuC versions and also to approach newer additions knowing that we'll never attempt to load something older than our minimum requirement. As an initial value, the minimum expected version is set to 70.19, which is the version currently in the firmware table, but the expectation is that this will be bumbed every time we update the table, until we remove the force probe. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240304162616.824884-1-daniele.ceraolospurio@intel.com
2024-03-19	drm/xe: Add dbg messages on the suspend resume functions.	Rodrigo Vivi
	In case of the suspend/resume flow getting locked up we can get reports with some useful hints on where it might get locked and if that has failed. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318180141.267458-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-19	drm/xe: Convert gt suspend/resume messages to debug	Rodrigo Vivi
	Let's be quieter on production configuration and let's also print the entry point of the gt suspend when debug messages are enabled. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318180141.267458-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-19	drm/xe/vm : Remove duplicate assignment of XE_VM_FLAG_LR_MODE flag.	Himal Prasad Ghimiray
	vm->flags are already assigned with passed flags. Remove the redundant assignment. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240307065213.1968688-1-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-19	drm/panel: simple: Add POWERTIP PH128800T006-ZHC01 panel entry	Nathan Morrisson
	Add support for the POWERTIP PH128800T006-ZHC01 10.1" (1280x800) LCD-TFT panel. Signed-off-by: Nathan Morrisson <nmorrisson@phytec.com> Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20240318161708.1415484-3-nmorrisson@phytec.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240318161708.1415484-3-nmorrisson@phytec.com
2024-03-19	drm: bridge: thc63lvd1024: Print error message when DT parsing fails	Laurent Pinchart
	Commit 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use of_graph_get_remote_node()") simplified the thc63lvd1024 driver by replacing hand-rolled code with a helper function. While doing so, it created an error code path at probe time without any error message, potentially causing probe issues that get annoying to debug. Fix it by adding an error message. Fixes: 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use of_graph_get_remote_node()") Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240318160601.2813-1-laurent.pinchart+renesas@ideasonboard.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240318160601.2813-1-laurent.pinchart+renesas@ideasonboard.com
2024-03-19	nouveau/gsp: don't check devinit disable on GSP.	Dave Airlie
	GSP should be handling this and I can see no evidence in opengpu driver that this register should be touched. Fixed acceleration on 2080 Ti GPUs. Fixes: 15740541e8f0 ("drm/nouveau/devinit/tu102-: prepare for GSP-RM") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240314014521.2695233-1-airlied@gmail.com
2024-03-19	drm/xe/display: Mark dpt and related vma as uncached	Juha-Pekka Heikkila
	Mark dpt and related vma as uncached to avoid pipe faults on some devices. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318201850.127785-1-juhapekka.heikkila@gmail.com
2024-03-19	drm/xe/display: mark DPT with XE_BO_PAGETABLE	Matthew Auld
	Otherwise in the case where we use normal system memory, the CPU access will always be cached, like when filling the DPT PTEs, which is likely not what we want since HW access could be incoherent on platforms like LNL. Marking as XE_BO_PAGETABLE will force wc/uc underneath on such platforms. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240314164905.239449-2-matthew.auld@intel.com
2024-03-19	drm/xe: Remove usage of unsafe strcpy	Nirmoy Das
	Remove usage of unsafe strcpy with a helper function to convert engine class to string. Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318091055.638-1-nirmoy.das@intel.com
2024-03-19	drm/xe: Drop bogus vma NULL check	Nirmoy Das
	The vma pointer can't be NULL here. Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318093547.16326-1-nirmoy.das@intel.com
2024-03-19	drm/xe/device: fix XE_MAX_TILES_PER_DEVICE check	Matthew Auld
	Here XE_MAX_TILES_PER_DEVICE is the gt array size, therefore the gt index should always be less than. v2 (Lucas): - Add fixes tag. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-6-matthew.auld@intel.com
2024-03-19	drm/xe/device: fix XE_MAX_GT_PER_TILE check	Matthew Auld
	Here XE_MAX_GT_PER_TILE is the total, therefore the gt index should always be less than. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-5-matthew.auld@intel.com
2024-03-19	drm/xe/queue: fix engine_class bounds check	Matthew Auld
	The engine_class is the index into the user_to_xe_engine_class, therefore it needs to be less than. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-4-matthew.auld@intel.com
2024-03-19	drm/xe: Fix potential integer overflow in page size calculation	Nirmoy Das
	Explicitly cast tbo->page_alignment to u64 before bit-shifting to prevent overflow when assigning to min_page_size. Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318164342.3094-1-nirmoy.das@intel.com
2024-03-19	drm/xe/vm: fix xe_assert()	Matthew Auld
	The region can be used an index into the region_to_mem_type, so we should be asserting that it is less than the ARRAY_SIZE here. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318103616.26240-2-matthew.auld@intel.com
2024-03-19	drm/xe/client: drop bogus bo NULL check	Matthew Auld
	If we fished it out the list then it can't be null; the list entry is embedded in the bo. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318093431.21075-4-matthew.auld@intel.com
2024-03-19	drm/xe/client: remove bogus rcu list usage	Matthew Auld
	We use plain spinlock to protect readers and writers, so there is no actual RCU here. Rather use the more appropriate non-rcu list based API. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318093431.21075-3-matthew.auld@intel.com
2024-03-19	drm/i915: Add includes for BUG_ON/BUILD_BUG_ON in i915_memcpy.c	Joonas Lahtinen
	Add standalone includes for BUG_ON and BUILD_BUG_ON to avoid build failure after linux-next include refactoring. Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tursulin@ursulin.net> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240308144643.137831-1-joonas.lahtinen@linux.intel.com
2024-03-19	Merge tag 'drm-misc-next-fixes-2024-03-14' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Short summary of fixes pull: probe-helper: - never return negative values from .get_modes() plus driver fixes nouveau: - clear bo resource bus after eviction - documentation fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240314082833.GA8761@localhost.localdomain
2024-03-19	Merge tag 'drm-xe-next-fixes-2024-03-14' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Driver changes: - Invalidate userptr VMA on page pin fault, allowing userspace to free userptr while still having bindings - Fail early on sysfs file creation error - Skip VMA pinning on xe_exec with num_batch_buffer == 0 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c4epi2j6anpc77z73zbgibxg7bxsmmkb522aa7tyei6oa6uunn@3oad4cgomd5a
2024-03-18	Merge tag 'trace-v6.9-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: "Main user visible change: - User events can now have "multi formats" The current user events have a single format. If another event is created with a different format, it will fail to be created. That is, once an event name is used, it cannot be used again with a different format. This can cause issues if a library is using an event and updates its format. An application using the older format will prevent an application using the new library from registering its event. A task could also DOS another application if it knows the event names, and it creates events with different formats. The multi-format event is in a different name space from the single format. Both the event name and its format are the unique identifier. This will allow two different applications to use the same user event name but with different payloads. - Added support to have ftrace_dump_on_oops dump out instances and not just the main top level tracing buffer. Other changes: - Add eventfs_root_inode Only the root inode has a dentry that is static (never goes away) and stores it upon creation. There's no reason that the thousands of other eventfs inodes should have a pointer that never gets set in its descriptor. Create a eventfs_root_inode desciptor that has a eventfs_inode descriptor and a dentry pointer, and only the root inode will use this. - Added WARN_ON()s in eventfs There's some conditionals remaining in eventfs that should never be hit, but instead of removing them, add WARN_ON() around them to make sure that they are never hit. - Have saved_cmdlines allocation also include the map_cmdline_to_pid array The saved_cmdlines structure allocates a large amount of data to hold its mappings. Within it, it has three arrays. Two are already apart of it: map_pid_to_cmdline[] and saved_cmdlines[]. More memory can be saved by also including the map_cmdline_to_pid[] array as well. - Restructure __string() and __assign_str() macros used in TRACE_EVENT() Dynamic strings in TRACE_EVENT() are declared with: __string(name, source) And assigned with: __assign_str(name, source) In the tracepoint callback of the event, the __string() is used to get the size needed to allocate on the ring buffer and __assign_str() is used to copy the string into the ring buffer. There's a helper structure that is created in the TRACE_EVENT() macro logic that will hold the string length and its position in the ring buffer which is created by __string(). There are several trace events that have a function to create the string to save. This function is executed twice. Once for __string() and again for __assign_str(). There's no reason for this. The helper structure could also save the string it used in __string() and simply copy that into __assign_str() (it also already has its length). By using the structure to store the source string for the assignment, it means that the second argument to __assign_str() is no longer needed. It will be removed in the next merge window, but for now add a warning if the source string given to __string() is different than the source string given to __assign_str(), as the source to __assign_str() isn't even used and will be going away. - Added checks to make sure that the source of __string() is also the source of __assign_str() so that it can be safely removed in the next merge window. Included fixes that the above check found. - Other minor clean ups and fixes" * tag 'trace-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits) tracing: Add __string_src() helper to help compilers not to get confused tracing: Use strcmp() in __assign_str() WARN_ON() check tracepoints: Use WARN() and not WARN_ON() for warnings tracing: Use div64_u64() instead of do_div() tracing: Support to dump instance traces by ftrace_dump_on_oops tracing: Remove second parameter to __assign_rel_str() tracing: Add warning if string in __assign_str() does not match __string() tracing: Add __string_len() example tracing: Remove __assign_str_len() ftrace: Fix most kernel-doc warnings tracing: Decrement the snapshot if the snapshot trigger fails to register tracing: Fix snapshot counter going between two tracers that use it tracing: Use EVENT_NULL_STR macro instead of open coding "(null)" tracing: Use ? : shortcut in trace macros tracing: Do not calculate strlen() twice for __string() fields tracing: Rework __assign_str() and __string() to not duplicate getting the string cxl/trace: Properly initialize cxl_poison region name net: hns3: tracing: fix hclgevf trace event strings drm/i915: Add missing ; to __assign_str() macros in tracepoint code NFSD: Fix nfsd_clid_class use of __string_len() macro ...