summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-12-21drm/xe/xe2: Add missing mocs entryLucas De Marchi
Add index 4 so WB on both L3 and L4 can be used by userspace. Bspec: 71582 Link: https://lore.kernel.org/all/7oqovb356dx2hm5muop3xjqr4kv7m5fzjisch3vmsmxm33ygtv@eib4jielia35/ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231004150317.3473731-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Remove devcoredump readout of IPEIRJosé Roberto de Souza
This register don't exist in gfx12+, so here dropping the readout and print in devcoredump. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Fix devcoredump readout of IPEHRJosé Roberto de Souza
It was reading (base) + 0x8c but that is not a valid register and instead it should read (base) + 0x68. So here reading the correct register and removing the wrong and duplicated. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Fix RING_MI_MODE label in devcoredumpJosé Roberto de Souza
Fix a typo in RING_MI_MODE label. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: fix range printing for debug messagesPaulo Zanoni
We're already using the half-open interval notation "[A, B)", that "- 1" there makes it wrong. Also, getting rid of the "-1" makes it much easier to grep for the logs when you're looking for an address that's the end of a vma and the start of another. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/vm: use list_last_entry() to fetch last_opPaulo Zanoni
I would imagine that it's more efficient to fetch ops_list->prev than to walk the whole list forward. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/vm: print the correct 'keep' when printing gpuva opsPaulo Zanoni
Unions are cool, until they aren't. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/hwmon: Expose hwmon energy attributeBadal Nilawar
Expose hwmon energy attribute to show device level energy usage v2: - %s/hwm_/hwmon_/ - Convert enums to upper case v3: - %s/hwmon_/xe_hwmon - Remove gt specific hwmon attributes v4: - %s/REG_PKG_ENERGY_STATUS/REG_ENERGY_STATUS_ALL (Riana) - %s/hwmon_energy_info/xe_hwmon_energy_info (Riana) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20230925081842.3566834-5-badal.nilawar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/hwmon: Expose input voltage attributeBadal Nilawar
Use Xe HWMON subsystem to display the input voltage. v2: - Rename hwm_get_vltg to hwm_get_voltage (Riana) - Use scale factor SF_VOLTAGE (Riana) v3: - %s/gt_perf_status/REG_GT_PERF_STATUS/ - Remove platform check from hwmon_get_voltage() v4: - Fix review comments (Andi) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20230925081842.3566834-4-badal.nilawar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/hwmon: Expose card reactive critical powerBadal Nilawar
Expose the card reactive critical (I1) power. I1 is exposed as power1_crit in microwatts (typically for client products) or as curr1_crit in milliamperes (typically for server). v2: Move PCODE_MBOX macro to pcode file (Riana) v3: s/IS_DG2/(gt_to_xe(gt)->info.platform == XE_DG2) v4: Fix review comments (Andi) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20230925081842.3566834-3-badal.nilawar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/hwmon: Expose power attributesBadal Nilawar
Expose Card reactive sustained (pl1) power limit as power_max and card default power limit (tdp) as power_rated_max. v2: - Fix review comments (Riana) v3: - Use drmm_mutex_init (Matt Brost) - Print error value (Matt Brost) - Convert enums to uppercase (Matt Brost) - Avoid extra reg read in hwmon_is_visible function (Riana) - Use xe_device_assert_mem_access when applicable (Matt Brost) - Add intel-xe@lists.freedesktop.org in Documentation (Matt Brost) v4: - Use prefix xe_hwmon prefix for all functions (Matt Brost/Andi) - %s/hwmon_reg/xe_hwmon_reg (Andi) - Fix review comments (Guenter/Andi) v5: - Fix review comments (Riana) v6: - Use drm_warn in default case (Rodrigo) - s/ENODEV/EOPNOTSUPP (Andi) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20230925081842.3566834-2-badal.nilawar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Fix exec queue usage for unbindsMatthew Brost
Passing in a NULL exec queue to __xe_pt_unbind_vma results in the migrate exec queue being used. This is not the intent from the VM bind IOCTL, rather a NULL exec queue should use default VM exec queue. Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Leverage ComputeCS read L3 cachingBalasubramani Vivekanandan
On platforms that support read L3 caching, set the default mocs index in CCS RING_CMD_CTL to leverage the read caching in L3. Currently PVC and Xe2 platforms have the support. Bspec: 72161 Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929051539.3157441-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/xe2: Update MOCS fields in blitter instructionsHaridhar Kalvala
Xe2 changes or adds bits for mocs in a few BLT instructions: XY_CTRL_SURF_COPY_BLT, XY_FAST_COLOR_BLT, XY_FAST_COPY_BLT, and MEM_SET. Modify the code to deal with the new location. Unlike Xe1, the MOCS field in those instructions is only the MOCS index and not the Structure_MEMORY_OBJECT_CONTROL_STATE anymore. The pxp bit is now explicitly documented separately. Bspec: 57567,57566,57565,57562 Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929213640.3189912-5-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/xe2: Set tile y type in XY_FAST_COPY_BLT to Tile4Haridhar Kalvala
Set bits 30 and 31 of XY_FAST_COPY_BLT's dword1 for XeHP and above. Destination or source being Y-Major is selected on dword0 and there's nothing to set on dword1. According to the bspec for Xe2, "Behavior is undefined when programmed the value 0". Also for XeHP, the only value allowed in those bits is 0b11, not being possible to select "Legacy Tile-Y" anymore, only the newer Tile4. So, unconditionally set those bits for graphics IP 12.50 and above. v2: Reword commit message and extend it to graphics version >= 12.50 (Matt Roper) Bspec: 57567 Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929213640.3189912-4-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Rename MEM_SET instructionHaridhar Kalvala
PVC_MS_* doesn't reflect the real name of the instruction. Rename it to follow the name used in the bspec. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929213640.3189912-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Adjust mocs field mask definitionsHaridhar Kalvala
Instead of using xe_mocs_index_to_value(), simply define the bitmask with the shift left applied. This will make it easier to adapt to new platforms that simply use the index. This also fixes PVC bug in emit_clear_link_copy() where the MOCS was getting shifted both by PVC_MS_MOCS_INDEX_MASK definition and by the xe_moc_index_to_value function. Bspec: 44509 Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929213640.3189912-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/xe2: Extend reserved stolen sizesLucas De Marchi
For xe2, besides the previous sizes, the reserved portion of stolen can also have 16MB and 32MB. Bspec: 53148 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929044959.3149265-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/tuning: Add missing engine class rules for LRC tuningMatt Roper
The LRC tuning settings we have today are modifying registers that are part of the RCS engine's context; they're not part of the general CSFE context that would apply to all engines. Add ENGINE_CLASS(RENDER) to the RTP rules to properly restrict these to the RCS. Bspec: 46255, 46261 Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230929230332.3348841-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: timeout needs to be a signed valueFei Yang
In xe_wait_user_fence_ioctl, the timeout is currently defined as unsigned long. That could potentially pass a negative value to the schedule_timeout() call because nsecs_to_jiffies() returns an unsigned long which gets used as signed long. [ 187.732238] schedule_timeout: wrong timeout value fffffffffffffc18 [ 187.733180] CPU: 0 PID: 792 Comm: test_thread_dim Tainted: G U 6.4.0-xe #1 [ 187.734251] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 187.735019] Call Trace: [ 187.735373] <TASK> [ 187.735687] dump_stack_lvl+0x92/0xb0 [ 187.736193] schedule_timeout+0x348/0x430 [ 187.736739] ? __might_fault+0x67/0xd0 [ 187.737255] ? check_chain_key+0x224/0x2d0 [ 187.737812] ? __pfx_schedule_timeout+0x10/0x10 [ 187.738429] ? __might_fault+0x6b/0xd0 [ 187.738946] ? __pfx_lock_release+0x10/0x10 [ 187.739512] ? __pfx_lock_release+0x10/0x10 [ 187.740080] wait_woken+0x86/0x100 [ 187.740556] xe_wait_user_fence_ioctl+0x34b/0xe00 [xe] [ 187.741281] ? __pfx_xe_wait_user_fence_ioctl+0x10/0x10 [xe] [ 187.742075] ? lock_acquire+0x169/0x3d0 [ 187.742601] ? check_chain_key+0x224/0x2d0 [ 187.743158] ? drm_dev_enter+0x9/0xe0 [drm] [ 187.743740] ? __pfx_woken_wake_function+0x10/0x10 [ 187.744388] ? drm_dev_exit+0x11/0x50 [drm] [ 187.744969] ? __pfx_lock_release+0x10/0x10 [ 187.745536] ? __might_fault+0x67/0xd0 [ 187.746052] ? check_chain_key+0x224/0x2d0 [ 187.746610] drm_ioctl_kernel+0x172/0x250 [drm] [ 187.747242] ? __pfx_xe_wait_user_fence_ioctl+0x10/0x10 [xe] [ 187.748037] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm] [ 187.748729] ? __pfx_xe_wait_user_fence_ioctl+0x10/0x10 [xe] [ 187.749524] ? __pfx_xe_wait_user_fence_ioctl+0x10/0x10 [xe] [ 187.750319] drm_ioctl+0x35e/0x620 [drm] [ 187.750871] ? __pfx_drm_ioctl+0x10/0x10 [drm] [ 187.751495] ? restore_fpregs_from_fpstate+0x99/0x140 [ 187.752172] ? __pfx_restore_fpregs_from_fpstate+0x10/0x10 [ 187.752901] ? mark_held_locks+0x24/0x90 [ 187.753438] __x64_sys_ioctl+0xb4/0xf0 [ 187.753954] do_syscall_64+0x3f/0x90 [ 187.754450] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 187.755127] RIP: 0033:0x7f4e6651aaff [ 187.755623] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ 187.757995] RSP: 002b:00007fff05f37a50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 187.758995] RAX: ffffffffffffffda RBX: 000055eca47c8130 RCX: 00007f4e6651aaff [ 187.759935] RDX: 00007fff05f37b60 RSI: 00000000c050644b RDI: 0000000000000004 [ 187.760874] RBP: 0000000000000017 R08: 0000000000000017 R09: 7fffffffffffffff [ 187.761814] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 187.762753] R13: 0000000000000000 R14: 0000000000000000 R15: 00007f4e65d19ce0 [ 187.763694] </TASK> Fixes: 5572a0046857 ("drm/xe: Use nanoseconds instead of jiffies in uapi for user fence") Signed-off-by: Fei Yang <fei.yang@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20230921220500.994558-2-fei.yang@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: set PTE_AE for all platforms supporting itFei Yang
Atomic access is supported by PVC, and became a common feature for all platforms starting from Xe2. To enable that XE_VMA_ATOMIC_PTE_BIT needs to be set, then pte encode will eventually set PTE_AE for devmem. Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230928044335.1474903-2-fei.yang@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add a missing mutex_destroy to xe_ttm_vram_mgrBommithi Sakeena
Ensure that the mutex is destroyed at fini function. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Bommithi Sakeena <bommithi.sakeena@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Ensure mutex are destroyedBommithi Sakeena
Add missing mutex_destroy calls to fini functions or convert to drmm_mutex_init where fini function is not available. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Bommithi Sakeena <bommithi.sakeena@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: do not register to PM if GuC is disabledOhad Sharabi
When working without GuC (i.e. working with execlists), the flow attempts to perform suspend operation which is failing due to a lack of support without GuC. If PM ops are not supported without GuC we may as well avoid PM registration rather than returning errors from various PM flows. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Use vfunc for ggtt pte encodingLucas De Marchi
Use 2 different functions for encoding the ggtt's pte, assigning them during initialization. Main difference is that before Xe-LPG, the pte didn't have the cache bits. v2: Re-use xelp_ggtt_pte_encode_bo() for the common part with xelpg_ggtt_pte_encode_bo() (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-11-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Use pat_index to encode pde/pteLucas De Marchi
Change the xelp_pte_encode() and xelp_pde_encode() functions to use the platform-dependent pat_index. The same function can be used for all platforms as they only need to encode the pat_index bits in the same pte/pde layout. For platforms that don't have the most significant bit, as long as they don't return a bogus index they should be fine. v2: Use the same logic to encode pde as it's compatible with previous logic, it's more future proof and also fixes the cache setting for PVC (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-10-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/pat: Keep track of relevant indexesLucas De Marchi
Some of the PAT entries are relevant for internal driver use, which varies per platform. Let the PAT early initialization set what they should point to so the rest of the driver can use them where needed. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-9-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/pat: Prefer the arch/IP namesLucas De Marchi
Both DG2 and PVC are derived from XeHP, but DG2 should not really re-use something introduced by PVC, so it's odd to have DG2 re-using the PVC programming for PAT. Let's prefer using the architecture and/or IP names. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-8-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/dg2: Fix using wrong PAT tableLucas De Marchi
DG2 should use the MCR variant to program the PAT registers, like PVC, but shouldn't use the same table as PVC. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-7-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Use vfunc to initialize PATLucas De Marchi
Split the PAT initialization between SW-only and HW. The _early() only sets up the ops and data structure that are used later to program the tables. This allows the PAT to be easily extended to other platforms. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-6-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/migrate: Do not hand-encode pteLucas De Marchi
Instead of encoding the pte, call a new vfunc from xe_vm to handle that. The encoding may not be the same on every platform, so keeping it in one place helps to better support them. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-5-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Use vfunc for pte/pde ppgtt encodingLucas De Marchi
Move the function to encode pte/pde to be vfuncs inside struct xe_vm. This will allow to easily extend to platforms that don't have a compatible encoding. v2: Fix kunit build Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-4-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Remove check for vma == NULLLucas De Marchi
vma at this point can never be NULL as otherwise it would crash earlier in the only caller, xe_pt_stage_bind_entry(). Remove the extra check and avoid adding and removing the bits from the pte. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Normalize pte/pde encodingLucas De Marchi
Split functions that do only part of the pde/pte encoding and that can be called by the different places. This normalizes how pde/pte are encoded so they can be moved elsewhere in a subsequent change. xe_pte_encode() was calling __pte_encode() with a NULL vma, which is the opposite of what xe_pt_stage_bind_entry() does. Stop passing a NULL vma and just split another function that deals with a vma rather than a bo. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230927193902.2849159-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Infer service copy functionality from engine listMatt Roper
On platforms with multiple BCS engines (i.e., PVC and Xe2), not all BCS engines are created equal. The BCS0 engine is what the specs refer to as a "resource copy engine," which supports the platform's full set of copy/fill instructions. In contast, the non-BCS0 "service copy" engines are more streamlined and only support a subset of the GPU instructions supported by the resource copy engine. Platforms with both types of copy engines always support the MEM_COPY and MEM_SET instructions which can be used for simple copy and fill operations on either type of BCS engine. Since the simple MEM_SET instruction meets the needs of Xe's migrate code (and since the more elaborate XY_FAST_COLOR_BLT instruction isn't available to use on service copy engines), we always prefer to use MEM_SET for clearing buffers on our newer platforms. We've been using a 'has_link_copy_engine' feature flag to keep track of which platforms should use MEM_SET for fills. However a feature flag like this is unnecessary since we can already derive the presence of service copy engines (and in turn the MEM_SET instruction) just by looking at the platform's pre-fusing engine list. Utilizing the engine list for this also avoids mistakes like we've made on Xe2 where we forget to set the feature flag in the IP definition. For clarity, "service copy" is a general term that covers any blitter engines that support a limited subset of the overall blitter instruction set (in practice this is any non-BCS0 blitter engine). The "link copy engines" introduced on PVC and the "paging copy engine" present in Xe2 are both instances of service copy engines. v2: - Rewrite / expand the commit message. (Bala) - Fix checkpatch whitespace error. Bspec: 65019 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Link: https://lore.kernel.org/r/20230927205143.2695089-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/irq: Clear GFX_MSTR_IRQ as part of IRQ resetGustavo Sousa
Starting with Xe_LP+, GFX_MSTR_IRQ contains status bits that have W1C behavior. If we do not properly reset them, we would miss delivery of interrupts if a pending bit is set when enabling IRQs. As an example, the display part of our probe routine contains paths where we wait for vblank interrupts. If a display interrupt was already pending when enabling IRQs, we would time out waiting for the vblank. That in fact happened recently when modprobing Xe on a Lunar Lake with a specific configuration; and that's how we found out we were missing this step in the IRQ enabling logic. Fix the issue by clearing GFX_MSTR_IRQ as part of the IRQ reset. v2: - Make resetting GFX_MSTR_IRQ be the last step to avoid bit re-latching. (Ville) v3: - Swap nesting order: guard loop with the IP version check instead of doing the check at each iteration. (Lucas) v4: - Add braces for the "if" statement guarding the loop to make the compiler happy. (Gustavo) BSpec: 50875, 54028, 62357 Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230926221914.106843-2-gustavo.sousa@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add Wa_18028616096Shekhar Chauhan
Drop UGM per set fragment threshold to 3 BSpec: 54833 Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230925160543.915217-1-shekhar.chauhan@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Align size to PAGE_SIZEPallavi Mishra
Ensure alignment with PAGE_SIZE for the size parameter passed to __xe_bo_create_locked() v2: move size alignment under else condition (Lucas) Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230920213259.3458968-1-pallavi.mishra@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Accept a const xe deviceLucas De Marchi
Depending on the context, it's preferred to have a const pointer to make sure nothing is modified underneath. The assert macros only ever read data from xe/tile/gt for printing, so they can be made const by default. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20230922174320.2372617-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: add msix supportDani Liberman
In future devices we will need to support msix interrupts. Reviewed-by: Ohad Sharabi <osharabi@habana.ai> Signed-off-by: Dani Liberman <dliberman@habana.ai> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: change old msi irq api to a new oneDani Liberman
As a preparation for msix support, changing for new msi irq api which supports both msi and msix. Reviewed-by: Ohad Sharabi <osharabi@habana.ai> Signed-off-by: Dani Liberman <dliberman@habana.ai> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rebase fixes by Rodrigo]
2023-12-21drm/xe: proper setting of irq enabled flagDani Liberman
IRQ enabled flag should be set only after request irq succeeds. Reviewed-by: Ohad Sharabi <osharabi@habana.ai> Signed-off-by: Dani Liberman <dliberman@habana.ai> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Implement fdinfo memory stats printingTejas Upadhyay
Use the newly added drm_print_memory_stats helper to show memory utilisation of our objects in drm/driver specific fdinfo output. To collect the stats we walk the per memory regions object lists and accumulate object size into the respective drm_memory_stats categories. Objects with multiple possible placements are reported in multiple regions for total and shared sizes, while other categories are counted only for the currently active region. V4: - Remove rcu lock - Auld/Thomas - take refcnt only if its non-zero - Auld - DMA_RESV_USAGE_BOOKKEEP covers all fences - Auld - covert to xe_bo for public objects V3: - dont use xe_bo_get/put, not needed - use designated initializer - Jani - use list_for_each_entry_rcu - Fix Checkpatch err - CI V2: - Use static initializer for mem_type - Himal/Jani Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Account ring buffer and context state storageTejas Upadhyay
Account ring buffers and logical context space against the owning client memory usage stats. Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Track page table memory usage for clientTejas Upadhyay
Account page table memory usage in the owning client memory usage stats. V2: - Minor tweak to if (vm->pt_root[id]) check - Himal Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Record each drm client with its VMTejas Upadhyay
Enable accounting of indirect client memory usage. Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add tracking support for bos per clientTejas Upadhyay
In order to show per client memory consumption, we need tracking support APIs to add at every bo consumption and removal. Adding APIs here to add tracking calls at places wherever it is applicable. V5: - Rebase V4: - remove client bo before vm_put - spin_lock_irqsave not required - Auld V3: - update .h to return xe_drm_client_remove_bo void - protect xe_drm_client_remove_bo under CONFIG_PROC_FS check - Himal - Fixed Checkpatch error - CI V2: - make xe_drm_client_remove_bo return void - Himal Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Interface xe drm client with fdinfo interfaceTejas Upadhyay
DRM core driver has introduced recently fdinfo interface to show memory stats of individual drm client. Lets interface xe drm client to fdinfo interface. V2: - cover call to xe_drm_client_fdinfo under PROC_FS Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add drm-client infrastructureTejas Upadhyay
Add drm-client infrastructure to record stats of consumption done by individual drm client. V2: - Typo - CI Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add child contexts to the GuC context lookupDaniele Ceraolo Spurio
The CAT_ERROR message from the GuC provides the guc id of the context that caused the problem, which can be a child context. We therefore need to be able to match that id to the exec_queue that owns it, which we do by adding child context to the context lookup. While at it, fix the error path of the guc id allocation code to correctly free the ids allocated for parallel queues. v2: rebase on s/XE_WARN_ON/xe_assert Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/590 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>