summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe
AgeCommit message (Collapse)Author
2025-07-02drm/xe/xe_pmu: Validate gt in event supportedRiana Tauro
Validate gt instead of checking gt_id is lesser than max gts per tile Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250630093741.2435281-1-riana.tauro@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-07-02drm/xe/xe_query: Use separate iterator while filling GT listMatt Roper
The 'id' value updated by for_each_gt() is the uapi GT ID of the GTs being iterated over, and may skip over values if a GT is not present on the device. Use a separate iterator for GT list array assignments to ensure that the array will be filled properly on future platforms where index in the GT query list may not match the uapi ID. v2: - Include the missing increment of the iterator. (Jonathan) Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250701201320.2514369-16-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-07-02drm/xe: Don't compare GT ID to GT count when determining valid GTsMatt Roper
On current platforms with multiple GTs, all of the GT IDs are consecutive; as a result we know that the GT IDs range from 0 to gt_count-1 and can determine if a GT ID is valid by comparing against the count. The consecutive nature of GT IDs may not hold true on future platforms if/when we have platforms that are both multi-tile and have multiple GTs within each tile. Once such platforms exist, it's quite possible that we could wind up with something like a GT list composed of IDs 0, 2, and 3 with no GT 1 (which would be a 2-tile platform with media only on the second tile). To future-proof the code we should stop comparing against the GT count to determine whether a GT ID is valid or not. Instead we should do an actual lookup of the ID to determine whether the GT exists. This also means that our GT loop macro should not end at the GT count, but should rather examine the entire space up to (# of tiles) * (max GT per tile) to ensure it doesn't stop prematurely. Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250701201320.2514369-15-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-07-02drm/xe: Assign GT IDs properly on multi-tile + multi-GT platformsMatt Roper
Although "multi-tile" and "multiple GTs per tile" are mutually-exclusive characteristics on all of our platforms today, this may not always be true. Assign GT IDs according to xe->info.max_gt_per_tile in a way that should work even if future platforms have different configurations. This patch should not change the behavior of current platforms; it only future-proofs for potential future designs. v2: - Re-calculate gt_count if tile count gets reduced by MTCFG. (PVC CI) Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com> Link: https://lore.kernel.org/r/20250701201320.2514369-14-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-07-02drm/xe/tests/pci: Ensure all platforms have a valid GT/tile countMatt Roper
Add a simple kunit test to ensure each platform's GT per tile count is non-zero and does not exceed the global XE_MAX_GT_PER_TILE definition. We need to move 'struct xe_subplatform_desc' from the .c file to the types header to ensure it is accessible from the kunit test. v2: - Rebase on latest xe_pci test rework from Michal and convert to a parameterized test that runs on each PCI ID supported by the driver. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com> Link: https://lore.kernel.org/r/20250701201320.2514369-13-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-07-02drm/xe: Track maximum GTs per tile on a per-platform basisMatt Roper
Today all of our platforms fall into one of three cases: * Single tile platforms with a single (primary) GT * Single tile platforms with two GTs (primary + media) * Two-tile platforms with a single GT (primary) in each Our numbering of GTs has been a bit inconsistent between platforms (e.g., GT1 is the media GT on some platforms, but the second tile's primary GT on others). In the future we'll likely have platforms that are both multi-tile and multi-GT, which will make the situation more confusing. We could also wind up with more than just two types of GTs at some point in the future. Going forward we should standardize the way we assign uapi GT IDs to internal GT structures. Let's declare that for userspace GT ID n, GT[n]'s tile = n / (max gt per tile) GT[n]'s slot within tile = n % (max gt per tile) We don't want the GT numbering to change for any of our current platforms since the current IDs are part of our ABI contract with userspace so this means we should track the 'max gt per tile' value on a per-platform basis rather than just using a single value across the driver. Encode this into device descriptors in xe_pci.c and use the per-platform number for various checks in the code. Constant XE_MAX_GT_PER_TILE will remain just as the maximum across all platforms for easy of sizing array allocations. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250701201320.2514369-12-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-07-02drm/xe: Export xe_step_name for kunit testsMatt Roper
xe_step_name() is used by xe_assert(), so adding assertions to functions like xe_device_get_gt() will result in ERROR: modpost: "xe_step_name" [drivers/gpu/drm/xe/tests/xe_test.ko] undefined! while building the kunit tests. Export xe_step_name to avoid these build failures when adding assertions. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250701201320.2514369-11-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-07-02drm/xe/hw_engine_group: Fix potential leakMichal Wajdeczko
If we fail to allocate a workqueue we will leak kzalloc'ed group object since it was designed to be kfree'ed in the drmm cleanup action, but we didn't have a chance to register this action yet. To avoid this leak allocate a group object using drmm_kzalloc() and start using predefined drmm action to release the workqueue. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250627184143.1480-1-michal.wajdeczko@intel.com
2025-07-01drm/xe: Allow dropping kunit dependency as built-inHarry Austen
Fix Kconfig symbol dependency on KUNIT, which isn't actually required for XE to be built-in. However, if KUNIT is enabled, it must be built-in too. Fixes: 08987a8b6820 ("drm/xe: Fix build with KUNIT=m") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Harry Austen <hpausten@protonmail.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20250627-xe-kunit-v2-2-756fe5cd56cf@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit a559434880b320b83733d739733250815aecf1b0) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe: Fix kconfig promptLucas De Marchi
The xe driver is the official driver for Intel Xe2 and later, while maintaining experimental support for earlier GPUs. Reword the help message accordingly. Reviewed-by: Maarten Lankhorst <dev@lankhorst.se> Link: https://lore.kernel.org/r/20250611-xe-kconfig-help-v1-1-8bcc6b47d11a@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 1488a3089de3d0bcdc9532da7ce04cf0af9d7dd0) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe/bmg: Update Wa_22019338487Vinay Belgaumkar
Limit GT max frequency to 2600MHz and wait for frequency to reduce before proceeding with a transient flush. This is really only needed for the transient flush: if L2 flush is needed due to 16023588340 then there's no need to do this additional wait since we are already using the bigger hammer. v2: Use generic names, ensure user set max frequency requests wait for flush to complete (Rodrigo) v3: - User requests wait via wait_var_event_timeout (Lucas) - Close races on flush + user requests (Lucas) - Fix xe_guc_pc_remove_flush_freq_limit() being called on last gt rather than root gt (Lucas) v4: - Only apply the freq reducing part if a TDF is needed: L2 flush trumps the need for waiting a lower frequency Fixes: aaa08078e725 ("drm/xe/bmg: Apply Wa_22019338487") Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250618-wa-22019338487-v5-4-b888388477f2@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit deea6a7d6d803d6bb874a3e6f1b312e560e6c6df) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe/bmg: Update Wa_14022085890Vinay Belgaumkar
Set GT min frequency to 1200Mhz once driver load is complete. v2: Review comments (Rodrigo) v3: Apply Wa earlier so user_req_min is not clobbered. v4: Apply to all GTs (Lucas) Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250612-wa-14022085890-v4-3-94ba5dcc1e30@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit bdde16c9ac5cb56ad2ee19792222fa1853577af7) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe: Split xe_device_td_flush()Lucas De Marchi
xe_device_td_flush() has 2 possible implementations: an entire L2 flush or a transient flush, depending on WA 16023588340. Make this clear by splitting the function so it calls each of them. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250618-wa-22019338487-v5-3-b888388477f2@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 5e300ed8a545bdffc26b579c526b5fef7b2d5365) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe/xe_guc_pc: Lock once to update stashed frequenciesLucas De Marchi
pc_set_mert_freq_cap() currently lock()/unlock() the mutex multiple times to stash the current frequencies. It's not a problem since xe_guc_pc_restore_stashed_freq() is guaranteed to be called only later in the init sequence. However, now that we have _locked() variants for this functions, use them and avoid potential issues when called from other places or using the same pattern. While at it, prefer and early return for the WA check to reduce indentation. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250618-wa-22019338487-v5-2-b888388477f2@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit d878c97daa603573e5af01fd8beec2fffdb42ad1) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe/guc_pc: Add _locked variant for min/max freqLucas De Marchi
There are places in which the getters/setters are called one after the other causing a multiple lock()/unlock(). These are not currently a problem since they are all happening from the same thread, but there's a race possibility as calls are added outside of the early init when the max/min and stashed values need to be correlated. Add the _locked() variants to prepare for that. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250618-wa-22019338487-v5-1-b888388477f2@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 1beae9aa2b88d3a02eb666e7b777eb2d7bc645f4) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe: Make WA BB part of LRC BOMatthew Brost
No idea why, but without this GuC context switches randomly fail when running IGTs in a loop. Need to follow up why this fixes the aforementioned issue but can live with a stable driver for now. Fixes: 617d824c5323 ("drm/xe: Add WA BB to capture active context utilization") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250612031925.4009701-1-matthew.brost@intel.com (cherry picked from commit 3a1edef8f4b58b0ba826bc68bf4bce4bdf59ecf3) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe: Fix out-of-bounds field write in MI_STORE_DATA_IMMJia Yao
According to Bspec, bits 0~9 of MI_STORE_DATA_IMM must not exceed 0x3FE. The macro MI_SDI_NUM_QW(x) evaluates to 2 * x + 1, which means the condition 2 * x + 1 <= 0x3FE must be satisfied. Therefore, the maximum valid value for x is 0x1FE, not 0x1FF. v2 - Replace 0x1fe with macro MAX_PTE_PER_SDI (Auld, Matthew & Patelczyk, Maciej) v3 - Change macro MAX_PTE_PER_SDI from 0x1fe to 0x1feU (De Marchi, Lucas) Bspec: 60246 Fixes: 9c44fd5f6e8a ("drm/xe: Add migrate layer functions for SVM support") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Brian3 Nguyen <brian3.nguyen@intel.com> Cc: Alex Zuo <alex.zuo@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Suggested-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Jia Yao <jia.yao@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Link: https://lore.kernel.org/r/20250612224620.161105-1-jia.yao@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit c038bdba98c9f6a36378044a9d4385531a194d3e) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe: Consolidate LRC offset calculationsTvrtko Ursulin
Attempt to consolidate the LRC offsets calculations by aligning the recently added wa_bb_offset with the naming scheme in the file and also change the size stored in struct xe_lrc to not include the ring buffer. The former makes it somewhat visually easier to follow the layout of the various logical blocks stored in the LRC bo, while the latter reduces the number of sprinkled around calculations. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250630124711.8209-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-01drm/xe: Fix typo in KconfigMaarten Lankhorst
doubut -> doubt. Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250627074119.347826-1-dev@lankhorst.se Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-06-28drm/xe: Allow dropping kunit dependency as built-inHarry Austen
Fix Kconfig symbol dependency on KUNIT, which isn't actually required for XE to be built-in. However, if KUNIT is enabled, it must be built-in too. Fixes: 08987a8b6820 ("drm/xe: Fix build with KUNIT=m") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Harry Austen <hpausten@protonmail.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20250627-xe-kunit-v2-2-756fe5cd56cf@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-06-28drm/xe: Fix conflicting intel_pcode_* symbolsLucas De Marchi
If CONFIG_DRM_XE_DISPLAY is set, the xe module can only be built as module to avoid duplicate symbols from i915. The interface for pcode was added without considering that, so the build breaks if both xe and i915 are built-in. Since the intel_pcode_* functions should only be called from the display side (xe side should call the xe interface directly) and there's already a protection in Kconfig to avoid the problematic configuration, ifdef it out in case CONFIG_DRM_XE_DISPLAY is disabled. Closes: https://lore.kernel.org/r/3667a992-a24b-4e49-aab2-5ca73f2c0a56@infradead.org Fixes: d9465cc8ac2d ("drm/xe/pcode: add struct drm_device based interface") Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250627-xe-kunit-v2-1-756fe5cd56cf@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-06-27drm/xe: Drop bo->sizeMatthew Brost
bo->size is redundant because the base GEM object already has a size field with the same value. Drop bo->size and use the base GEM object’s size instead. While at it, introduce xe_bo_size() to abstract the BO size. v2: - Fix typo in kernel doc (Ashutosh) - Fix kunit (CI) - Fix line wrap (Checkpatch) v3: - Fix sriov build (CI) v4: - Fix display build (CI) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/20250625144128.2827577-1-matthew.brost@intel.com
2025-06-27drm/xe/guc: Enable the Dynamic Inhibit Context Switch optimizationDaniele Ceraolo Spurio
The Dynamic Inhibit Context Switch is an optimization aimed at reducing the amount of time the HW is stuck waiting on an unsatisfied semaphore. When this optimization is enabled, the GuC will dynamically modify the CTX_CTRL_INHIBIT_SYN_CTX_SWITCH in the CTX_CONTEXT_CONTROL register of LRCs to enable immediate switching out on an unsatisfied semaphore wait when multiple contexts are competing for time on the same engine. This feature is available on recent HW from GuC 70.40.1 onwards and it is enabled via a per-VF feature opt-in. v2: rebase v3: switch to using guc_buf_cache instead of dedicated alloc v4: add helper to check for feature availability (Michal), don't enable if multi-lrc is possible. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250625205405.1653212-4-daniele.ceraolospurio@intel.com
2025-06-27drm/xe/guc: Enable extended CAT error reportingDaniele Ceraolo Spurio
On newer HW (Xe2 onwards + PVC) it is possible to get extra information when a CAT error occurs, specifically a dword reporting the error type. To enable this extra reporting, we need to opt-in with the GuC, which is done via a specific per-VF feature opt-in H2G. On platforms where the HW does not support the extra reporting, the GuC will set the type to 0xdeadbeef, so we can keep the code simple and opt-in to the feature on every platform and then just discard the data if it is invalid. Note that on native/PF we're guaranteed that the opt in is available because we don't support any GuC old enough to not have it, but if we're a VF we might be running on a non-XE PF with an older GuC, so we need to handle that case. We can re-use the invalid type above to handle this scenario the same way as if the feature was not supported in HW. Given that this patch is the first user of the guc_buf_cache on native and VF, it also extends that feature to non-PF use-cases. v2: simpler print for the error type (John), rebase v3: use guc_buf_cache instead of new alloc, simpler doc (Michal) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> #v1 Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250625205405.1653212-3-daniele.ceraolospurio@intel.com
2025-06-27drm/i915/flipq: Provide the nuts and bolts code for flip queueVille Syrjälä
Provide the lower level code for PIPEDMC based flip queue. We'll use the so called semi-full flip queue mode where the PIPEDMC will start the provided DSB on a scanline a little ahead of the vblank. We need to program the triggering scanline early enough so that the DSB has enough time to complete writing all the double buffered registers before they get latched (at start of vblank). The firmware implements several queues: - 3 "plane queues" which execute a single DSB per entry - 1 "general queue" which can apparently execute 2 DSBs per entry - 1 vestigial "fast queue" that replaced the "simple flip queue" on ADL+, but this isn't supposed to be used due to issues. But we only need a single plane queue really, and we won't actually use it as a real queue because we don't allow queueing multiple commits ahead of time. So the whole thing is perhaps useless. I suppose there migth be some power saving benefits if we would get the flip scheduled by userspace early and then could keep some hardware powered off a bit longer until the DMC kicks off the flipq programming. But that is pure speculation at this time and needs to be proven. The code to hook up the flip queue into the actual atomic commit path will follow later. TODO: need to think how to do the "wait for DMC firmware load" nicely need to think about VRR and PSR etc. v2: Don't write DMC_FQ_W2_PTS_CFG_SEL on pre-lnl Don't oops at flipq init if there is no dmc v3: Adapt to PTL+ flipq changes (different queue entry layout, different trigger event, need VRR TG) Use the actual CDCLK frequency Ask the DSB code how long things are expected to take v3: Adjust the cdclk rounding (docs are 100% vague, Windows rounds like this) Initialize some undocumented magic DMC variables on PTL v4: Use PIPEDMC_FQ_STATUS for busy check (the busy bit in PIPEDMC_FQ_CTRL is apparently gone on LNL+) Based the preempt timeout on the max exec time Preempt before disabling the flip queue Order the PIPEDMC_SCANLINECMP* writes a bit more carefully Fix some typos v5: Try to deal with some clang-20 div-by-zero false positive (Nathan) Add some docs (Jani) Cc: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> epr Link: https://patchwork.freedesktop.org/patch/msgid/20250624170049.27284-5-ville.syrjala@linux.intel.com
2025-06-27drm/i915/display: Add drm_panic support for Y-tiling with DPTJocelyn Falempe
On Alder Lake and later, it's not possible to disable tiling when DPT is enabled. So this commit implements Y-Tiling support, to still be able to draw the panic screen. Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250624091501.257661-10-jfalempe@redhat.com Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-27drm/i915: Add intel_bo_panic_setup() and intel_bo_panic_finish()Jocelyn Falempe
Implement both functions for i915 and xe, they prepare the work for drm_panic support. They both use kmap_try_from_panic(), and map one page at a time, to write the panic screen on the framebuffer. Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250624091501.257661-8-jfalempe@redhat.com Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-27drm/i915: Add intel_bo_alloc_framebuffer()Jocelyn Falempe
Encapsulate the struct intel_framebuffer into an xe_framebuffer or i915_framebuffer, and allow to add specific fields for each variant for the panic use-case. This is particularly needed to have a struct xe_res_cursor available to support drm panic on discrete GPU. Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250624091501.257661-7-jfalempe@redhat.com Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-27drm/i915/fbdev: Add intel_fbdev_get_map()Jocelyn Falempe
The vaddr of the fbdev framebuffer is private to the struct intel_fbdev, so this function is needed to access it for drm_panic. Also the struct i915_vma is different between i915 and xe, so it requires a few functions to access fbdev->vma->iomap. Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250624091501.257661-3-jfalempe@redhat.com Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Fix out-of-bounds field write in MI_STORE_DATA_IMMJia Yao
According to Bspec, bits 0~9 of MI_STORE_DATA_IMM must not exceed 0x3FE. The macro MI_SDI_NUM_QW(x) evaluates to 2 * x + 1, which means the condition 2 * x + 1 <= 0x3FE must be satisfied. Therefore, the maximum valid value for x is 0x1FE, not 0x1FF. v2 - Replace 0x1fe with macro MAX_PTE_PER_SDI (Auld, Matthew & Patelczyk, Maciej) v3 - Change macro MAX_PTE_PER_SDI from 0x1fe to 0x1feU (De Marchi, Lucas) Bspec: 60246 Fixes: 9c44fd5f6e8a ("drm/xe: Add migrate layer functions for SVM support") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Brian3 Nguyen <brian3.nguyen@intel.com> Cc: Alex Zuo <alex.zuo@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Suggested-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Jia Yao <jia.yao@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Link: https://lore.kernel.org/r/20250612224620.161105-1-jia.yao@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-06-26drm/xe: Rename xe_uc_init_hw to xe_uc_load_hwMaarten Lankhorst
It feels to me like load is closer to the intention than init_hw. It makes the init calls slightly less confusing to me. :) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-24-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Remove xe_uc_fini_hwMaarten Lankhorst
xe_uc_init_hw() is called multiple times from xe_gt.c, and that makes the name xe_uc_fini_hw(), called for a different reason in xe_guc.c confusing. Remove it and inline the xe_uc_sanitize_reset into xe_guc.c directly. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-23-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Remove xe_uc_init_hwconfig()Maarten Lankhorst
This function is called immediately after xe_uc_init(), so just put it there instead. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-22-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Move xe_ttm_sys_mgr_init() downwards.Maarten Lankhorst
Now that all previous allocations are gone, ensure no new allocations will ever be done before xe_display_init_early(), by moving the call that allows allocations downwards. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-21-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Split init of xe_gt_init_hwconfig to xe_gt_init and *_earlyMaarten Lankhorst
Now that we added the separate step of initialising GUC in xe_gt_init_early, it should be ok to initialise the minimum during early init, and the rest after allocations are allowed. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-20-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Rename gt_init sub-functionsMaarten Lankhorst
s/gt_fw_domain_init/gt_init_with_gt_forcewake()/ s/all_fw_domain_init/gt_init_with_all_forcewake()/ Clarify that the functions are the part of gt_init() that are called with the respective power domains held. all_domain() of course only works after discovering and initialisation of force_wake on all engines, that's why the split is needed in the first place. Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-19-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Only dump PAT when xe_hw_engines_init_early failsMaarten Lankhorst
After discussion with Lucas De Marchi, it turns out that is the specific caller requiring a dump. This allows us to cleanup gt_init in a bit. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-18-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Make it possible to read instance0 MCR registers after ↵Maarten Lankhorst
xe_gt_mcr_init_early After mcr_init_early, we need to be able to do VRAM and CCS probing without hwconfig probe. Fortunately the relevant registers are all instance 0, which fortunately means no dependencies on further initialization is required. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-17-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Simplify GuC early initializationMaarten Lankhorst
Add a 2-stage GuC init. An early one for everything that is needed for VF, and a full one that loads GuC and is allowed to do allocations. Link: https://lore.kernel.org/r/20250619104858.418440-16-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-06-26drm/xe/sriov: Move VF bootstrap and query_config to vf_guc_initMaarten Lankhorst
We want to split up GUC init to an alloc and noalloc part to keep the init path the same for VF and !VF as much as possible. Everything in vf_guc_init should be done as early as possible, otherwise VRAM probing becomes impossible. Also move xe_gt_mmio_init to the end of xe_gt_init_early(), cleaning up the init in xe_device slightly. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-15-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Defer memirq init until neededMaarten Lankhorst
memirqs require allocations into GGTT, which we cannot use until after display is enabled. Now that the initialisation of interrupts is postponed, move memirq init too. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ilia Levi <ilia.levi@intel.com> Link: https://lore.kernel.org/r/20250619104858.418440-14-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26drm/xe: Implement and use the drm_pagemap populate_mm opThomas Hellström
Add runtime PM since we might call populate_mm on a foreign device. v3: - Fix a kerneldoc failure (Matt Brost) - Revert the bo type change from device to kernel (Matt Brost) v4: - Add an assert in xe_svm_alloc_vram (Matt Brost) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250619134035.170086-4-thomas.hellstrom@linux.intel.com
2025-06-26drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemapMatthew Brost
The migration functionality and track-keeping of per-pagemap VRAM mapped to the CPU mm is not per GPU_vm, but rather per pagemap. This is also reflected by the functions not needing the drm_gpusvm structures. So move to drm_pagemap. With this, drm_gpusvm shouldn't really access the page zone-device-data since its meaning is internal to drm_pagemap. Currently it's used to reject mapping ranges backed by multiple drm_pagemap allocations. For now, make the zone-device-data a void pointer. Alter the interface of drm_gpusvm_migrate_to_devmem() to ensure we don't pass a gpusvm pointer. Rename CONFIG_DRM_XE_DEVMEM_MIRROR to CONFIG_DRM_XE_PAGEMAP. Matt is listed as author of this commit since he wrote most of the code, and it makes sense to retain his git authorship. Thomas mostly moved the code around. v3: - Kerneldoc fixes (CI) - Don't update documentation about how the drm_pagemap migration should be interpreted until upcoming patches where the functionality is implemented. (Matt Brost) v4: - More kerneldoc fixes around timeslice_ms (Himal Ghimiray, Matt Brost) v6: - Fix an uninitialized pagemap pointer (CI) Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250619134035.170086-2-thomas.hellstrom@linux.intel.com
2025-06-26drm/ttm, drm_xe, Implement ttm_lru_walk_for_evict() using the guarded LRU ↵Thomas Hellström
iteration To avoid duplicating the tricky bo locking implementation, Implement ttm_lru_walk_for_evict() using the guarded bo LRU iteration. To facilitate this, support ticketlocking from the guarded bo LRU iteration. v2: - Clean up some static function interfaces (Christian König) - Fix Handling -EALREADY from ticketlocking in the loop by skipping to the next item. (Intel CI) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250623155313.4901-4-thomas.hellstrom@linux.intel.com
2025-06-26drm/ttm, drm/xe: Modify the struct ttm_bo_lru_walk_cursor initializationThomas Hellström
Instead of the struct ttm_operation_ctx, Pass a struct ttm_lru_walk_arg to enable us to easily extend the walk functionality, and to implement ttm_lru_walk_for_evict() using the guarded LRU iteration. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250623155313.4901-3-thomas.hellstrom@linux.intel.com
2025-06-26drm/xe: Process deferred GGTT node removals on device unwindMichal Wajdeczko
While we are indirectly draining our dedicated workqueue ggtt->wq that we use to complete asynchronous removal of some GGTT nodes, this happends as part of the managed-drm unwinding (ggtt_fini_early), which could be later then manage-device unwinding, where we could already unmap our MMIO/GMS mapping (mmio_fini). This was recently observed during unsuccessful VF initialization: [ ] xe 0000:00:02.1: probe with driver xe failed with error -62 [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747340 __xe_bo_unpin_map_no_vm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747540 __xe_bo_unpin_map_no_vm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747240 __xe_bo_unpin_map_no_vm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747040 tiles_fini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e746840 mmio_fini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747f40 xe_bo_pinned_fini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e746b40 devm_drm_dev_init_release (16 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] drmres release begin [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef81640 __fini_relay (8 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80d40 guc_ct_fini (8 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80040 __drmm_mutex_release (8 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80140 ggtt_fini_early (8 bytes) and this was leading to: [ ] BUG: unable to handle page fault for address: ffffc900058162a0 [ ] #PF: supervisor write access in kernel mode [ ] #PF: error_code(0x0002) - not-present page [ ] Oops: Oops: 0002 [#1] SMP NOPTI [ ] Tainted: [W]=WARN [ ] Workqueue: xe-ggtt-wq ggtt_node_remove_work_func [xe] [ ] RIP: 0010:xe_ggtt_set_pte+0x6d/0x350 [xe] [ ] Call Trace: [ ] <TASK> [ ] xe_ggtt_clear+0xb0/0x270 [xe] [ ] ggtt_node_remove+0xbb/0x120 [xe] [ ] ggtt_node_remove_work_func+0x30/0x50 [xe] [ ] process_one_work+0x22b/0x6f0 [ ] worker_thread+0x1e8/0x3d Add managed-device action that will explicitly drain the workqueue with all pending node removals prior to releasing MMIO/GSM mapping. Fixes: 919bb54e989c ("drm/xe: Fix missing runtime outer protection for ggtt_remove_node") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250612220937.857-2-michal.wajdeczko@intel.com (cherry picked from commit 89d2835c3680ab1938e22ad81b1c9f8c686bd391) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-26drm/xe/guc: Explicitly exit CT safe mode on unwindMichal Wajdeczko
During driver probe we might be briefly using CT safe mode, which is based on a delayed work, but usually we are able to stop this once we have IRQ fully operational. However, if we abort the probe quite early then during unwind we might try to destroy the workqueue while there is still a pending delayed work that attempts to restart itself which triggers a WARN. This was recently observed during unsuccessful VF initialization: [ ] xe 0000:00:02.1: probe with driver xe failed with error -62 [ ] ------------[ cut here ]------------ [ ] workqueue: cannot queue safe_mode_worker_func [xe] on wq xe-g2h-wq [ ] WARNING: CPU: 9 PID: 0 at kernel/workqueue.c:2257 __queue_work+0x287/0x710 [ ] RIP: 0010:__queue_work+0x287/0x710 [ ] Call Trace: [ ] delayed_work_timer_fn+0x19/0x30 [ ] call_timer_fn+0xa1/0x2a0 Exit the CT safe mode on unwind to avoid that warning. Fixes: 09b286950f29 ("drm/xe/guc: Allow CTB G2H processing without G2H IRQ") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250612220937.857-3-michal.wajdeczko@intel.com (cherry picked from commit 2ddbb73ec20b98e70a5200cb85deade22ccea2ec) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-26drm/xe: move DPT l2 flush to a more sensible placeMatthew Auld
Only need the flush for DPT host updates here. Normal GGTT updates don't need special flush. Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250606104546.1996818-4-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 35db1da40c8cfd7511dc42f342a133601eb45449) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-26drm/xe: Move DSB l2 flush to a more sensible placeMaarten Lankhorst
Flushing l2 is only needed after all data has been written. Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/20250606104546.1996818-3-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 0dd2dd0182bc444a62652e89d08c7f0e4fde15ba) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-25drm/xe: Do not wedge device on killed exec queuesMatthew Brost
When a user closes an exec queue or interrupts an app with Ctrl-C, this does not warrant wedging the device in mode 2. Avoid this by skipping the wedge check for killed exec queues in the TDR and LR exec queue cleanup worker. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250624174103.2707941-1-matthew.brost@intel.com