summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/intel_ggtt.c
AgeCommit message (Collapse)Author
2024-02-07drm/i915: Rename the DSM/GSM registersVille Syrjälä
0x108100 and 0x1080c0 have been around since snb. Rename the defines appropriately. v2: Rebase Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-7-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Bypass LMEMBAR/GTTMMADR for MTL stolen memory accessVille Syrjälä
On MTL accessing stolen memory via the BARs is somehow borked, and it can hang the machine. As a workaround let's bypass the BARs and just go straight to DSMBASE/GSMBASE instead. Note that on every other platform this itself would hang the machine, but on MTL the system firmware is expected to relax the access permission guarding stolen memory to enable this workaround, and thus direct CPU accesses should be fine. The raw stolen memory areas won't be passed to VMs so we'll need to risk using the BAR there for the initial setup. Once command submission is up we should switch to MI_UPDATE_GTT which at least shouldn't hang the whole machine. v2: Don't use direct GSM/DSM access on guests Add w/a number v3: Check register 0x138914 to see if pcode did its job Add some debug prints Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-5-ville.syrjala@linux.intel.com
2023-11-20drm/i915: Track gt pm wakerefsAndrzej Hajda
Track every intel_gt_pm_get() until its corresponding release in intel_gt_pm_put() by returning a cookie to the caller for acquire that must be passed by on released. When there is an imbalance, we can see who either tried to free a stale wakeref, or who forgot to free theirs. v2: track recently added calls in gen8_ggtt_bind_get_ce and destroyed_worker_func Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231030-ref_tracker_i915-v1-2-006fe6b96421@intel.com
2023-10-27drm/i915: Flush WC GGTT only on required platformsNirmoy Das
gen8_ggtt_invalidate() is only needed for limited set of platforms where GGTT is mapped as WC. This was added as way to fix WC based GGTT in commit 0f9b91c754b7 ("drm/i915: flush system agent TLBs on SNB") and there are no reference in HW docs that forces us to use this on non-WC backed GGTT. This can also cause unwanted side-effects on XE_HP platforms where GFX_FLSH_CNTL_GEN6 is not valid anymore. v2: Add a func to detect wc ggtt detection (Ville) v3: Improve commit log and add reference commit (Daniel) Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: <stable@vger.kernel.org> # v6.2+ Suggested-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231018093815.1349-1-nirmoy.das@intel.com
2023-10-27drm/i915/gt: Remove {} from if-elseSoumya Negi
In accordance to Linux coding style(Documentation/process/4.Coding.rst), remove unneeded braces from if-else block as all arms of this block contain single statements. Suggested-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Soumya Negi <soumya.negi97@gmail.com> Acked-by: Karolina Stolarek <karolina.stolarek@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231026044309.17213-1-soumya.negi97@gmail.com
2023-10-18drm/i915: Define and use GuC and CTB TLB invalidation routinesPrathap Kumar Valsan
The GuC firmware had defined the interface for Translation Look-Aside Buffer (TLB) invalidation. We should use this interface when invalidating the engine and GuC TLBs. Add additional functionality to intel_gt_invalidate_tlb, invalidating the GuC TLBs and falling back to GT invalidation when the GuC is disabled. The invalidation is done by sending a request directly to the GuC tlb_lookup that invalidates the table. The invalidation is submitted as a wait request and is performed in the CT event handler. This means we cannot perform this TLB invalidation path if the CT is not enabled. If the request isn't fulfilled in two seconds, this would constitute an error in the invalidation as that would constitute either a lost request or a severe GuC overload. With this new invalidation routine, we can perform GuC-based GGTT invalidations. GuC-based GGTT invalidation is incompatible with MMIO invalidation so we should not perform MMIO invalidation when GuC-based GGTT invalidation is expected. The additional complexity incurred in this patch will be necessary for range-based tlb invalidations, which will be platformed in the future. Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> CC: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231017180806.3054290-4-jonathan.cavitt@intel.com
2023-09-30drm/i915: Implement GGTT update method with MI_UPDATE_GTTNirmoy Das
Implement GGTT update method with blitter command, MI_UPDATE_GTT and install those handlers if a platform requires that. v2: Make sure we hold the GT wakeref and Blitter engine wakeref before we call mutex_lock/intel_context_enter below. When GT/engine are not awake, the intel_context_enter calls into some runtime pm function which can end up with kmalloc/fs_reclaim. But trigger fs_reclaim holding a mutex lock is not allowed because shrinker can also try to hold the same mutex lock. It is a circular lock. So hold the GT/blitter engine wakeref before calling mutex_lock, to fix the circular lock. Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Oak Zeng <oak.zeng@intel.com> Acked-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230926083742.14740-6-nirmoy.das@intel.com
2023-09-21drm/i915/gt: Fix reservation address in ggtt_reserve_guc_topJavier Pello
There is an assertion in ggtt_reserve_guc_top that the global GTT is of size at least GUC_GGTT_TOP, which is not the case on a 32-bit platform; see commit 562d55d991b39ce376c492df2f7890fd6a541ffc ("drm/i915/bdw: Only use 2g GGTT for 32b platforms"). If GEM_BUG_ON is enabled, this triggers a BUG(); if GEM_BUG_ON is disabled, the subsequent reservation fails and the driver fails to initialise the device: i915 0000:00:02.0: [drm:i915_init_ggtt [i915]] Failed to reserve top of GGTT for GuC i915 0000:00:02.0: Device initialization failed (-28) i915 0000:00:02.0: Please file a bug on drm/i915; see https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs for details. i915: probe of 0000:00:02.0 failed with error -28 Make the reservation at the top of the available space, whatever that is, instead of assuming that the top will be GUC_GGTT_TOP. Fixes: 911800765ef6 ("drm/i915/uc: Reserve upper range of GGTT") Link: https://gitlab.freedesktop.org/drm/intel/-/issues/9080 Signed-off-by: Javier Pello <devel@otheo.eu> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Fernando Pacheco <fernando.pacheco@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230902171039.2229126186d697dbcf62d6d8@otheo.eu
2023-06-05drm/i915/uc: perma-pin firmwaresDaniele Ceraolo Spurio
Now that each FW has its own reserved area, we can keep them always pinned and skip the pin/unpin dance on reset. This will make things easier for the 2-step HuC authentication, which requires the FW to be pinned in GGTT after the xfer is completed. Since the vma is now valid for a long time and not just for the quick pin-load-unpin dance, the name "dummy" is no longer appropriare and has been replaced with vma_res. All the functions have also been updated to operate on vma_res for consistency. Given that we pin the vma behind the allocator's back (which is ok because we do the pinning in an area that was previously reserved for thus purpose), we do need to explicitly re-pin on resume because the automated helper won't cover us. v2: better comments and commit message, s/dummy/vma_res/ Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-2-daniele.ceraolospurio@intel.com
2023-06-02drm/i915/gt: Fix second parameter type of pre-gen8 pte_encode callbacksNathan Chancellor
When booting a kernel compiled with CONFIG_CFI_CLANG (kCFI), there is a CFI failure in ggtt_probe_common() when trying to call hsw_pte_encode() via an indirect call: [ 5.030027] CFI failure at ggtt_probe_common+0xd1/0x130 [i915] (target: hsw_pte_encode+0x0/0x30 [i915]; expected type: 0xf5c1d0fc) With kCFI, indirect calls are validated against their expected type versus actual type and failures occur when the two types do not match. clang's -Wincompatible-function-pointer-types-strict can catch this at compile time but it is not enabled for the kernel yet: drivers/gpu/drm/i915/gt/intel_ggtt.c:1155:23: error: incompatible function pointer types assigning to 'u64 (*)(dma_addr_t, unsigned int, u32)' (aka 'unsigned long long (*)(unsigned int, unsigned int, unsigned int)') from 'u64 (dma_addr_t, enum i915_cache_level, u32)' (aka 'unsigned long long (unsigned int, enum i915_cache_level, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict] ggtt->vm.pte_encode = iris_pte_encode; ^ ~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/gt/intel_ggtt.c:1157:23: error: incompatible function pointer types assigning to 'u64 (*)(dma_addr_t, unsigned int, u32)' (aka 'unsigned long long (*)(unsigned int, unsigned int, unsigned int)') from 'u64 (dma_addr_t, enum i915_cache_level, u32)' (aka 'unsigned long long (unsigned int, enum i915_cache_level, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict] ggtt->vm.pte_encode = hsw_pte_encode; ^ ~~~~~~~~~~~~~~ drivers/gpu/drm/i915/gt/intel_ggtt.c:1159:23: error: incompatible function pointer types assigning to 'u64 (*)(dma_addr_t, unsigned int, u32)' (aka 'unsigned long long (*)(unsigned int, unsigned int, unsigned int)') from 'u64 (dma_addr_t, enum i915_cache_level, u32)' (aka 'unsigned long long (unsigned int, enum i915_cache_level, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict] ggtt->vm.pte_encode = byt_pte_encode; ^ ~~~~~~~~~~~~~~ drivers/gpu/drm/i915/gt/intel_ggtt.c:1161:23: error: incompatible function pointer types assigning to 'u64 (*)(dma_addr_t, unsigned int, u32)' (aka 'unsigned long long (*)(unsigned int, unsigned int, unsigned int)') from 'u64 (dma_addr_t, enum i915_cache_level, u32)' (aka 'unsigned long long (unsigned int, enum i915_cache_level, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict] ggtt->vm.pte_encode = ivb_pte_encode; ^ ~~~~~~~~~~~~~~ drivers/gpu/drm/i915/gt/intel_ggtt.c:1163:23: error: incompatible function pointer types assigning to 'u64 (*)(dma_addr_t, unsigned int, u32)' (aka 'unsigned long long (*)(unsigned int, unsigned int, unsigned int)') from 'u64 (dma_addr_t, enum i915_cache_level, u32)' (aka 'unsigned long long (unsigned int, enum i915_cache_level, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict] ggtt->vm.pte_encode = snb_pte_encode; ^ ~~~~~~~~~~~~~~ 5 errors generated. In this case, the pre-gen8 pte_encode functions have a second parameter type of 'enum i915_cache_level' whereas the function pointer prototype in 'struct i915_address_space' expects a second parameter type of 'unsigned int'. Update the second parameter of the callbacks and the comment above them noting that these statements are still valid, which matches other functions and files, to clear up the kCFI failures at run time. Fixes: 9275277d5324 ("drm/i915: use pat_index instead of cache_level") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Fei Yang <fei.yang@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230530-i915-gt-cache_level-wincompatible-function-pointer-types-strict-v1-1-54501d598229@kernel.org
2023-05-11drm/i915: use pat_index instead of cache_levelFei Yang
Currently the KMD is using enum i915_cache_level to set caching policy for buffer objects. This is flaky because the PAT index which really controls the caching behavior in PTE has far more levels than what's defined in the enum. In addition, the PAT index is platform dependent, having to translate between i915_cache_level and PAT index is not reliable, and makes the code more complicated. From UMD's perspective there is also a necessity to set caching policy for performance fine tuning. It's much easier for the UMD to directly use PAT index because the behavior of each PAT index is clearly defined in Bspec. Having the abstracted i915_cache_level sitting in between would only cause more ambiguity. PAT is expected to work much like MOCS already works today, and by design userspace is expected to select the index that exactly matches the desired behavior described in the hardware specification. For these reasons this patch replaces i915_cache_level with PAT index. Also note, the cache_level is not completely removed yet, because the KMD still has the need of creating buffer objects with simple cache settings such as cached, uncached, or writethrough. For kernel objects, cache_level is used for simplicity and backward compatibility. For Pre-gen12 platforms PAT can have 1:1 mapping to i915_cache_level, so these two are interchangeable. see the use of LEGACY_CACHELEVEL. One consequence of this change is that gen8_pte_encode is no longer working for gen12 platforms due to the fact that gen12 platforms has different PAT definitions. In the meantime the mtl_pte_encode introduced specfically for MTL becomes generic for all gen12 platforms. This patch renames the MTL PTE encode function into gen12_pte_encode and apply it to all gen12. Even though this change looks unrelated, but separating them would temporarily break gen12 PTE encoding, thus squash them in one patch. Special note: this patch changes the way caching behavior is controlled in the sense that some objects are left to be managed by userspace. For such objects we need to be careful not to change the userspace settings.There are kerneldoc and comments added around obj->cache_coherent, cache_dirty, and how to bypass the checkings by i915_gem_object_has_cache_level. For full understanding, these changes need to be looked at together with the two follow-up patches, one disables the {set|get}_caching ioctl's and the other adds set_pat extension to the GEM_CREATE uAPI. Bspec: 63019 Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-3-fei.yang@intel.com
2023-05-11drm/i915: preparation for using PAT indexFei Yang
This patch is a preparation for replacing enum i915_cache_level with PAT index. Caching policy for buffer objects is set through the PAT index in PTE, the old i915_cache_level is not sufficient to represent all caching modes supported by the hardware. Preparing the transition by adding some platform dependent data structures and helper functions to translate the cache_level to pat_index. cachelevel_to_pat: a platform dependent array mapping cache_level to pat_index. max_pat_index: the maximum PAT index recommended in hardware specification Needed for validating the PAT index passed in from user space. i915_gem_get_pat_index: function to convert cache_level to PAT index. obj_to_i915(obj): macro moved to header file for wider usage. I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the convenience of coding. Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-2-fei.yang@intel.com
2023-04-25drm/i915/mtl: Add PTE encode functionFei Yang
PTE encode functions are platform dependent. This patch implements PTE functions for MTL, and ensures the correct PTE encode function is used by calling pte_encode function pointer instead of the hardcoded gen8 version of PTE encode. Fixes: b76c0deef627 ("drm/i915/mtl: Define MOCS and PAT tables for MTL") Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230424182902.3663500-2-fei.yang@intel.com
2023-04-06Merge tag 'drm-intel-gt-next-2023-04-06' of ↵Daniel Vetter
git://anongit.freedesktop.org/drm/drm-intel into drm-next UAPI Changes: - (Build-time only, should not have any impact) drm/i915/uapi: Replace fake flex-array with flexible-array member "Zero-length arrays as fake flexible arrays are deprecated and we are moving towards adopting C99 flexible-array members instead." This is on core kernel request moving towards GCC 13. Driver Changes: - Fix context runtime accounting on sysfs fdinfo for heavy workloads (Tvrtko) - Add support for OA media units on MTL (Umesh) - Add new workarounds for Meteorlake (Daniele, Radhakrishna, Haridhar) - Fix sysfs to read actual frequency for MTL and Gen6 and earlier (Ashutosh) - Synchronize i915/BIOS on C6 enabling on MTL (Vinay) - Fix DMAR error noise due to GPU error capture (Andrej) - Fix forcewake during BAR resize on discrete (Andrzej) - Flush lmem contents after construction on discrete (Chris) - Fix GuC loading timeout on systems where IFWI programs low boot frequency (John) - Fix race condition UAF in i915_perf_add_config_ioctl (Min) - Sanitycheck MMIO access early in driver load and during forcewake (Matt) - Wakeref fixes for GuC RC error scenario and active VM tracking (Chris) - Cancel HuC delayed load timer on reset (Daniele) - Limit double GT reset to pre-MTL (Daniele) - Use i915 instead of dev_priv insied the file_priv structure (Andi) - Improve GuC load error reporting (John) - Simplify VCS/BSD engine selection logic (Tvrtko) - Perform uc late init after probe error injection (Andrzej) - Fix format for perf_limit_reasons in debugfs (Vinay) - Create per-gt debugfs files (Andi) - Documentation and kerneldoc fixes (Nirmoy, Lee) - Selftest improvements (Fei, Jonathan) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZC6APj/feB+jBf2d@jlahtine-mobl.ger.corp.intel.com
2023-03-16drm/i915: add guard page to ggtt->error_captureAndrzej Hajda
Write-combining memory allows speculative reads by CPU. ggtt->error_capture is WC mapped to CPU, so CPU/MMU can try to prefetch memory beyond the error_capture, ie it tries to read memory pointed by next PTE in GGTT. If this PTE points to invalid address DMAR errors will occur. This behaviour was observed on ADL and RPL platforms. To avoid it, guard scratch page should be added after error_capture. The patch fixes the most annoying issue with error capture but since WC reads are used also in other places there is a risk similar problem can affect them as well. v2: - modified commit message (I hope the diagnosis is correct), - added bug checks to ensure scratch is initialized on gen3 platforms. CI produces strange stacktrace for it suggesting scratch[0] is NULL, to be removed after resolving the issue with gen3 platforms. v3: - removed bug checks, replaced with gen check. v4: - change code for scratch page insertion to support all platforms, - add info in commit message there could be more similar issues v5: - check for nop_clear_range instead of gen8 (Tvrtko), - re-insert scratch pages on resume (Tvrtko) v6: - use scratch_range callback to set scratch pages (Chris) Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230308-guard_error_capture-v6-2-1b5f31422563@intel.com
2023-03-16drm/i915/gt: introduce vm->scratch_range callbackAndrzej Hajda
The callback will be responsible for setting scratch page PTEs for specified range. In contrast to clear_range it cannot be optimized to nop. It will be used by code adding guard pages. Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230308-guard_error_capture-v6-1-1b5f31422563@intel.com
2023-01-25drm/i915/display: add intel_display_limits.h for key enumsJani Nikula
Move a handful of key enums to a new file intel_display_limits.h. These are the enum types, and the MAX/NUM enumerations within them, that are used in other headers. Otherwise, there's no common theme between them. Replace intel_display.h include with intel_display_limit.h where relevant, and add the intel_display.h include directly in the .c files where needed. Since intel_display.h is used almost everywhere in display/, include it from intel_display_types.h to avoid massive changes across the board. There are very few files that would need intel_display_types.h but not intel_display.h so this is neglible, and further cleanup between these headers can be left for the future. Overall this change drops the direct and indirect dependencies on intel_display.h from about 300 to about 100 compilation units, because we can drop the include from i915_drv.h. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230116164644.1752009-1-jani.nikula@intel.com
2023-01-25Merge drm/drm-next into drm-intel-nextJani Nikula
Backmerge to get the EDID handling changes. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-01-18drm/i915: drop cast from DEFINE_RES_MEM() usageJani Nikula
Since commit 52c4d11f1dce ("resource: Convert DEFINE_RES_NAMED() to be compound literal") it's no longer necessary to cast DEFINE_RES_MEM() to struct resource. This also fixes sparse warnings "cast from non-scalar" and "cast to non-scalar". Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230116173422.1858527-2-jani.nikula@intel.com
2022-12-06Revert "drm/i915: Improve on suspend / resume time with VT-d enabled"Andi Shyti
This reverts commit 2ef6efa79fecd5e3457b324155d35524d95f2b6b. Checking the presence if the IRST (Intel Rapid Start Technology) through the ACPI to decide whether to rebuild or not the GGTT puts us at the mercy of the boot firmware and we need to unnecessarily rely on third parties. Because now we avoid adding scratch pages to the entire GGTT we don't need this hack anymore. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-6-andi.shyti@linux.intel.com
2022-12-06drm/i915: Refine VT-d scanout workaroundChris Wilson
VT-d may cause overfetch of the scanout PTE, both before and after the vma (depending on the scanout orientation). bspec recommends that we provide a tile-row in either directions, and suggests using 168 PTE, warning that the accesses will wrap around the ends of the GGTT. Currently, we fill the entire GGTT with scratch pages when using VT-d to always ensure there are valid entries around every vma, including scanout. However, writing every PTE is slow as on recent devices we perform 8MiB of uncached writes, incurring an extra 100ms during resume. If instead we focus on only putting guard pages around scanout, we can avoid touching the whole GGTT. To avoid having to introduce extra nodes around each scanout vma, we adjust the scanout drm_mm_node to be smaller than the allocated space, and fixup the extra PTE during dma binding. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-5-andi.shyti@linux.intel.com
2022-12-06drm/i915: Introduce guard pages to i915_vmaChris Wilson
Introduce the concept of padding the i915_vma with guard pages before and after. The major consequence is that all ordinary uses of i915_vma must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size directly, as the drm_mm_node will include the guard pages that surround our object. The biggest connundrum is how exactly to mix requesting a fixed address with guard pages, particularly through the existing uABI. The user does not know about guard pages, so such must be transparent to the user, and so the execobj.offset must be that of the object itself excluding the guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages. The caveat is that some placements will be impossible with guard pages, as wrap arounds need to be avoided, and the vma itself will require a larger node. We must not report EINVAL but ENOSPC as these are unavailable locations within the GTT rather than conflicting user requirements. In the next patch, we start using guard pages for scanout objects. While these are limited to GGTT vma, on a few platforms these vma (or at least an alias of the vma) is shared with userspace, so we may leak the existence of such guards if we are not careful to ensure that the execobj.offset is transparent and excludes the guards. (On such platforms like ivb, without full-ppgtt, userspace has to use relocations so the presence of more untouchable regions within its GTT such be of no further issue.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221201203912.346110-1-andi.shyti@linux.intel.com
2022-12-05drm/i915/guc: enable GuC GGTT invalidation from the startDaniele Ceraolo Spurio
Invalidating the GuC TLBs while GuC is not loaded does not have negative consequences, so if we're starting the driver with GuC enabled we can use the GGTT invalidation function from the get-go, instead of switching to it when we initialize the GuC objects. In MTL, this fixes and issue where we try to overwrite the invalidation function twice (once for each GuC), due to the GGTT being shared between the primary and media GTs Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221110175823.3867135-1-daniele.ceraolospurio@intel.com
2022-11-28drm/i915/mtl: Media GT and Render GT share common GGTTAravind Iddamsetty
On XE_LPM+ platforms the media engines are carved out into a separate GT but have a common GGTMMADR address range which essentially makes the GGTT address space to be shared between media and render GT. As a result any updates in GGTT shall invalidate TLB of GTs sharing it and similarly any operation on GGTT requiring an action on a GT will have to involve all GTs sharing it. setup_private_pat was being done on a per GGTT based as that doesn't touch any GGTT structures moved it to per GT based. BSPEC: 63834 v2: 1. Add details to commit msg 2. includes fix for failure to add item to ggtt->gt_list, as suggested by Lucas 3. as ggtt_flush() is used only for ggtt drop i915_is_ggtt check within it. 4. setup_private_pat moved out of intel_gt_tiles_init v3: 1. Move out for_each_gt from i915_driver.c (Jani Nikula) v4: drop using RCU primitives on ggtt->gt_list as it is not an RCU list (Matt Roper) Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221122070126.4813-1-aravind.iddamsetty@intel.com
2022-11-14drm/i915/mtl: Handle wopcm per-GT and limit calculations.Aravind Iddamsetty
With MTL standalone media architecture the wopcm layout has changed, with separate partitioning in WOPCM for the root GT GuC and the media GT GuC. The size of WOPCM is 4MB with the lower 2MB reserved for the media GT and the upper 2MB for the root GT. Given that MTL has GuC deprivilege, the WOPCM registers are pre-locked by the bios. Therefore, we can skip all the math for the partitioning and just limit ourselves to sanity-checking the values. v2: fix makefile file ordering (Jani) v3: drop XELPM_SAMEDIA_WOPCM_SIZE, check huc instead of VDBOX (John) v4: further clarify commit message, remove blank line (John) Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-5-daniele.ceraolospurio@intel.com
2022-10-20drm/i915: s/HAS_BAR2_SMEM_STOLEN/HAS_LMEMBAR_SMEM_STOLEN/Ville Syrjälä
The fact that LMEMBAR is BAR2 should be of no real interest to anyone. So use the name of the BAR rather than its index. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221005154159.18750-3-ville.syrjala@linux.intel.com Acked-by: Matthew Auld <matthew.auld@intel.com>
2022-10-20drm/i915: Name our BARs based on the specVille Syrjälä
We use all kinds of weird names for our base address registers. Take the names from the spec and stick to them to avoid confusing everyone. The only exceptions are IOBAR and LMEMBAR since naming them IOBAR_BAR and LMEMBAR_BAR looks too funny, and yet I think that adding the _BAR to GTTMMADR & co. (which don't have one in the spec name) does make it more clear what they are. And IOBAR vs. GTTMMADR_BAR also looks a bit too inconsistent for my taste. v2: Fix gvt build v3: Add GEN2_IO_BAR for completeness Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221005195646.17201-1-ville.syrjala@linux.intel.com Acked-by: Matthew Auld <matthew.auld@intel.com>
2022-10-17drm/i915/xehp: Create separate reg definitions for new MCR registersMatt Roper
Starting in Xe_HP, several registers our driver works with have been converted from singleton registers into replicated registers with multicast behavior. Although the registers are still located at the same MMIO offsets as on previous platforms, let's duplicate the register definitions in preparation for upcoming patches that will handle multicast registers in a special manner. The registers that are now replicated on Xe_HP are: * PAT_INDEX (mslice replication) * FF_MODE2 (gslice replication) * COMMON_SLICE_CHICKEN3 (gslice replication) * SLICE_COMMON_ECO_CHICKEN1 (gslice replication) * SLICE_UNIT_LEVEL_CLKGATE (gslice replication) * LNCFCMOCS (lncf replication) Note that there are a couple places in selftest_mocs.c where the gen9 version of LNCFCMOCS is still used without regards for which platform we're on. Those cases are just doing an offset lookup and not issuing any CPU reads/writes of the register, so the potentially multicast nature of the register doesn't come into play. v2: - Add commit message note about the unconditional GEN9_LNCFCMOCS usage in selftest_mocs. (Bala) - Include some additional TLB registers. Bspec: 66534 Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-3-matthew.d.roper@intel.com
2022-10-10drm/i915: Fix display problems after resumeThomas Hellström
Commit 39a2bd34c933 ("drm/i915: Use the vma resource as argument for gtt binding / unbinding") introduced a regression that due to the vma resource tracking of the binding state, dpt ptes were not correctly repopulated. Fix this by clearing the vma resource state before repopulating. The state will subsequently be restored by the bind_vma operation. Fixes: 39a2bd34c933 ("drm/i915: Use the vma resource as argument for gtt binding / unbinding") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220912121957.31310-1-thomas.hellstrom@linux.intel.com Cc: Matthew Auld <matthew.auld@intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.18+ Reported-and-tested-by: Kevin Boulain <kevinboulain@gmail.com> Tested-by: David de Sousa <davidesousa@gmail.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221005121159.340245-1-thomas.hellstrom@linux.intel.com
2022-10-03Merge drm/drm-next into drm-intel-gt-nextTvrtko Ursulin
Daniele needs 84d4333c1e28 ("misc/mei: Add NULL check to component match callback functions") in order to merge the DG2 HuC patches. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-09-30drm/i915/mtl: enable local stolen memoryAravind Iddamsetty
As an integrated GPU, MTL does not have local memory and HAS_LMEM() returns false. However the platform's stolen memory is presented via BAR2 (i.e., the BAR we traditionally consider to be the GMADR on IGFX) and should be managed by the driver the same way that local memory is on dgpu platforms (which includes setting the "lmem" bit on page table entries). We use the term "local stolen memory" to refer to this model. The major difference from the traditional BAR2 (GMADR) is that the stolen area is mapped via the BAR2 while in the former BAR2 is an aperture into the GTT VA through which access are made into stolen area. BSPEC: 53098, 63830 v2: 1. dropped is_dsm_invalid, updated valid_stolen_size check from Lucas (Jani, Lucas) 2. drop lmembar_is_igpu_stolen 3. revert to referring GFXMEM_BAR as GEN12_LMEM_BAR (Lucas) v3:(Jani) 1. rename get_mtl_gms_size to mtl_get_gms_size 2. define register for MMIO address v4:(Matt) 1. Use REG_FIELD_GET to read GMS value 2. replace the calculations with SZ_256M/SZ_8M v5: Include more details to commit message on how it is different from earlier platforms (Anshuman) Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: CQ Tang <cq.tang@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Original-author: CQ Tang Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220929114658.145287-1-aravind.iddamsetty@intel.com
2022-08-10drm/i915: Sanitycheck PCI BARsPiotr Piórkowski
For proper operation of i915 we need usable PCI GTTMMADDR BAR 0 (1 for GEN2). In most cases we also need usable PCI GFXMEM BAR 2. Let's add functions to check if BARs are set, and that it have a size greater than 0. In case GTTMMADDR BAR, let's validate at the beginning of i915 initialization. For other BARs, let's validate before first use. Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220805155959.1983584-3-piotr.piorkowski@intel.com
2022-08-10drm/i915: Use of BARs names instead of numbersPiotr Piórkowski
At the moment, when we refer to some PCI BAR we use the number of this BAR in the code. The meaning of BARs between different platforms may be different. Therefore, in order to organize the code, let's start using defined names instead of numbers. v2: Add lost header in cfg_space.c Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220805155959.1983584-2-piotr.piorkowski@intel.com
2022-06-29drm/i915: Fix a lockdep warning at error captureNirmoy Das
For some platfroms we use stop_machine version of gen8_ggtt_insert_page/gen8_ggtt_insert_entries to avoid a concurrent GGTT access bug but this causes a circular locking dependency warning: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ggtt->error_mutex); lock(dma_fence_map); lock(&ggtt->error_mutex); lock(cpu_hotplug_lock); Fix this by calling gen8_ggtt_insert_page/gen8_ggtt_insert_entries directly at error capture which is concurrent GGTT access safe because reset path make sure of that. v2: Fix rebase conflict and added a comment. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5595 Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220624110821.29190-1-nirmoy.das@intel.com
2022-06-22drm/i915/gt: Re-do the intel-gtt splitLucas De Marchi
Re-do what was attempted in commit 7a5c922377b4 ("drm/i915/gt: Split intel-gtt functions by arch"). The goal of that commit was to split the handlers for older hardware that depend on intel-gtt.ko so i915 can be built for non-x86 archs, after some more patches. Other archs do not need intel-gtt.ko. Main issue with the previous approach: it moved all the hooks, including the gen8, which is used by all platforms gen8 and newer. Re-do the split moving only the handlers for gen < 6, which are the only ones calling out to the separate module. While at it do some minor cleanups: - Rename the prefix s/gen5_/gmch_/ to be more accurate what platforms are covered by intel_ggtt_gmch.c - Remove dead code for gen12 out of needs_idle_maps() - Remove TODO comment leftover - Re-order if/else ladder in ggtt_probe_hw() to keep newest platforms first v2: Add minor cleanups (Matt Roper) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220617230559.2109427-2-lucas.demarchi@intel.com
2022-06-20drm/i915: Improve on suspend / resume time with VT-d enabledThomas Hellström
When DMAR / VT-d is enabled, the display engine uses overfetching, presumably to deal with the increased latency. To avoid display engine errors and DMAR faults, as a workaround the GGTT is populated with scatch PTEs when VT-d is enabled. However starting with gen10, Write-combined writing of scratch PTES is no longer possible and as a result, populating the full GGTT with scratch PTEs like on resume becomes very slow as uncached access is needed. Therefore, on integrated GPUs utilize the fact that the PTEs are stored in stolen memory which retain content across S3 suspend. Don't clear the PTEs on suspend and resume. This improves on resume time with around 100 ms. While 100+ms might appear like a short time it's 10% to 20% of total resume time and important in some applications. One notable exception is Intel Rapid Start Technology which may cause stolen memory to be lost across what the OS percieves as S3 suspend. If IRST is enabled or if we can't detect whether IRST is enabled, retain the old workaround, clearing and re-instating PTEs. As an additional measure, if we detect that the last ggtt pte was lost during suspend, print a warning and re-populate the GGTT ptes On discrete GPUs, the display engine scans out from LMEM which isn't subject to DMAR, and presumably the workaround is therefore not needed, but that needs to be verified and disabling the workaround for dGPU, if possible, will be deferred to a follow-up patch. v2: - Rely on retained ptes to also speed up suspend and resume re-binding. - Re-build GGTT ptes if Intel rst is enabled. v3: - Re-build GGTT ptes also if we can't detect whether Intel rst is enabled, and if the guard page PTE and end of GGTT was lost. v4: - Fix some kerneldoc issues (Matthew Auld), rebase. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220617152856.249295-1-thomas.hellstrom@linux.intel.com
2022-04-21Merge drm/drm-next into drm-intel-gt-nextRodrigo Vivi
In order to get the GSC Support merged on drm-intel-gt-next in a clean fashion we needed this ATS-M patch to avoid conflict in i915_pci.c: commit 412c942bdfae ("drm/i915/ats-m: add ATS-M platform info") -- Fixing a silent conflict on drivers/gpu/drm/i915/gt/intel_gt_gmch.c: - if (!intel_vtd_active(i915)) + if (!i915_vtd_active(i915)) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-04-06drm/i915/gt: Split intel-gtt functions by archCasey Bowman
Some functions defined in the intel-gtt module are used in several areas, but is only supported on x86 platforms. By separating these calls and their static underlying functions to another area, we are able to compile out these functions for non-x86 builds and provide stubs for the non-x86 implementations. In addition to the problematic calls, we are moving the gmch-related functions to the new area. Signed-off-by: Casey Bowman <casey.g.bowman@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220330234809.1218210-2-casey.g.bowman@intel.com
2022-03-30drm/i915: Move intel_vtd_active and run_as_guest to i915_utilsTvrtko Ursulin
Continuation of the effort to declutter i915_drv.h. Also, component specific helpers which consult the iommu/virtualization helpers moved to respective component source/header files as appropriate. v2: * s/dev_priv/i915/ in intel_scanout_needs_vtd_wa. (Lucas) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220329090204.2324499-1-tvrtko.ursulin@linux.intel.com [tursulin: fixup conflict in i915_drv.h]
2022-03-07drm/i915: Remove the vm open countThomas Hellström
vms are not getting properly closed. Rather than fixing that, Remove the vm open count and instead rely on the vm refcount. The vm open count existed solely to break the strong references the vmas had on the vms. Now instead make those references weak and ensure vmas are destroyed when the vm is destroyed. Unfortunately if the vm destructor and the object destructor both wants to destroy a vma, that may lead to a race in that the vm destructor just unbinds the vma and leaves the actual vma destruction to the object destructor. However in order for the object destructor to ensure the vma is unbound it needs to grab the vm mutex. In order to keep the vm mutex alive until the object destructor is done with it, somewhat hackishly grab a vm_resv refcount that is released late in the vma destruction process, when the vm mutex is no longer needed. v2: Address review-comments from Niranjana - Clarify that the struct i915_address_space::skip_pte_rewrite is a hack and should ideally be replaced in an upcoming patch. - Remove an unneeded continue in clear_vm_list and update comment. v3: - Documentation update - Commit message formatting Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-2-thomas.hellstrom@linux.intel.com
2022-02-23Merge tag 'drm-intel-gt-next-2022-02-17' of ↵Rodrigo Vivi
git://anongit.freedesktop.org/drm/drm-intel into drm-intel-next UAPI Changes: - Weak parallel submission support for execlists Minimal implementation of the parallel submission support for execlists backend that was previously only implemented for GuC. Support one sibling non-virtual engine. Core Changes: - Two backmerges of drm/drm-next for header file renames/changes and i915_regs reorganization Driver Changes: - Add new DG2 subplatform: DG2-G12 (Matt R) - Add new DG2 workarounds (Matt R, Ram, Bruce) - Handle pre-programmed WOPCM registers for DG2+ (Daniele) - Update guc shim control programming on XeHP SDV+ (Daniele) - Add RPL-S C0/D0 stepping information (Anusha) - Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas) - Fix KMD and GuC race on accessing PMU busyness (Umesh) - Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh) - Report error on invalid reset notification from GuC (John) - Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston) - Fixes to parallel submission implementation (Matt B.) - Improve GuC loading status check/error reports (John) - Tweak TTM LRU priority hint selection (Matt A.) - Align the plane_vma to min_page_size of stolen mem (Ram) - Introduce vma resources and implement async unbinding (Thomas) - Use struct vma_resource instead of struct vma_snapshot (Thomas) - Return some TTM accel move errors instead of trying memcpy move (Thomas) - Fix a race between vma / object destruction and unbinding (Thomas) - Remove short-term pins from execbuf (Maarten) - Update to GuC version 69.0.3 (John, Michal Wa.) - Improvements to GT reset paths in GuC backend (Matt B.) - Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko) - Use trylock instead of blocking lock when freeing GEM objects (Maarten) - Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.) - Fixes to object unmapping and purging (Matt A) - Check for wedged device in GuC backend (John) - Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten) - Allow dead vm to unbind vma's without lock (Maarten) - s/engine->i915/i915/ for DG2 engine workarounds (Matt R) - Use to_gt() helper for GGTT accesses (Michal Wi.) - Selftest improvements (Matt B., Thomas, Ram) - Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan) From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip rerere cache update")]
2022-02-02drm/i915: Move GT registers to their own header fileMatt Roper
This is a huge, chaotic mass of registers copied over as-is without any real cleanup. We'll come back and organize these better, align on consistent coding style, remove dead code, etc. in separate patches later that will be easier to review. v2: - Add missing include in intel_pxp_irq.c v3: - Correct a few indentation errors (Lucas) - Minor conflict resolution Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com
2022-01-18drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for ↵Maarten Lankhorst
i915_vma_unbind, v2. We want to remove more members of i915_vma, which requires the locking to be held more often. Start requiring gem object lock for i915_vma_unbind, as it's one of the callers that may unpin pages. Some special care is needed when evicting, because the last reference to the object may be held by the VMA, so after __i915_vma_unbind, vma may be garbage, and we need to cache vma->obj before unlocking. Changes since v1: - Make trylock failing a WARN. (Matt) - Remove double i915_vma_wait_for_bind() (Matt) - Move atomic_set to right before mutex_unlock(), to make it more clear they belong together. (Matt) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-5-maarten.lankhorst@linux.intel.com
2022-01-18drm/i915: Add object locking to i915_gem_evict_for_node and ↵Maarten Lankhorst
i915_gem_evict_something, v2. Because we will start to require the obj->resv lock for unbinding, ensure these vma eviction utility functions also take the lock. This requires some function signature changes, to ensure that the ww context is passed around, but is mostly straightforward. Previously this was split up into several patches, but reworking should allow for easier bisection. Changes since v1: - Handle evicting dead objects better. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-4-maarten.lankhorst@linux.intel.com
2022-01-18Merge drm/drm-next into drm-intel-gt-nextTvrtko Ursulin
Maarten needs backmerge to account for header file renames/changes which landed via drm-intel-next and are interfering with his pinning work. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-11drm/i915: Use vma resources for async unbindingThomas Hellström
Implement async (non-blocking) unbinding by not syncing the vma before calling unbind on the vma_resource. Add the resulting unbind fence to the object's dma_resv from where it is picked up by the ttm migration code. Ideally these unbind fences should be coalesced with the migration blit fence to avoid stalling the migration blit waiting for unbind, as they can certainly go on in parallel, but since we don't yet have a reasonable data structure to use to coalesce fences and attach the resulting fence to a timeline, we defer that for now. Note that with async unbinding, even while the unbind waits for the preceding bind to complete before unbinding, the vma itself might have been destroyed in the process, clearing the vma pages. Therefore we can only allow async unbinding if we have a refcounted sg-list and keep a refcount on that for the vma resource pages to stay intact until binding occurs. If this condition is not met, a request for an async unbind is diverted to a sync unbind. v2: - Use a separate kmem_cache for vma resources for now to isolate their memory allocation and aid debugging. - Move the check for vm closed to the actual unbinding thread. Regardless of whether the vm is closed, we need the unbind fence to properly wait for capture. - Clear vma_res::vm on unbind and update its documentation. v4: - Take cache coloring into account when searching for vma resources pending unbind. (Matthew Auld) v5: - Fix timeout and error check in i915_vma_resource_bind_dep_await(). - Avoid taking a reference on the object for async binding if async unbind capable. - Fix braces around a single-line if statement. v6: - Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>) - Don't allow async unbinding if the vma_res pages are not the same as the object pages. (Matthew Auld) v7: - s/unsigned long/u64/ in a number of places (Matthew Auld) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com
2022-01-11drm/i915: Use the vma resource as argument for gtt binding / unbindingThomas Hellström
When introducing asynchronous unbinding, the vma itself may no longer be alive when the actual binding or unbinding takes place. Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource instead of a struct i915_vma for the bind_vma() and unbind_vma() ops. Similarly change the insert_entries() op for struct i915_address_space. Replace a couple of i915_vma_snapshot members with their newly introduced i915_vma_resource counterparts, since they have the same lifetime. Also make sure to avoid changing the struct i915_vma_flags (in particular the bind flags) async. That should now only be done sync under the vm mutex. v2: - Update the vma_res::bound_flags when binding to the aliased ggtt v6: - Remove I915_VMA_ALLOC_BIT (Matthew Auld) - Change some members of struct i915_vma_resource from unsigned long to u64 (Matthew Auld) v7: - Fix vma resource size parameters to be u64 rather than unsigned long (Matthew Auld) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com
2022-01-05drm/i915/gt: Use to_gt() helper for GGTT accessesMichał Winiarski
GGTT is currently available both through i915->ggtt and gt->ggtt, and we eventually want to get rid of the i915->ggtt one. Use to_gt() for all i915->ggtt accesses to help with the future refactoring. During the probe of i915 the early intiialization of the gt (intel_gt_init_hw_early()) is moved prior to any access to the ggtt. This because it's in that moment we assign the ggtt to the gt and we want to do that before using it. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211221195946.3180-1-andi.shyti@linux.intel.com
2021-12-24Merge tag 'drm-intel-gt-next-2021-12-23' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: - Added bits of DG2 support around page table handling (Stuart Summers, Matthew Auld) - Fixed wakeref leak in PMU busyness during reset in GuC mode (Umesh Nerlige Ramappa) - Fixed debugfs access crash if GuC failed to load (John Harrison) - Bring back GuC error log to error capture, undoing accidental earlier breakage (Thomas Hellström) - Fixed memory leak in error capture caused by earlier refactoring (Thomas Hellström) - Exclude reserved stolen from driver use (Chris Wilson) - Add memory region sanity checking and optional full test (Chris Wilson) - Fixed buffer size truncation in TTM shmemfs backend (Robert Beckett) - Use correct lock and don't overwrite internal data structures when stealing GuC context ids (Matthew Brost) - Don't hog IRQs when destroying GuC contexts (John Harrison) - Make GuC to Host communication more robust (Matthew Brost) - Continuation of locking refactoring around VMA and backing store handling (Maarten Lankhorst) - Improve performance of reading GuC log from debugfs (John Harrison) - Log when GuC fails to reset an engine (John Harrison) - Speed up GuC/HuC firmware loading by requesting RP0 (Vinay Belgaumkar) - Further work on asynchronous VMA unbinding (Thomas Hellström, Christian König) - Refactor GuC/HuC firmware handling to prepare for future platforms (John Harrison) - Prepare for future different GuC/HuC firmware signing key sizes (Daniele Ceraolo Spurio, Michal Wajdeczko) - Add noreclaim annotations (Matthew Auld) - Remove racey GEM_BUG_ON between GPU reset and GuC communication handling (Matthew Brost) - Refactor i915->gt with to_gt(i915) to prepare for future platforms (Michał Winiarski, Andi Shyti) - Increase GuC log size for CONFIG_DEBUG_GEM (John Harrison) - Fixed engine busyness in selftests when in GuC mode (Umesh Nerlige Ramappa) - Make engine parking work with PREEMPT_RT (Sebastian Andrzej Siewior) - Replace X86_FEATURE_PAT with pat_enabled() (Lucas De Marchi) - Selftest for stealing of guc ids (Matthew Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YcRvKO5cyPvIxVCi@tursulin-mobl2
2021-12-20drm/i915: Remove pages_mutex and intel_gtt->vma_ops.set/clear_pages members, v3.Maarten Lankhorst
Big delta, but boils down to moving set_pages to i915_vma.c, and removing the special handling, all callers use the defaults anyway. We only remap in ggtt, so default case will fall through. Because we still don't require locking in i915_vma_unpin(), handle this by using xchg in get_pages(), as it's locked with obj->mutex, and cmpxchg in unpin, which only fails if we race a against a new pin. Changes since v1: - aliasing gtt sets ZERO_SIZE_PTR, not -ENODEV, remove special case from __i915_vma_get_pages(). (Matt) Changes since v2: - Free correct old pages in __i915_vma_get_pages(). (Matt) Remove race of clearing vma->pages accidentally from put, free it but leave it set, as only get has the lock. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-4-maarten.lankhorst@linux.intel.com Reviewed-by: Matthew Auld <matthew.auld@intel.com>