summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/uc/abi
AgeCommit message (Collapse)Author
2025-02-04drm/i915/slpc: Add sysfs for SLPC power profilesVinay Belgaumkar
Default SLPC power profile is Base(0). Power Saving mode(1) has conservative up/down thresholds and is suitable for use with apps that typically need to be power efficient. Selected power profile will be displayed in this format- $ cat slpc_power_profile [base] power_saving $ echo power_saving > slpc_power_profile $ cat slpc_power_profile base [power_saving] v2: Disable waitboost in power saving profile, update sysfs format and add some kernel doc for SLPC (Rodrigo) v3: Update doc with info about power profiles (Rodrigo) v4: Checkpatch warning and remove extra line (Rodrigo) Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250117215753.749906-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-18drm/i915/guc: Extend w/a 14019159160John Harrison
There is a new part to an existing workaround, so enable that piece as well. v2: Extend even further. v3: Drop DG2 as there are CI failures still to resolve. Also re-order the parameters to a function to reduce excessive line wrapping. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240622004636.662081-3-John.C.Harrison@Intel.com
2024-06-06drm/i915/guc: Enable w/a 16021333562 for DG2, MTL and ARLJohn Harrison
Enable another workaround that is implemented inside the GuC. v2: Use the correct Gen12 w/a id rather than the Xe version (review feedback from Matthew R) also extend to include ARL. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240528230515.479395-1-John.C.Harrison@Intel.com
2024-05-16drm/i915/guc: avoid FIELD_PREP warningArnd Bergmann
With gcc-7 and earlier, there are lots of warnings like In file included from <command-line>:0:0: In function '__guc_context_policy_add_priority.isra.66', inlined from '__guc_context_set_prio.isra.67' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3292:3, inlined from 'guc_context_set_prio' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3320:2: include/linux/compiler_types.h:399:38: error: call to '__compiletime_assert_631' declared with attribute error: FIELD_PREP: mask is not constant _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ... drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2422:3: note: in expansion of macro 'FIELD_PREP' FIELD_PREP(GUC_KLV_0_KEY, GUC_CONTEXT_POLICIES_KLV_ID_##id) | \ ^~~~~~~~~~ Make sure that GUC_KLV_0_KEY is an unsigned value to avoid the warning. Fixes: 77b6f79df66e ("drm/i915/guc: Update to GuC version 69.0.3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240430164809.482131-1-julia.filipchuk@intel.com
2024-03-07drm/i915/guc: Enable Wa_14019159160John Harrison
Use the new w/a KLV support to enable a MTL w/a. Note, this w/a is a super-set of Wa_16019325821, so requires turning that one as well as setting the new flag for Wa_14019159160 itself. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223205632.1621019-4-John.C.Harrison@Intel.com
2024-03-07drm/i915/guc: Add support for w/a KLVsJohn Harrison
To prevent running out of bits, new w/a enable flags are being added via a KLV system instead of a 32 bit flags word. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223205632.1621019-3-John.C.Harrison@Intel.com
2024-03-07drm/i915/guc: Use context hints for GT frequencyVinay Belgaumkar
Allow user to provide a low latency context hint. When set, KMD sends a hint to GuC which results in special handling for this context. SLPC will ramp the GT frequency aggressively every time it switches to this context. The down freq threshold will also be lower so GuC will ramp down the GT freq for this context more slowly. We also disable waitboost for this context as that will interfere with the strategy. We need to enable the use of SLPC Compute strategy during init, but it will apply only to contexts that set this bit during context creation. Userland can check whether this feature is supported using a new param- I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission enabled platforms as they use SLPC for frequency management. The Mesa usage model for this flag is here - https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint v2: Rename flags as per review suggestions (Rodrigo, Tvrtko). Also, use flag bits in intel_context as it allows finer control for toggling per engine if needed (Tvrtko). v3: Minor review comments (Tvrtko) v4: Update comment (Sushma) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306012759.204938-1-vinay.belgaumkar@intel.com
2023-10-18drm/i915: Define and use GuC and CTB TLB invalidation routinesPrathap Kumar Valsan
The GuC firmware had defined the interface for Translation Look-Aside Buffer (TLB) invalidation. We should use this interface when invalidating the engine and GuC TLBs. Add additional functionality to intel_gt_invalidate_tlb, invalidating the GuC TLBs and falling back to GT invalidation when the GuC is disabled. The invalidation is done by sending a request directly to the GuC tlb_lookup that invalidates the table. The invalidation is submitted as a wait request and is performed in the CT event handler. This means we cannot perform this TLB invalidation path if the CT is not enabled. If the request isn't fulfilled in two seconds, this would constitute an error in the invalidation as that would constitute either a lost request or a severe GuC overload. With this new invalidation routine, we can perform GuC-based GGTT invalidations. GuC-based GGTT invalidation is incompatible with MMIO invalidation so we should not perform MMIO invalidation when GuC-based GGTT invalidation is expected. The additional complexity incurred in this patch will be necessary for range-based tlb invalidations, which will be platformed in the future. Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> CC: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231017180806.3054290-4-jonathan.cavitt@intel.com
2023-05-31drm/i915/guc: Drop legacy CTB definitionsMichal Wajdeczko
We've already switched to new HXG definitions some time ago, drop legacy CTB definitions to avoid mistakes. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230509201103.538-1-michal.wajdeczko@intel.com
2023-05-30drm/i915/guc: Use FAST_REQUEST for non-blocking H2G callsMichal Wajdeczko
In addition to the already defined REQUEST HXG message format, which is used when sender expects some confirmation or data, HXG protocol includes definition of the FAST REQUEST message, that may be used when sender does not expect any useful data to be returned. Using this instead of GUC_HXG_TYPE_EVENT for non-blocking CTB requests will allow GuC to send back GUC_HXG_TYPE_RESPONSE_FAILURE in case of errors. Note that it is not possible to return such errors to the caller, since this is for non-blocking calls and the related fence is not stored. Instead such messages are treated as unexpected, which will give an indication of potential GuC misprogramming that warrants extra debugging effort. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230526235538.2230780-2-John.C.Harrison@Intel.com
2023-05-05drm/i915/guc: Decode another GuC load failure caseJohn Harrison
Explain another potential firmware failure mode and early exit the long wait if hit. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230502234007.1762014-2-John.C.Harrison@Intel.com
2023-03-23drm/i915/guc: Improve GuC load error reportingJohn Harrison
There are multiple ways in which the GuC load can fail. The driver was reporting the status register as is, but not everyone can read the matrix unfiltered. So add decoding of the common error cases. Also, remove the comment about interrupt based load completion checking being not recommended. The interrupt was removed from the GuC firmware some time ago so it is no longer an option anyway. While at it, also abort the timeout if a known error code is reported. No need to keep waiting if the GuC has already given up the load. v2: Fix mis-matched case and confusing 'success' variable (Daniele). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230316220632.3312218-2-John.C.Harrison@Intel.com
2022-10-27drm/i915/guc: Support OA when Wa_16011777198 is enabledVinay Belgaumkar
On DG2, a w/a resets RCS/CCS before it goes into RC6. This breaks OA since OA does not expect engine resets during its use. Fix it by disabling RC6. v2: (Ashutosh) - Bring back slpc_unset_param helper - Update commit msg - Use with_intel_runtime_pm helper for set/unset v3: (Ashutosh) - Just use intel_uc_uses_guc_rc Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-15-umesh.nerlige.ramappa@intel.com
2022-09-27drm/i915/guc: Enable compute scheduling on DG2John Harrison
DG2 has issues. To work around one of these the GuC must schedule apps in an exclusive manner across both RCS and CCS. That is, if a context from app X is running on RCS then all CCS engines must sit idle even if there are contexts from apps Y, Z, ... waiting to run. A certain OS favours RCS to the total starvation of CCS. Linux does not. Hence the GuC now has a scheduling policy setting to control this abitration. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220922201209.1446343-2-John.C.Harrison@Intel.com
2022-07-29drm/i915/guc: Don't abort on CTB_UNUSED statusJohn Harrison
When the KMD sends a CLIENT_RESET request to GuC (as part of the suspend sequence), GuC will mark the CTB buffer as 'UNUSED'. If the KMD then checked the CTB queue, it would see a non-zero status value and report the buffer as corrupted. Technically, no G2H messages should be received once the CLIENT_RESET has been sent. However, if a context was outstanding on an engine then it would get reset and a reset notification would be sent. So, don't actually treat UNUSED as a catastrophic error. Just flag it up as unexpected and keep going. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220728024225.2363663-7-John.C.Harrison@Intel.com
2022-07-19drm/i915/guc: support v69 in parallel to v70Daniele Ceraolo Spurio
This patch re-introduces support for GuC v69 in parallel to v70. As this is a quick fix, v69 has been re-introduced as the single "fallback" guc version in case v70 is not available on disk and only for platforms that are out of force_probe and require the GuC by default. All v69 specific code has been labeled as such for easy identification, and the same was done for all v70 functions for which there is a separate v69 version, to avoid accidentally calling the wrong version via the unlabeled name. When the fallback mode kicks in, a drm_notice message is printed in dmesg to inform the user of the required update. The existing logging of the fetch function has also been updated so that we no longer complain immediately if we can't find a fw and we only throw an error if the fetch of both the base and fallback blobs fails. The plan is to follow this up with a more complex rework to allow for multiple different GuC versions to be supported at the same time. v2: reduce the fallback to platform that require it, switch to firmware_request_nowarn(), improve logs. Fixes: 2584b3549f4c ("drm/i915/guc: Update to GuC version 70.1.1") Link: https://lists.freedesktop.org/archives/intel-gfx/2022-July/301640.html Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Dave Airlie <airlied@gmail.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220718230732.1409641-1-daniele.ceraolospurio@intel.com
2022-05-26drm/i915/gt: Add media freq factor to per-gt sysfsAshutosh Dixit
Expose new sysfs to program and retrieve media freq factor. Factor values of 0 (dynamic), 0.5 and 1.0 are supported via a u8.8 fixed point representation (corresponding to integer values of 0, 128 and 256 respectively). Media freq factor is converted to media_ratio_mode for GuC. It is programmed into GuC using H2G SLPC interface. It is retrieved from GuC through a register read. A cached media_ratio_mode is maintained to preserve set values across GuC resets. This patch adds the following sysfs files to gt/gtN sysfs: * media_freq_factor * media_freq_factor.scale v2: Minor wording change in drm_warn (Tvrtko) Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Dale B Stimson <dale.b.stimson@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/7ad7578335d8af9cba047b4bcf33d1887453d2e1.1653484574.git.ashutosh.dixit@intel.com
2022-05-19drm/i915/uc: Fix undefined behavior due to shift overflowing the constantBorislav Petkov
Fix: In file included from <command-line>:0:0: drivers/gpu/drm/i915/gt/uc/intel_guc.c: In function ‘intel_guc_send_mmio’: ././include/linux/compiler_types.h:352:38: error: call to ‘__compiletime_assert_1047’ \ declared with attribute error: FIELD_PREP: mask is not constant _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) and other build errors due to shift overflowing values. See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory details as to why it triggers with older gccs only. v2 by Jani: - Drop the i915_reg.h changes Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Ruiqi GONG <gongruiqi1@huawei.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220518113315.1305027-2-jani.nikula@intel.com
2022-04-14drm/i915/guc: Update to GuC version 70.1.1John Harrison
The latest GuC firmware drops the context descriptor pool in favour of passing all creation data in the create H2G. It also greatly simplifies the work queue and removes the process descriptor used for multi-LRC submission. So, remove all mention of LRC and process descriptors and update the registration code accordingly. Unfortunately, the new API also removes the ability to set default values for the scheduling policies at context registration time. Instead, a follow up H2G must be sent. The individual scheduling policy update H2G commands are also dropped in favour of a single KLV based H2G. So, change the update wrappers accordingly and call this during context registration.. Of course, this second H2G per registration might fail due to being backed up. The registration code has a complicated state machine to cope with the actual registration call failing. However, if that works then there is no support for unwinding if a further call should fail. Unwinding would require sending a H2G to de-register - but that can't be done because the CTB is already backed up. So instead, add a new flag to say whether the context has a pending policy update. This is set if the policy H2G fails at registration time. The submission code checks for this flag and retries the policy update if set. If that call fails, the submission path early exists with a retry error. This is something that is already supported for other reasons. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220412225955.1802543-2-John.C.Harrison@Intel.com
2022-03-22drm/i915/guc: Extract GuC error capture lists on G2H notification.Alan Previn
- Upon the G2H Notify-Err-Capture event, parse through the GuC Log Buffer (error-capture-subregion) and generate one or more capture-nodes. A single node represents a single "engine- instance-capture-dump" and contains at least 3 register lists: global, engine-class and engine-instance. An internal link list is maintained to store one or more nodes. - Because the link-list node generation happen before the call to i915_gpu_codedump, duplicate global and engine-class register lists for each engine-instance register dump if we find dependent-engine resets in a engine-capture-group. - When i915_gpu_coredump calls into capture_engine, (in a subsequent patch) we detach the matching node (guc-id, LRCA, etc) from the link list above and attach it to i915_gpu_coredump's intel_engine_coredump structure when have matching LRCA/guc-id/engine-instance. Additional notes to be aware of: - GuC generates the error capture dump into the GuC log buffer but this buffer is one big log buffer with 3 independent subregions within it. Each subregion is populated with different content and used in different ways and timings but all regions operate behave as independent ring buffers. Each guc-log subregion (general-logs, crash-dump and error- capture) has it's own guc_log_buffer_state that contain independent read and write pointers. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-11-alan.previn.teres.alexis@intel.com
2022-03-17drm/i915/guc: Add fetch of hwconfig blobJohn Harrison
Implement support for fetching the hardware description table from the GuC. The call is made twice - once without a destination buffer to query the size and then a second time to fill in the buffer. The table is stored in the GT structure so that it can be fetched once at driver load time. Keeping inside a GuC structure would mean it would be release and reloaded on a GuC reset (part of a full GT reset). However, the table does not change just because the GT has been reset and the GuC reloaded. Also, dynamic memory allocations inside the reset path are a problem. Note that the table is only available on ADL-P and later platforms. v2 (John's v2 patch): * Move to GT level to avoid memory allocation during reset path (and unnecessary re-read of the table on a reset). v5 (of Jordan's posting): * Various changes made by Jordan and recommended by Michal - Makefile ordering - Adjust "struct intel_guc_hwconfig hwconfig" comment - Set Copyright year to 2022 in intel_guc_hwconfig.c/.h - Drop inline from hwconfig_to_guc() - Replace hwconfig param with guc in __guc_action_get_hwconfig() - Move zero size check into guc_hwconfig_discover_size() - Change comment to say zero size offset/size is needed to get size - Add has_guc_hwconfig to devinfo and drop has_table() - Change drm_err to notice in __uc_init_hw() and use %pe v6 (of Jordan's posting): * Added a couple more small changes recommended by Michal * Merge in John's v2 patch, but note: - Using drm_notice as recommended by Michal - Reverted Michal's suggestion of using devinfo v7 (of Jordan's posting): * Change back to drm_err as preferred by John Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220306232157.1174335-2-jordan.l.justen@intel.com
2022-03-03drm/i915/guc: Drop obsolete H2G definitionsJohn Harrison
The CTB registration process changed significantly a while back using a single KLV based H2G. So drop the original and now obsolete H2G definitions. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-8-John.C.Harrison@Intel.com
2022-02-23Merge tag 'drm-intel-gt-next-2022-02-17' of ↵Rodrigo Vivi
git://anongit.freedesktop.org/drm/drm-intel into drm-intel-next UAPI Changes: - Weak parallel submission support for execlists Minimal implementation of the parallel submission support for execlists backend that was previously only implemented for GuC. Support one sibling non-virtual engine. Core Changes: - Two backmerges of drm/drm-next for header file renames/changes and i915_regs reorganization Driver Changes: - Add new DG2 subplatform: DG2-G12 (Matt R) - Add new DG2 workarounds (Matt R, Ram, Bruce) - Handle pre-programmed WOPCM registers for DG2+ (Daniele) - Update guc shim control programming on XeHP SDV+ (Daniele) - Add RPL-S C0/D0 stepping information (Anusha) - Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas) - Fix KMD and GuC race on accessing PMU busyness (Umesh) - Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh) - Report error on invalid reset notification from GuC (John) - Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston) - Fixes to parallel submission implementation (Matt B.) - Improve GuC loading status check/error reports (John) - Tweak TTM LRU priority hint selection (Matt A.) - Align the plane_vma to min_page_size of stolen mem (Ram) - Introduce vma resources and implement async unbinding (Thomas) - Use struct vma_resource instead of struct vma_snapshot (Thomas) - Return some TTM accel move errors instead of trying memcpy move (Thomas) - Fix a race between vma / object destruction and unbinding (Thomas) - Remove short-term pins from execbuf (Maarten) - Update to GuC version 69.0.3 (John, Michal Wa.) - Improvements to GT reset paths in GuC backend (Matt B.) - Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko) - Use trylock instead of blocking lock when freeing GEM objects (Maarten) - Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.) - Fixes to object unmapping and purging (Matt A) - Check for wedged device in GuC backend (John) - Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten) - Allow dead vm to unbind vma's without lock (Maarten) - s/engine->i915/i915/ for DG2 engine workarounds (Matt R) - Use to_gt() helper for GGTT accesses (Michal Wi.) - Selftest improvements (Matt B., Thomas, Ram) - Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan) From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip rerere cache update")]
2022-02-02drm/i915: Only include i915_reg.h from .c filesMatt Roper
Several of our i915 header files, have been including i915_reg.h. This means that any change to i915_reg.h will trigger a full rebuild of pretty much every file of the driver, even those that don't have any kind of register access. Let's delete the i915_reg.h include from all headers and add an explicit include from the .c files that truly need the register definitions; those that need a definition of i915_reg_t for a function definition can get it from i915_reg_defs.h instead. We also remove two non-register #define's (VLV_DISPLAY_BASE and GEN12_SFC_DONE_MAX) into i915_reg_defs.h to allow us to drop the i915_reg.h include from a couple of headers. There's probably a lot more header dependency optimization possible, but the changes here roughly cut the number of files compiled after 'touch i915_reg.h' in half --- a good first step. Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-7-matthew.d.roper@intel.com
2022-01-11drm/i915/guc: Improve GuC loading status check/error reportsJohn Harrison
If the GuC fails to load, it is useful to know what firmware file / version was attempted. So move the version info report to before the load attempt rather than only after a successful load. If the GuC does fail to load, then make the error messages visible rather than being 'debug' prints that do not appears in dmesg output by default. When waiting for the GuC to load, it used to be necessary to check for two different states - READY and (LAPIC_DONE | MIA_CORE). Apparently the second signified init complete on RC6 exit. However, in more recent GuC versions the RC6 exit sequence now finishes with status READY as well. So the test can be simplified. Also, add an enum giving all the current status codes that GuC loading can report as a reference without having to pull and search through the GuC source files. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220107000622.292081-4-John.C.Harrison@Intel.com
2022-01-11drm/i915/guc: Update to GuC version 69.0.3John Harrison
Update to the latest GuC release. The latest GuC firmware introduces a number of interface changes: GuC may return NO_RESPONSE_RETRY message for requests sent over CTB. Add support for this reply and try resending the request again as a new CTB message. A KLV (key-length-value) mechanism is now used for passing configuration data such as CTB management. With the new KLV scheme, the old CTB management actions are no longer used and are removed. Register capture on hang is now supported by GuC. Full i915 support for this will be added by a later patch. A minimum support of providing capture memory and register lists is required though, so add that in. The device id of the current platform needs to be provided at init time. The 'poll CS' w/a (Wa_22012773006) was blanket enabled by previous versions of GuC. It must now be explicitly requested by the KMD. So, add in the code to turn it on when relevant. The GuC log entry format has changed. This requires adding a new field to the log header structure to mark the wrap point at the end of the buffer (as the buffer size is no longer a multiple of the log entry size). New CTB notification messages are now sent for some things that were previously only sent via MMIO notifications. Of these, the crash dump notification was not really being handled by i915. It called the log flush code but that only flushed the regular debug log and then only if relay logging was enabled. So just report an error message instead. The 'exception' notification was just being ignored completely. So add an error message for that as well. Note that in either the crash dump or the exception case, the GuC is basically dead. The KMD will detect this via the heartbeat and trigger both an error log (which will include the crash dump as part of the GuC log) and a GT reset. So no other processing is really required. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220107000622.292081-3-John.C.Harrison@Intel.com
2021-10-28drm/i915/pmu: Connect engine busyness stats from GuC to pmuUmesh Nerlige Ramappa
With GuC handling scheduling, i915 is not aware of the time that a context is scheduled in and out of the engine. Since i915 pmu relies on this info to provide engine busyness to the user, GuC shares this info with i915 for all engines using shared memory. For each engine, this info contains: - total busyness: total time that the context was running (total) - id: id of the running context (id) - start timestamp: timestamp when the context started running (start) At the time (now) of sampling the engine busyness, if the id is valid (!= ~0), and start is non-zero, then the context is considered to be active and the engine busyness is calculated using the below equation engine busyness = total + (now - start) All times are obtained from the gt clock base. For inactive contexts, engine busyness is just equal to the total. The start and total values provided by GuC are 32 bits and wrap around in a few minutes. Since perf pmu provides busyness as 64 bit monotonically increasing values, there is a need for this implementation to account for overflows and extend the time to 64 bits before returning busyness to the user. In order to do that, a worker runs periodically at frequency = 1/8th the time it takes for the timestamp to wrap. As an example, that would be once in 27 seconds for a gt clock frequency of 19.2 MHz. Note: There might be an over-accounting of busyness due to the fact that GuC may be updating the total and start values while kmd is reading them. (i.e kmd may read the updated total and the stale start). In such a case, user may see higher busyness value followed by smaller ones which would eventually catch up to the higher value. v2: (Tvrtko) - Include details in commit message - Move intel engine busyness function into execlist code - Use union inside engine->stats - Use natural type for ping delay jiffies - Drop active_work condition checks - Use for_each_engine if iterating all engines - Drop seq locking, use spinlock at GuC level to update engine stats - Document worker specific details v3: (Tvrtko/Umesh) - Demarcate GuC and execlist stat objects with comments - Document known over-accounting issue in commit - Provide a consistent view of GuC state - Add hooks to gt park/unpark for GuC busyness - Stop/start worker in gt park/unpark path - Drop inline - Move spinlock and worker inits to GuC initialization - Drop helpers that are called only once v4: (Tvrtko/Matt/Umesh) - Drop addressed opens from commit message - Get runtime pm in ping, remove from the park path - Use cancel_delayed_work_sync in disable_submission path - Update stats during reset prepare - Skip ping if reset in progress - Explicitly name execlists and GuC stats objects - Since disable_submission is called from many places, move resetting stats to intel_guc_submission_reset_prepare v5: (Tvrtko) - Add a trylock helper that does not sleep and synchronize PMU event callbacks and worker with gt reset v6: (CI BAT failures) - DUTs using execlist submission failed to boot since __gt_unpark is called during i915 load. This ends up calling the GuC busyness unpark hook and results in kick-starting an uninitialized worker. Let park/unpark hooks check if GuC submission has been initialized. - drop cant_sleep() from trylock helper since rcu_read_lock takes care of that. v7: (CI) Fix igt@i915_selftest@live@gt_engines - For GuC mode of submission the engine busyness is derived from gt time domain. Use gt time elapsed as reference in the selftest. - Increase busyness calculation to 10ms duration to ensure batch runs longer and falls within the busyness tolerances in selftest. v8: - Use ktime_get in selftest as before - intel_reset_trylock_no_wait results in a lockdep splat that is not trivial to fix since the PMU callback runs in irq context and the reset paths are tightly knit into the driver. The test that uncovers this is igt@perf_pmu@faulting-read. Drop intel_reset_trylock_no_wait, instead use the reset_count to synchronize with gt reset during pmu callback. For the ping, continue to use intel_reset_trylock since ping is not run in irq context. - GuC PM timestamp does not tick when GuC is idle. This can potentially result in wrong busyness values when a context is active on the engine, but GuC is idle. Use the RING TIMESTAMP as GPU timestamp to process the GuC busyness stats. This works since both GuC timestamp and RING timestamp are synced with the same clock. - The busyness stats may get updated after the batch starts running. This delay causes the busyness reported for 100us duration to fall below 95% in the selftest. The only option at this time is to wait for GuC busyness to change from idle to active before we sample busyness over a 100us period. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211027004821.66097-2-umesh.nerlige.ramappa@intel.com
2021-10-15drm/i915/guc: Add multi-lrc context registrationMatthew Brost
Add multi-lrc context registration H2G. In addition a workqueue and process descriptor are setup during multi-lrc context registration as these data structures are needed for multi-lrc submission. v2: (John Harrison) - Move GuC specific fields into sub-struct - Clean up WQ defines - Add comment explaining math to derive WQ / PD address v3: (John Harrison) - Add PARENT_SCRATCH_SIZE define - Update comment explaining multi-lrc register v4: (John Harrison) - Move PARENT_SCRATCH_SIZE to common file Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-9-matthew.brost@intel.com
2021-09-23drm/i915/guc, docs: Fix pdfdocs build error by removing nested gridAkira Yokosawa
Nested grids in grid-table cells are not specified as proper ReST constructs. Commit 572f2a5cd974 ("drm/i915/guc: Update firmware to v62.0.0") added a couple of kerneldoc tables of the form: +---+-------+------------------------------------------------------+ | 1 | 31:0 | +------------------------------------------------+ | +---+-------+ | | | |...| | | Embedded `HXG Message`_ | | +---+-------+ | | | | n | 31:0 | +------------------------------------------------+ | +---+-------+------------------------------------------------------+ For "make htmldocs", they happen to work as one might expect, but they are incompatible with "make latexdocs" and "make pdfdocs", and cause the generated gpu.tex file to become incomplete and unbuildable by xelatex. Restore the compatibility by removing those nested grids in the tables. Size comparison of generated gpu.tex: Sphinx 2.4.4 Sphinx 4.2.0 v5.14: 3238686 3841631 v5.15-rc1: 376270 432729 with this fix: 3377846 3998095 Fixes: 572f2a5cd974 ("drm/i915/guc: Update firmware to v62.0.0") Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4a227569-074f-c501-58bb-d0d8f60a8ae9@gmail.com
2021-08-03drm/i915/guc/rc: Setup and enable GuCRC featureVinay Belgaumkar
This feature hands over the control of HW RC6 to the GuC. GuC decides when to put HW into RC6 based on it's internal busyness algorithms. GuCRC needs GuC submission to be enabled, and only supported on Gen12+ for now. When GuCRC is enabled, do not set HW RC6. Use a H2G message to tell GuC to enable GuCRC. When disabling RC6, tell GuC to revert RC6 control back to KMD. KMD is still responsible for enabling everything related to Coarse Power Gating though. v2: Address comments (Michal W) v3: Don't set hysterisis values when GuCRC is used (Matt Roper) v4: checkpatch() Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-15-vinay.belgaumkar@intel.com
2021-08-03drm/i915/guc/slpc: Adding SLPC communication interfacesVinay Belgaumkar
Add constants and params that are needed to configure SLPC. v2: Add a new abi header for SLPC. Replace bitfields with genmasks. Address other comments from Michal W. v3: Add slpc H2G format in abi, other review commments (Michal W) v4: Update status bits according to latest spec v5: checkpatch() Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Sundaresan Sujaritha <sujaritha.sundaresan@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-4-vinay.belgaumkar@intel.com
2021-07-27drm/i915/guc: Suspend/resume implementation for new interfaceMatthew Brost
The new GuC interface introduces an MMIO H2G command, INTEL_GUC_ACTION_RESET_CLIENT, which is used to implement suspend. This MMIO tears down any active contexts generating a context reset G2H CTB for each. Once that step completes the GuC tears down the CTB channels. It is safe to suspend once this MMIO H2G command completes and all G2H CTBs have been processed. In practice the i915 will likely never receive a G2H as suspend should only be called after the GPU is idle. Resume is implemented in the same manner as before - simply reload the GuC firmware and reinitialize everything (e.g. CTB channels, contexts, etc..). v2: (Michel / John H) - INTEL_GUC_ACTION_RESET_CLIENT 0x5B01 -> 0x5507 Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-12-matthew.brost@intel.com
2021-07-22drm/i915/guc: Add new GuC interface defines and structuresMatthew Brost
Add new GuC interface defines and structures while maintaining old ones in parallel. Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-2-matthew.brost@intel.com
2021-07-13drm/i915/guc: Add non blocking CTB send functionMatthew Brost
Add non blocking CTB send function, intel_guc_send_nb. GuC submission will send CTBs in the critical path and does not need to wait for these CTBs to complete before moving on, hence the need for this new function. The non-blocking CTB now must have a flow control mechanism to ensure the buffer isn't overrun. A lazy spin wait is used as we believe the flow control condition should be rare with a properly sized buffer. The function, intel_guc_send_nb, is exported in this patch but unused. Several patches later in the series make use of this function. v2: (Michal) - Use define for H2G room calculations - Move INTEL_GUC_SEND_NB define (Daniel Vetter) - Use msleep_interruptible rather than cond_resched v3: (Michal) - Move includes to following patch - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g v4: (John H) - Update comment, add type local variable Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-5-matthew.brost@intel.com
2021-06-18drm/i915/guc: Update firmware to v62.0.0Michal Wajdeczko
Most of the changes to the 62.0.0 firmware revolved around CTB communication channel. Conform to the new (stable) CTB protocol. v2: (Michal) Add values back to kernel DOC for actions (Docs) Add 'CT buffer' back in to fix warning Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> [mattrope: Tweaked kerneldoc while pushing as suggested by Daniele/Michal] Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210616001302.84233-3-matthew.brost@intel.com
2021-06-18drm/i915/guc: Introduce unified HXG messagesMichal Wajdeczko
New GuC firmware will unify format of MMIO and CTB H2G messages. Introduce their definitions now to allow gradual transition of our code to match new changes. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210616001302.84233-2-matthew.brost@intel.com
2021-06-03drm/i915/guc: Stop using fence/status from CTB descriptorMichal Wajdeczko
Stop using fence/status from CTB descriptor as future GuC ABI will no longer support replies over CTB descriptor. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-8-matthew.brost@intel.com
2021-06-03drm/i915/guc: Keep strict GuC ABI definitionsMichal Wajdeczko
Our fwif.h file is now mix of strict firmware ABI definitions and set of our helpers. In anticipation of upcoming changes to the GuC interface try to keep them separate in smaller maintainable files. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-6-matthew.brost@intel.com