Age | Commit message (Collapse) | Author |
|
Stephen Rothwell reported htmldocs warnings:
Documentation/gpu/i915:64: drivers/gpu/drm/i915/gt/intel_workarounds.c:32: WARNING: Inline emphasis start-string without end-string.
Documentation/gpu/i915:64: drivers/gpu/drm/i915/gt/intel_workarounds.c:57: WARNING: Inline emphasis start-string without end-string.
Documentation/gpu/i915:64: drivers/gpu/drm/i915/gt/intel_workarounds.c:66: WARNING: Inline emphasis start-string without end-string.
Escape wildcards in *_ctx_workarounds_init(), *_gt_workarounds_init(), and
*_whitelist_build() to fix above warnings.
Link: https://lore.kernel.org/linux-next/20230203134622.0b6315b9@canb.auug.org.au/
Fixes: 0c3064cf33fbfa ("drm/i915/doc: Document where to implement register workarounds")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230203100215.31852-2-bagasdotme@gmail.com
(cherry picked from commit ec852e3c88d5caa457557406c0c787b56c36dffb)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
The memory frequency detection is a bit spread out here and
there. Consolidate to intel_dram.c.
v2:
- Remove inaccurate comment (Ville)
- Call detect_mem_freq() unconditionally (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8a862eeca8b42a98e04b3c52637851d33531abb6.1676317696.git.jani.nikula@intel.com
|
|
The UNSLICE_UNIT_LEVEL_CLKGATE register programmed by this workaround
has 'BUS' style reset, indicating that it does not lose its value on
engine resets. Furthermore, this register is part of the GT forcewake
domain rather than the RENDER domain, so it should not be impacted by
RCS engine resets. As such, we should implement this on the GT
workaround list rather than an engine list.
Bspec: 19219
Fixes: 3551ff928744 ("drm/i915/gen11: Moving WAs to rcs_engine_wa_init()")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230201222831.608281-2-matthew.d.roper@intel.com
(cherry picked from commit 5f21dc07b52eb54a908e66f5d6e05a87bcb5b049)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Although registers in the L3 bank/node configuration ranges are marked
as having "DEV" reset characteristics in the bspec, this appears to be a
hold-over from pre-Xe_HP platforms. In reality, these registers
maintain their values across engine resets, meaning that workarounds
and tuning settings targeting them should be placed on the GT
workaround list rather than an engine workaround list.
Note that an extra clue here is that these registers moved from the
RENDER forcewake domain to the GT forcewake domain in Xe_HP; generally
RCS/CCS engine resets should not lead to the reset of a register that
lives outside the RENDER domain.
Re-applying these registers on engine resets wouldn't actually hurt
anything, but is unnecessary and just makes it more confusing to anyone
trying to decipher how these registers really work.
v2:
- Also move DG2's Wa_14010648519 to the GT list. (Gustavo)
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230209232228.859317-1-matthew.d.roper@intel.com
|
|
Update a bunch more debug prints to use the new GT based scheme.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-7-John.C.Harrison@Intel.com
|
|
Update a bunch more debug prints to use the new GT based scheme.
v2: Also change prints to use %pe for error values (MichalW).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-6-John.C.Harrison@Intel.com
|
|
Update a bunch more debug prints to use the new GT based scheme.
v2: Also change prints to use %pe for error values (MichalW).
Fix a context leak on error due to a -- being too early.
Use the correct header file for the debug macros.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-5-John.C.Harrison@Intel.com
|
|
Update a bunch more debug prints to use the new GT based scheme.
v2: Upgrade the no node found message to a warning on the grounds of
it being quite important if the error capture can't find any register
state information.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-4-John.C.Harrison@Intel.com
|
|
Update a bunch more debug prints to use the new GT based scheme.
v2: Also change prints to use %pe for error values (MichalW).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-3-John.C.Harrison@Intel.com
|
|
Update a bunch more debug prints to use the new GT based scheme.
v2: Also change prints to use %pe for error values (MichalW).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-2-John.C.Harrison@Intel.com
|
|
INF_UNIT_LEVEL_CLKGATE is not replicated, but since it's not actually
used it can just be removed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230206165410.3056073-2-lucas.demarchi@intel.com
|
|
Register 0x9424 is not replicated on any platform, so it shouldn't be
declared with REG_MCR(). Declaring it with _MMIO() is basically
duplicate of the GEN7 version, so just remove the GEN8 and change all
the callers to use the right functions.
Old versions of the gen8 bspec page used to contain a table with MCR
registers, apparently implying 0x9400 - 0x94ff registers were
replicated. However that table went away and there is no information
related to the ranges for gen8 anymore. Moreover the current behavior of
the driver wouldn't do anything special for 0x9424 since there is no
equivalent table in intel_gt_mcr.c: the driver would just fallback to
intel_uncore_{read,write}(). Therefore, do not care about the possible
special case for gen8 and just use the register as non-MCR for all the
platforms.
One place doing read + write is also converted to intel_uncore_rmw().
v2: Reword commit message adding the justification wrt gen8
Fixes: a9e69428b1b4 ("drm/i915: Define MCR registers explicitly")
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230206165410.3056073-1-lucas.demarchi@intel.com
|
|
The UNSLICE_UNIT_LEVEL_CLKGATE register programmed by this workaround
has 'BUS' style reset, indicating that it does not lose its value on
engine resets. Furthermore, this register is part of the GT forcewake
domain rather than the RENDER domain, so it should not be impacted by
RCS engine resets. As such, we should implement this on the GT
workaround list rather than an engine list.
Bspec: 19219
Fixes: 3551ff928744 ("drm/i915/gen11: Moving WAs to rcs_engine_wa_init()")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230201222831.608281-2-matthew.d.roper@intel.com
|
|
XEHPC_LNCFMISCCFGREG0 and XEHPC_L3SCRUB are both in MCR register ranges
on PVC (with HALFBSLICE and L3BANK replication respectively), so they
should be explicitly declared as MCR registers and use MCR-aware
workaround handlers.
The workarounds/tuning settings should still be applied properly on PVC
even without the MCR annotation, but readback verification on
CONFIG_DRM_I915_DEBUG_GEM builds could potentitally give false positive
"workaround lost on load" warnings on parts fused such that a unicast
read targets a terminated register instance.
Fixes: a9e69428b1b4 ("drm/i915: Define MCR registers explicitly")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230201222831.608281-1-matthew.d.roper@intel.com
|
|
Annotate intel_gt_mcr_lock() and intel_gt_mcr_unlock() to fix sparse
warnings:
drivers/gpu/drm/i915/gt/intel_gt_mcr.c:397:9: warning: context imbalance in 'intel_gt_mcr_lock' - wrong count at exit
drivers/gpu/drm/i915/gt/intel_gt_mcr.c:412:6: warning: context imbalance in 'intel_gt_mcr_unlock' - unexpected unlock
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207124026.2105442-1-jani.nikula@intel.com
|
|
Like we did it for GuC, introduce some helper print macros for
HuC to have unified format of messages that also include GT#.
While around improve some messages and use %pe if possible.
v2: update GSC/PXP timeout message
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230203085912.1963-1-michal.wajdeczko@intel.com
|
|
Stephen Rothwell reported htmldocs warnings:
Documentation/gpu/i915:64: drivers/gpu/drm/i915/gt/intel_workarounds.c:32: WARNING: Inline emphasis start-string without end-string.
Documentation/gpu/i915:64: drivers/gpu/drm/i915/gt/intel_workarounds.c:57: WARNING: Inline emphasis start-string without end-string.
Documentation/gpu/i915:64: drivers/gpu/drm/i915/gt/intel_workarounds.c:66: WARNING: Inline emphasis start-string without end-string.
Escape wildcards in *_ctx_workarounds_init(), *_gt_workarounds_init(), and
*_whitelist_build() to fix above warnings.
Link: https://lore.kernel.org/linux-next/20230203134622.0b6315b9@canb.auug.org.au/
Fixes: 0c3064cf33fbfa ("drm/i915/doc: Document where to implement register workarounds")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230203100215.31852-2-bagasdotme@gmail.com
|
|
In the manual fixup of the list_count_nodes() logic in
drivers/gpu/drm/i915/gt/intel_execlists_submission.c in the usb-next
branch, I missed that the print modifier was incorrect, resulting in
loads of build warnings on 32bit systems.
Fix this up by using "%su" instead of "%lu".
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 924fb3ec50f5 ("Merge 6.2-rc7 into usb-next")
Link: https://lore.kernel.org/r/20230206124422.2266892-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the USB fixes in here, and this resolves a merge conflict with
the i915 driver as reported in linux-next
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Just recently we switched over to new GuC oriented log macros but in
the meantime yet another message was added that we missed to update.
While around improve that new message by adding engine name and use
existing helpers to check for context state.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230131214413.1879-1-michal.wajdeczko@intel.com
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes:
Fixes/improvements/new stuff:
- Fix bcs default context on Meteorlake (Lucas De Marchi)
- GAM registers don't need to be re-applied on engine resets (Matt Roper)
- Correct implementation of Wa_18018781329 (Matt Roper)
- Avoid potential vm use-after-free (Rob Clark)
- GuC error capture fixes (John Harrison)
- Fix potential bit_17 double-free (Rob Clark)
- Don't complain about missing regs on MTL (John Harrison)
Future platform enablement:
- Convert PSS_MODE2 to multicast register (Gustavo Sousa)
- Move/adjust register definitions related to Wa_22011450934 (Matt Roper)
- Move LSC_CHICKEN_BIT* workarounds to correct function (Gustavo Sousa)
- Document where to implement register workarounds (Gustavo Sousa)
- Use uabi engines for the default engine map (Tvrtko Ursulin)
- Flush all tiles on test exit (Tvrtko Ursulin)
- Annotate a couple more workaround registers as MCR (Matt Roper)
Driver refactors:
- Add and use GuC oriented print macros (Michal Wajdeczko)
Miscellaneous:
- Fix intel_selftest_modify_policy argument types (Arnd Bergmann)
Backmerges:
Merge drm/drm-next into drm-intel-gt-next (for conflict resolution) (Tvrtko Ursulin)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y9pOsq7VKnq7rgnW@tursulin-desk
|
|
Use sysfs_emit() and sysfs_emit_at() in show() callback
as recommended by Documentation/filesystems/sysfs.rst
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230130131358.16800-1-nirmoy.das@intel.com
|
|
Check that we invalidate the TLB cache, the updated physical addresses
are immediately visible to the HW, and there is no retention of the old
physical address for concurrent HW access.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[ahajda: adjust to upstream driver, v2+]
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[tursulin: Small indentation fix.]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230130165058.1647414-1-andrzej.hajda@intel.com
|
|
Wa_22011802037 requires waiting for an engine-specific register to
clear. A missing entry for GSC engine in the register table is flagged
as a drm_err. The drm_err was originally intended to catch missing
register entries for newer engines, however, it was later found that the
WA is only required for 'legacy' engines. So just drop the drm_err.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230124231111.1786429-1-umesh.nerlige.ramappa@intel.com
|
|
Use new macros to have common prefix that also include GT#.
v2: pass gt to print_fw_ver
v3: prefer guc_dbg in suspend/resume logs
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-9-michal.wajdeczko@intel.com
|
|
Use new macros to have common prefix that also include GT#.
v2: improve few existing messages
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-8-michal.wajdeczko@intel.com
|
|
Use new macros to have common prefix that also include GT#.
v2: drop redundant GuC strings, minor improvements
v3: more message improvements
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-7-michal.wajdeczko@intel.com
|
|
Use new macros to have common prefix that also include GT#.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-6-michal.wajdeczko@intel.com
|
|
Use new macros to have common prefix that also include GT#.
v2: drop unused helpers
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-5-michal.wajdeczko@intel.com
|
|
Use new macros to have common prefix that also include GT#.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-4-michal.wajdeczko@intel.com
|
|
Use new macros to have common prefix that also include GT#.
v2: drop now redundant "GuC" word from the message
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-3-michal.wajdeczko@intel.com
|
|
While we do have GT oriented print macros, add few more GuC
specific to have common look and feel across all messages
related to the GuC and to avoid chasing the gt pointer.
We will use these macros shortly in upcoming patches.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-2-michal.wajdeczko@intel.com
|
|
Due to holidays we started -next with more -fixes in-flight than
usual, and people have been asking where they are. Backmerge to get
things better in sync.
Conflicts:
- Tiny conflict in drm_fbdev_generic.c between variable rename and
missing error handling that got added.
- Conflict in drm_fb_helper.c between the added call to vgaswitcheroo
in drm_fb_helper_single_fb_probe and a refactor patch that extracted
lots of helpers and incidentally removed the dev local variable.
Readd it to make things compile.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
The debugfs dump of requests was confused about what state requires
the execlist lock versus the GuC lock. There was also a bunch of
duplicated messy code between it and the error capture code.
So refactor the hung request search into a re-usable function. And
reduce the span of the execlist state lock to only the execlist
specific code paths. In order to do that, also move the report of hold
count (which is an execlist only concept) from the top level dump
function to the lower level execlist specific function. Also, move the
execlist specific code into the execlist source file.
v2: Rename some functions and move to more appropriate files (Daniele).
v3: Rename new execlist dump function (Daniele)
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com
(cherry picked from commit a4be3dca53172d9d2091e4b474fb795c81ed3d6c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
When GuC support was added to error capture, the reference counting
around the request object was broken. Fix it up.
The context based search manages the spinlocking around the search
internally. So it needs to grab the reference count internally as
well. The execlist only request based search relies on external
locking, so it needs an external reference count but within the
spinlock not outside it.
The only other caller of the context based search is the code for
dumping engine state to debugfs. That code wasn't previously getting
an explicit reference at all as it does everything while holding the
execlist specific spinlock. So, that needs updaing as well as that
spinlock doesn't help when using GuC submission. Rather than trying to
conditionally get/put depending on submission model, just change it to
always do the get/put.
v2: Explicitly document adding an extra blank line in some dense code
(Andy Shevchenko). Fix multiple potential null pointer derefs in case
of no request found (some spotted by Tvrtko, but there was more!).
Also fix a leaked request in case of !started and another in
__guc_reset_context now that intel_context_find_active_request is
actually reference counting the returned request.
v3: Add a _get suffix to intel_context_find_active_request now that it
grabs a reference (Daniele).
v4: Split the intel_guc_find_hung_context change to a separate patch
and rename intel_context_find_active_request_get to
intel_context_get_active_request (Tvrtko).
v5: s/locking/reference counting/ in commit message (Tvrtko)
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com
(cherry picked from commit 3700e353781e27f1bc7222f51f2cc36cbeb9b4ec)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
intel_guc_find_hung_context() was not acquiring the correct spinlock
before searching the request list. So fix that up. While at it, add
some extra whitespace padding for readability.
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-2-John.C.Harrison@Intel.com
(cherry picked from commit d1c3717501bcf56536e8b8c1bdaf5cd5357f6bb2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
All production DG1 hardware has graphics stepping B0; there is no such
thing as C0. As such, we can simplify
IS_DG1_GRAPHICS_STEP(uncore->i915, STEP_A0, STEP_C0)
to just match DG1 in general.
Bspec: 44463
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127224313.4042331-4-matthew.d.roper@intel.com
|
|
Several post-DG1 platforms have been brought up now, so we're well past
the point where we usually drop the workarounds that are only applicable
to internal/pre-production hardware.
Production DG1 hardware always has a B0 stepping for both display and
GT.
Bspec: 44463
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127224313.4042331-3-matthew.d.roper@intel.com
|
|
Several post-TGL platforms have been brought up now, so we're well past
the point where we usually drop the workarounds that are only applicable
to internal/pre-production hardware.
Production TGL hardware always has display stepping C0 or later and GT
stepping B0 or later (this is true for both the original TGL and the U/Y
subplatform).
Bspec 44455
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127224313.4042331-2-matthew.d.roper@intel.com
|
|
The GuC specific register state entry in the error capture object was
just called 'capture'. Although the companion 'node' entry was called
'guc_capture_node'. Rename the base entry to be 'guc_capture' instead
so that it is a) more consistent and b) more obvious what it is.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-9-John.C.Harrison@Intel.com
|
|
For understanding bug reports, it can be useful to have an explicit
dmesg print when a reset notification is received from GuC. As opposed
to simply inferring that this happened from other messages.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-8-John.C.Harrison@Intel.com
|
|
Engine resets are supposed to never fail. But in the case when one
does (due to unknown reasons that normally come down to a missing
w/a), it is useful to get as much information out of the system as
possible. Given that the GuC intentionally dies on such a situation,
it is not possible to get a guilty context notification back. So do a
manual search instead. Given that GuC is dead, this is safe because
GuC won't be changing the engine state asynchronously.
v2: Change comment to be less alarming (Tvrtko)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-7-John.C.Harrison@Intel.com
|
|
The debugfs dump of requests was confused about what state requires
the execlist lock versus the GuC lock. There was also a bunch of
duplicated messy code between it and the error capture code.
So refactor the hung request search into a re-usable function. And
reduce the span of the execlist state lock to only the execlist
specific code paths. In order to do that, also move the report of hold
count (which is an execlist only concept) from the top level dump
function to the lower level execlist specific function. Also, move the
execlist specific code into the execlist source file.
v2: Rename some functions and move to more appropriate files (Daniele).
v3: Rename new execlist dump function (Daniele)
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com
|
|
When GuC support was added to error capture, the reference counting
around the request object was broken. Fix it up.
The context based search manages the spinlocking around the search
internally. So it needs to grab the reference count internally as
well. The execlist only request based search relies on external
locking, so it needs an external reference count but within the
spinlock not outside it.
The only other caller of the context based search is the code for
dumping engine state to debugfs. That code wasn't previously getting
an explicit reference at all as it does everything while holding the
execlist specific spinlock. So, that needs updaing as well as that
spinlock doesn't help when using GuC submission. Rather than trying to
conditionally get/put depending on submission model, just change it to
always do the get/put.
v2: Explicitly document adding an extra blank line in some dense code
(Andy Shevchenko). Fix multiple potential null pointer derefs in case
of no request found (some spotted by Tvrtko, but there was more!).
Also fix a leaked request in case of !started and another in
__guc_reset_context now that intel_context_find_active_request is
actually reference counting the returned request.
v3: Add a _get suffix to intel_context_find_active_request now that it
grabs a reference (Daniele).
v4: Split the intel_guc_find_hung_context change to a separate patch
and rename intel_context_find_active_request_get to
intel_context_get_active_request (Tvrtko).
v5: s/locking/reference counting/ in commit message (Tvrtko)
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com
|
|
intel_guc_find_hung_context() was not acquiring the correct spinlock
before searching the request list. So fix that up. While at it, add
some extra whitespace padding for readability.
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-2-John.C.Harrison@Intel.com
|
|
GAMSTLB_CTRL and GAMCNTRL_CTRL became multicast/replicated registers on
Xe_HP. They should be defined accordingly and use MCR-aware operations.
These registers have only been used for some dg2/xehpsdv workarounds, so
this fix is mostly just for consistency/future-proofing; even lacking
the MCR annotation, workarounds will always be properly applied in a
multicast manner on these platforms.
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Fixes: 58bc2453ab8a ("drm/i915: Define multicast registers as a new type")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-3-matthew.d.roper@intel.com
|
|
Workaround Wa_18018781329 has applied to several recent Xe_HP-based
platforms. However there are some extra gotchas to implementing this
properly for MTL that we need to take into account:
* Due to the separation of media and render/compute into separate GTs,
this workaround needs to be implemented on each GT, not just the
primary GT. Since each class of register only exists on one of the
two GTs, we should program the appropriate registers on each GT.
* As with past Xe_HP platforms, the registers on the primary GT (Xe_LPG
IP) are multicast/replicated registers and should be handled with the
MCR-aware functions. However the registers on the media GT (Xe_LPM+
IP) are regular singleton registers and should _not_ use MCR
handling. We need to create separate register definitions for the
Xe_HP multicast form and the Xe_LPM+ singleton form and use each in
the appropriate place.
* Starting with MTL, workarounds documented by the hardware teams are
technically associated with IP versions/steppings rather than
top-level platforms. That means we should take care to check the
media IP version rather than the graphics IP version when deciding
whether the workaround is needed on the Xe_LPM+ media GT (in this
case the workaround applies to both IPs and the stepping bounds are
identical, but we should still write the code appropriately to set a
proper precedent for future workaround implementations).
* It's worth noting that the GSC register and the CCS register are
defined with the same MMIO offset (0xCF30). Since the CCS is only
relevant to the primary GT and the GSC is only relevant to the media
GT there isn't actually a clash here (the media GT automatically adds
the additional 0x380000 GSI offset). However there's currently a
glitch in the bspec where the CCS register doesn't show up at all and
the GSC register is listed as existing on both GTs. That's a known
documentation problem for several registers with shared GSC/CCS
offsets; rest assured that the CCS register really does still exist.
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Fixes: 41bb543f5598 ("drm/i915/mtl: Add initial gt workarounds")
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-2-matthew.d.roper@intel.com
|
|
Register reset characteristics (i.e., whether the register maintains or
loses its value on engine reset) is an important factor that determines
which wa_list we want to add workarounds to. We recently found out that
the bspec documentation for the Xe_HP's "GAM" registers in the 0xC800 -
0xCFFF range was misleading; these registers do not actually lose their
value on engine resets as the documentation implied. This means there's
no need to re-apply workarounds touching these registers after a reset,
and the corresponding workarounds should be moved from the 'engine'
lists back to the 'gt' list.
v2:
- Don't add Wa_18018781329 to xehpsdv; the original condition didn't
include that platform. (Gustavo)
- Move the MTL code to the GT function as-is for now; we'll take care
of the additional fixes needed in a follow-up patch.
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Fixes: edf176f48d87 ("drm/i915/dg2: Move misplaced 'ctx' & 'gt' wa's to engine wa list")
Fixes: b2006061ae28 ("drm/i915/xehpsdv: Move render/compute engine reset domains related workarounds")
Fixes: 41bb543f5598 ("drm/i915/mtl: Add initial gt workarounds")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-1-matthew.d.roper@intel.com
|
|
Move a handful of key enums to a new file intel_display_limits.h. These
are the enum types, and the MAX/NUM enumerations within them, that are
used in other headers. Otherwise, there's no common theme between them.
Replace intel_display.h include with intel_display_limit.h where
relevant, and add the intel_display.h include directly in the .c files
where needed.
Since intel_display.h is used almost everywhere in display/, include it
from intel_display_types.h to avoid massive changes across the
board. There are very few files that would need intel_display_types.h
but not intel_display.h so this is neglible, and further cleanup between
these headers can be left for the future.
Overall this change drops the direct and indirect dependencies on
intel_display.h from about 300 to about 100 compilation units, because
we can drop the include from i915_drv.h.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230116164644.1752009-1-jani.nikula@intel.com
|
|
Group the GMCH related members together.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c10c8f16cb5d12041e009f788bd9810225d6962d.1673958757.git.jani.nikula@intel.com
|