summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt
AgeCommit message (Collapse)Author
2022-11-17drm/i915/gt: Use RC6 residency types as arguments to residency functionsAshutosh Dixit
Previously RC6 residency functions directly accepted RC6 residency register MMIO offsets (there are four RC6 residency registers). This worked but required an assumption on the residency register layout so was not future proof. Therefore change RC6 residency functions to accept RC6 residency types instead of register MMIO offsets. The knowledge of register offsets as well as ID to offset mapping is now maintained solely in intel_rc6 and can be tailored for different platforms and different register layouts as need arises. v2: Address review comments by Jani N - Change residency functions to accept RC6 residency types instead of register ID's - s/intel_rc6_print_rc5_res/intel_rc6_print_residency/ - Remove "const enum" in function arguments - Naming: intel_rc6_* for enum - Use INTEL_RC6_RES_MAX and other minor changes v3: Don't include intel_rc6_types.h in intel_rc6.h (Jani) Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Suggested-by: Jani Nikula <jani.nikula@linux.intel.com> Reported-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221114123348.3474216-5-badal.nilawar@intel.com
2022-11-17drm/i915/mtl: Modify CAGF functions for MTLBadal Nilawar
Update CAGF functions for MTL to get actual resolved frequency of 3D and SAMedia. v2: Update MTL_MIRROR_TARGET_WP1 position/formatting (MattR) Move MTL branches in cagf functions to top (MattR) Fix commit message (Andi) v3: Added comment about registers not needing forcewake for Gen12+ and returning 0 freq in RC6 v4: Use REG_FIELD_GET and uncore (Rodrigo) Bspec: 66300 Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221114123348.3474216-4-badal.nilawar@intel.com
2022-11-17drm/i915: Use GEN12_RPSTAT register for GT freqDon Hiatt
On GEN12+ use GEN12_RPSTAT register to get actual resolved GT freq. GEN12_RPSTAT does not require a forcewake and will return 0 freq if GT is in RC6. v2: - Fixed review comments(Ashutosh) - Added function intel_rps_read_rpstat_fw to read RPSTAT without forcewake, required especially for GEN6_RPSTAT1 (Ashutosh, Tvrtko) v3: - Updated commit title and message for more clarity (Ashutosh) - Replaced intel_rps_read_rpstat with direct read to GEN12_RPSTAT1 in read_cagf (Ashutosh) v4: Remove GEN12_CAGF_SHIFT and use REG_FIELD_GET (Rodrigo) Cc: Don Hiatt <donhiatt@gmail.com> Cc: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221114123348.3474216-3-badal.nilawar@intel.com
2022-11-17drm/i915/rps: Prefer REG_FIELD_GET in intel_rps_get_cagfAshutosh Dixit
Instead of masks/shifts settle on REG_FIELD_GET as the standard way to extract reg fields. This allows future patches touching this code to also consistently use REG_FIELD_GET and friends. Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221114123348.3474216-2-badal.nilawar@intel.com
2022-11-16drm/i915/guc: add the GSC CS to the GuC capture listDaniele Ceraolo Spurio
For the GSC engine we only want to capture the instance regs. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221111001832.4144910-1-daniele.ceraolospurio@intel.com
2022-11-16drm/i915/selftests: add igt_vma_move_to_active_unlockedAndrzej Hajda
All calls to i915_vma_move_to_active are surrounded by vma lock and/or there are multiple local helpers for it in particular tests. Let's replace it by common helper. The patch should not introduce functional changes. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-3-andrzej.hajda@intel.com
2022-11-16drm/i915: call i915_request_await_object from _i915_vma_move_to_activeAndrzej Hajda
Since almost all calls to i915_vma_move_to_active are prepended with i915_request_await_object, let's call the latter from _i915_vma_move_to_active by default and add flag allowing bypassing it. Adjust all callers accordingly. The patch should not introduce functional changes. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-2-andrzej.hajda@intel.com
2022-11-16drm/i915: Update workaround documentationLucas De Marchi
There were several updates in the driver on how the workarounds are handled since its documentation was written. Update the documentation to reflect the current reality. v2: - Remove footnote that was wrongly referenced, adding back the reference in the correct paragraph. - Remove "Display workarounds" and just mention "display IP" under "Other" category since all of them are peppered around the driver. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> # v1 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221115192611.179981-1-lucas.demarchi@intel.com
2022-11-14Merge drm/drm-next into drm-intel-nextRodrigo Vivi
Catch up on 6.1-rc cycle in order to solve the intel_backlight conflict on linux-next. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-11-14drm/i915/guc: handle interrupts from media GuCDaniele Ceraolo Spurio
The render and media GuCs share the same interrupt enable register, so we can no longer disable interrupts when we disable communication for one of the GuCs as this would impact the other GuC. Instead, we keep the interrupts always enabled in HW and use a variable in the GuC structure to determine if we want to service the received interrupts or not. v2: use MTL_ prefix for reg definition (Matt) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-7-daniele.ceraolospurio@intel.com
2022-11-14drm/i915/guc: define media GT GuC send regsDaniele Ceraolo Spurio
The media GT shares the G-unit with the root GT, so a second set of communication registers is required for the media GuC. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-6-daniele.ceraolospurio@intel.com
2022-11-14drm/i915/mtl: Handle wopcm per-GT and limit calculations.Aravind Iddamsetty
With MTL standalone media architecture the wopcm layout has changed, with separate partitioning in WOPCM for the root GT GuC and the media GT GuC. The size of WOPCM is 4MB with the lower 2MB reserved for the media GT and the upper 2MB for the root GT. Given that MTL has GuC deprivilege, the WOPCM registers are pre-locked by the bios. Therefore, we can skip all the math for the partitioning and just limit ourselves to sanity-checking the values. v2: fix makefile file ordering (Jani) v3: drop XELPM_SAMEDIA_WOPCM_SIZE, check huc instead of VDBOX (John) v4: further clarify commit message, remove blank line (John) Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-5-daniele.ceraolospurio@intel.com
2022-11-14drm/i915/uc: use different ggtt pin offsets for uc loadsDaniele Ceraolo Spurio
Our current FW loading process is the same for all FWs: - Pin FW to GGTT at the start of the ggtt->uc_fw node - Load the FW - Unpin This worked because we didn't have a case where 2 FWs would be loaded on the same GGTT at the same time. On MTL, however, this can happen if both GTs are reset at the same time, so we can't pin everything in the same spot and we need to use separate offset. For simplicity, instead of calculating the exact required size, we reserve a 2MB slot for each fw. v2: fail fetch if FW is > 2MBs, improve comments (John) v3: more comment improvements (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-3-daniele.ceraolospurio@intel.com
2022-11-14drm/i915/huc: only load HuC on GTs that have VCS enginesDaniele Ceraolo Spurio
On MTL the primary GT doesn't have any media capabilities, so no video engines and no HuC. We must therefore skip the HuC fetch and load on that specific case. Given that other multi-GT platforms might have HuC on the primary GT, we can't just check for that and it is easier to instead check for the lack of VCS engines. Based on code from Aravind Iddamsetty v2: clarify which engine_mask is used for each GT and why (Tvrtko) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-1-daniele.ceraolospurio@intel.com
2022-11-14drm/i915: Simplify internal helper function signatureTvrtko Ursulin
Since we are now storing the GT backpointer in the wa list we can drop the explicit struct intel_gt * argument to wa_list_apply. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221110124633.3135026-1-tvrtko.ursulin@linux.intel.com
2022-11-14drm/i915/mtl: Add Wa_14017073508 for SAMediaBadal Nilawar
This workaround is added for Media tile of MTL A step. It is to help pcode workaround which handles the hardware issue seen during package C2/C3 transitions due to RC6 entry/exit transitions on Media tile. As a part of workaround pcode expect kmd to send mailbox message "media busy" when components of Media tile are in use and "media idle" otherwise. As per workaround description gucrc need to be disabled so enabled host based RC for Media tile. v2: - Correct workaround id (Matt) - Fix review comments (Rodrigo) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221103184559.2306481-1-badal.nilawar@intel.com
2022-11-11drm/i915/guc/slpc: Add selftest for slpc tile-tile interactionRiana Tauro
Run a workload on tiles simultaneously by requesting for RP0 frequency. Pcode can however limit the frequency being granted due to throttling reasons. This test checks if there is any throttling but does not fail if RP0 is not granted due to throttle reasons v2: Fix build error v3: Use IS_ERR_OR_NULL to check worker Addressed cosmetic review comments (Tvrtko) v4: do not skip test on media engines if gt type is GT_MEDIA. Use correct PERF_LIMIT_REASONS register for MTL (Vinay) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221109112541.275021-2-riana.tauro@intel.com
2022-11-11drm/i915: stop including i915_irq.h from i915_trace.hJani Nikula
Turns out many of the files that need i915_reg.h get it implicitly via {display/intel_de.h, gt/intel_context.h} -> i915_trace.h -> i915_irq.h -> i915_reg.h. Since i915_trace.h doesn't actually need i915_irq.h, makes sense to drop it, but that requires adding quite a few new includes all over the place. Prefer including i915_reg.h where needed instead of adding another implicit include, because eventually we'll want to split up i915_reg.h and only include the specific registers at each place. Also some places actually needed i915_irq.h too. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/6e78a2e0ac1bffaf5af3b5ccc21dff05e6518cef.1668008071.git.jani.nikula@intel.com
2022-11-11drm/i915: split out intel_display_reg_defs.hJani Nikula
Split out the display register helper macros to a separate file. For now, include it from i915_reg.h, but note that there are already files that don't need i915_reg.h, such as intel_audio.c. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/3af47193ff5219b6d2cfe353b752ec4bb44de4f1.1668008071.git.jani.nikula@intel.com
2022-11-10drm/i915: Partial abandonment of legacy DRM logging macrosTvrtko Ursulin
Convert some usages of legacy DRM logging macros into versions which tell us on which device have the events occurred. v2: * Don't have struct drm_device as local. (Jani, Ville) v3: * Store gt, not i915, in workaround list. (John) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221109104633.2579245-1-tvrtko.ursulin@linux.intel.com
2022-11-07drm/i915/guc: don't hardcode BCS0 in guc_hang selftestDaniele Ceraolo Spurio
On MTL there are no BCS engines on the media GT, so we can't always use BCS0 in the test. There is no actual reason to use a BCS engine over an engine of a different class, so switch to using any available engine. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102214310.2829310-1-daniele.ceraolospurio@intel.com
2022-11-07drm/i915/mtl: don't expose GSC command streamer to the userDaniele Ceraolo Spurio
There is no userspace user for this CS yet, we only need it for internal kernel ops (e.g. HuC, PXP), so don't expose it. v2: even if it's not exposed, rename the engine so it is easier to identify in the debug logs (Matt) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-6-daniele.ceraolospurio@intel.com
2022-11-07drm/i915/mtl: add GSC CS reset supportDaniele Ceraolo Spurio
The GSC CS has its own dedicated bit in the GDRST register. Bspec: 52549 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-5-daniele.ceraolospurio@intel.com
2022-11-07drm/i915/mtl: add GSC CS interrupt supportDaniele Ceraolo Spurio
The GSC CS re-uses the same interrupt bits that the GSC used in older platforms. This means that we can now have an engine interrupt coming out of OTHER_CLASS, so we need to handle that appropriately. v2: clean up the if statement for the engine irq (Tvrtko) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-4-daniele.ceraolospurio@intel.com
2022-11-07drm/i915/mtl: pass the GSC CS info to the GuCDaniele Ceraolo Spurio
We need to tell the GuC that the GSC CS is there. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-3-daniele.ceraolospurio@intel.com
2022-11-07drm/i915/mtl: add initial definitions for GSC CSDaniele Ceraolo Spurio
Starting on MTL, the GSC is no longer managed with direct MMIO access, but we instead have a dedicated command streamer for it. As a first step for adding support for this CS, add the required definitions. Note that, although it is now a CS, the GSC retains its old class:instance value (OTHER_CLASS instance 6) Bspec: 65308, 45605 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-2-daniele.ceraolospurio@intel.com
2022-11-04drm/i915/guc: Don't deadlock busyness stats vs resetJohn Harrison
The engine busyness stats has a worker function to do things like 64bit extend the 32bit hardware counters. The GuC's reset prepare function flushes out this worker function to ensure no corruption happens during the reset. Unforunately, the worker function has an infinite wait for active resets to finish before doing its work. Thus a deadlock would occur if the worker function had actually started just as the reset starts. The function being used to lock the reset-in-progress mutex is called intel_gt_reset_trylock(). However, as noted it does not follow standard 'trylock' conventions and exit if already locked. So rename the current _trylock function to intel_gt_reset_lock_interruptible(), which is the behaviour it actually provides. In addition, add a new implementation of _trylock and call that from the busyness stats worker instead. v2: Rename existing trylock to interruptible rather than trying to preserve the existing (confusing) naming scheme (review comments from Tvrtko). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102192109.2492625-3-John.C.Harrison@Intel.com
2022-11-04drm/i915/guc: Properly initialise kernel contextsJohn Harrison
If a context has already been registered prior to first submission then context init code was not being called. The noticeable effect of that was the scheduling priority was left at zero (meaning super high priority) instead of being set to normal. This would occur with kernel contexts at start of day as they are manually pinned up front rather than on first submission. So add a call to initialise those when they are pinned. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102192109.2492625-2-John.C.Harrison@Intel.com
2022-11-04drm/i915/guc: Remove excessive line feeds in state dumpsJohn Harrison
Some of the GuC state dump messages were adding extra line feeds. When printing via a DRM printer to dmesg, for example, that messes up the log formatting as it loses any prefixing from the printer. Given that the extra line feeds are just in the middle of random bits of GuC state, there isn't any real need for them. So just remove them completely. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031220007.4176835-1-John.C.Harrison@Intel.com
2022-11-04Merge tag 'drm-intel-gt-next-2022-11-03' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: - Fix for #7306: [Arc A380] white flickering when using arc as a secondary gpu (Matt A) - Add Wa_18017747507 for DG2 (Wayne) - Avoid spurious WARN on DG1 due to incorrect cache_dirty flag (Niranjana, Matt A) - Corrections to CS timestamp support for Gen5 and earlier (Ville) - Fix a build error used with clang compiler on hwmon (GG) - Improvements to LMEM handling with RPM (Anshuman, Matt A) - Cleanups in dmabuf code (Mike) - Selftest improvements (Matt A) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y2N11wu175p6qeEN@jlahtine-mobl.ger.corp.intel.com
2022-11-02drm/i915/selftests: Run the perf MI_BB tests on gen4/5Ville Syrjälä
Now that we know the ring timestamp frequency on gen4/5 we can run the perf tests that depend on sampling the timestamp. On g4x/ilk we must read the udw of the 64bit timestamp register. Details in {g4x,gen5)_read_clock_frequency(). When executing the read via the CS i965 doesn't seem to need the double read trick that CPU mmio reads need. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-7-ville.syrjala@linux.intel.com Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2022-11-02drm/i915/selftests: Test RING_TIMESTAMP on gen4/5Ville Syrjälä
Now that we actually know the cs timestamp frequency on gen4/5 let's run the corresponding test. On g4x/ilk we must read the udw of the 64bit timestamp register. Details in {g4x,gen5)_read_clock_frequency(). The one extra caveat is that on i965 (or at least CL, don't recall if I ever tested on BW) we must read the register twice to get an up to date value. For some unknown reason the first read tends to return a stale value. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-6-ville.syrjala@linux.intel.com Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2022-11-02drm/i915/selftests: Run MI_BB perf selftests on SNBVille Syrjälä
SNB does have the RING_TIMESTAMP register on the RCS engine. Run the MI_BB perf tests on it. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-5-ville.syrjala@linux.intel.com Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2022-11-02drm/i915: Fix cs timestamp frequency for cl/bwVille Syrjälä
Despite what the spec says the TIMESTAMP register seems to tick once every hrawclk (confirmed on i965gm and g35). v2: Rebase Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-4-ville.syrjala@linux.intel.com Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2022-11-02drm/i915: Stop claiming cs timestamp frquency on gen2/3Ville Syrjälä
Gen2/3 have no TIMESTAMP registers to sample so no point in thinking we have any frequency for it either. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-3-ville.syrjala@linux.intel.com Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2022-11-02drm/i915: Fix cs timestamp frequency for ctg/elk/ilkVille Syrjälä
On ilk the UDW of TIMESTAMP increments every 1000 ns, LDW is mbz. In order to represent that we'd need 52 bits, but we only have 32 bits. Even worse most things want to only deal with 32 bits of timestamp. So let's just set up the timestamp frequency as if we only had the UDW. On ctg/elk 63:20 of TIMESTAMP increments every 1/4 ns, 19:0 are mbz. To make life simpler let's ignore the LDW and set up timestamp frequency based on the UDW only (increments every 1024 ns). v2: Rebase Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-2-ville.syrjala@linux.intel.com Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2022-11-01drm/i915/dg2: Introduce Wa_18017747507Wayne Boyer
WA 18017747507 applies to all DG2 skus. BSpec: 56035, 46121, 68173 Signed-off-by: Wayne Boyer <wayne.boyer@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031131509.3411195-1-wayne.boyer@intel.com
2022-11-01Merge tag 'drm-intel-next-2022-10-28' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next - Hotplug code clean-up and organization (Jani, Gustavo) - More VBT specific code clean-up, doc, organization, and improvements (Ville) - More MTL enabling work (Matt, RK, Anusha, Jose) - FBC related clean-ups and improvements (Ville) - Removing unused sw_fence_await_reservation (Niranjana) - Big chunch of display house clean-up (Ville) - Many Watermark fixes and clean-ups (Ville) - Fix device info for devices without display (Jani) - Fix TC port PLLs after readout (Ville) - DPLL ID clean-ups (Ville) - Prep work for finishing (de)gamma readout (Ville) - PSR fixes and improvements (Jouni, Jose) - Reject excessive dotclocks early (Ville) - DRRS related improvements (Ville) - Simplify uncore register updates (Andrzej) - Fix simulated GPU reset wrt. encoder HW readout (Imre) - Add a ADL-P workaround (Jose) - Fix clear mask in GEN7_MISCCPCTL update (Andrzej) - Temporarily disable runtime_pm for discrete (Anshuman) - Improve fbdev debugs (Nirmoy) - Fix DP FRL link training status (Ankit) - Other small display fixes (Ankit, Suraj) - Allow panel fixed modes to have differing sync polarities (Ville) - Clean up crtc state flag checks (Ville) - Fix race conditions during DKL PHY accesses (Imre) - Prep-work for cdclock squash and crawl modes (Anusha) - ELD precompute and readout (Ville) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y1wd6ZJ8LdJpCfZL@intel.com
2022-10-31drm/i915: Encapsulate lmem rpm stuff in intel_runtime_pmAnshuman Gupta
Runtime pm is not really per GT, therefore it make sense to move lmem_userfault_list, lmem_userfault_lock and userfault_wakeref from intel_gt to intel_runtime_pm structure, which is embedded to i915. No functional change. v2: - Fixes the code comment nit. [Matt Auld] Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221027092242.1476080-2-anshuman.gupta@intel.com
2022-10-28drm/i915/mtl: Add missing steering table terminatorsMatt Roper
The termination entries were missing for a couple of the recently-added MTL steering tables. Fixes: f32898c94a10 ("drm/i915/xelpg: Add multicast steering") Fixes: a7ec65fc7e83 ("drm/i915/xelpmp: Add multicast steering for media GT") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221028224022.964997-1-matthew.d.roper@intel.com
2022-10-27drm/i915/guc: Support OA when Wa_16011777198 is enabledVinay Belgaumkar
On DG2, a w/a resets RCS/CCS before it goes into RC6. This breaks OA since OA does not expect engine resets during its use. Fix it by disabling RC6. v2: (Ashutosh) - Bring back slpc_unset_param helper - Update commit msg - Use with_intel_runtime_pm helper for set/unset v3: (Ashutosh) - Just use intel_uc_uses_guc_rc Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-15-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Save/restore EU flex counters across resetUmesh Nerlige Ramappa
If a drm client is killed, then hw contexts used by the client are reset immediately. This reset clears the EU flex counter configuration. If an OA use case is running in parallel, it would start seeing zeroed eu counter values following the reset even if the drm client is restarted. Save/restore the EU flex counter config so that the EU counters can be monitored continuously across resets. v2: - Save/restore eu flex config only for gen12, as for pre-gen12, these are saved and restored in the context image. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-14-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Add Wa_1508761755:dg2Umesh Nerlige Ramappa
Disable Clock gating in EU when gathering the events so that EU events are not lost. v2: Fix checkpatch issues v3: User MCR helpers to write to MC reg v4: Indent correctly (checkpatch) Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-12-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Move gt-specific data from i915->perf to gt->perfUmesh Nerlige Ramappa
Make perf part of gt as the OAG buffer is specific to a gt. The refactor eventually simplifies programming the right OA buffer and the right HW registers when supporting multiple gts. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-8-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Determine gen12 oa ctx offset at runtimeUmesh Nerlige Ramappa
Some SKUs of same gen12 platform may have different oactxctrl offsets. For gen12, determine oactxctrl offsets at runtime. v2: (Lionel) - Move MI definitions to intel_gpu_commands.h - Ensure __find_reg_in_lri does read past context image size v3: (Ashutosh) - Drop unnecessary use of double underscores - fix find_reg_in_lri - Return error if oa context offset is U32_MAX - Error out if oa_ctx_ctrl_offset does not find offset v4: (Ashutosh) - Warn on odd MI LRI_LEN - Remove unnecessary check for valid_oactxctrl_offset - Drop valid_oactxctrl_offset macro v5: Drop unrelated comment Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-5-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Fix noa wait predication for DG2Umesh Nerlige Ramappa
Predication for batch buffer commands changed in XEHPSDV. MI_BATCH_BUFFER_START predicates based on MI_SET_PREDICATE_RESULT register. The MI_SET_PREDICATE_RESULT register can only be modified with MI_SET_PREDICATE command. When configured, the MI_SET_PREDICATE command sets MI_SET_PREDICATE_RESULT based on bit 0 of MI_PREDICATE_RESULT_2. Use this to configure predication in noa_wait. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-4-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Fix OA filtering logic for GuC modeUmesh Nerlige Ramappa
With GuC mode of submission, GuC is in control of defining the context id field that is part of the OA reports. To filter reports, UMD and KMD must know what sw context id was chosen by GuC. There is not interface between KMD and GuC to determine this, so read the upper-dword of EXECLIST_STATUS to filter/squash OA reports for the specific context. v2: Explain guc id stealing w.r.t OA use case Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-2-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915: Fix CFI violations in gt_sysfsNathan Chancellor
When booting with CONFIG_CFI_CLANG, there are numerous violations when accessing the files under /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0: $ cd /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0 $ grep . * id:0 punit_req_freq_mhz:350 rc6_enable:1 rc6_residency_ms:214934 rps_act_freq_mhz:1300 rps_boost_freq_mhz:1300 rps_cur_freq_mhz:350 rps_max_freq_mhz:1300 rps_min_freq_mhz:350 rps_RP0_freq_mhz:1300 rps_RP1_freq_mhz:350 rps_RPn_freq_mhz:350 throttle_reason_pl1:0 throttle_reason_pl2:0 throttle_reason_pl4:0 throttle_reason_prochot:0 throttle_reason_ratl:0 throttle_reason_status:0 throttle_reason_thermal:0 throttle_reason_vr_tdc:0 throttle_reason_vr_thermalert:0 $ sudo dmesg &| grep "CFI failure at" [ 214.595903] CFI failure at kobj_attr_show+0x19/0x30 (target: id_show+0x0/0x70 [i915]; expected type: 0xc527b809) [ 214.596064] CFI failure at kobj_attr_show+0x19/0x30 (target: punit_req_freq_mhz_show+0x0/0x40 [i915]; expected type: 0xc527b809) [ 214.596407] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_enable_show+0x0/0x40 [i915]; expected type: 0xc527b809) [ 214.596528] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_residency_ms_show+0x0/0x270 [i915]; expected type: 0xc527b809) [ 214.596682] CFI failure at kobj_attr_show+0x19/0x30 (target: act_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.596792] CFI failure at kobj_attr_show+0x19/0x30 (target: boost_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.596893] CFI failure at kobj_attr_show+0x19/0x30 (target: cur_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.596996] CFI failure at kobj_attr_show+0x19/0x30 (target: max_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597099] CFI failure at kobj_attr_show+0x19/0x30 (target: min_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597198] CFI failure at kobj_attr_show+0x19/0x30 (target: RP0_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597301] CFI failure at kobj_attr_show+0x19/0x30 (target: RP1_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597405] CFI failure at kobj_attr_show+0x19/0x30 (target: RPn_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597538] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.597701] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.597836] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.597952] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598071] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598177] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598307] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598439] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598542] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) With kCFI, indirect calls are validated against their expected type versus actual type and failures occur when the two types do not match. The ultimate issue is that these sysfs functions are expecting to be called via dev_attr_show() but they may also be called via kobj_attr_show(), as certain files are created under two different kobjects that have two different sysfs_ops in intel_gt_sysfs_register(), hence the warnings above. When accessing the gt_ files under /sys/devices/pci0000:00/0000:00:02.0/drm/card0, which are using the same sysfs functions, there are no violations, meaning the functions are being called with the proper type. To make everything work properly, adjust certain functions to match the type of the ->show() and ->store() members in 'struct kobj_attribute'. Add a macro to generate functions for that can be called via both dev_attr_{show,store}() or kobj_attr_{show,store}() so that they can be called through both kobject locations without violating kCFI and adjust the attribute groups to account for this. Link: https://github.com/ClangBuiltLinux/linux/issues/1716 Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221013205909.1282545-1-nathan@kernel.org
2022-10-26drm/i915/guc: Remove intel_context:number_committed_requests counterAlan Previn
With the introduction of the delayed disable-sched behavior, we use the GuC's xarray of valid guc-id's as a way to identify if new requests had been added to a context when the said context is being checked for closure. Additionally that prior change also closes the race for when a new incoming request fails to cancel the pending delayed disable-sched worker. With these two complementary checks, we see no more use for intel_context:guc_state:number_committed_requests. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006225121.826257-3-alan.previn.teres.alexis@intel.com
2022-10-26drm/i915/guc: Delay disabling guc_id scheduling for better hysteresisMatthew Brost
Add a delay, configurable via debugfs (default 34ms), to disable scheduling of a context after the pin count goes to zero. Disable scheduling is a costly operation as it requires synchronizing with the GuC. So the idea is that a delay allows the user to resubmit something before doing this operation. This delay is only done if the context isn't closed and less than a given threshold (default is 3/4) of the guc_ids are in use. Alan Previn: Matt Brost first introduced this patch back in Oct 2021. However no real world workload with measured performance impact was available to prove the intended results. Today, this series is being republished in response to a real world workload that benefited greatly from it along with measured performance improvement. Workload description: 36 containers were created on a DG2 device where each container was performing a combination of 720p 3d game rendering and 30fps video encoding. The workload density was configured in a way that guaranteed each container to ALWAYS be able to render and encode no less than 30fps with a predefined maximum render + encode latency time. That means the totality of all 36 containers and their workloads were not saturating the engines to their max (in order to maintain just enough headrooom to meet the min fps and max latencies of incoming container submissions). Problem statement: It was observed that the CPU core processing the i915 soft IRQ work was experiencing severe load. Using tracelogs and an instrumentation patch to count specific i915 IRQ events, it was confirmed that the majority of the CPU cycles were caused by the gen11_other_irq_handler() -> guc_irq_handler() code path. The vast majority of the cycles was determined to be processing a specific G2H IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent by GuC in response to i915 KMD sending H2G requests: INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G requests are sent whenever a context goes idle so that we can unpin the context from GuC. The high CPU utilization % symptom was limiting density scaling. Root Cause Analysis: Because the incoming execution buffers were spread across 36 different containers (each with multiple contexts) but the system in totality was NOT saturated to the max, it was assumed that each context was constantly idling between submissions. This was causing a thrashing of unpinning contexts from GuC at one moment, followed quickly by repinning them due to incoming workload the very next moment. These event-pairs were being triggered across multiple contexts per container, across all containers at the rate of > 30 times per sec per context. Metrics: When running this workload without this patch, we measured an average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10 seconds or ~10 million times over ~25+ mins. With this patch, the count reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The improvement observed is ~99% for the average counts per 10 seconds. Design awareness: Selftest impact. As temporary WA disable this feature for the selftests. Selftests are very timing sensitive and any change in timing can cause failure. A follow up patch will fixup the selftests to understand this delay. Design awareness: Race between guc_request_alloc and guc_context_close. If a context close is issued while there is a request submission in flight and a delayed schedule disable is pending, guc_context_close and guc_request_alloc will race to cancel the delayed disable. To close the race, make sure that guc_request_alloc waits for guc_context_close to finish running before checking any state. Design awareness: GT Reset event. If a gt reset is triggered, as preparation steps, add an additional step to ensure all contexts that have a pending delay-disable-schedule task be flushed of it. Move them directly into the closed state after cancelling the worker. This is okay because the existing flow flushes all yet-to-arrive G2H's dropping them anyway. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006225121.826257-2-alan.previn.teres.alexis@intel.com