linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2023-12-19	drm/xe: Reformat xe_guc_regs.h	Matt Roper
	Reformat the GuC register header according to the same rules used by other register headers: - Register definitions are ordered by offset - Value of #define's start on column 49 - Lowercase used for hex values No functional change. This header has some things that aren't directly related to register definitions (e.g., number of doorbells, doorbell info structure, GuC interrupt vector layout, etc. These items have been moved to the bottom of the header. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20230602235210.1314028-1-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Replace deprecated DRM_ERROR()	Gustavo Sousa
	DRM_ERROR() has been deprecated in favor of pr_err(). However, we should prefer to use xe_gt_err() or drm_err() whenever possible so we get gt- or device-specific output with the error message. v2: - Prefer drm_err() over pr_err(). (Matt, Jani) v3: - Prefer xe_gt_err() over drm_err() when possible. (Matt) v4: - Use the already available dev variable instead of xe->drm as parameter to drm_err(). (Matt) Cc: Jani Nikula <jani.nikula@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230601194419.1179609-1-gustavo.sousa@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Add kerneldoc description of multi-tile devices	Matt Roper
	v2: - Fix doubled word. (Lucas) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-32-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Reinstate media GT support	Matt Roper
	Now that tiles and GTs are handled separately and other prerequisite changes are in place, we're ready to re-enable the media GT. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-31-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Update query uapi to support standalone media	Matt Roper
	Now that a higher GT count can result from either multiple tiles (with one GT each) or an extra media GT within the root tile, we need to update the query code slightly to stop looking at tile_count. FIXME: As noted previously, we need to decide on a formal direction for exposing tiles and/or GTs to userspace. v2: - Drop num_gt() function in favor of stored xe->info.gt_count. (Brian) v3: - Keep XE_QUERY_GT_TYPE_REMOTE around for now. Userspace probably doesn't actually need this, and we may remove it in the future, but for now let's avoid changing uapi. (Brian) Cc: Brian Welty <brian.welty@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-30-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Allow GT looping and lookup on standalone media	Matt Roper
	Allow xe_device_get_gt() and for_each_gt() to operate as expected on platforms with standalone media. FIXME: We need to figure out a consistent ID scheme for GTs. This patch keeps the pre-existing behavior of 0/1 being the GT IDs for both PVC (multi-tile) and MTL (multi-GT), but depending on the direction we decide to go with uapi, we may change this in the future (e.g., to return 0/1 on PVC and 0/2 on MTL). Or if we decide we only need to expose tiles to userspace and not GTs, we may not even need ID numbers for the GTs anymore. v2: - Restructure a bit to make the assertions more clear. - Clarify in commit message that the goal here is to preserve existing behavior; UAPI-visible changes may be introduced in the future once we settle on what we really want. v3: - Store total GT count in xe_device for ease of lookup. (Brian) - s/(id__++)/(id__)++/ (Gustavo) Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Brian Welty <brian.welty@intel.com> Acked-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-29-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations	Matt Roper
	Updates to the GGTT can happen when there are no in-flight jobs keeping the hardware awake. If the GT is powered down when invalidation is requested, we will not be able to communicate with the GuC (or MMIO) and the invalidation request will go missing. Explicitly grab GT forcewake to ensure the GT and GuC are powered up during the TLB invalidation. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-28-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Invalidate TLB on all affected GTs during GGTT updates	Matt Roper
	The GGTT is part of the tile and is shared by the primary and media GTs on platforms with a standalone media architecture. However each of these GTs has its own TLBs caching the page table lookups, and each needs to be invalidated separately. Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-27-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe	Matt Roper
	The majority of xe_gt_irq_postinstall() is really focused on the hardware engine interrupts; other GT-related interrupts such as the GuC are enabled/disabled independently. Renaming the function and making it truly GT-specific will make it more clear what the intended focus is. Disabling/masking of other interrupts (such as GuC interrupts) is unnecessary since that has already happened during the irq_reset stage, and doing so will become harmful once the media GT is re-enabled since calls to xe_gt_irq_postinstall during media GT initialization would incorrectly disable the primary GT's GuC interrupts. Also, since this function is called from gt_fw_domain_init(), it's not necessary to also call it earlier during xe_irq_postinstall; just xe_irq_resume to handle runtime resume should be sufficient. v2: - Drop unnecessary !gt check. (Lucas) - Reword some comments about enable/unmask for clarity. (Lucas) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-26-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/irq: Untangle postinstall functions	Matt Roper
	The xe_irq_postinstall() never actually gets called after installing the interrupt handler. This oversight seems to get papered over due to the fact that the (misnamed) xe_gt_irq_postinstall does more than it really should and gets called in the middle of the GT initialization. The callstack for postinstall is also a bit muddled with top-level device interrupt enablement happening within platform-specific functions called from the per-tile xe_gt_irq_postinstall() function. Clean this all up by adding the missing call to xe_irq_postinstall() after installing the interrupt handler and pull top-level irq enablement up to xe_irq_postinstall where we'd expect it to be. The xe_gt_irq_postinstall() function is still a bit misnamed here; an upcoming patch will refocus its purpose and rename it. v2: - Squash in patch to actually call xe_irq_postinstall() after installing the interrupt handler. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-25-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask	Matt Roper
	Although primary and media GuC share a single interrupt enable bit, they each have distinct bits in the mask register. Although we always enable interrupts for the primary GuC before the media GuC today (and never disable either of them), this might not always be the case in the future, so use a RMW when updating the mask register to ensure the other GuC's mask doesn't get clobbered. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-24-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/irq: Move ASLE backlight interrupt logic	Matt Roper
	Our only use of GUnit interrupts is to handle ASLE backlight operations that are reported as GUnit GSE interrupts. Move the enable/disable of these interrupts to a more sensible place, in the same area where we expect display interrupt code to be added by future patches. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-23-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Interrupts are delivered per-tile, not per-GT	Matt Roper
	IRQ delivery and handling needs to be handled on a per-tile basis. Note that this is true even for the "GT interrupts" relating to engines and GuCs --- the interrupts relating to both GTs get raised through a single set of registers in the tile's sgunit range. On true multi-tile platforms, interrupts on remote tiles are internally forwarded to the root tile; the first thing the top-level interrupt handler should do is consult the root tile's instance of DG1_MSTR_TILE_INTR to determine which tile(s) had interrupts. This register is also responsible for enabling/disabling top-level reporting of any interrupts to the OS. Although this register technically exists on all tiles, it should only be used on the root tile. The (mis)use of struct xe_gt as a target for MMIO operations in the driver makes the code somewhat confusing since we wind up needing a GT pointer to handle programming that's unrelated to the GT. To mitigate this confusion, all of the xe_gt structures used solely as an MMIO target in interrupt code are renamed to 'mmio' so that it's clear that the structure being passed does not necessarily relate to any specific GT (primary or media) that we might be dealing with interrupts for. Reworking the driver's MMIO handling to not be dependent on xe_gt is planned as a future patch series. Note that GT initialization code currently calls xe_gt_irq_postinstall() in an attempt to enable the HWE interrupts for the GT being initialized. Unfortunately xe_gt_irq_postinstall() doesn't really match its name and does a bunch of other stuff unrelated to the GT interrupts (such as enabling the top-level device interrupts). That will be addressed in future patches. v2: - Clarify commit message with explanation of why DG1_MSTR_TILE_INTR is only used on the root tile, even though it's an sgunit register that is technically present in each tile's MMIO space. (Aravind) - Also clarify that the xe_gt used as a target for MMIO operations may or may not relate to the GT we're dealing with for interrupts. (Lucas) Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-22-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Add media GT to tile	Matt Roper
	This media_gt pointer isn't actually allocated yet. Future patches will start hooking it up at appropriate places in the code, and then creation of the media GT will be added once those infrastructure changes are in place. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-20-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Allocate GT dynamically	Matt Roper
	In preparation for re-adding media GT support, switch the primary GT within the tile to a dynamic allocation. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-19-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE	Matt Roper
	Now that tiles and GTs are handled separately, extra_gts[] doesn't really provide any useful information that we can't just infer directly. The primary GT of the root tile and of the remote tiles behave the same way and don't need independent handling. When we re-add support for media GTs in a future patch, the presence of media can be determined from MEDIA_VER() (i.e., >= 13) and media's GSI offset handling is expected to remain constant for all forseeable future platforms, so it won't need to be provided in a definition structure either. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-18-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Drop vram_id	Matt Roper
	The VRAM ID is always the tile ID; there's no need to track it separately within a GT. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-17-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Clarify 'gt' retrieval for primary tile	Matt Roper
	There are a bunch of places in the driver where we need to perform non-GT MMIO against the platform's primary tile (display code, top-level interrupt enable/disable, driver initialization, etc.). Rename 'to_gt()' to 'xe_primary_mmio_gt()' to clarify that we're trying to get a primary MMIO handle for these top-level operations. In the future we need to move away from xe_gt as the target for MMIO operations (most of which are completely unrelated to GT). v2: - s/xe_primary_mmio_gt/xe_root_mmio_gt/ for more consistency with how we refer to tile 0. (Lucas) v3: - Tweak comment on xe_root_mmio_gt(). (Lucas) Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-16-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Move migration from GT to tile	Matt Roper
	Migration primarily focuses on the memory associated with a tile, so it makes more sense to track this at the tile level (especially since the driver was already skipping migration operations on media GTs). Note that the blitter engine used to perform the migration always lives in the tile's primary GT today. In theory that could change if media GTs ever start including blitter engines in the future, but we can extend the design if/when that happens in the future. v2: - Fix kunit test build - Kerneldoc parameter name update v3: - Removed leftover prototype for removed function. (Gustavo) - Remove unrelated / unwanted error handling change. (Gustavo) Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-15-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Memory allocations are tile-based, not GT-based	Matt Roper
	Since memory and address spaces are a tile concept rather than a GT concept, we need to plumb tile-based handling through lots of memory-related code. Note that one remaining shortcoming here that will need to be addressed before media GT support can be re-enabled is that although the address space is shared between a tile's GTs, each GT caches the PTEs independently in their own TLB and thus TLB invalidation should be handled at the GT level. v2: - Fix kunit test build. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-13-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Move VRAM from GT to tile	Matt Roper
	On platforms with VRAM, the VRAM is associated with the tile, not the GT. v2: - Unsquash the GGTT handling back into its own patch. - Fix kunit test build v3: - Tweak the "FIXME" comment to clarify that this function will be completely gone by the end of the series. (Lucas) v4: - Move a few changes that were supposed to be part of the GGTT patch back to that commit. (Gustavo) v5: - Kerneldoc parameter name fix. Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-11-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Move GGTT from GT to tile	Matt Roper
	The GGTT exists at the tile level. When a tile contains multiple GTs, they share the same GGTT. v2: - Include some changes that were mis-squashed into the VRAM patch. (Gustavo) Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-9-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Move register MMIO into xe_tile	Matt Roper
	Each tile has its own register region in the BAR, containing instances of all registers for the platform. In contrast, the multiple GTs within a tile share the same MMIO space; there's just a small subset of registers (the GSI registers) which have multiple copies at different offsets (0x0 for primary GT, 0x380000 for media GT). Move the register MMIO region size/pointers to the tile structure, leaving just the GSI offset information in the GT structure. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-7-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Add for_each_tile iterator	Matt Roper
	As we start splitting tile handling out from GT handling, we'll need to be able to iterate over tiles separately from GTs. This iterator will be used in upcoming patches. v2: - s/(id__++)/(id__)++/ (Gustavo) Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-6-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Add backpointer from gt to tile	Matt Roper
	Rather than a backpointer to the xe_device, a GT should have a backpointer to its tile (which can then be used to lookup the device if necessary). The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h) can and should still be used to jump directly from an xe_gt to xe_device. v2: - Fix kunit test build - Move a couple changes to the previous patch. (Lucas) Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-4-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Introduce xe_tile	Matt Roper
	Create a new xe_tile structure to begin separating the concept of "tile" from "GT." A tile is effectively a complete GPU, and a GT is just one part of that. On platforms like MTL, there's only a single full GPU (tile) which has its IP blocks provided by two GTs. In contrast, a "multi-tile" platform like PVC is basically multiple complete GPUs packed behind a single PCI device. For now, just create xe_tile as a simple wrapper around xe_gt. The items in xe_gt that are truly tied to the tile rather than the GT will be moved in future patches. Support for multiple GTs per tile (i.e., the MTL standalone media case) will also be re-introduced in a future patch. v2: - Fix kunit test build - Move hunk from next patch to use local tile variable rather than direct xe->tiles[id] accesses. (Lucas) - Mention compute in kerneldoc. (Rodrigo) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-3-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/mtl: Disable media GT	Matt Roper
	Xe incorrectly conflates the concept of 'tile' and 'GT.' Since MTL's media support is not yet functioning properly, let's just disable it completely for now while we fix the fundamental driver design. Support for media GTs on platforms like MTL will be re-added later. v2: - Drop some unrelated code cleanup that didn't belong in this patch. (Lucas) v3: - Drop unnecessary xe_gt.h include. (Gustavo) Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/vm: fix double list add	Matthew Auld
	It looks like the driver only wants to track one vma for each external object per vm. However it looks like bo_has_vm_references_locked() will ignore any vma that is marked as vma->destroyed (not actually destroyed yet). If we then mark our externally tracked vma as destroyed and then create a new vma for the same object and vm, we can have two externally tracked vma for the same object and vm. When the destroy actually happens it tries to move the external tracking to a different vma, but in this case it is already being tracked, leading to double list add errors. It should be safe to simply drop the destroyed check in bo_has_vm_references(), since the actual destroy will switch the external tracking to the next available vma. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/290 Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Replace PVC check by engine type check	José Roberto de Souza
	__emit_job_gen12_render_compute() masks some PIPE_CONTROL bits that do not exist in platforms without render engine. So here replacing the PVC check by something more generic that will support any future platforms without render engine. Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Rename GPU offset helper to reflect true usage	Michael J. Ruhl
	The _io_offset helper function is returning an offset into the GPU address space. Using the CPU address offset (io_) is not correct. Rename to reflect usage. Update to use GPU offset information. Update PT dma_offset to use the helper Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Size GT device memory correctly	Michael J. Ruhl
	The current method of sizing GT device memory is not quite right. Update the algorithm to use the relevant HW information and offsets to set up the sizing correctly. Update the stolen memory sizing to reflect the changes, and to be GT specific. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Simplify rebar sizing	Michael J. Ruhl
	"Right sizing" the PCI BAR is not necessary. If rebar is needed size to the maximum available. Preserve the force_vram_bar_size sizing. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Rework size helper to be a little more correct	Michael J. Ruhl
	The _total_vram_size helper is device based and is not complete. Teach the helper to be tile aware and add the ability to size DG1 correctly. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Prevent evicting for page tables	Maarten Lankhorst
	When creating page tables from xe_exec_ioctl, we may end up freeing memory we just validated. To be certain this does not happen, do not allow the current reservation to be evicted from the ioctl. Callchain: [ 109.008522] xe_bo_move_notify+0x5c/0xf0 [xe] [ 109.008548] xe_bo_move+0x90/0x510 [xe] [ 109.008573] ttm_bo_handle_move_mem+0xb7/0x170 [ttm] [ 109.008581] ttm_bo_swapout+0x15e/0x360 [ttm] [ 109.008586] ttm_device_swapout+0xc2/0x110 [ttm] [ 109.008592] ttm_global_swapout+0x47/0xc0 [ttm] [ 109.008598] ttm_tt_populate+0x7a/0x130 [ttm] [ 109.008603] ttm_bo_handle_move_mem+0x160/0x170 [ttm] [ 109.008609] ttm_bo_validate+0xe5/0x1d0 [ttm] [ 109.008614] ttm_bo_init_reserved+0xac/0x190 [ttm] [ 109.008620] __xe_bo_create_locked+0x153/0x260 [xe] [ 109.008645] xe_bo_create_locked_range+0x77/0x360 [xe] [ 109.008671] xe_bo_create_pin_map_at+0x33/0x1f0 [xe] [ 109.008695] xe_bo_create_pin_map+0x11/0x20 [xe] [ 109.008721] xe_pt_create+0x69/0xf0 [xe] [ 109.008749] xe_pt_stage_bind_entry+0x208/0x430 [xe] [ 109.008776] xe_pt_walk_range+0xe9/0x2a0 [xe] [ 109.008802] xe_pt_walk_range+0x223/0x2a0 [xe] [ 109.008828] xe_pt_walk_range+0x223/0x2a0 [xe] [ 109.008853] __xe_pt_bind_vma+0x28d/0xbd0 [xe] [ 109.008878] xe_vm_bind_vma+0xc7/0x2f0 [xe] [ 109.008904] xe_vm_rebind+0x72/0x160 [xe] [ 109.008930] xe_exec_ioctl+0x22b/0xa70 [xe] [ 109.008955] drm_ioctl_kernel+0xb9/0x150 [drm] [ 109.008972] drm_ioctl+0x210/0x430 [drm] [ 109.008988] __x64_sys_ioctl+0x85/0xb0 [ 109.008990] do_syscall_64+0x38/0x90 [ 109.008991] entry_SYSCALL_64_after_hwframe+0x72/0xdc Original warning: [ 5613.149126] WARNING: CPU: 3 PID: 45883 at drivers/gpu/drm/xe/xe_vm.c:504 xe_vm_unlock_dma_resv+0x43/0x50 [xe] ... [ 5613.226398] RIP: 0010:xe_vm_unlock_dma_resv+0x43/0x50 [xe] [ 5613.316098] Call Trace: [ 5613.318595] <TASK> [ 5613.320743] xe_exec_ioctl+0x383/0x8a0 [xe] [ 5613.325278] ? __is_insn_slot_addr+0x8e/0x110 [ 5613.329719] ? __is_insn_slot_addr+0x8e/0x110 [ 5613.334116] ? kernel_text_address+0x75/0xf0 [ 5613.338429] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 5613.343778] ? __kernel_text_address+0x9/0x40 [ 5613.348181] ? unwind_get_return_address+0x1a/0x30 [ 5613.353013] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 5613.358362] ? arch_stack_walk+0x99/0xf0 [ 5613.362329] ? rcu_read_lock_sched_held+0xb/0x70 [ 5613.366996] ? lock_acquire+0x287/0x2f0 [ 5613.370873] ? rcu_read_lock_sched_held+0xb/0x70 [ 5613.375530] ? rcu_read_lock_sched_held+0xb/0x70 [ 5613.380181] ? lock_release+0x225/0x2e0 [ 5613.384059] ? __pfx_xe_exec_ioctl+0x10/0x10 [xe] [ 5613.389092] drm_ioctl_kernel+0xc0/0x170 [ 5613.393068] drm_ioctl+0x1b7/0x490 [ 5613.396519] ? __pfx_xe_exec_ioctl+0x10/0x10 [xe] [ 5613.401547] ? lock_release+0x225/0x2e0 [ 5613.405432] __x64_sys_ioctl+0x8a/0xb0 [ 5613.409232] do_syscall_64+0x37/0x90 Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/239 Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: keep pulling mem_access_get further back	Matthew Auld
	Lockdep is unhappy about ggtt->lock -> runtime_pm, where it seems to think this can somehow get inverted. The ggtt->lock looks like a potentially sensitive driver lock, so likely a sensible move to never call the runtime_pm routines while holding it. Actually it looks like d3cold wants to grab this, so perhaps this can indeed deadlock. v2: - Don't forget about xe_gt_tlb_invalidation_vma(), which now needs explicit access_get. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: don't allocate under ct->lock	Matthew Auld
	Seems to be a sensitive lock, where ct->lock looks to be primed with fs_reclaim, so holding that and then allocating memory will cause lockdep to complain. We need to change the ordering wrt to grabbing the ct->lock and potentially grabbing the runtime_pm, since some of the runtime_pm routines can allocate memory (or at least that's what lockdep seems to suggest). Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/migrate: retain CCS aux state for vram -> vram	Matthew Auld
	There is no mention that migrate_copy() will skip copying the CCS aux state for all types of vram -> vram transfers. Currently we don't need such a facility but might be surprising if we ever do. v2: (Lucas): - s/lmem/vram/ in the commit message - Tidy up the code a bit; use one emit_copy_ccs() v3: - Reword the commit message Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/bo: further limit where CCS pages are needed	Matthew Auld
	No need to allocate extra pages for this if we know flat-ccs AUX state is not even possible, like for normal system memory objects. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Support copying of data between system memory bos	Thomas Hellström
	Modify the xe_migrate_copy() function somewhat to explicitly allow copying of data between two buffer objects including system memory buffer objects. Update the migrate test accordingly. v2: - Check that buffer object sizes match when copying (Matthew Auld) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Fix the migrate selftest for integrated GPUs	Thomas Hellström
	The TTM resource cursor was set up incorrectly. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/guc: Port Wa_14014475959 to xe_wa and fix it	Lucas De Marchi
	Port Wa_14014475959 to xe_wa fixing its condition. The workaround should only be applied on the primary GT, not on media. So just checking by MTL platform is not enough: checking GT is of the right type is also needed. Since the GRAPHICS_STEP() does checks the GT type, we could leave the first check as a platform one: it'd would be easier to understand and not go out of sync with the graphics_ip_map[] in drivers/gpu/drm/xe/xe_pci.c. However it also means that new platforms using the same IP wouldn't match. Prefer using the IP version. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-22-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/rtp: Also check gt type	Lucas De Marchi
	When running rules on MTL and beyond that have media as a standalone GT, the rule should only match if the gt passed as parameter match the version/range/stepping that the rule is checking. This allows workarounds affecting only the media GT to be applied only on that GT and vice-versa. For platforms before MTL, the GT will not be of media type, even if it includes media engines. Make sure to cover that case by checking if the platforma has standalone media. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-21-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/guc: Port Wa_1509372804 to xe_wa	Lucas De Marchi
	Port Wa_1509372804 to xe_wa so it's reported as active. v2: Match workaround database, starting from A0 stepping (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-20-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/guc: Port Wa_16015675438/Wa_18020744125 to xe_wa	Lucas De Marchi
	Wa_16015675438 and Wa_18020744125 apply to DG2 using the same action and conditions. Add both to the oob rules so they are both reported as active. Note that previously they were not checking by platform or IP version, hence making them not future-proof. Those workarounds should only be active in PVC and DG2, besides the check for "no render engine". v2: From current WA database, Wa_16015675438 applies to all DG2 subplatforms except G11. Migrate condition to use subplatform and remove G11 from the match (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-19-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/guc: Port Wa_22012727170/Wa_22012727685 to xe_wa	Lucas De Marchi
	Wa_22012727170 and Wa_22012727685 apply to DG2 using the same action and conditions. Add both to the oob rules so they are both reported as active. Do not Wa_22012727170 to PVC and MTL since only early A* steppings are affected. v2: Remove DG2_G10 from Wa_22012727685 to match current WA database (Matt Roper) v3: GRAPHICS_STEP(A0, FOREVER) can be left alone for DG2 as this means all steppings Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-18-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/guc: Port Wa_16011777198 to xe_wa	Lucas De Marchi
	Port Wa_16011777198 to xe_wa so it's reported as active. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-17-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/guc: Port Wa_14012197797/Wa_22011391025 to xe_wa	Lucas De Marchi
	Wa_14012197797 and Wa_22011391025 apply to DG2 using the same action. They apply to slightly different conditions. Add both to the oob rules so they are both reported as active. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-16-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/guc: Port Wa_16011759253 to xe_wa	Lucas De Marchi
	Port Wa_16011759253 to oob. Wa_22011383443, that has the same action, doesn't need to be ported as it targets early PVC steppings. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-15-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe/guc: Port Wa_22012773006 to xe_wa	Lucas De Marchi
	Let xe_guc.c start using XE_WA() for workarounds, starting from a simple one: Wa_22012773006. It's also changed to start with graphics version 12, since that is the first supported by xe. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-14-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19	drm/xe: Add support for OOB workarounds	Lucas De Marchi
	There are WAs that, due to their nature, cannot be applied from a central place like xe_wa.c. Those are peppered around the rest of the code, as needed. Now they have a new name: "out-of-band workarounds". These workarounds have their names and rules still grouped in xe_wa.c, inside the xe_wa_oob array, which is generated at compile time by xe_wa_oob.rules and the hostprog xe_gen_wa_oob. The code generation guarantees that the header xe_wa_oob.h contains the IDs for the workarounds that match the index in the table. This way the runtime checks that are spread throughout the code are simple tests against the bitmap saved during initialization. v2: Fix prev_name tracking not working when it's empty, i.e. when there is more than 1 continuation rule. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-13-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>