diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-07-01 12:53:43 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-07-01 12:53:43 -0700 |
commit | e058a84bfddc42ba356a2316f2cf1141974625c9 (patch) | |
tree | e6a02dd913e83f44ea9f5a779f9b9bd56d06a9e3 /Documentation/gpu/rfc/i915_gem_lmem.rst | |
parent | c288d9cd710433e5991d58a0764c4d08a933b871 (diff) | |
parent | 8a02ea42bc1d4c448caf1bab0e05899dad503f74 (diff) |
Merge tag 'drm-next-2021-07-01' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Highlights:
- AMD enables two more GPUs, with resulting header files
- i915 has started to move to TTM for discrete GPU and enable DG1
discrete GPU support (not by default yet)
- new HyperV drm driver
- vmwgfx adds arm64 support
- TTM refactoring ongoing
- 16bpc display support for AMD hw
Otherwise it's just the usual insane amounts of work all over the
place in lots of drivers and the core, as mostly summarised below:
Core:
- mark AGP ioctls as legacy
- disable force probing for non-master clients
- HDR metadata property helpers
- HDMI infoframe signal colorimetry support
- remove drm_device.pdev pointer
- remove DRM_KMS_FB_HELPER config option
- remove drm_pci_alloc/free
- drm_err_*/drm_dbg_* helpers
- use drm driver names for fbdev
- leaked DMA handle fix
- 16bpc fixed point format fourcc
- add prefetching memcpy for WC
- Documentation fixes
aperture:
- add aperture ownership helpers
dp:
- aux fixes
- downstream 0 port handling
- use extended base receiver capability DPCD
- Rename DP_PSR_SELECTIVE_UPDATE to better mach eDP spec
- mst: use khz as link rate during init
- VCPI fixes for StarTech hub
ttm:
- provide tt_shrink file via debugfs
- warn about freeing pinned BOs
- fix swapping error handling
- move page alignment into BO
- cleanup ttm_agp_backend
- add ttm_sys_manager
- don't override vm_ops
- ttm_bo_mmap removed
- make ttm_resource base of all managers
- remove VM_MIXEDMAP usage
panel:
- sysfs_emit support
- simple: runtime PM support
- simple: power up panel when reading EDID + caching
bridge:
- MHDP8546: HDCP support + DT bindings
- MHDP8546: Register DP AUX channel with userspace
- TI SN65DSI83 + SN65DSI84: add driver
- Sil8620: Fix module dependencies
- dw-hdmi: make CEC driver loading optional
- Ti-sn65dsi86: refclk fixes, subdrivers, runtime pm
- It66121: Add driver + DT bindings
- Adv7511: Support I2S IEC958 encoding
- Anx7625: fix power-on delay
- Nwi-dsi: Modesetting fixes; Cleanups
- lt6911: add missing MODULE_DEVICE_TABLE
- cdns: fix PM reference leak
hyperv:
- add new DRM driver for HyperV graphics
efifb:
- non-PCI device handling fixes
i915:
- refactor IP/device versioning
- XeLPD Display IP preperation work
- ADL-P enablement patches
- DG1 uAPI behind BROKEN
- disable mmap ioctl for discerte GPUs
- start enabling HuC loading for Gen12+
- major GuC backend rework for new platforms
- initial TTM support for Discrete GPUs
- locking rework for TTM prep
- use correct max source link rate for eDP
- %p4cc format printing
- GLK display fixes
- VLV DSI panel power fixes
- PSR2 disabled for RKL and ADL-S
- ACPI _DSM invalid access fixed
- DMC FW path abstraction
- ADL-S PCI ID update
- uAPI headers converted to kerneldoc
- initial LMEM support for DG1
- x86/gpu: add Jasperlake to gen11 early quirks
amdgpu:
- Aldebaran updates + initial SR-IOV
- new GPU: Beige Goby and Yellow Carp support
- more LTTPR display work
- Vangogh updates
- SDMA 5.x GCR fixes
- PCIe ASPM support
- Renoir TMZ enablement
- initial multiple eDP panel support
- use fdinfo to track devices/process info
- pin/unpin TTM fixes
- free resource on fence usage query
- fix fence calculation
- fix hotunplug/suspend issues
- GC/MM register access macro cleanup for SR-IOV
- W=1 fixes
- ACPI ATCS/ATIF handling rework
- 16bpc fixed point format support
- Initial smartshift support
- RV/PCO power tuning fixes
- new INFO query for additional vbios info
amdkfd:
- SR-IOV aldebaran support
- HMM SVM support
radeon:
- SMU regression fixes
- Oland flickering fix
vmwgfx:
- enable console with fbdev emulation
- fix cpu updates of coherent multisample surfaces
- remove reservation semaphore
- add initial SVGA3 support
- support arm64
msm:
- devcoredump support for display errors
- dpu/dsi: yaml bindings conversion
- mdp5: alpha/blend_mode/zpos support
- a6xx: cached coherent buffer support
- gpu iova fault improvement
- a660 support
rockchip:
- RK3036 win1 scaling support
- RK3066/3188 missing register support
- RK3036/3066/3126/3188 alpha support
mediatek:
- MT8167 HDMI support
- MT8183 DPI dual edge support
tegra:
- fixed YUV support/scaling on Tegra186+
ast:
- use pcim_iomap
- fix DP501 EDID
bochs:
- screen blanking support
etnaviv:
- export more GPU ID values to userspace
- add HWDB entry for GPU on i.MX8MP
- rework linear window calcs
exynos:
- pm runtime changes
imx:
- Annotate dma_fence critical section
- fix PRG modifiers after drmm conversion
- Add 8 pixel alignment fix for 1366x768
- fix YUV advertising
- add color properties
ingenic:
- IPU planes fix
panfrost:
- Mediatek MT8183 support + DT bindings
- export AFBC_FEATURES register to userspace
simpledrm:
- %pr for printing resources
nouveau:
- pin/unpin TTM fixes
qxl:
- unpin shadow BO
virtio:
- create dumb BOs as guest blob
vkms:
- drmm_universal_plane_alloc
- add XRGB plane composition
- overlay support"
* tag 'drm-next-2021-07-01' of git://anongit.freedesktop.org/drm/drm: (1570 commits)
drm/i915: Reinstate the mmap ioctl for some platforms
drm/i915/dsc: abstract helpers to get bigjoiner primary/secondary crtc
Revert "drm/msm/mdp5: provide dynamic bandwidth management"
drm/msm/mdp5: provide dynamic bandwidth management
drm/msm/mdp5: add perf blocks for holding fudge factors
drm/msm/mdp5: switch to standard zpos property
drm/msm/mdp5: add support for alpha/blend_mode properties
drm/msm/mdp5: use drm_plane_state for pixel blend mode
drm/msm/mdp5: use drm_plane_state for storing alpha value
drm/msm/mdp5: use drm atomic helpers to handle base drm plane state
drm/msm/dsi: do not enable PHYs when called for the slave DSI interface
drm/msm: Add debugfs to trigger shrinker
drm/msm/dpu: Avoid ABBA deadlock between IRQ modules
drm/msm: devcoredump iommu fault support
iommu/arm-smmu-qcom: Add stall support
drm/msm: Improve the a6xx page fault handler
iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info
iommu/arm-smmu: Add support for driver IOMMU fault handlers
drm/msm: export hangcheck_period in debugfs
drm/msm/a6xx: add support for Adreno 660 GPU
...
Diffstat (limited to 'Documentation/gpu/rfc/i915_gem_lmem.rst')
-rw-r--r-- | Documentation/gpu/rfc/i915_gem_lmem.rst | 131 |
1 files changed, 131 insertions, 0 deletions
diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst new file mode 100644 index 000000000000..675ba8620d66 --- /dev/null +++ b/Documentation/gpu/rfc/i915_gem_lmem.rst @@ -0,0 +1,131 @@ +========================= +I915 DG1/LMEM RFC Section +========================= + +Upstream plan +============= +For upstream the overall plan for landing all the DG1 stuff and turning it for +real, with all the uAPI bits is: + +* Merge basic HW enabling of DG1(still without pciid) +* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag + * At this point we can still make changes, but importantly this lets us + start running IGTs which can utilize local-memory in CI +* Convert over to TTM, make sure it all keeps working. Some of the work items: + * TTM shrinker for discrete + * dma_resv_lockitem for full dma_resv_lock, i.e not just trylock + * Use TTM CPU pagefault handler + * Route shmem backend over to TTM SYSTEM for discrete + * TTM purgeable object support + * Move i915 buddy allocator over to TTM + * MMAP ioctl mode(see `I915 MMAP`_) + * SET/GET ioctl caching(see `I915 SET/GET CACHING`_) +* Send RFC(with mesa-dev on cc) for final sign off on the uAPI +* Add pciid for DG1 and turn on uAPI for real + +New object placement and region query uAPI +========================================== +Starting from DG1 we need to give userspace the ability to allocate buffers from +device local-memory. Currently the driver supports gem_create, which can place +buffers in system memory via shmem, and the usual assortment of other +interfaces, like dumb buffers and userptr. + +To support this new capability, while also providing a uAPI which will work +beyond just DG1, we propose to offer three new bits of uAPI: + +DRM_I915_QUERY_MEMORY_REGIONS +----------------------------- +New query ID which allows userspace to discover the list of supported memory +regions(like system-memory and local-memory) for a given device. We identify +each region with a class and instance pair, which should be unique. The class +here would be DEVICE or SYSTEM, and the instance would be zero, on platforms +like DG1. + +Side note: The class/instance design is borrowed from our existing engine uAPI, +where we describe every physical engine in terms of its class, and the +particular instance, since we can have more than one per class. + +In the future we also want to expose more information which can further +describe the capabilities of a region. + +.. kernel-doc:: include/uapi/drm/i915_drm.h + :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions + +GEM_CREATE_EXT +-------------- +New ioctl which is basically just gem_create but now allows userspace to provide +a chain of possible extensions. Note that if we don't provide any extensions and +set flags=0 then we get the exact same behaviour as gem_create. + +Side note: We also need to support PXP[1] in the near future, which is also +applicable to integrated platforms, and adds its own gem_create_ext extension, +which basically lets userspace mark a buffer as "protected". + +.. kernel-doc:: include/uapi/drm/i915_drm.h + :functions: drm_i915_gem_create_ext + +I915_GEM_CREATE_EXT_MEMORY_REGIONS +---------------------------------- +Implemented as an extension for gem_create_ext, we would now allow userspace to +optionally provide an immutable list of preferred placements at creation time, +in priority order, for a given buffer object. For the placements we expect +them each to use the class/instance encoding, as per the output of the regions +query. Having the list in priority order will be useful in the future when +placing an object, say during eviction. + +.. kernel-doc:: include/uapi/drm/i915_drm.h + :functions: drm_i915_gem_create_ext_memory_regions + +One fair criticism here is that this seems a little over-engineered[2]. If we +just consider DG1 then yes, a simple gem_create.flags or something is totally +all that's needed to tell the kernel to allocate the buffer in local-memory or +whatever. However looking to the future we need uAPI which can also support +upcoming Xe HP multi-tile architecture in a sane way, where there can be +multiple local-memory instances for a given device, and so using both class and +instance in our uAPI to describe regions is desirable, although specifically +for DG1 it's uninteresting, since we only have a single local-memory instance. + +Existing uAPI issues +==================== +Some potential issues we still need to resolve. + +I915 MMAP +--------- +In i915 there are multiple ways to MMAP GEM object, including mapping the same +object using different mapping types(WC vs WB), i.e multiple active mmaps per +object. TTM expects one MMAP at most for the lifetime of the object. If it +turns out that we have to backpedal here, there might be some potential +userspace fallout. + +I915 SET/GET CACHING +-------------------- +In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but +DG1 doesn't support non-snooped pcie transactions, so we can just always +allocate as WB for smem-only buffers. If/when our hw gains support for +non-snooped pcie transactions then we must fix this mode at allocation time as +a new GEM extension. + +This is related to the mmap problem, because in general (meaning, when we're +not running on intel cpus) the cpu mmap must not, ever, be inconsistent with +allocation mode. + +Possible idea is to let the kernel picks the mmap mode for userspace from the +following table: + +smem-only: WB. Userspace does not need to call clflush. + +smem+lmem: We only ever allow a single mode, so simply allocate this as uncached +memory, and always give userspace a WC mapping. GPU still does snooped access +here(assuming we can't turn it off like on DG1), which is a bit inefficient. + +lmem only: always WC + +This means on discrete you only get a single mmap mode, all others must be +rejected. That's probably going to be a new default mode or something like +that. + +Links +===== +[1] https://patchwork.freedesktop.org/series/86798/ + +[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791 |