summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe
AgeCommit message (Collapse)Author
2025-06-05drm/xe: Rework eviction rejection of bound external bosThomas Hellström
For preempt_fence mode VM's we're rejecting eviction of shared bos during VM_BIND. However, since we do this in the move() callback, we're getting an eviction failure warning from TTM. The TTM callback intended for these things is eviction_valuable(). However, the latter doesn't pass in the struct ttm_operation_ctx needed to determine whether the caller needs this. Instead, attach the needed information to the vm under the vm->resv, until we've been able to update TTM to provide the needed information. And add sufficient lockdep checks to prevent misuse and races. v2: - Fix a copy-paste error in xe_vm_clear_validating() v3: - Fix kerneldoc errors. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 0af944f0e308 ("drm/xe: Reject BO eviction if BO is bound to current VM") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250528164105.234718-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 9d5558649f68e2e84a87a909631b30e15ca0f8ec) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/vsec: fix CONFIG_INTEL_VSEC dependencyArnd Bergmann
The XE driver can be built with or without VSEC support, but fails to link as built-in if vsec is in a loadable module: x86_64-linux-ld: vmlinux.o: in function `xe_vsec_init': (.text+0x1e83e16): undefined reference to `intel_vsec_register' The normal fix for this is to add a 'depends on INTEL_VSEC || !INTEL_VSEC', forcing XE to be a loadable module as well, but that causes a circular dependency: symbol DRM_XE depends on INTEL_VSEC symbol INTEL_VSEC depends on X86_PLATFORM_DEVICES symbol X86_PLATFORM_DEVICES is selected by DRM_XE The problem here is selecting a symbol from another subsystem, so change that as well and rephrase the 'select' into the corresponding dependency. Since X86_PLATFORM_DEVICES is 'default y', there is no change to defconfig builds here. Fixes: 0c45e76fcc62 ("drm/xe/vsec: Support BMG devices") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250529172355.2395634-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit e4931f8be347ec5f19df4d6d33aea37145378c42) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe: drop redundant conversion to boolRaag Jadav
The result of integer comparison already evaluates to bool. No need for explicit conversion. No functional impact. Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505292205.MoljmkjQ-lkp@intel.com/ Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250529160937.490147-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 61761a6b57f2818983466d24aab60baab471ba21) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/hwmon: Move card reactive critical power under channel cardKarthik Poosa
Move power2/curr2_crit to channel 1 i.e power1/curr1_crit as this represents the entire card critical power/current. v2: Update the date of curr1_crit also in hwmon documentation. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: 345dadc4f68b ("drm/xe/hwmon: Add infra to support card power and energy attributes") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-3-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 25e963a09e059ffdb15c09cc79cfded855b43668) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/hwmon: Add support to manage power limits though mailboxKarthik Poosa
Add support to manage power limits using pcode mailbox commands for supported platforms. v2: - Address review comments. (Badal) - Use mailbox commands instead of registers to manage power limits for BMG. - Clamp the maximum power limit to GPU firmware default value. v3: - Clamp power limit in write also for platforms with mailbox support. v4: - Remove unnecessary debug prints. (Badal) v5: - Update description of variable pl1_on_boot to fix kernel-doc error. v6: - Improve commit message, refer to BIOS as GPU firmware. - Change macro READ_PL_FROM_BIOS to READ_PL_FROM_FW. - Rectify drm_warn to drm_info. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: e90f7a58e659 ("drm/xe/hwmon: Add HWMON support for BMG") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-2-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 7596d839f6228757fe17a810da2d1c5f3305078c) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/vm: move xe_svm_init() earlierMatthew Auld
In xe_vm_close_and_put() we need to be able to call xe_svm_fini(), however during vm creation we can call this on the error path, before having actually initialised the svm state, leading to various splats followed by a fatal NPD. Fixes: 6fd979c2f331 ("drm/xe: Add SVM init / close / fini to faulting VMs") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4967 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-4-matthew.auld@intel.com (cherry picked from commit 4f296d77cf49fcb5f90b4674123ad7f3a0676165) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/vm: move rebind_work init earlierMatthew Auld
In xe_vm_close_and_put() we need to be able to call flush_work(rebind_work), however during vm creation we can call this on the error path, before having actually set up the worker, leading to a splat from flush_work(). It looks like we can simply move the worker init step earlier to fix this. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-3-matthew.auld@intel.com (cherry picked from commit 96af397aa1a2d1032a6e28ff3f4bc0ab4be40e1d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-04Merge tag 'pci-v6.16-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Print the actual delay time in pci_bridge_wait_for_secondary_bus() instead of assuming it was 1000ms (Wilfred Mallawa) - Revert 'iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices', which broke resume from system sleep on AMD platforms and has been fixed by other commits (Lukas Wunner) Resource management: - Remove mtip32xx use of pcim_iounmap_regions(), which is deprecated and unnecessary (Philipp Stanner) - Remove pcim_iounmap_regions() and pcim_request_region_exclusive() and related flags since all uses have been removed (Philipp Stanner) - Rework devres 'request' functions so they are no longer 'hybrid', i.e., their behavior no longer depends on whether pcim_enable_device or pci_enable_device() was used, and remove related code (Philipp Stanner) - Warn (not BUG()) about failure to assign optional resources (Ilpo Järvinen) Error handling: - Log the DPC Error Source ID only when it's actually valid (when ERR_FATAL or ERR_NONFATAL was received from a downstream device) and decode into bus/device/function (Bjorn Helgaas) - Determine AER log level once and save it so all related messages use the same level (Karolina Stolarek) - Use KERN_WARNING, not KERN_ERR, when logging PCIe Correctable Errors (Karolina Stolarek) - Ratelimit PCIe Correctable and Non-Fatal error logging, with sysfs controls on interval and burst count, to avoid flooding logs and RCU stall warnings (Jon Pan-Doh) Power management: - Increment PM usage counter when probing reset methods so we don't try to read config space of a powered-off device (Alex Williamson) - Set all devices to D0 during enumeration to ensure ACPI opregion is connected via _REG (Mario Limonciello) Power control: - Rename pwrctrl Kconfig symbols from 'PWRCTL' to 'PWRCTRL' to match the filename paths. Retain old deprecated symbols for compatibility, except for the pwrctrl slot driver (PCI_PWRCTRL_SLOT) (Johan Hovold) - When unregistering pwrctrl, cancel outstanding rescan work before cleaning up data structures to avoid use-after-free issues (Brian Norris) Bandwidth control: - Simplify link bandwidth controller by replacing the count of Link Bandwidth Management Status (LBMS) events with a PCI_LINK_LBMS_SEEN flag (Ilpo Järvinen) - Update the Link Speed after retraining, since the Link Speed may have changed (Ilpo Järvinen) PCIe native device hotplug: - Ignore Presence Detect Changed caused by DPC. pciehp already ignores Link Down/Up events caused by DPC, but on slots using in-band presence detect, DPC causes a spurious Presence Detect Changed event (Lukas Wunner) - Ignore Link Down/Up caused by Secondary Bus Reset. On hotplug ports using in-band presence detect, the reset causes a Presence Detect Changed event, which mistakenly caused teardown and re-enumeration of the device. Drivers may need to annotate code that resets their device (Lukas Wunner) Virtualization: - Add an ACS quirk for Loongson Root Ports that don't advertise ACS but don't allow peer-to-peer transactions between Root Ports; the quirk allows each Root Port to be in a separate IOMMU group (Huacai Chen) Endpoint framework: - For fixed-size BARs, retain both the actual size and the possibly larger size allocated to accommodate iATU alignment requirements (Jerome Brunet) - Simplify ctrl/SPAD space allocation and avoid allocating more space than needed (Jerome Brunet) - Correct MSI-X PBA offset calculations for DesignWare and Cadence endpoint controllers (Niklas Cassel) - Align the return value (number of interrupts) encoding for pci_epc_get_msi()/pci_epc_ops::get_msi() and pci_epc_get_msix()/pci_epc_ops::get_msix() (Niklas Cassel) - Align the nr_irqs parameter encoding for pci_epc_set_msi()/pci_epc_ops::set_msi() and pci_epc_set_msix()/pci_epc_ops::set_msix() (Niklas Cassel) Common host controller library: - Convert pci-host-common to a library so platforms that don't need native host controller drivers don't need to include these helper functions (Manivannan Sadhasivam) Apple PCIe controller driver: - Extract ECAM bridge creation helper from pci_host_common_probe() to separate driver-specific things like MSI from PCI things (Marc Zyngier) - Dynamically allocate RID-to_SID bitmap to prepare for SoCs with varying capabilities (Marc Zyngier) - Skip ports disabled in DT when setting up ports (Janne Grunau) - Add t6020 compatible string (Alyssa Rosenzweig) - Add T602x PCIe support (Hector Martin) - Directly set/clear INTx mask bits because T602x dropped the accessors that could do this without locking (Marc Zyngier) - Move port PHY registers to their own reg items to accommodate T602x, which moves them around; retain default offsets for existing DTs that lack phy%d entries with the reg offsets (Hector Martin) - Stop polling for core refclk, which doesn't work on T602x and the bootloader has already done anyway (Hector Martin) - Use gpiod_set_value_cansleep() when asserting PERST# in probe because we're allowed to sleep there (Hector Martin) Cadence PCIe controller driver: - Drop a runtime PM 'put' to resolve a runtime atomic count underflow (Hans Zhang) - Make the cadence core buildable as a module (Kishon Vijay Abraham I) - Add cdns_pcie_host_disable() and cdns_pcie_ep_disable() for use by loadable drivers when they are removed (Siddharth Vadapalli) Freescale i.MX6 PCIe controller driver: - Apply link training workaround only on IMX6Q, IMX6SX, IMX6SP (Richard Zhu) - Remove redundant dw_pcie_wait_for_link() from imx_pcie_start_link(); since the DWC core does this, imx6 only needs it when retraining for a faster link speed (Richard Zhu) - Toggle i.MX95 core reset to align with PHY powerup (Richard Zhu) - Set SYS_AUX_PWR_DET to work around i.MX95 ERR051624 erratum: in some cases, the controller can't exit 'L23 Ready' through Beacon or PERST# deassertion (Richard Zhu) - Clear GEN3_ZRXDC_NONCOMPL to work around i.MX95 ERR051586 erratum: controller can't meet 2.5 GT/s ZRX-DC timing when operating at 8 GT/s, causing timeouts in L1 (Richard Zhu) - Wait for i.MX95 PLL lock before enabling controller (Richard Zhu) - Save/restore i.MX95 LUT for suspend/resume (Richard Zhu) Mobiveil PCIe controller driver: - Return bool (not int) for link-up check in mobiveil_pab_ops.link_up() and layerscape-gen4, mobiveil (Hans Zhang) NVIDIA Tegra194 PCIe controller driver: - Create debugfs directory for 'aspm_state_cnt' only when CONFIG_PCIEASPM is enabled, since there are no other entries (Hans Zhang) Qualcomm PCIe controller driver: - Add OF support for parsing DT 'eq-presets-<N>gts' property for lane equalization presets (Krishna Chaitanya Chundru) - Read Maximum Link Width from the Link Capabilities register if DT lacks 'num-lanes' property (Krishna Chaitanya Chundru) - Add Physical Layer 64 GT/s Capability ID and register offsets for 8, 32, and 64 GT/s lane equalization registers (Krishna Chaitanya Chundru) - Add generic dwc support for configuring lane equalization presets (Krishna Chaitanya Chundru) - Add DT and driver support for PCIe on IPQ5018 SoC (Nitheesh Sekar) Renesas R-Car PCIe controller driver: - Describe endpoint BAR 4 as being fixed size (Jerome Brunet) - Document how to obtain R-Car V4H (r8a779g0) controller firmware (Yoshihiro Shimoda) Rockchip PCIe controller driver: - Reorder rockchip_pci_core_rsts because reset_control_bulk_deassert() deasserts in reverse order, to fix a link training regression (Jensen Huang) - Mark RK3399 as being capable of raising INTx interrupts (Niklas Cassel) Rockchip DesignWare PCIe controller driver: - Check only PCIE_LINKUP, not LTSSM status, to determine whether the link is up (Shawn Lin) - Increase N_FTS (used in L0s->L0 transitions) and enable ASPM L0s for Root Complex and Endpoint modes (Shawn Lin) - Hide the broken ATS Capability in rockchip_pcie_ep_init() instead of rockchip_pcie_ep_pre_init() so it stays hidden after PERST# resets non-sticky registers (Shawn Lin) - Call phy_power_off() before phy_exit() in rockchip_pcie_phy_deinit() (Diederik de Haas) Synopsys DesignWare PCIe controller driver: - Set PORT_LOGIC_LINK_WIDTH to one lane to make initial link training more robust; this will not affect the intended link width if all lanes are functional (Wenbin Yao) - Return bool (not int) for link-up check in dw_pcie_ops.link_up() and armada8k, dra7xx, dw-rockchip, exynos, histb, keembay, keystone, kirin, meson, qcom, qcom-ep, rcar_gen4, spear13xx, tegra194, uniphier, visconti (Hans Zhang) - Add debugfs support for exposing DWC device-specific PTM context (Manivannan Sadhasivam) TI J721E PCIe driver: - Make j721e buildable as a loadable and removable module (Siddharth Vadapalli) - Fix j721e host/endpoint dependencies that result in link failures in some configs (Arnd Bergmann) Device tree bindings: - Add qcom DT binding for 'global' interrupt (PCIe controller and link-specific events) for ipq8074, ipq8074-gen3, ipq6018, sa8775p, sc7280, sc8180x sdm845, sm8150, sm8250, sm8350 (Manivannan Sadhasivam) - Add qcom DT binding for 8 MSI SPI interrupts for msm8998, ipq8074, ipq8074-gen3, ipq6018 (Manivannan Sadhasivam) - Add dw rockchip DT binding for rk3576 and rk3562 (Kever Yang) - Correct indentation and style of examples in brcm,stb-pcie, cdns,cdns-pcie-ep, intel,keembay-pcie-ep, intel,keembay-pcie, microchip,pcie-host, rcar-pci-ep, rcar-pci-host, xilinx-versal-cpm (Krzysztof Kozlowski) - Convert Marvell EBU (dove, kirkwood, armada-370, armada-xp) and armada8k from text to schema DT bindings (Rob Herring) - Remove obsolete .txt DT bindings for content that has been moved to schemas (Rob Herring) - Add qcom DT binding for MHI registers in IPQ5332, IPQ6018, IPQ8074 and IPQ9574 (Varadarajan Narayanan) - Convert v3,v360epc-pci from text to DT schema binding (Rob Herring) - Change microchip,pcie-host DT binding to be 'dma-noncoherent' since PolarFire may be configured that way (Conor Dooley) Miscellaneous: - Drop 'pci' suffix from intel_mid_pci.c filename to match similar files (Andy Shevchenko) - All platforms with PCI have an MMU, so add PCI Kconfig dependency on MMU to simplify build testing and avoid inadvertent build regressions (Arnd Bergmann) - Update Krzysztof Wilczyński's email address in MAINTAINERS (Krzysztof Wilczyński) - Update Manivannan Sadhasivam's email address in MAINTAINERS (Manivannan Sadhasivam)" * tag 'pci-v6.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (147 commits) MAINTAINERS: Update Manivannan Sadhasivam email address PCI: j721e: Fix host/endpoint dependencies PCI: j721e: Add support to build as a loadable module PCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup PCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup PCI: cadence: Add support to build pcie-cadence library as a kernel module MAINTAINERS: Update Krzysztof Wilczyński email address PCI: Remove unnecessary linesplit in __pci_setup_bridge() PCI: WARN (not BUG()) when we fail to assign optional resources PCI: Remove unused pci_printk() PCI: qcom: Replace PERST# sleep time with proper macro PCI: dw-rockchip: Replace PERST# sleep time with proper macro PCI: host-common: Convert to library for host controller drivers PCI/ERR: Remove misleading TODO regarding kernel panic PCI: cadence: Remove duplicate message code definitions PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding PCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding PCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding PCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding PCI: cadence-ep: Correct PBA offset in .set_msix() callback ...
2025-05-28Merge tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm updates from Dave Airlie: "As part of building up nova-core/nova-drm pieces we've brought in some rust abstractions through this tree, aux bus being the main one, with devres changes also in the driver-core tree. Along with the drm core abstractions and enough nova-core/nova-drm to use them. This is still all stub work under construction, to build the nova driver upstream. The other big NVIDIA related one is nouveau adds support for Hopper/Blackwell GPUs, this required a new GSP firmware update to 570.144, and a bunch of rework in order to support multiple fw interfaces. There is also the introduction of an asahi uapi header file as a precursor to getting the real driver in later, but to unblock userspace mesa packages while the driver is trapped behind rust enablement. Otherwise it's the usual mixture of stuff all over, amdgpu, i915/xe, and msm being the main ones, and some changes to vsprintf. new drivers: - bring in the asahi uapi header standalone - nova-drm: stub driver rust dependencies (for nova-core): - auxiliary - bus abstractions - driver registration - sample driver - devres changes from driver-core - revocable changes core: - add Apple fourcc modifiers - add virtio capset definitions - extend EXPORT_SYNC_FILE for timeline syncobjs - convert to devm_platform_ioremap_resource - refactor shmem helper page pinning - DP powerup/down link helpers - extended %p4cc in vsprintf.c to support fourcc prints - change vsprintf %p4cn to %p4chR, remove %p4cn - Add drm_file_err function - IN_FORMATS_ASYNC property - move sitronix from tiny to their own subdir rust: - add drm core infrastructure rust abstractions (device/driver, ioctl, file, gem) dma-buf: - adjust sg handling to not cache map on attach - allow setting dma-device for import - Add a helper to sort and deduplicate dma_fence arrays docs: - updated drm scheduler docs - fbdev todo update - fb rendering - actual brightness ttm: - fix delayed destroy resv object bridge: - add kunit tests - convert tc358775 to atomic - convert drivers to devm_drm_bridge_alloc - convert rk3066_hdmi to bridge driver scheduler: - add kunit tests panel: - refcount panels to improve lifetime handling - Powertip PH128800T004-ZZA01 - NLT NL13676BC25-03F, Tianma TM070JDHG34-00 - Himax HX8279/HX8279-D DDIC - Visionox G2647FB105 - Sitronix ST7571 - ZOTAC rotation quirk vkms: - allow attaching more displays i915: - xe3lpd display updates - vrr refactor - intel_display struct conversions - xe2hpd memory type identification - add link rate/count to i915_display_info - cleanup VGA plane handling - refactor HDCP GSC - fix SLPC wait boosting reference counting - add 20ms delay to engine reset - fix fence release on early probe errors xe: - SRIOV updates - BMG PCI ID update - support separate firmware for each GT - SVM fix, prelim SVM multi-device work - export fan speed - temp disable d3cold on BMG - backup VRAM in PM notifier instead of suspend/freeze - update xe_ttm_access_memory to use GPU for non-visible access - fix guc_info debugfs for VFs - use copy_from_user instead of __copy_from_user - append PCIe gen5 limitations to xe_firmware document amdgpu: - DSC cleanup - DC Scaling updates - Fused I2C-over-AUX updates - DMUB updates - Use drm_file_err in amdgpu - Enforce isolation updates - Use new dma_fence helpers - USERQ fixes - Documentation updates - SR-IOV updates - RAS updates - PSP 12 cleanups - GC 9.5 updates - SMU 13.x updates - VCN / JPEG SR-IOV updates amdkfd: - Update error messages for SDMA - Userptr updates - XNACK fixes radeon: - CIK doorbell cleanup nouveau: - add support for NVIDIA r570 GSP firmware - enable Hopper/Blackwell support nova-core: - fix task list - register definition infrastructure - move firmware into own rust module - register auxiliary device for nova-drm nova-drm: - initial driver skeleton msm: - GPU: - ACD (adaptive clock distribution) for X1-85 - drop fictional address_space_size - improve GMU HFI response time out robustness - fix crash when throttling during boot - DPU: - use single CTL path for flushing on DPU 5.x+ - improve SSPP allocation code for better sharing - Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550 - Added SAR2130P support - Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660 - DP: - switch to new audio helpers - better LTTPR handling - DSI: - Added support for SA8775P - Added SAR2130P support - HDMI: - Switched to use new helpers for ACR data - Fixed old standing issue of HPD not working in some cases amdxdna: - add dma-buf support - allow empty command submits renesas: - add dma-buf support - add zpos, alpha, blend support panthor: - fail properly for NO_MMAP bos - add SET_LABEL ioctl - debugfs BO dumping support imagination: - update DT bindings - support TI AM68 GPU hibmc: - improve interrupt handling and HPD support virtio: - add panic handler support rockchip: - add RK3588 support - add DP AUX bus panel support ivpu: - add heartbeat based hangcheck mediatek: - prepares support for MT8195/99 HDMIv2/DDCv2 anx7625: - improve HPD tegra: - speed up firmware loading * tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel: (1627 commits) drm/nouveau/tegra: Fix error pointer vs NULL return in nvkm_device_tegra_resource_addr() drm/xe: Default auto_link_downgrade status to false drm/xe/guc: Make creation of SLPC debugfs files conditional drm/i915/display: Add check for alloc_ordered_workqueue() and alloc_workqueue() drm/i915/dp_mst: Work around Thunderbolt sink disconnect after SINK_COUNT_ESI read drm/i915/ptl: Use everywhere the correct DDI port clock select mask drm/nouveau/kms: add support for GB20x drm/dp: add option to disable zero sized address only transactions. drm/nouveau: add support for GB20x drm/nouveau/gsp: add hal for fifo.chan.doorbell_handle drm/nouveau: add support for GB10x drm/nouveau/gf100-: track chan progress with non-WFI semaphore release drm/nouveau/nv50-: separate CHANNEL_GPFIFO handling out from CHANNEL_DMA drm/nouveau: add helper functions for allocating pinned/cpu-mapped bos drm/nouveau: add support for GH100 drm/nouveau: improve handling of 64-bit BARs drm/nouveau/gv100-: switch to volta semaphore methods drm/nouveau/gsp: support deeper page tables in COPY_SERVER_RESERVED_PDES drm/nouveau/gsp: init client VMMs with NV0080_CTRL_DMA_SET_PAGE_DIRECTORY drm/nouveau/gsp: fetch level shift and PDE from BAR2 VMM ...
2025-05-28drm/xe: Add missing documentation of rpa_freqRodrigo Vivi
While at it, already adjust the rpe_freq frequency, to highlight that both are calculated by PCODE at runtime. Fixes: c6aac2fa77a3 ("drm/xe: Introduce the RPa information") Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250521165146.39616-4-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 39578fa40420fb11dbe4f42225a347e945d8fd0e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-05-28drm/xe: Make xe_gt_freq part of the DocumentationRodrigo Vivi
The documentation was created with the creation of the component, however it has never been actually shown in the actual Documentation. While doing this, fixes the identation style, to avoid new warnings while building htmldocs. Fixes: bef52b5c7a19 ("drm/xe: Create a xe_gt_freq component for raw management and sysfs") Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250521165146.39616-3-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit af53f0fd99c3bbb3afd29f1612c9e88c5a92cc01) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-05-27Merge tag 'pm-6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "Once again, the changes are dominated by cpufreq updates, but this time the majority of them are cpufreq core changes, mostly related to the introduction of policy locking guards and __free() usage, and fixes related to boost handling. Still, there is also a significant update of the intel_pstate driver making it register an energy model when running on a hybrid platform which is used for enabling energy-aware scheduling (EAS) if the driver operates in the passive mode (and schedutil is used as the cpufreq governor for all CPUs which is the passive mode default). There are some amd-pstate driver updates too, for a good measure, including the "Requested CPU Min frequency" BIOS option support and new online/offline callbacks. In the cpuidle space, the most significant change is the addition of a C1 demotion on/off sysfs knob to intel_idle which should help some users to configure their systems more precisely. There is also the conversion of the PSCI cpuidle driver to a faux device one and there are two small updates of cpuidle governors. Device power management is also modified quite a bit, especially the handling of devices with asynchronous suspend and resume enabled during system transitions. They are now going to be handled more asynchronously during suspend transitions and somewhat less aggressively during resume transitions. Apart from the above, the operating performance points (OPP) library is now going to use mutex locking guards and scope-based cleanup helpers and there is the usual bunch of assorted fixes and code cleanups. Specifics: - Fix potential division-by-zero error in em_compute_costs() (Yaxiong Tian) - Fix typos in energy model documentation and example driver code (Moon Hee Lee, Atul Kumar Pant) - Rearrange the energy model management code and add a new function for adjusting a CPU energy model after adjusting the capacity of the given CPU to it (Rafael Wysocki) - Refactor cpufreq_online(), add and use cpufreq policy locking guards, use __free() in policy reference counting, and clean up core cpufreq code on top of that (Rafael Wysocki) - Fix boost handling on CPU suspend/resume and sysfs updates (Viresh Kumar) - Fix des_perf clamping with max_perf in amd_pstate_update() (Dhananjay Ugwekar) - Add offline, online and suspend callbacks to the amd-pstate driver, rename and use the existing amd_pstate_epp callbacks in it (Dhananjay Ugwekar) - Add support for the "Requested CPU Min frequency" BIOS option to the amd-pstate driver (Dhananjay Ugwekar) - Reset amd-pstate driver mode after running selftests (Swapnil Sapkal) - Avoid shadowing ret in amd_pstate_ut_check_driver() (Nathan Chancellor) - Add helper for governor checks to the schedutil cpufreq governor and move cpufreq-specific EAS checks to cpufreq (Rafael Wysocki) - Populate the cpu_capacity sysfs entries from the intel_pstate driver after registering asym capacity support (Ricardo Neri) - Add support for enabling Energy-aware scheduling (EAS) to the intel_pstate driver when operating in the passive mode on a hybrid platform (Rafael Wysocki) - Drop redundant cpus_read_lock() from store_local_boost() in the cpufreq core (Seyediman Seyedarab) - Replace sscanf() with kstrtouint() in the cpufreq code and use a symbol instead of a raw number in it (Bowen Yu) - Add support for autonomous CPU performance state selection to the CPPC cpufreq driver (Lifeng Zheng) - OPP: Add dev_pm_opp_set_level() (Praveen Talari) - Introduce scope-based cleanup headers and mutex locking guards in OPP core (Viresh Kumar) - Switch OPP to use kmemdup_array() (Zhang Enpei) - Optimize bucket assignment when next_timer_ns equals KTIME_MAX in the menu cpuidle governor (Zhongqiu Han) - Convert the cpuidle PSCI driver to a faux device one (Sudeep Holla) - Add C1 demotion on/off sysfs knob to the intel_idle driver (Artem Bityutskiy) - Fix typos in two comments in the teo cpuidle governor (Atul Kumar Pant) - Fix denying of auto suspend in pm_suspend_timer_fn() (Charan Teja Kalla) - Move debug runtime PM attributes to runtime_attrs[] (Rafael Wysocki) - Add new devm_ functions for enabling runtime PM and runtime PM reference counting (Bence Csókás) - Remove size arguments from strscpy() calls in the hibernation core code (Thorsten Blum) - Adjust the handling of devices with asynchronous suspend enabled during system suspend and resume to start resuming them immediately after resuming their parents and to start suspending such a device immediately after suspending its first child (Rafael Wysocki) - Adjust messages printed during tasks freezing to avoid using pr_cont() (Andrew Sayers, Paul Menzel) - Clean up unnecessary usage of !! in pm_print_times_init() (Zihuan Zhang) - Add missing wakeup source attribute relax_count to sysfs and remove the space character at the end ofi the string produced by pm_show_wakelocks() (Zijun Hu) - Add configurable pm_test delay for hibernation (Zihuan Zhang) - Disable asynchronous suspend in ucsi_ccg_probe() to prevent the cypd4226 device on Tegra boards from suspending prematurely (Jon Hunter) - Unbreak printing PM debug messages during hibernation and clean up some related code (Rafael Wysocki) - Add a systemd service to run cpupower and change cpupower binding's Makefile to use -lcpupower (John B. Wyatt IV, Francesco Poli)" * tag 'pm-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits) cpufreq: CPPC: Add support for autonomous selection cpufreq: Update sscanf() to kstrtouint() cpufreq: Replace magic number OPP: switch to use kmemdup_array() PM: freezer: Rewrite restarting tasks log to remove stray *done.* PM: runtime: fix denying of auto suspend in pm_suspend_timer_fn() cpufreq: drop redundant cpus_read_lock() from store_local_boost() cpupower: do not install files to /etc/default/ cpupower: do not call systemctl at install time cpupower: do not write DESTDIR to cpupower.service PM: sleep: Introduce pm_sleep_transition_in_progress() cpufreq/amd-pstate: Avoid shadowing ret in amd_pstate_ut_check_driver() cpufreq: intel_pstate: Document hybrid processor support cpufreq: intel_pstate: EAS: Increase cost for CPUs using L3 cache cpufreq: intel_pstate: EAS support for hybrid platforms PM: EM: Introduce em_adjust_cpu_capacity() PM: EM: Move CPU capacity check to em_adjust_new_capacity() PM: EM: Documentation: Fix typos in example driver code cpufreq: Drop policy locking from cpufreq_policy_is_good_for_eas() PM: sleep: Introduce pm_suspend_in_progress() ...
2025-05-26Merge branches 'pm-runtime' and 'pm-sleep'Rafael J. Wysocki
Merge updates related to system sleep handling and runtime PM for 6.16-rc1: - Fix denying of auto suspend in pm_suspend_timer_fn() (Charan Teja Kalla). - Move debug runtime PM attributes to runtime_attrs[] (Rafael Wysocki). - Add new devm_ functions for enabling runtime PM and runtime PM reference counting (Bence Csókás). - Remove size arguments from strscpy() calls in the hibernation core code (Thorsten Blum). - Adjust the handling of devices with asynchronous suspend enabled during system suspend and resume to start resuming them immediately after resuming their parents and to start suspending such a device immediately after suspending its first child (Rafael Wysocki). - Adjust messages printed during tasks freezing to avoid using pr_cont() (Andrew Sayers, Paul Menzel). - Clean up unnecessary usage of !! in pm_print_times_init() (Zihuan Zhang). - Add missing wakeup source attribute relax_count to sysfs and remove the space character at the end ofi the string produced by pm_show_wakelocks() (Zijun Hu). - Add configurable pm_test delay for hibernation (Zihuan Zhang). - Disable asynchronous suspend in ucsi_ccg_probe() to prevent the cypd4226 device on Tegra boards from suspending prematurely (Jon Hunter). - Unbreak printing PM debug messages during hibernation and clean up some related code (Rafael Wysocki). * pm-runtime: PM: runtime: fix denying of auto suspend in pm_suspend_timer_fn() PM: sysfs: Move debug runtime PM attributes to runtime_attrs[] PM: runtime: Add new devm functions * pm-sleep: PM: freezer: Rewrite restarting tasks log to remove stray *done.* PM: sleep: Introduce pm_sleep_transition_in_progress() PM: sleep: Introduce pm_suspend_in_progress() PM: sleep: Print PM debug messages during hibernation ucsi_ccg: Disable async suspend in ucsi_ccg_probe() PM: hibernate: add configurable delay for pm_test PM: wakeup: Delete space in the end of string shown by pm_show_wakelocks() PM: wakeup: Add missing wakeup source attribute relax_count PM: sleep: Remove unnecessary !! PM: sleep: Use two lines for "Restarting..." / "done" messages PM: sleep: Make suspend of devices more asynchronous PM: sleep: Suspend async parents after suspending children PM: sleep: Resume children after resuming the parent PM: hibernate: Remove size arguments when calling strscpy()
2025-05-23drm/xe: Default auto_link_downgrade status to falseAradhya Bhatia
xe_pcode_read() can return back successfully without updating the variable 'val'. This can cause an arbitrary value to show up in the sysfs file. Allow the auto_link_downgrade_status to default to 0 to avoid any arbitrary value from coming up. Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes") Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Link: https://lore.kernel.org/r/20250516124355.4872-1-aradhya.bhatia@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit a7f87deac2295d11865048bcb9c2de369b52ed93) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-05-23drm/xe/guc: Make creation of SLPC debugfs files conditionalAradhya Bhatia
Platforms that do not support SLPC are exempted from the GuC PC support. The GuC PC does not get initialized, and neither do its BOs get created. This causes a problem because the GuC PC debugfs file is still being created. Whenever the file is attempted to read, it causes a NULL pointer dereference on the supposed BO of the GuC PC. So, make the creation of SLPC debugfs files conditional to when SLPC features are supported. Fixes: aaab5404b16f ("drm/xe: Introduce GuC PC debugfs") Suggested-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Link: https://lore.kernel.org/r/20250516141902.5614-1-aradhya.bhatia@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 17486cf3df5320752cc67ee8bcb2379d1b9de76c) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-05-22drm/xe: Use xe_mmio_read32() to read mtcfg registerShuicheng Lin
The mtcfg register is a 32-bit register and should therefore be accessed using xe_mmio_read32(). Other 3 changes per codestyle suggestion: " xe_mmio.c:83: CHECK: Alignment should match open parenthesis xe_mmio.c:131: CHECK: Comparison to NULL could be written "!xe->mmio.regs" xe_mmio.c:315: CHECK: line length of 103 exceeds 100 columns " Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250513153010.3464767-1-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit d2662cf8f44a68deb6c76ad9f1d9f29dbf7ba601) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-22drm/xe/mocs: Check if all domains awakeTejas Upadhyay
Check if all domains are awake specially for LNCF regs Fixes: 298661cd9cea ("drm/xe: Fix MOCS debugfs LNCF readout") Improvements-suggested-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250506142300.1865783-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> (cherry picked from commit a383cf218ef8bb35d4c03958bd956573b65cf778) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14drm/xe: Add WA BB to capture active context utilizationUmesh Nerlige Ramappa
Context Timestamp (CTX_TIMESTAMP) in the LRC accumulates the run ticks of the context, but only gets updated when the context switches out. In order to check how long a context has been active before it switches out, two things are required: (1) Determine if the context is running: To do so, we program the WA BB to set an initial value for CTX_TIMESTAMP in the LRC. The value chosen is 1 since 0 is the initial value when the LRC is initialized. During a query, we just check for this value to determine if the context is active. If the context switched out, it would overwrite this location with the actual CTX_TIMESTAMP MMIO value. Note that WA BB runs as the last part of the context restore, so reusing this LRC location will not clobber anything. (2) Calculate the time that the context has been active for: The CTX_TIMESTAMP ticks only when the context is active. If a context is active, we just use the CTX_TIMESTAMP MMIO as the new value of utilization. While doing so, we need to read the CTX_TIMESTAMP MMIO for the specific engine instance. Since we do not know which instance the context is running on until it is scheduled, we also read the ENGINE_ID MMIO in the WA BB and store it in the PPHSWP. Using the above 2 instructions in a WA BB, capture active context utilization. v2: (Matt Brost) - This breaks TDR, fix it by saving the CTX_TIMESTAMP register "drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value" - Drop tile from LRC if using gt "drm/xe: Save the gt pointer in LRC and drop the tile" v3: - Remove helpers for bb_per_ctx_ptr (Matt) - Add define for context active value (Matt) - Use 64 bit CTX TIMESTAMP for platforms that support it. For platforms that don't, live with the rare race. (Matt, Lucas) - Convert engine id to hwe and get the MMIO value (Lucas) - Correct commit message on when WA BB runs (Lucas) v4: - s/GRAPHICS_VER(...)/xe->info.has_64bit_timestamp/ (Matt) - Drop support for active utilization on a VF (CI failure) - In xe_lrc_init ensure the lrc value is 0 to begin with (CI regression) v5: - Minor checkpatch fix - Squash into previous commit and make TDR use 32-bit time - Update code comment to match commit msg Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4532 Cc: <stable@vger.kernel.org> # v6.13+ Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250509161159.2173069-8-umesh.nerlige.ramappa@intel.com (cherry picked from commit 82b98cadb01f63cdb159e596ec06866d00f8e8c7) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14drm/xe: Save the gt pointer in lrc and drop the tileUmesh Nerlige Ramappa
Save the gt pointer in the lrc so that it can used for gt based helpers. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250509161159.2173069-7-umesh.nerlige.ramappa@intel.com (cherry picked from commit 741d3ef8b8b88fab2729ca89de1180e49bc9cef0) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC valueUmesh Nerlige Ramappa
For determining actual job execution time, save the current value of the CTX_TIMESTAMP register rather than the value saved in LRC since the current register value is the closest to the start time of the job. v2: Define MI_STORE_REGISTER_MEM to fix compile error v3: Place MI_STORE_REGISTER_MEM sorted by MI_INSTR (Lucas) Fixes: 65921374c48f ("drm/xe: Emit ctx timestamp copy in ring ops") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250509161159.2173069-6-umesh.nerlige.ramappa@intel.com (cherry picked from commit 38b14233e5deff51db8faec287b4acd227152246) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14drm/xe: Timeslice GPU on atomic SVM faultMatthew Brost
Ensure GPU can make forward progress on an atomic SVM GPU fault by giving the GPU a timeslice of 5ms v2: - Reduce timeslice to 5ms - Double timeslice on retry - Split out GPU SVM changes into independent patch v5: - Double timeslice in a few more places Fixes: 2f118c949160 ("drm/xe: Add SVM VRAM migration") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-5-matthew.brost@intel.com (cherry picked from commit a5d8d3be1dea8154edbbea481081469627665659) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14drm/xe: Strict migration policy for atomic SVM faultsMatthew Brost
Mixing GPU and CPU atomics does not work unless a strict migration policy of GPU atomics must be device memory. Enforce a policy of must be in VRAM with a retry loop of 3 attempts, if retry loop fails abort fault. Removing always_migrate_to_vram modparam as we now have real migration policy. v2: - Only retry migration on atomics - Drop alway migrate modparam v3: - Only set vram_only on DGFX (Himal) - Bail on get_pages failure if vram_only and retry count exceeded (Himal) - s/vram_only/devmem_only - Update xe_svm_range_is_valid to accept devmem_only argument v4: - Fix logic bug get_pages failure v5: - Fix commit message (Himal) - Mention removing always_migrate_to_vram in commit message (Lucas) - Fix xe_svm_range_is_valid to check for devmem pages - Bail on devmem_only && !migrate_devmem (Thomas) v6: - Add READ_ONCE barriers for opportunistic checks (Thomas) - Pair READ_ONCE with WRITE_ONCE (Thomas) v7: - Adjust comments (Thomas) Fixes: 2f118c949160 ("drm/xe: Add SVM VRAM migration") Cc: stable@vger.kernel.org Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-3-matthew.brost@intel.com (cherry picked from commit a9ac0fa455b050d03e3032501368048fb284d318) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14drm/xe/xe2hpg: Add Wa_22021007897Aradhya Bhatia
Add Wa_22021007897 for the Xe2_HPG (graphics version: 20.01) IP. It is a permanent workaround, and applicable on all the steppings. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Link: https://lore.kernel.org/r/20250512065004.2576-1-aradhya.bhatia@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit e5c13e2c505b73a8667ef9a0fd5cbd4227e483e6) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-13drm/xe: Fix the gem shrinker nameThomas Hellström
The xe buffer object shrinker name is visible in the <debugfs>/shrinker directory and most if not all other shinkers follow a naming convention that looks like <subsystem>-<driver>_<objects>:<unique> Follow the same convention for xe, changing the name to drm-xe_gem:<unique>. Other shrinkers typically use the device node for <unique> but since drm drivers typically don't have a single unique device- node, instead use the unique name in the drm device. Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250508112931.3347-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 243bf99e2fe75edf8df1711c1377b6fc020b806c) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-13PM: sleep: Introduce pm_suspend_in_progress()Rafael J. Wysocki
Introduce pm_suspend_in_progress() to be used for checking if a system- wide suspend or resume transition is in progress, instead of comparing pm_suspend_target_state directly to PM_SUSPEND_ON, and use it where applicable. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/2020901.PYKUYFuaPT@rjwysocki.net
2025-05-10Merge tag 'drm-intel-next-2025-05-08' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Non-display related: - Fix undefined reference to `intel_pxp_gsccs_is_ready_for_sessions' Display related: - More work towards display separation (Jani) - Stop writing VRR_CTL_IGN_MAX_SHIFT for MTL onwards (Jouni) - DSC checks for 3 engines (Ankit) - Add link rate and lane count to i915_display_info (Khaled) - PSR fixes and workaround for underrun on idle (Jouni) - LOBF enablement and ALMP fixes (Animesh) - Clean up VGA plane handling (Ville) - Use an intel_connector pointer everywhere (Imre) - Fix warning for coffeelake on SunrisePoint PCH (Jiajia) - Rework/Correction on minimum hblank calculation (Arun) - Dmesg clean up (Jani) - Add a couple of simple display workarounds (Ankit, Vinod) - Refactor HDCP GSC (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aByyL3bEufPu79OM@intel.com
2025-05-08drm/xe: Add config control for svm flush workShuicheng Lin
Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below warning pops. Refine the flush work code to be controlled by the config to avoid below warning: " [ 453.132028] ------------[ cut here ]------------ [ 453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0 [ 453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec [ 453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G U W 6.15.0-rc3+ #7 PREEMPT(full) [ 453.135405] Tainted: [U]=USER, [W]=WARN ... [ 453.136921] RIP: 0010:__flush_work+0x379/0x3a0 [ 453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe [ 453.139250] RSP: 0018:ffffc90000c67b18 EFLAGS: 00010246 [ 453.139782] RAX: 0000000000000000 RBX: ffff888108a24000 RCX: 0000000000002000 [ 453.140521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881016d61c8 [ 453.141253] RBP: ffff8881016d61c8 R08: 0000000000000000 R09: 0000000000000000 [ 453.141985] R10: 0000000000000000 R11: 0000000008a24000 R12: 0000000000000001 [ 453.142709] R13: 0000000000000002 R14: 0000000000000000 R15: ffff888107db8c00 [ 453.143450] FS: 00007f44853d4c80(0000) GS:ffff8882f469b000(0000) knlGS:0000000000000000 [ 453.144276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 453.144853] CR2: 00007f4487629228 CR3: 00000001016aa000 CR4: 00000000000406f0 [ 453.145594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 453.146320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 453.147061] Call Trace: [ 453.147336] <TASK> [ 453.147579] ? tick_nohz_tick_stopped+0xd/0x30 [ 453.148067] ? xas_load+0x9/0xb0 [ 453.148435] ? xa_load+0x6f/0xb0 [ 453.148781] __xe_vm_bind_ioctl+0xbd5/0x1500 [xe] [ 453.149338] ? dev_printk_emit+0x48/0x70 [ 453.149762] ? _dev_printk+0x57/0x80 [ 453.150148] ? drm_ioctl+0x17c/0x440 [ 453.150544] ? __drm_dev_vprintk+0x36/0x90 [ 453.150983] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.151575] ? drm_ioctl_kernel+0x9f/0xf0 [ 453.151998] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.152560] drm_ioctl_kernel+0x9f/0xf0 [ 453.152968] drm_ioctl+0x20f/0x440 [ 453.153332] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.153893] ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100 [ 453.154489] ? memory_bm_test_bit+0x5/0x60 [ 453.154935] xe_drm_ioctl+0x47/0x70 [xe] [ 453.155419] __x64_sys_ioctl+0x8d/0xc0 [ 453.155824] do_syscall_64+0x47/0x110 [ 453.156228] entry_SYSCALL_64_after_hwframe+0x76/0x7e " v2 (Matt): refine commit message to have more details add Fixes tag move the code to xe_svm.h which already have the config remove a blank line per codestyle suggestion Fixes: 63f6e480d115 ("drm/xe: Add SVM garbage collector") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com (cherry picked from commit 9d80698bcd97a5ad1088bcbb055e73fd068895e2) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08drm/xe: Release force wake first then runtime powerShuicheng Lin
xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for the release path, xe_force_wake_put() should be called first then xe_pm_runtime_put(). Combine the error path and normal path together with goto. Fixes: 85d547608ef5 ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 432cd94efdca06296cc5e76d673546f58aa90ee1) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08drm/xe/gsc: do not flush the GSC worker from the reset pathDaniele Ceraolo Spurio
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM, while the GSC one isn't (and can't be as we need to do memory allocations in the gsc worker). Therefore, we can't flush the latter from the former. The reason why we had such a flush was to avoid interrupting either the GSC FW load or in progress GSC proxy operations. GSC proxy operations fall into 2 categories: 1) GSC proxy init: this only happens once immediately after GSC FW load and does not support being interrupted. The only way to recover from an interruption of the proxy init is to do an FLR and re-load the GSC. 2) GSC proxy request: this can happen in response to a request that the driver sends to the GSC. If this is interrupted, the GSC FW will timeout and the driver request will be failed, but overall the GSC will keep working fine. Flushing the work allowed us to avoid interruption in both cases (unless the hang came from the GSC engine itself, in which case we're toast anyway). However, a failure on a proxy request is tolerable if we're in a scenario where we're triggering a GT reset (i.e., something is already gone pretty wrong), so what we really need to avoid is interrupting the init flow, which we can do by polling on the register that reports when the proxy init is complete (as that ensure us that all the load and init operations have been completed). Note that during suspend we still want to do a flush of the worker to make sure it completes any operations involving the HW before the power is cut. v2: fix spelling in commit msg, rename waiter function (Julia) Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com (cherry picked from commit 12370bfcc4f0bdf70279ec5b570eb298963422b5) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regsTejas Upadhyay
LNCF registers report wrong values when XE_FORCEWAKE_GT only is held. Holding XE_FORCEWAKE_ALL ensures correct operations on LNCF regs. V2(Himal): - Use xe_force_wake_ref_has_domain Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1999 Fixes: a6a4ea6d7d37 ("drm/xe: Add mocs kunit") Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250428082357.1730068-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> (cherry picked from commit 70a2585e582058e94fe4381a337be42dec800337) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08drm/xe: Add page queue multiplierMatthew Brost
For an unknown reason the math to determine the PF queue size does is not correct - compute UMD applications are overflowing the PF queue which is fatal. A multippier of 8 fixes the problem. Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://lore.kernel.org/r/20250408155915.78770-1-matthew.brost@intel.com (cherry picked from commit 29582e0ea75c95668d168b12406e3c56cf5a73c4) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-07drm/xe: Expose PCIe link downgrade attributesRaag Jadav
Expose sysfs attributes for PCIe link downgrade capability and status. v2: Move from debugfs to sysfs (Lucas, Rodrigo, Badal) Rework macros and their naming (Rodrigo) v3: Use sysfs_create_files() (Riana) Fix checkpatch warning (Riana) v4: s/downspeed/downgrade (Lucas, Rodrigo, Riana) v5: Use PCIe Gen agnostic naming (Rodrigo) v6: s/pcie_gen/auto_link (Lucas) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250506054835.3395220-3-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-05-07drm/xe: Move xe_device_sysfs_init() to xe_device_probe()Raag Jadav
Since xe_device_sysfs_init() exposes device specific attributes, a better place for it is xe_device_probe(). Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250506054835.3395220-2-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-05-07drm/xe: Release force wake first then runtime powerShuicheng Lin
xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for the release path, xe_force_wake_put() should be called first then xe_pm_runtime_put(). Combine the error path and normal path together with goto. Fixes: 85d547608ef5 ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-05-07drm/xe: Add config control for svm flush workShuicheng Lin
Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below warning pops. Refine the flush work code to be controlled by the config to avoid below warning: " [ 453.132028] ------------[ cut here ]------------ [ 453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0 [ 453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec [ 453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G U W 6.15.0-rc3+ #7 PREEMPT(full) [ 453.135405] Tainted: [U]=USER, [W]=WARN ... [ 453.136921] RIP: 0010:__flush_work+0x379/0x3a0 [ 453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe [ 453.139250] RSP: 0018:ffffc90000c67b18 EFLAGS: 00010246 [ 453.139782] RAX: 0000000000000000 RBX: ffff888108a24000 RCX: 0000000000002000 [ 453.140521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881016d61c8 [ 453.141253] RBP: ffff8881016d61c8 R08: 0000000000000000 R09: 0000000000000000 [ 453.141985] R10: 0000000000000000 R11: 0000000008a24000 R12: 0000000000000001 [ 453.142709] R13: 0000000000000002 R14: 0000000000000000 R15: ffff888107db8c00 [ 453.143450] FS: 00007f44853d4c80(0000) GS:ffff8882f469b000(0000) knlGS:0000000000000000 [ 453.144276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 453.144853] CR2: 00007f4487629228 CR3: 00000001016aa000 CR4: 00000000000406f0 [ 453.145594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 453.146320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 453.147061] Call Trace: [ 453.147336] <TASK> [ 453.147579] ? tick_nohz_tick_stopped+0xd/0x30 [ 453.148067] ? xas_load+0x9/0xb0 [ 453.148435] ? xa_load+0x6f/0xb0 [ 453.148781] __xe_vm_bind_ioctl+0xbd5/0x1500 [xe] [ 453.149338] ? dev_printk_emit+0x48/0x70 [ 453.149762] ? _dev_printk+0x57/0x80 [ 453.150148] ? drm_ioctl+0x17c/0x440 [ 453.150544] ? __drm_dev_vprintk+0x36/0x90 [ 453.150983] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.151575] ? drm_ioctl_kernel+0x9f/0xf0 [ 453.151998] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.152560] drm_ioctl_kernel+0x9f/0xf0 [ 453.152968] drm_ioctl+0x20f/0x440 [ 453.153332] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.153893] ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100 [ 453.154489] ? memory_bm_test_bit+0x5/0x60 [ 453.154935] xe_drm_ioctl+0x47/0x70 [xe] [ 453.155419] __x64_sys_ioctl+0x8d/0xc0 [ 453.155824] do_syscall_64+0x47/0x110 [ 453.156228] entry_SYSCALL_64_after_hwframe+0x76/0x7e " v2 (Matt): refine commit message to have more details add Fixes tag move the code to xe_svm.h which already have the config remove a blank line per codestyle suggestion Fixes: 63f6e480d115 ("drm/xe: Add SVM garbage collector") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com
2025-05-07drm/xe: Use copy_from_user() instead of __copy_from_user()Harish Chegondi
copy_from_user() has more checks and is more safer than __copy_from_user() Suggested-by: Kees Cook <kees@kernel.org> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/acabf20aa8621c7bc8de09b1bffb8d14b5376484.1746126614.git.harish.chegondi@intel.com
2025-05-07drm/i915/irq: move i915->irq_lock to display->irq.lockJani Nikula
Observe that i915->irq_lock is no longer used to protect anything outside of display. Make it a display thing. This allows us to remove the ugly #define irq_lock irq.lock hack from xe compat header. Note that this is slightly more subtle than it first looks. For i915, there's no functional change here. The lock is moved. However, for xe, we'll now have *two* locks, xe->irq.lock and display->irq.lock. These should protect different things, though. Indeed, nesting in the past would've lead to a deadlock because they were the same lock. With the i915 references gone, we can make a handful more files independent of i915_drv.h. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/6d8d2ce0f34a9c7361a5e2fcf96bb32a34c57e76.1746536745.git.jani.nikula@intel.com [Jani: Fixed a comment while applying.] Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-07drm/i915/rps: refactor display rps supportJani Nikula
Make the gt rps code and display irq code interact via intel_display_rps.[ch], instead of direct access. Add no-op static inline stubs for xe instead of having a separate build unit doing nothing. All of this clarifies the interfaces between i915 core and display. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/ef2a46dc8f30b72282494f54e98cb5fed7523b58.1746536745.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-05drm/xe/gsc: do not flush the GSC worker from the reset pathDaniele Ceraolo Spurio
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM, while the GSC one isn't (and can't be as we need to do memory allocations in the gsc worker). Therefore, we can't flush the latter from the former. The reason why we had such a flush was to avoid interrupting either the GSC FW load or in progress GSC proxy operations. GSC proxy operations fall into 2 categories: 1) GSC proxy init: this only happens once immediately after GSC FW load and does not support being interrupted. The only way to recover from an interruption of the proxy init is to do an FLR and re-load the GSC. 2) GSC proxy request: this can happen in response to a request that the driver sends to the GSC. If this is interrupted, the GSC FW will timeout and the driver request will be failed, but overall the GSC will keep working fine. Flushing the work allowed us to avoid interruption in both cases (unless the hang came from the GSC engine itself, in which case we're toast anyway). However, a failure on a proxy request is tolerable if we're in a scenario where we're triggering a GT reset (i.e., something is already gone pretty wrong), so what we really need to avoid is interrupting the init flow, which we can do by polling on the register that reports when the proxy init is complete (as that ensure us that all the load and init operations have been completed). Note that during suspend we still want to do a flush of the worker to make sure it completes any operations involving the HW before the power is cut. v2: fix spelling in commit msg, rename waiter function (Julia) Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com
2025-05-02drm/i915/hdcp: pass struct drm_device to driver specific HDCP GSC codeJani Nikula
The driver specific HDCP GSC code will eventually be part of the driver cores rather than display. Remove the struct intel_display references from them, and pass struct drm_device instead. Cc: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/bf9aa8e44e18eef41e3077a2966935b4e2649b62.1745524803.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-02drm/i915/hdcp: simplify HDCP GSC firmware usage selectionJani Nikula
Just localize the GSC decision inside intel_hdcp.c, and deduplicate the conditions. Cc: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/a1d031bfbff7073e576dfe8d3d3d5a28d7bb2c15.1745524803.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-02drm/i915/hdcp: switch the HDCP GSC message interface from u8* to void*Jani Nikula
The in/out buffers are just opaque data, and don't need to be considered u8*. Switching to void* lets us drop a ton of unnecessary casts. Cc: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/ea005adb713e85b797d83204c80de0a2a8e5ab47.1745524803.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-02drm/i915/hdcp: pass the context to the HDCP GSC message interfaceJani Nikula
The opaque HDCP GSC context nicely abstracts the differences between drivers. Pass that instead of struct drm_i915_private or struct xe_device to intel_hdcp_gsc_msg_send(). We can store the driver specific data in the context. This lets us drop the dependency on i915_drv.h from intel_hdcp_gsc_message.c. Cc: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/df1653212f9014e717701b017e78e0017884b870.1745524803.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-02drm/i915/hdcp: rename HDCP GSC context alloc/free functionsJani Nikula
Name the functions intel_hdcp_gsc_context_alloc() and intel_hdcp_gsc_context_free() for consistency. Cc: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/c6e25686ed20b5fdea9a59faf6a64a7312a075b0.1745524803.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-02drm/i915/hdcp: rename struct intel_hdcp_gsc_message to intel_hdcp_gsc_contextJani Nikula
It's really about the context more than about the message. Cc: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/ca0a802a81ba4e96e7c40646a32386d4351d6ff4.1745524803.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-02drm/i915/hdcp: split HDCP GSC message alloc/save responsibilitiesJani Nikula
Allocate and initialize the HDCP GSC message in intel_hdcp_gsc_hdcp2_init() as before, but store the pointer to display->hdcp.hdcp_message in the caller. Similarly, pass in the pointer to intel_hdcp_gsc_free_message(). Cc: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/a74fcc941126bf92d12115b5faf4f75099e26242.1745524803.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-02drm/i915/hdcp: deduplicate and refactor HDCP GSC ops initializationJani Nikula
The gsc_hdcp_ops is duplicated and initialized exactly the same way in two different places (for i915 and xe), and requires forward declarations for all the hooks. Deduplicate, and make the functions static. There are slight differences in the i915 and xe implementations of intel_hdcp_gsc_init() and intel_hdcp_gsc_fini(). Take the best of both, and improve. We need to expose intel_hdcp_gsc_hdcp2_init() and intel_hdcp_gsc_free_message() for this, and create the latter for xe. Cc: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/21e7871b35d4c7d13f016b5ecb4f10e5be72c531.1745524803.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-01drm/xe: Do not print timedout job message on killed exec queuesMatthew Brost
If a user ctrl-c an app while something is running on the GPU, jobs are expected to timeout. Do not spam dmesg with timedout job messages in this case. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250428175505.935694-1-matthew.brost@intel.com
2025-05-01drm/xe/eustall: Do not support EU stall on SRIOV VFHarish Chegondi
EU stall sampling is not supported on SRIOV VF. Do not initialize or open EU stall stream on SRIOV VF. Fixes: 9a0b11d4cf3b ("drm/xe/eustall: Add support to init, enable and disable EU stall sampling") Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/10db5d1c7e17aadca7078ff74575b7ffc0d5d6b8.1745215022.git.harish.chegondi@intel.com (cherry picked from commit 6ed20625a4b8189a1bd6598aa58e03147ce378ee) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-01drm/xe/eustall: Resolve a possible circular locking dependencyHarish Chegondi
Use a separate lock in the polling function eu_stall_data_buf_poll() instead of eu_stall->stream_lock. This would prevent a possible circular locking dependency leading to a deadlock as described below. This would also require additional locking with the new lock in the read function. <4> [787.192986] ====================================================== <4> [787.192988] WARNING: possible circular locking dependency detected <4> [787.192991] 6.14.0-rc7-xe+ #1 Tainted: G U <4> [787.192993] ------------------------------------------------------ <4> [787.192994] xe_eu_stall/20093 is trying to acquire lock: <4> [787.192996] ffff88819847e2c0 ((work_completion) (&(&stream->buf_poll_work)->work)), at: __flush_work+0x1f8/0x5e0 <4> [787.193005] but task is already holding lock: <4> [787.193007] ffff88814ce83ba8 (&gt->eu_stall->stream_lock){3:3}, at: xe_eu_stall_stream_ioctl+0x41/0x6a0 [xe] <4> [787.193090] which lock already depends on the new lock. <4> [787.193093] the existing dependency chain (in reverse order) is: <4> [787.193095] -> #1 (&gt->eu_stall->stream_lock){+.+.}-{3:3}: <4> [787.193099] __mutex_lock+0xb4/0xe40 <4> [787.193104] mutex_lock_nested+0x1b/0x30 <4> [787.193106] eu_stall_data_buf_poll_work_fn+0x44/0x1d0 [xe] <4> [787.193155] process_one_work+0x21c/0x740 <4> [787.193159] worker_thread+0x1db/0x3c0 <4> [787.193161] kthread+0x10d/0x270 <4> [787.193164] ret_from_fork+0x44/0x70 <4> [787.193168] ret_from_fork_asm+0x1a/0x30 <4> [787.193172] -> #0 ((work_completion)(&(&stream->buf_poll_work)->work)){+.+.}-{0:0}: <4> [787.193176] __lock_acquire+0x1637/0x2810 <4> [787.193180] lock_acquire+0xc9/0x300 <4> [787.193183] __flush_work+0x219/0x5e0 <4> [787.193186] cancel_delayed_work_sync+0x87/0x90 <4> [787.193189] xe_eu_stall_disable_locked+0x9a/0x260 [xe] <4> [787.193237] xe_eu_stall_stream_ioctl+0x5b/0x6a0 [xe] <4> [787.193285] __x64_sys_ioctl+0xa4/0xe0 <4> [787.193289] x64_sys_call+0x131e/0x2650 <4> [787.193292] do_syscall_64+0x91/0x180 <4> [787.193295] entry_SYSCALL_64_after_hwframe+0x76/0x7e <4> [787.193299] other info that might help us debug this: <4> [787.193302] Possible unsafe locking scenario: <4> [787.193304] CPU0 CPU1 <4> [787.193305] ---- ---- <4> [787.193306] lock(&gt->eu_stall->stream_lock); <4> [787.193308] lock((work_completion) (&(&stream->buf_poll_work)->work)); <4> [787.193311] lock(&gt->eu_stall->stream_lock); <4> [787.193313] lock((work_completion) (&(&stream->buf_poll_work)->work)); <4> [787.193315] *** DEADLOCK *** Fixes: 760edec939685 ("drm/xe/eustall: Add support to read() and poll() EU stall data") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4598 Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/c896932fca84f79db2df5942911997ed77b2b9b6.1744934656.git.harish.chegondi@intel.com (cherry picked from commit c2b1f1b8641372bb2e563c49eb25632623a860fc) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>