summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-03-06drm/xe: Enable CPU address mirror uAPIMatthew Brost
Support for CPU address mirror bindings in SRAM fully in place, enable the implementation. v3: - s/system allocator/CPU address mirror (Thomas) v7: - Only enable uAPI if selected by GPU SVM (CI) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-19-matthew.brost@intel.com
2025-03-06drm/xe: Do not allow CPU address mirror VMA unbind ifMatthew Brost
uAPI is designed with the use case that only mapping a BO to a malloc'd address will unbind a CPU-address mirror VMA. Therefore, allowing a CPU-address mirror VMA to unbind when the GPU has bindings in the range being unbound does not make much sense. This behavior is not supported, as it simplifies the code. This decision can always be revisited if a use case arises. v3: - s/arrises/arises (Thomas) - s/system allocator/GPU address mirror (Thomas) - Kernel doc (Thomas) - Newline between function defs (Thomas) v5: - Kernel doc (Thomas) v6: - Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-18-matthew.brost@intel.com
2025-03-06drm/xe: Add unbind to SVM garbage collectorMatthew Brost
Add unbind to SVM garbage collector. To facilitate add unbind support function to VM layer which unbinds a SVM range. Also teach PT layer to understand unbinds of SVM ranges. v3: - s/INVALID_VMA/XE_INVALID_VMA (Thomas) - Kernel doc (Thomas) - New GPU SVM range structure (Thomas) - s/DRM_GPUVA_OP_USER/DRM_GPUVA_OP_DRIVER (Thomas) v4: - Use xe_vma_op_unmap_range (Himal) v5: - s/PY/PT (Thomas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-17-matthew.brost@intel.com
2025-03-06drm/xe: Add SVM garbage collectorMatthew Brost
Add basic SVM garbage collector which destroy a SVM range upon a MMU UNMAP event. The garbage collector runs on worker or in GPU fault handler and is required as locks in the path of reclaim are required and cannot be taken the notifier. v2: - Flush garbage collector in xe_svm_close v3: - Better commit message (Thomas) - Kernel doc (Thomas) - Use list_first_entry_or_null for garbage collector loop (Thomas) - Don't add to garbage collector if VM is closed (Thomas) v4: - Use %pe to print error (Thomas) v5: - s/visable/visible (Thomas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-16-matthew.brost@intel.com
2025-03-06drm/xe: Add (re)bind to SVM page fault handlerMatthew Brost
Add (re)bind to SVM page fault handler. To facilitate add support function to VM layer which (re)binds a SVM range. Also teach PT layer to understand (re)binds of SVM ranges. v2: - Don't assert BO lock held for range binds - Use xe_svm_notifier_lock/unlock helper in xe_svm_close - Use drm_pagemap dma cursor - Take notifier lock in bind code to check range state v3: - Use new GPU SVM range structure (Thomas) - Kernel doc (Thomas) - s/DRM_GPUVA_OP_USER/DRM_GPUVA_OP_DRIVER (Thomas) v5: - Kernel doc (Thomas) v6: - Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Tested-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-15-matthew.brost@intel.com
2025-03-06drm/xe: Add SVM range invalidation and page faultMatthew Brost
Add SVM range invalidation vfunc which invalidates PTEs. A new PT layer function which accepts a SVM range is added to support this. In addition, add the basic page fault handler which allocates a SVM range which is used by SVM range invalidation vfunc. v2: - Don't run invalidation if VM is closed - Cycle notifier lock in xe_svm_close - Drop xe_gt_tlb_invalidation_fence_fini v3: - Better commit message (Thomas) - Add lockdep asserts (Thomas) - Add kernel doc (Thomas) - s/change/changed (Thomas) - Use new GPU SVM range / notifier structures - Ensure PTEs are zapped / dma mappings are unmapped on VM close (Thomas) v4: - Fix macro (Checkpatch) v5: - Use range start/end helpers (Thomas) - Use notifier start/end helpers (Thomas) v6: - Use min/max helpers (Himal) - Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-13-matthew.brost@intel.com
2025-03-06drm/xe: Nuke VM's mapping upon closeMatthew Brost
Clear root PT entry and invalidate entire VM's address space when closing the VM. Will prevent the GPU from accessing any of the VM's memory after closing. v2: - s/vma/vm in kernel doc (CI) - Don't nuke migration VM as this occur at driver unload (CI) v3: - Rebase and pull into SVM series (Thomas) - Wait for pending binds (Thomas) v5: - Remove xe_gt_tlb_invalidation_fence_fini in error case (Matt Auld) - Drop local migration bool (Thomas) v7: - Add drm_dev_enter/exit protecting invalidation (CI, Matt Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-12-matthew.brost@intel.com
2025-03-06drm/xe: Add dma_addr res cursorThomas Hellström
Add dma_addr res cursor which walks an array of drm_pagemap_dma_addr. Useful for SVM ranges and programing page tables. v3: - Better commit message (Thomas) - Use new drm_pagemap.h location v7: - Fix kernel doc (CI) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-11-matthew.brost@intel.com
2025-03-06drm/xe: Add SVM init / close / fini to faulting VMsMatthew Brost
Add SVM init / close / fini to faulting VMs. Minimual implementation acting as a placeholder for follow on patches. v2: - Add close function v3: - Better commit message (Thomas) - Kernel doc (Thomas) - Update chunk array to be unsigned long (Thomas) - Use new drm_gpusvm.h header location (Thomas) - Newlines between functions in xe_svm.h (Thomas) - Call drm_gpusvm_driver_set_lock in init (Thomas) v6: - Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas) v7: - Only select CONFIG_DRM_GPUSVM if DEVICE_PRIVATE (CI) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-10-matthew.brost@intel.com
2025-03-06drm/xe/uapi: Add DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRRORMatthew Brost
Add the DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag, which is used to create unpopulated virtual memory areas (VMAs) without memory backing or GPU page tables. These VMAs are referred to as CPU address mirror VMAs. The idea is that upon a page fault or prefetch, the memory backing and GPU page tables will be populated. CPU address mirror VMAs only update GPUVM state; they do not have an internal page table (PT) state, nor do they have GPU mappings. It is expected that CPU address mirror VMAs will be mixed with buffer object (BO) VMAs within a single VM. In other words, system allocations and runtime allocations can be mixed within a single user-mode driver (UMD) program. Expected usage: - Bind the entire virtual address (VA) space upon program load using the DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag. - If a buffer object (BO) requires GPU mapping (runtime allocation), allocate a CPU address using mmap(PROT_NONE), bind the BO to the mmapped address using existing bind IOCTLs. If a CPU map of the BO is needed, mmap it again to the same CPU address using mmap(MAP_FIXED) - If a BO no longer requires GPU mapping, munmap it from the CPU address space and them bind the mapping address with the DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag. - Any malloc'd or mmapped CPU address accessed by the GPU will be faulted in via the SVM implementation (system allocation). - Upon freeing any mmapped or malloc'd data, the SVM implementation will remove GPU mappings. Only supporting 1 to 1 mapping between user address space and GPU address space at the moment as that is the expected use case. uAPI defines interface for non 1 to 1 but enforces 1 to 1, this restriction can be lifted if use cases arrise for non 1 to 1 mappings. This patch essentially short-circuits the code in the existing VM bind paths to avoid populating page tables when the DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag is set. v3: - Call vm_bind_ioctl_ops_fini on -ENODATA - Don't allow DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR on non-faulting VMs - s/DRM_XE_VM_BIND_FLAG_SYSTEM_ALLOCATOR/DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR (Thomas) - Rework commit message for expected usage (Thomas) - Describe state of code after patch in commit message (Thomas) v4: - Fix alignment (Checkpatch) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-9-matthew.brost@intel.com
2025-03-06drm/xe: Select DRM_GPUSVM KconfigMatthew Brost
Xe depends on DRM_GPUSVM for SVM implementation, select it in Kconfig. v6: - Don't select DRM_GPUSVM if UML (CI) v7: - Only select DRM_GPUSVM if DEVICE_PRIVATE (CI) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-8-matthew.brost@intel.com
2025-03-06drm/gpusvm: Add support for GPU Shared Virtual MemoryMatthew Brost
This patch introduces support for GPU Shared Virtual Memory (SVM) in the Direct Rendering Manager (DRM) subsystem. SVM allows for seamless sharing of memory between the CPU and GPU, enhancing performance and flexibility in GPU computing tasks. The patch adds the necessary infrastructure for SVM, including data structures and functions for managing SVM ranges and notifiers. It also provides mechanisms for allocating, deallocating, and migrating memory regions between system RAM and GPU VRAM. This is largely inspired by GPUVM. v2: - Take order into account in check pages - Clear range->pages in get pages error - Drop setting dirty or accessed bit in get pages (Vetter) - Remove mmap assert for cpu faults - Drop mmap write lock abuse (Vetter, Christian) - Decouple zdd from range (Vetter, Oak) - Add drm_gpusvm_range_evict, make it work with coherent pages - Export drm_gpusvm_evict_to_sram, only use in BO evict path (Vetter) - mmget/put in drm_gpusvm_evict_to_sram - Drop range->vram_alloation variable - Don't return in drm_gpusvm_evict_to_sram until all pages detached - Don't warn on mixing sram and device pages - Update kernel doc - Add coherent page support to get pages - Use DMA_FROM_DEVICE rather than DMA_BIDIRECTIONAL - Add struct drm_gpusvm_vram and ops (Thomas) - Update the range's seqno if the range is valid (Thomas) - Remove the is_unmapped check before hmm_range_fault (Thomas) - Use drm_pagemap (Thomas) - Drop kfree_mapping (Thomas) - dma mapp pages under notifier lock (Thomas) - Remove ctx.prefault - Remove ctx.mmap_locked - Add ctx.check_pages - s/vram/devmem (Thomas) v3: - Fix memory leak drm_gpusvm_range_get_pages - Only migrate pages with same zdd on CPU fault - Loop over al VMAs in drm_gpusvm_range_evict - Make GPUSVM a drm level module - GPL or MIT license - Update main kernel doc (Thomas) - Prefer foo() vs foo for functions in kernel doc (Thomas) - Prefer functions over macros (Thomas) - Use unsigned long vs u64 for addresses (Thomas) - Use standard interval_tree (Thomas) - s/drm_gpusvm_migration_put_page/drm_gpusvm_migration_unlock_put_page (Thomas) - Drop err_out label in drm_gpusvm_range_find_or_insert (Thomas) - Fix kernel doc in drm_gpusvm_range_free_pages (Thomas) - Newlines between functions defs in header file (Thomas) - Drop shall language in driver vfunc kernel doc (Thomas) - Move some static inlines from head to C file (Thomas) - Don't allocate pages under page lock in drm_gpusvm_migrate_populate_ram_pfn (Thomas) - Change check_pages to a threshold v4: - Fix NULL ptr deref in drm_gpusvm_migrate_populate_ram_pfn (Thomas, Himal) - Fix check pages threshold - Check for range being unmapped under notifier lock in get pages (Testing) - Fix characters per line - Drop WRITE_ONCE for zdd->devmem_allocation assignment (Thomas) - Use completion for devmem_allocation->detached (Thomas) - Make GPU SVM depend on ZONE_DEVICE (CI) - Use hmm_range_fault for eviction (Thomas) - Drop zdd worker (Thomas) v5: - Select Kconfig deps (CI) - Set device to NULL in __drm_gpusvm_migrate_to_ram (Matt Auld, G.G.) - Drop Thomas's SoB (Thomas) - Add drm_gpusvm_range_start/end/size helpers (Thomas) - Add drm_gpusvm_notifier_start/end/size helpers (Thomas) - Absorb drm_pagemap name changes (Thomas) - Fix driver lockdep assert (Thomas) - Move driver lockdep assert to static function (Thomas) - Assert mmap lock held in drm_gpusvm_migrate_to_devmem (Thomas) - Do not retry forever on eviction (Thomas) v6: - Fix drm_gpusvm_get_devmem_page alignment (Checkpatch) - Modify Kconfig (CI) - Compile out lockdep asserts (CI) v7: - Add kernel doc for flags fields (CI, Auld) Cc: Simona Vetter <simona.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <dri-devel@lists.freedesktop.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-7-matthew.brost@intel.com
2025-03-06drm/xe/bo: Introduce xe_bo_put_asyncThomas Hellström
Introduce xe_bo_put_async to put a bo where the context is such that the bo destructor can't run due to lockdep problems or atomic context. If the put is the final put, freeing will be done from a work item. v5: - Kerenl doc for xe_bo_put_async (Thomas) v7: - Fix kernel doc (CI) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Tested-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-6-matthew.brost@intel.com
2025-03-06drm/xe: Retry BO allocationMatthew Brost
TTM doesn't support fair eviction via WW locking, this mitigated in by using retry loops in exec and preempt rebind worker. Extend this retry loop to BO allocation. Once TTM supports fair eviction this patch can be reverted. v4: - Keep line break (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-2-matthew.brost@intel.com
2025-03-06Merge tag 'net-6.14-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth and wireless. Current release - new code bugs: - wifi: nl80211: disable multi-link reconfiguration Previous releases - regressions: - gso: fix ownership in __udp_gso_segment - wifi: iwlwifi: - fix A-MSDU TSO preparation - free pages allocated when failing to build A-MSDU - ipv6: fix dst ref loop in ila lwtunnel - mptcp: fix 'scheduling while atomic' in mptcp_pm_nl_append_new_local_addr - bluetooth: add check for mgmt_alloc_skb() in mgmt_device_connected() - ethtool: allow NULL nlattrs when getting a phy_device - eth: be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink Previous releases - always broken: - core: support TCP GSO case for a few missing flags - wifi: mac80211: - fix vendor-specific inheritance - cleanup sta TXQs on flush - llc: do not use skb_get() before dev_queue_xmit() - eth: ipa: nable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX} for v4.7" * tag 'net-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (41 commits) net: ipv6: fix missing dst ref drop in ila lwtunnel net: ipv6: fix dst ref loop in ila lwtunnel mctp i3c: handle NULL header address net: dsa: mt7530: Fix traffic flooding for MMIO devices net-timestamp: support TCP GSO case for a few missing flags vlan: enforce underlying device type mptcp: fix 'scheduling while atomic' in mptcp_pm_nl_append_new_local_addr net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device ppp: Fix KMSAN uninit-value warning with bpf net: ipa: Enable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX} for v4.7 net: ipa: Fix QSB data for v4.7 net: ipa: Fix v4.7 resource group names net: hns3: make sure ptp clock is unregister and freed if hclge_ptp_get_cycle returns an error wifi: nl80211: disable multi-link reconfiguration net: dsa: rtl8366rb: don't prompt users for LED control be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink llc: do not use skb_get() before dev_queue_xmit() wifi: cfg80211: regulatory: improve invalid hints checking caif_virtio: fix wrong pointer check in cfv_probe() net: gso: fix ownership in __udp_gso_segment ...
2025-03-06drm/msm/dpu: Support YUV formats on writeback for DPU 5.x+Jessica Zhang
Now that CDM_0 has been enabled for DPU 5.x+, add support for YUV formats on writeback Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/641270/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2025-03-06drm/msm/dpu: Clear perf params before calculating bwJessica Zhang
To prevent incorrect BW calculation, zero out dpu_core_perf_params before it is passed into dpu_core_perf_aggregate(). Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> Fixes: 795aef6f3653 ("drm/msm/dpu: remove duplicate code calculating sum of bandwidths") Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/641278/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2025-03-06Merge branch 'for-6.15/features' into fwctlJason Gunthorpe
Add CXL mailbox Features commands enabling. This is also preparation for CXL fwctl enabling. The same code will also be utilized by the CXL EDAC enabling. The commands 'Get Supported Features', 'Get Feature', and 'Set Feature' are enabled for kernel usages. Required for the CXL fwctl driver. * branch 'for-6.15/features' cxl: Setup exclusive CXL features that are reserved for the kernel cxl/mbox: Add SET_FEATURE mailbox command cxl/mbox: Add GET_FEATURE mailbox command cxl/test: Add Get Supported Features mailbox command support cxl: Add Get Supported Features command for kernel usage cxl: Enumerate feature commands cxl: Refactor user ioctl command path from mds to mailbox Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-06mlx5: Create an auxiliary device for fwctl_mlx5Saeed Mahameed
If the device supports User Context then it can support fwctl. Create an auxiliary device to allow fwctl to bind to it. Create a sysfs like: $ ls /sys/devices/pci0000:00/0000:00:0a.0/mlx5_core.fwctl.0/driver -l lrwxrwxrwx 1 root root 0 Apr 25 19:46 /sys/devices/pci0000:00/0000:00:0a.0/mlx5_core.fwctl.0/driver -> ../../../../bus/auxiliary/drivers/mlx5_fwctl.mlx5_fwctl Link: https://patch.msgid.link/r/8-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-06fwctl/mlx5: Support for communicating with mlx5 fwSaeed Mahameed
mlx5 FW has a built in security context called UID. Each UID has a set of permissions controlled by the kernel when it is created and every command is tagged by the kernel with a particular UID. In general commands cannot reach objects outside of their UID and commands cannot exceed their UID's permissions. These restrictions are enforced by FW. This mechanism has long been used in RDMA for the devx interface where RDMA will sent commands directly to the FW and the UID limitations restrict those commands to a ib_device/verbs security domain. For instance commands that would effect other VFs, or global device resources. The model is suitable for unprivileged userspace to operate the RDMA functionality. The UID has been extended with a "tools resources" permission which allows additional commands and sub-commands that are intended to match with the scope limitations set in FWCTL. This is an alternative design to the "command intent log" where the FW does the enforcement rather than having the FW report the enforcement the kernel should do. Consistent with the fwctl definitions the "tools resources" security context is limited to the FWCTL_RPC_CONFIGURATION, FWCTL_RPC_DEBUG_READ_ONLY, FWCTL_RPC_DEBUG_WRITE, and FWCTL_RPC_DEBUG_WRITE_FULL security scopes. Like RDMA devx, each opened fwctl file descriptor will get a unique UID associated with each file descriptor. The fwctl driver is kept simple and we reject commands that can create objects as the UID mechanism relies on the kernel to track and destroy objects prior to detroying the UID. Filtering into fwctl sub scopes is done inside the driver with a switch statement. This substantially limits what is possible to primarily query functions ad a few limited set operations. mlx5 already has a robust infrastructure for delivering RPC messages to fw. Trivially connect fwctl's RPC mechanism to mlx5_cmd_do(). Enforce the User Context ID in every RPC header accepted from the FD so the FW knows the security context of the issuing ID. Link: https://patch.msgid.link/r/7-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-06fwctl: FWCTL_RPC to execute a Remote Procedure Call to device firmwareJason Gunthorpe
Add the FWCTL_RPC ioctl which allows a request/response RPC call to device firmware. Drivers implementing this call must follow the security guidelines under Documentation/userspace-api/fwctl.rst The core code provides some memory management helpers to get the messages copied from and back to userspace. The driver is responsible for allocating the output message memory and delivering the message to the device. Link: https://patch.msgid.link/r/5-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-06fwctl: FWCTL_INFO to return basic information about the deviceJason Gunthorpe
Userspace will need to know some details about the fwctl interface being used to locate the correct userspace code to communicate with the kernel. Provide a simple device_type enum indicating what the kernel driver is. Allow the device to provide a device specific info struct that contains any additional information that the driver may need to provide to userspace. Link: https://patch.msgid.link/r/3-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-06fwctl: Basic ioctl dispatch for the character deviceJason Gunthorpe
Each file descriptor gets a chunk of per-FD driver specific context that allows the driver to attach a device specific struct to. The core code takes care of the memory lifetime for this structure. The ioctl dispatch and design is based on what was built for iommufd. The ioctls have a struct which has a combined in/out behavior with a typical 'zero pad' scheme for future extension and backwards compatibility. Like iommufd some shared logic does most of the ioctl marshaling and compatibility work and table dispatches to some function pointers for each unique ioctl. This approach has proven to work quite well in the iommufd and rdma subsystems. Allocate an ioctl number space for the subsystem. Link: https://patch.msgid.link/r/2-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-06fwctl: Add basic structure for a class subsystem with a cdevJason Gunthorpe
Create the class, character device and functions for a fwctl driver to un/register to the subsystem. A typical fwctl driver has a sysfs presence like: $ ls -l /dev/fwctl/fwctl0 crw------- 1 root root 250, 0 Apr 25 19:16 /dev/fwctl/fwctl0 $ ls /sys/class/fwctl/fwctl0 dev device power subsystem uevent $ ls /sys/class/fwctl/fwctl0/device/infiniband/ ibp0s10f0 $ ls /sys/class/infiniband/ibp0s10f0/device/fwctl/ fwctl0/ $ ls /sys/devices/pci0000:00/0000:00:0a.0/fwctl/fwctl0 dev device power subsystem uevent Which allows userspace to link all the multi-subsystem driver components together and learn the subsystem specific names for the device's components. Link: https://patch.msgid.link/r/1-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-06cpufreq/amd-pstate: Drop actions in amd_pstate_epp_cpu_offline()Mario Limonciello
When the CPU goes offline there is no need to change the CPPC request because the CPU will go into the deepest C-state it supports already. Actually changing the CPPC request when it goes offline messes up the cached values and can lead to the wrong values being restored when it comes back. Instead drop the actions and if the CPU comes back online let amd_pstate_epp_set_policy() restore it to expected values. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Stop caching EPPMario Limonciello
EPP values are cached in the cpudata structure per CPU. This is needless though because they are also cached in the CPPC request variable. Drop the separate cache for EPP values and always reference the CPPC request variable when needed. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Rework CPPC enablingMario Limonciello
The CPPC enable register is configured as "write once". That is any future writes don't actually do anything. Because of this, all the cleanup paths that currently exist for CPPC disable are non-effective. Rework CPPC enable to only enable after all the CAP registers have been read to avoid enabling CPPC on CPUs with invalid _CPC or unpopulated MSRs. As the register is write once, remove all cleanup paths as well. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Drop debug statements for policy settingMario Limonciello
There are trace events that exist now for all amd-pstate modes that will output information right before programming to the hardware. This makes the existing debug statements unnecessary remaining overhead. Drop them. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writesMario Limonciello
On EPP only writes update the cached variable so that the min/max performance controls don't need to be updated again. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp ↵Mario Limonciello
functions The EPP tracing is done by the caller today, but this precludes the information about whether the CPPC request has changed. Move it into the update_perf and set_epp functions and include information about whether the request has changed from the last one. amd_pstate_update_perf() and amd_pstate_set_epp() now require the policy as an argument instead of the cpudata. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Cache CPPC request in shared mem case tooMario Limonciello
In order to prevent a potential write for shmem_update_perf() cache the request into the cppc_req_cached variable normally only used for the MSR case. This adds symmetry into the code and potentially avoids extra writes. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masksMario Limonciello
Bitfield masks are easier to follow and less error prone. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Adjust variable scopeMario Limonciello
In amd_pstate_ut_check_freq() and amd_pstate_ut_check_perf() the cpudata variable is only needed in the scope of the for loop. Move it there. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Run on all of the correct CPUsMario Limonciello
If a CPU is missing a policy or one has been offlined then the unit test is skipped for the rest of the CPUs on the system. Instead; iterate online CPUs and skip any missing policies to allow continuing to test the rest of them. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Drop SUCCESS and FAIL enumsMario Limonciello
Enums are effectively used as a boolean and don't show the return value of the failing call. Instead of using enums switch to returning the actual return code from the unit test. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Allow lowest nonlinear and lowest to be the sameMario Limonciello
Several Ryzen AI processors support the exact same value for lowest nonlinear perf and lowest perf. Loosen up the unit tests to allow this scenario. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Use _free macro to free put policyMario Limonciello
Using a scoped cleanup macro simplifies cleanup code. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Drop `cppc_cap1_cached`Mario Limonciello
The `cppc_cap1_cached` variable isn't used at all, there is no need to read it at initialization for each CPU. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Overhaul lockingMario Limonciello
amd_pstate_cpu_boost_update() and refresh_frequency_limits() both update the policy state and have nothing to do with the amd-pstate driver itself. A global "limits" lock doesn't make sense because each CPU can have policies changed independently. Each time a CPU changes values they will atomically be written to the per-CPU perf member. Drop per CPU locking cases. The remaining "global" driver lock is used to ensure that only one entity can change driver modes at a given time. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Move perf values into a unionMario Limonciello
By storing perf values in a union all the writes and reads can be done atomically, removing the need for some concurrency protections. While making this change, also drop the cached frequency values, using inline helpers to calculate them on demand from perf value. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Drop min and max cached frequenciesMario Limonciello
Use the perf_to_freq helpers to calculate this on the fly. As the members are no longer cached add an extra check into amd_pstate_epp_update_limit() to avoid unnecessary calls in amd_pstate_update_min_max_limit(). Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Show a warning when a CPU fails to setupMario Limonciello
I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't populated. This is an unexpected behavior that is most likely a BIOS bug. In the event it happens I'd like users to report bugs to properly root cause and get this fixed. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Invalidate cppc_req_cached during suspendMario Limonciello
During resume it's possible the firmware didn't restore the CPPC request MSR but the kernel thinks the values line up. This leads to incorrect performance after resume from suspend. To fix the issue invalidate the cached value at suspend. During resume use the saved values programmed as cached limits. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reported-by: Miroslav Pavleski <miroslav@pavleski.net> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Fix the clamping of perf valuesDhananjay Ugwekar
The clamping in freq_to_perf() is broken right now, as we first typecast (read wraparound) the overflowing value into a u8 and then clamp it down. So, use a u32 to store the >255 value in certain edge cases and then clamp it down into a u8. Also, use a "explicit typecast + clamp" instead of just a "clamp_t" as the latter typecasts first and then clamps between the limits, which defeats our purpose. Fixes: 620136ced35a ("cpufreq/amd-pstate: Modularize perf<->freq conversion") Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20250222033221.554976-1-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06PCI/DOE: Rename Discovery Response Data Object Contents to typeAlistair Francis
PCIe r6.1, sec 6.30.1.1, describes a "Vendor ID", a "Data Object Type" and "Next Index" as the fields in the DOE Discovery Response Data Object. The DOE driver currently uses both the terms 'type' and 'prot' for the second element. Rename all uses of the DOE Discovery Response Data Object to use 'type' as the second element of the object header, instead of type/prot as it currently is. Link: https://lore.kernel.org/r/20250306075211.1855177-2-alistair@alistair23.me Signed-off-by: Alistair Francis <alistair@alistair23.me> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-03-06PCI/DOE: Rename DOE protocol to featureAlistair Francis
DOE r1.1 replaced all occurrences of "protocol" with the term "feature" or "Data Object Type". PCIe r6.1 incorporated that change. Rename the existing terms protocol with feature. Link: https://lore.kernel.org/r/20250306075211.1855177-1-alistair@alistair23.me Signed-off-by: Alistair Francis <alistair@alistair23.me> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Lukas Wunner <lukas@wunner.de>
2025-03-06soc/tegra: pmc: Use str_enable_disable-like helpersKrzysztof Kozlowski
Replace ternary (condition ? "enable" : "disable") syntax with helpers from string_choices.h because: 1. Simple function call with one argument is easier to read. Ternary operator has three arguments and with wrapping might lead to quite long code. 2. Is slightly shorter thus also easier to read. 3. It brings uniformity in the text - same string. 4. Allows deduping by the linker, which results in a smaller binary file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250114203638.1013670-1-krzysztof.kozlowski@linaro.org Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-03-06soc: samsung: include linux/array_size.h where neededArnd Bergmann
This does not necessarily get included through asm/io.h: drivers/soc/samsung/exynos3250-pmu.c:120:18: error: use of undeclared identifier 'ARRAY_SIZE' 120 | for (i = 0; i < ARRAY_SIZE(exynos3250_list_feed); i++) { | ^ drivers/soc/samsung/exynos5250-pmu.c:162:18: error: use of undeclared identifier 'ARRAY_SIZE' 162 | for (i = 0; i < ARRAY_SIZE(exynos5_list_both_cnt_feed); i++) { | ^ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20250305211446.43772-1-arnd@kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2025-03-06media: vim2m: print device name after registering deviceMatthew Majewski
Move the v4l2_info() call displaying the video device name after the device is actually registered. This fixes a bug where the driver was always displaying "/dev/video0" since it was reading from the vfd before it was registered. Fixes: cf7f34777a5b ("media: vim2m: Register video device after setting up internals") Cc: stable@vger.kernel.org Signed-off-by: Matthew Majewski <mattwmajewski@gmail.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2025-03-06media: vivid: Introduce VIDEO_VIVID_OSDRicardo Ribalda
vivid-osd depends on CONFIG_FB, which can be a large dependency. Introduce CONFIG_VIDEO_VIVID_OSD to control enabling support for testing output overlay. Suggested-by: Slawomir Rosek <srosek@google.com> Co-developed-by: Slawomir Rosek <srosek@google.com> Signed-off-by: Slawomir Rosek <srosek@google.com> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> [hverkuil: add newline to squash checkpatch warning]