Age | Commit message (Collapse) | Author |
|
Support for CPU address mirror bindings in SRAM fully in place, enable the
implementation.
v3:
- s/system allocator/CPU address mirror (Thomas)
v7:
- Only enable uAPI if selected by GPU SVM (CI)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-19-matthew.brost@intel.com
|
|
uAPI is designed with the use case that only mapping a BO to a malloc'd
address will unbind a CPU-address mirror VMA. Therefore, allowing a
CPU-address mirror VMA to unbind when the GPU has bindings in the range
being unbound does not make much sense. This behavior is not supported,
as it simplifies the code. This decision can always be revisited if a
use case arises.
v3:
- s/arrises/arises (Thomas)
- s/system allocator/GPU address mirror (Thomas)
- Kernel doc (Thomas)
- Newline between function defs (Thomas)
v5:
- Kernel doc (Thomas)
v6:
- Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-18-matthew.brost@intel.com
|
|
Add unbind to SVM garbage collector. To facilitate add unbind support
function to VM layer which unbinds a SVM range. Also teach PT layer to
understand unbinds of SVM ranges.
v3:
- s/INVALID_VMA/XE_INVALID_VMA (Thomas)
- Kernel doc (Thomas)
- New GPU SVM range structure (Thomas)
- s/DRM_GPUVA_OP_USER/DRM_GPUVA_OP_DRIVER (Thomas)
v4:
- Use xe_vma_op_unmap_range (Himal)
v5:
- s/PY/PT (Thomas)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-17-matthew.brost@intel.com
|
|
Add basic SVM garbage collector which destroy a SVM range upon a MMU
UNMAP event. The garbage collector runs on worker or in GPU fault
handler and is required as locks in the path of reclaim are required and
cannot be taken the notifier.
v2:
- Flush garbage collector in xe_svm_close
v3:
- Better commit message (Thomas)
- Kernel doc (Thomas)
- Use list_first_entry_or_null for garbage collector loop (Thomas)
- Don't add to garbage collector if VM is closed (Thomas)
v4:
- Use %pe to print error (Thomas)
v5:
- s/visable/visible (Thomas)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-16-matthew.brost@intel.com
|
|
Add (re)bind to SVM page fault handler. To facilitate add support
function to VM layer which (re)binds a SVM range. Also teach PT layer to
understand (re)binds of SVM ranges.
v2:
- Don't assert BO lock held for range binds
- Use xe_svm_notifier_lock/unlock helper in xe_svm_close
- Use drm_pagemap dma cursor
- Take notifier lock in bind code to check range state
v3:
- Use new GPU SVM range structure (Thomas)
- Kernel doc (Thomas)
- s/DRM_GPUVA_OP_USER/DRM_GPUVA_OP_DRIVER (Thomas)
v5:
- Kernel doc (Thomas)
v6:
- Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Tested-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-15-matthew.brost@intel.com
|
|
Add SVM range invalidation vfunc which invalidates PTEs. A new PT layer
function which accepts a SVM range is added to support this. In
addition, add the basic page fault handler which allocates a SVM range
which is used by SVM range invalidation vfunc.
v2:
- Don't run invalidation if VM is closed
- Cycle notifier lock in xe_svm_close
- Drop xe_gt_tlb_invalidation_fence_fini
v3:
- Better commit message (Thomas)
- Add lockdep asserts (Thomas)
- Add kernel doc (Thomas)
- s/change/changed (Thomas)
- Use new GPU SVM range / notifier structures
- Ensure PTEs are zapped / dma mappings are unmapped on VM close (Thomas)
v4:
- Fix macro (Checkpatch)
v5:
- Use range start/end helpers (Thomas)
- Use notifier start/end helpers (Thomas)
v6:
- Use min/max helpers (Himal)
- Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-13-matthew.brost@intel.com
|
|
Clear root PT entry and invalidate entire VM's address space when
closing the VM. Will prevent the GPU from accessing any of the VM's
memory after closing.
v2:
- s/vma/vm in kernel doc (CI)
- Don't nuke migration VM as this occur at driver unload (CI)
v3:
- Rebase and pull into SVM series (Thomas)
- Wait for pending binds (Thomas)
v5:
- Remove xe_gt_tlb_invalidation_fence_fini in error case (Matt Auld)
- Drop local migration bool (Thomas)
v7:
- Add drm_dev_enter/exit protecting invalidation (CI, Matt Auld)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-12-matthew.brost@intel.com
|
|
Add dma_addr res cursor which walks an array of drm_pagemap_dma_addr.
Useful for SVM ranges and programing page tables.
v3:
- Better commit message (Thomas)
- Use new drm_pagemap.h location
v7:
- Fix kernel doc (CI)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-11-matthew.brost@intel.com
|
|
Add SVM init / close / fini to faulting VMs. Minimual implementation
acting as a placeholder for follow on patches.
v2:
- Add close function
v3:
- Better commit message (Thomas)
- Kernel doc (Thomas)
- Update chunk array to be unsigned long (Thomas)
- Use new drm_gpusvm.h header location (Thomas)
- Newlines between functions in xe_svm.h (Thomas)
- Call drm_gpusvm_driver_set_lock in init (Thomas)
v6:
- Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas)
v7:
- Only select CONFIG_DRM_GPUSVM if DEVICE_PRIVATE (CI)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-10-matthew.brost@intel.com
|
|
Add the DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag, which is used to
create unpopulated virtual memory areas (VMAs) without memory backing or
GPU page tables. These VMAs are referred to as CPU address mirror VMAs.
The idea is that upon a page fault or prefetch, the memory backing and
GPU page tables will be populated.
CPU address mirror VMAs only update GPUVM state; they do not have an
internal page table (PT) state, nor do they have GPU mappings.
It is expected that CPU address mirror VMAs will be mixed with buffer
object (BO) VMAs within a single VM. In other words, system allocations
and runtime allocations can be mixed within a single user-mode driver
(UMD) program.
Expected usage:
- Bind the entire virtual address (VA) space upon program load using the
DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag.
- If a buffer object (BO) requires GPU mapping (runtime allocation),
allocate a CPU address using mmap(PROT_NONE), bind the BO to the
mmapped address using existing bind IOCTLs. If a CPU map of the BO is
needed, mmap it again to the same CPU address using mmap(MAP_FIXED)
- If a BO no longer requires GPU mapping, munmap it from the CPU address
space and them bind the mapping address with the
DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag.
- Any malloc'd or mmapped CPU address accessed by the GPU will be
faulted in via the SVM implementation (system allocation).
- Upon freeing any mmapped or malloc'd data, the SVM implementation will
remove GPU mappings.
Only supporting 1 to 1 mapping between user address space and GPU
address space at the moment as that is the expected use case. uAPI
defines interface for non 1 to 1 but enforces 1 to 1, this restriction
can be lifted if use cases arrise for non 1 to 1 mappings.
This patch essentially short-circuits the code in the existing VM bind
paths to avoid populating page tables when the
DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag is set.
v3:
- Call vm_bind_ioctl_ops_fini on -ENODATA
- Don't allow DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR on non-faulting VMs
- s/DRM_XE_VM_BIND_FLAG_SYSTEM_ALLOCATOR/DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR (Thomas)
- Rework commit message for expected usage (Thomas)
- Describe state of code after patch in commit message (Thomas)
v4:
- Fix alignment (Checkpatch)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-9-matthew.brost@intel.com
|
|
Xe depends on DRM_GPUSVM for SVM implementation, select it in Kconfig.
v6:
- Don't select DRM_GPUSVM if UML (CI)
v7:
- Only select DRM_GPUSVM if DEVICE_PRIVATE (CI)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-8-matthew.brost@intel.com
|
|
This patch introduces support for GPU Shared Virtual Memory (SVM) in the
Direct Rendering Manager (DRM) subsystem. SVM allows for seamless
sharing of memory between the CPU and GPU, enhancing performance and
flexibility in GPU computing tasks.
The patch adds the necessary infrastructure for SVM, including data
structures and functions for managing SVM ranges and notifiers. It also
provides mechanisms for allocating, deallocating, and migrating memory
regions between system RAM and GPU VRAM.
This is largely inspired by GPUVM.
v2:
- Take order into account in check pages
- Clear range->pages in get pages error
- Drop setting dirty or accessed bit in get pages (Vetter)
- Remove mmap assert for cpu faults
- Drop mmap write lock abuse (Vetter, Christian)
- Decouple zdd from range (Vetter, Oak)
- Add drm_gpusvm_range_evict, make it work with coherent pages
- Export drm_gpusvm_evict_to_sram, only use in BO evict path (Vetter)
- mmget/put in drm_gpusvm_evict_to_sram
- Drop range->vram_alloation variable
- Don't return in drm_gpusvm_evict_to_sram until all pages detached
- Don't warn on mixing sram and device pages
- Update kernel doc
- Add coherent page support to get pages
- Use DMA_FROM_DEVICE rather than DMA_BIDIRECTIONAL
- Add struct drm_gpusvm_vram and ops (Thomas)
- Update the range's seqno if the range is valid (Thomas)
- Remove the is_unmapped check before hmm_range_fault (Thomas)
- Use drm_pagemap (Thomas)
- Drop kfree_mapping (Thomas)
- dma mapp pages under notifier lock (Thomas)
- Remove ctx.prefault
- Remove ctx.mmap_locked
- Add ctx.check_pages
- s/vram/devmem (Thomas)
v3:
- Fix memory leak drm_gpusvm_range_get_pages
- Only migrate pages with same zdd on CPU fault
- Loop over al VMAs in drm_gpusvm_range_evict
- Make GPUSVM a drm level module
- GPL or MIT license
- Update main kernel doc (Thomas)
- Prefer foo() vs foo for functions in kernel doc (Thomas)
- Prefer functions over macros (Thomas)
- Use unsigned long vs u64 for addresses (Thomas)
- Use standard interval_tree (Thomas)
- s/drm_gpusvm_migration_put_page/drm_gpusvm_migration_unlock_put_page (Thomas)
- Drop err_out label in drm_gpusvm_range_find_or_insert (Thomas)
- Fix kernel doc in drm_gpusvm_range_free_pages (Thomas)
- Newlines between functions defs in header file (Thomas)
- Drop shall language in driver vfunc kernel doc (Thomas)
- Move some static inlines from head to C file (Thomas)
- Don't allocate pages under page lock in drm_gpusvm_migrate_populate_ram_pfn (Thomas)
- Change check_pages to a threshold
v4:
- Fix NULL ptr deref in drm_gpusvm_migrate_populate_ram_pfn (Thomas, Himal)
- Fix check pages threshold
- Check for range being unmapped under notifier lock in get pages (Testing)
- Fix characters per line
- Drop WRITE_ONCE for zdd->devmem_allocation assignment (Thomas)
- Use completion for devmem_allocation->detached (Thomas)
- Make GPU SVM depend on ZONE_DEVICE (CI)
- Use hmm_range_fault for eviction (Thomas)
- Drop zdd worker (Thomas)
v5:
- Select Kconfig deps (CI)
- Set device to NULL in __drm_gpusvm_migrate_to_ram (Matt Auld, G.G.)
- Drop Thomas's SoB (Thomas)
- Add drm_gpusvm_range_start/end/size helpers (Thomas)
- Add drm_gpusvm_notifier_start/end/size helpers (Thomas)
- Absorb drm_pagemap name changes (Thomas)
- Fix driver lockdep assert (Thomas)
- Move driver lockdep assert to static function (Thomas)
- Assert mmap lock held in drm_gpusvm_migrate_to_devmem (Thomas)
- Do not retry forever on eviction (Thomas)
v6:
- Fix drm_gpusvm_get_devmem_page alignment (Checkpatch)
- Modify Kconfig (CI)
- Compile out lockdep asserts (CI)
v7:
- Add kernel doc for flags fields (CI, Auld)
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-7-matthew.brost@intel.com
|
|
Introduce xe_bo_put_async to put a bo where the context is such that
the bo destructor can't run due to lockdep problems or atomic context.
If the put is the final put, freeing will be done from a work item.
v5:
- Kerenl doc for xe_bo_put_async (Thomas)
v7:
- Fix kernel doc (CI)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Tested-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-6-matthew.brost@intel.com
|
|
TTM doesn't support fair eviction via WW locking, this mitigated in by
using retry loops in exec and preempt rebind worker. Extend this retry
loop to BO allocation. Once TTM supports fair eviction this patch can be
reverted.
v4:
- Keep line break (Stuart)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-2-matthew.brost@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth and wireless.
Current release - new code bugs:
- wifi: nl80211: disable multi-link reconfiguration
Previous releases - regressions:
- gso: fix ownership in __udp_gso_segment
- wifi: iwlwifi:
- fix A-MSDU TSO preparation
- free pages allocated when failing to build A-MSDU
- ipv6: fix dst ref loop in ila lwtunnel
- mptcp: fix 'scheduling while atomic' in
mptcp_pm_nl_append_new_local_addr
- bluetooth: add check for mgmt_alloc_skb() in
mgmt_device_connected()
- ethtool: allow NULL nlattrs when getting a phy_device
- eth: be2net: fix sleeping while atomic bugs in
be_ndo_bridge_getlink
Previous releases - always broken:
- core: support TCP GSO case for a few missing flags
- wifi: mac80211:
- fix vendor-specific inheritance
- cleanup sta TXQs on flush
- llc: do not use skb_get() before dev_queue_xmit()
- eth: ipa: nable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX}
for v4.7"
* tag 'net-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (41 commits)
net: ipv6: fix missing dst ref drop in ila lwtunnel
net: ipv6: fix dst ref loop in ila lwtunnel
mctp i3c: handle NULL header address
net: dsa: mt7530: Fix traffic flooding for MMIO devices
net-timestamp: support TCP GSO case for a few missing flags
vlan: enforce underlying device type
mptcp: fix 'scheduling while atomic' in mptcp_pm_nl_append_new_local_addr
net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device
ppp: Fix KMSAN uninit-value warning with bpf
net: ipa: Enable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX} for v4.7
net: ipa: Fix QSB data for v4.7
net: ipa: Fix v4.7 resource group names
net: hns3: make sure ptp clock is unregister and freed if hclge_ptp_get_cycle returns an error
wifi: nl80211: disable multi-link reconfiguration
net: dsa: rtl8366rb: don't prompt users for LED control
be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink
llc: do not use skb_get() before dev_queue_xmit()
wifi: cfg80211: regulatory: improve invalid hints checking
caif_virtio: fix wrong pointer check in cfv_probe()
net: gso: fix ownership in __udp_gso_segment
...
|
|
Now that CDM_0 has been enabled for DPU 5.x+, add support for YUV formats
on writeback
Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/641270/
Signed-off-by: Rob Clark <robdclark@chromium.org>
|
|
To prevent incorrect BW calculation, zero out dpu_core_perf_params
before it is passed into dpu_core_perf_aggregate().
Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Fixes: 795aef6f3653 ("drm/msm/dpu: remove duplicate code calculating sum of bandwidths")
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/641278/
Signed-off-by: Rob Clark <robdclark@chromium.org>
|
|
Add CXL mailbox Features commands enabling. This is also preparation for
CXL fwctl enabling. The same code will also be utilized by the CXL EDAC
enabling. The commands 'Get Supported Features', 'Get Feature', and 'Set
Feature' are enabled for kernel usages.
Required for the CXL fwctl driver.
* branch 'for-6.15/features'
cxl: Setup exclusive CXL features that are reserved for the kernel
cxl/mbox: Add SET_FEATURE mailbox command
cxl/mbox: Add GET_FEATURE mailbox command
cxl/test: Add Get Supported Features mailbox command support
cxl: Add Get Supported Features command for kernel usage
cxl: Enumerate feature commands
cxl: Refactor user ioctl command path from mds to mailbox
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
If the device supports User Context then it can support fwctl. Create an
auxiliary device to allow fwctl to bind to it.
Create a sysfs like:
$ ls /sys/devices/pci0000:00/0000:00:0a.0/mlx5_core.fwctl.0/driver -l
lrwxrwxrwx 1 root root 0 Apr 25 19:46 /sys/devices/pci0000:00/0000:00:0a.0/mlx5_core.fwctl.0/driver -> ../../../../bus/auxiliary/drivers/mlx5_fwctl.mlx5_fwctl
Link: https://patch.msgid.link/r/8-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
mlx5 FW has a built in security context called UID. Each UID has a set of
permissions controlled by the kernel when it is created and every command
is tagged by the kernel with a particular UID. In general commands cannot
reach objects outside of their UID and commands cannot exceed their UID's
permissions. These restrictions are enforced by FW.
This mechanism has long been used in RDMA for the devx interface where
RDMA will sent commands directly to the FW and the UID limitations
restrict those commands to a ib_device/verbs security domain. For instance
commands that would effect other VFs, or global device resources. The
model is suitable for unprivileged userspace to operate the RDMA
functionality.
The UID has been extended with a "tools resources" permission which allows
additional commands and sub-commands that are intended to match with the
scope limitations set in FWCTL. This is an alternative design to the
"command intent log" where the FW does the enforcement rather than having
the FW report the enforcement the kernel should do.
Consistent with the fwctl definitions the "tools resources" security
context is limited to the FWCTL_RPC_CONFIGURATION,
FWCTL_RPC_DEBUG_READ_ONLY, FWCTL_RPC_DEBUG_WRITE, and
FWCTL_RPC_DEBUG_WRITE_FULL security scopes.
Like RDMA devx, each opened fwctl file descriptor will get a unique UID
associated with each file descriptor.
The fwctl driver is kept simple and we reject commands that can create
objects as the UID mechanism relies on the kernel to track and destroy
objects prior to detroying the UID. Filtering into fwctl sub scopes is
done inside the driver with a switch statement. This substantially limits
what is possible to primarily query functions ad a few limited set
operations.
mlx5 already has a robust infrastructure for delivering RPC messages to
fw. Trivially connect fwctl's RPC mechanism to mlx5_cmd_do(). Enforce the
User Context ID in every RPC header accepted from the FD so the FW knows
the security context of the issuing ID.
Link: https://patch.msgid.link/r/7-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add the FWCTL_RPC ioctl which allows a request/response RPC call to device
firmware. Drivers implementing this call must follow the security
guidelines under Documentation/userspace-api/fwctl.rst
The core code provides some memory management helpers to get the messages
copied from and back to userspace. The driver is responsible for
allocating the output message memory and delivering the message to the
device.
Link: https://patch.msgid.link/r/5-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Userspace will need to know some details about the fwctl interface being
used to locate the correct userspace code to communicate with the
kernel. Provide a simple device_type enum indicating what the kernel
driver is.
Allow the device to provide a device specific info struct that contains
any additional information that the driver may need to provide to
userspace.
Link: https://patch.msgid.link/r/3-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Each file descriptor gets a chunk of per-FD driver specific context that
allows the driver to attach a device specific struct to. The core code
takes care of the memory lifetime for this structure.
The ioctl dispatch and design is based on what was built for iommufd. The
ioctls have a struct which has a combined in/out behavior with a typical
'zero pad' scheme for future extension and backwards compatibility.
Like iommufd some shared logic does most of the ioctl marshaling and
compatibility work and table dispatches to some function pointers for
each unique ioctl.
This approach has proven to work quite well in the iommufd and rdma
subsystems.
Allocate an ioctl number space for the subsystem.
Link: https://patch.msgid.link/r/2-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Create the class, character device and functions for a fwctl driver to
un/register to the subsystem.
A typical fwctl driver has a sysfs presence like:
$ ls -l /dev/fwctl/fwctl0
crw------- 1 root root 250, 0 Apr 25 19:16 /dev/fwctl/fwctl0
$ ls /sys/class/fwctl/fwctl0
dev device power subsystem uevent
$ ls /sys/class/fwctl/fwctl0/device/infiniband/
ibp0s10f0
$ ls /sys/class/infiniband/ibp0s10f0/device/fwctl/
fwctl0/
$ ls /sys/devices/pci0000:00/0000:00:0a.0/fwctl/fwctl0
dev device power subsystem uevent
Which allows userspace to link all the multi-subsystem driver components
together and learn the subsystem specific names for the device's
components.
Link: https://patch.msgid.link/r/1-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When the CPU goes offline there is no need to change the CPPC request
because the CPU will go into the deepest C-state it supports already.
Actually changing the CPPC request when it goes offline messes up the
cached values and can lead to the wrong values being restored when
it comes back.
Instead drop the actions and if the CPU comes back online let
amd_pstate_epp_set_policy() restore it to expected values.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
EPP values are cached in the cpudata structure per CPU. This is needless
though because they are also cached in the CPPC request variable.
Drop the separate cache for EPP values and always reference the CPPC
request variable when needed.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
The CPPC enable register is configured as "write once". That is
any future writes don't actually do anything.
Because of this, all the cleanup paths that currently exist for
CPPC disable are non-effective.
Rework CPPC enable to only enable after all the CAP registers have
been read to avoid enabling CPPC on CPUs with invalid _CPC or
unpopulated MSRs.
As the register is write once, remove all cleanup paths as well.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
There are trace events that exist now for all amd-pstate modes that
will output information right before programming to the hardware.
This makes the existing debug statements unnecessary remaining
overhead. Drop them.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
On EPP only writes update the cached variable so that the min/max
performance controls don't need to be updated again.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
functions
The EPP tracing is done by the caller today, but this precludes the
information about whether the CPPC request has changed.
Move it into the update_perf and set_epp functions and include information
about whether the request has changed from the last one.
amd_pstate_update_perf() and amd_pstate_set_epp() now require the policy
as an argument instead of the cpudata.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
In order to prevent a potential write for shmem_update_perf()
cache the request into the cppc_req_cached variable normally only
used for the MSR case.
This adds symmetry into the code and potentially avoids extra writes.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
Bitfield masks are easier to follow and less error prone.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
In amd_pstate_ut_check_freq() and amd_pstate_ut_check_perf() the cpudata
variable is only needed in the scope of the for loop. Move it there.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
If a CPU is missing a policy or one has been offlined then the unit test
is skipped for the rest of the CPUs on the system.
Instead; iterate online CPUs and skip any missing policies to allow
continuing to test the rest of them.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
Enums are effectively used as a boolean and don't show
the return value of the failing call.
Instead of using enums switch to returning the actual return
code from the unit test.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
Several Ryzen AI processors support the exact same value for lowest
nonlinear perf and lowest perf. Loosen up the unit tests to allow this
scenario.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
Using a scoped cleanup macro simplifies cleanup code.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
The `cppc_cap1_cached` variable isn't used at all, there is no
need to read it at initialization for each CPU.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
amd_pstate_cpu_boost_update() and refresh_frequency_limits() both
update the policy state and have nothing to do with the amd-pstate
driver itself.
A global "limits" lock doesn't make sense because each CPU can have
policies changed independently. Each time a CPU changes values they
will atomically be written to the per-CPU perf member. Drop per CPU
locking cases.
The remaining "global" driver lock is used to ensure that only one
entity can change driver modes at a given time.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
By storing perf values in a union all the writes and reads can
be done atomically, removing the need for some concurrency protections.
While making this change, also drop the cached frequency values,
using inline helpers to calculate them on demand from perf value.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
Use the perf_to_freq helpers to calculate this on the fly.
As the members are no longer cached add an extra check into
amd_pstate_epp_update_limit() to avoid unnecessary calls in
amd_pstate_update_min_max_limit().
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
populated. This is an unexpected behavior that is most likely a
BIOS bug. In the event it happens I'd like users to report bugs
to properly root cause and get this fixed.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
During resume it's possible the firmware didn't restore the CPPC request
MSR but the kernel thinks the values line up. This leads to incorrect
performance after resume from suspend.
To fix the issue invalidate the cached value at suspend. During resume use
the saved values programmed as cached limits.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reported-by: Miroslav Pavleski <miroslav@pavleski.net>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
The clamping in freq_to_perf() is broken right now, as we first typecast
(read wraparound) the overflowing value into a u8 and then clamp it down.
So, use a u32 to store the >255 value in certain edge cases and then clamp
it down into a u8.
Also, use a "explicit typecast + clamp" instead of just a "clamp_t" as the
latter typecasts first and then clamps between the limits, which defeats
our purpose.
Fixes: 620136ced35a ("cpufreq/amd-pstate: Modularize perf<->freq conversion")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250222033221.554976-1-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
PCIe r6.1, sec 6.30.1.1, describes a "Vendor ID", a "Data Object Type" and
"Next Index" as the fields in the DOE Discovery Response Data Object. The
DOE driver currently uses both the terms 'type' and 'prot' for the second
element.
Rename all uses of the DOE Discovery Response Data Object to use 'type' as
the second element of the object header, instead of type/prot as it
currently is.
Link: https://lore.kernel.org/r/20250306075211.1855177-2-alistair@alistair23.me
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
DOE r1.1 replaced all occurrences of "protocol" with the term "feature" or
"Data Object Type". PCIe r6.1 incorporated that change.
Rename the existing terms protocol with feature.
Link: https://lore.kernel.org/r/20250306075211.1855177-1-alistair@alistair23.me
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
|
|
Replace ternary (condition ? "enable" : "disable") syntax with helpers
from string_choices.h because:
1. Simple function call with one argument is easier to read. Ternary
operator has three arguments and with wrapping might lead to quite
long code.
2. Is slightly shorter thus also easier to read.
3. It brings uniformity in the text - same string.
4. Allows deduping by the linker, which results in a smaller binary
file.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250114203638.1013670-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
This does not necessarily get included through asm/io.h:
drivers/soc/samsung/exynos3250-pmu.c:120:18: error: use of undeclared identifier 'ARRAY_SIZE'
120 | for (i = 0; i < ARRAY_SIZE(exynos3250_list_feed); i++) {
| ^
drivers/soc/samsung/exynos5250-pmu.c:162:18: error: use of undeclared identifier 'ARRAY_SIZE'
162 | for (i = 0; i < ARRAY_SIZE(exynos5_list_both_cnt_feed); i++) {
| ^
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250305211446.43772-1-arnd@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
|
|
Move the v4l2_info() call displaying the video device name after the
device is actually registered.
This fixes a bug where the driver was always displaying "/dev/video0"
since it was reading from the vfd before it was registered.
Fixes: cf7f34777a5b ("media: vim2m: Register video device after setting up internals")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Majewski <mattwmajewski@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
vivid-osd depends on CONFIG_FB, which can be a large dependency. Introduce
CONFIG_VIDEO_VIVID_OSD to control enabling support for testing output
overlay.
Suggested-by: Slawomir Rosek <srosek@google.com>
Co-developed-by: Slawomir Rosek <srosek@google.com>
Signed-off-by: Slawomir Rosek <srosek@google.com>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
[hverkuil: add newline to squash checkpatch warning]
|