Age | Commit message (Collapse) | Author |
|
This patch adds code to cleanup any leftover userqueues which
a user might have missed to destroy due to a crash or any other
programming error.
V7: Added Alex's R-B
V8: Rebase
V9: Rebase
V10: Rebase
V11: Rebase
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The userspace sends us the doorbell object and the relative doobell
index in the object to be used for the usermode queue, but the FW
expects the absolute doorbell index on the PCI BAR in the MQD. This
patch adds a function to convert this relative doorbell index to
absolute doorbell index.
V5: Fix the db object reference leak (Christian)
V6: Pin the doorbell bo in userqueue_create() function, and unpin it
in userqueue destoy (Christian)
V7: Added missing kfree for queue in error cases
Added Alex's R-B
V8: Rebase
V9: Changed the function names from gfx_v11* to mes_v11*
V10: Rebase
V11: Rebase
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To support oversubscription, MES FW expects WPTR BOs to
be mapped into GART, before they are submitted to usermode
queues. This patch adds a function for the same.
V4: fix the wptr value before mapping lookup (Bas, Christian).
V5: Addressed review comments from Christian:
- Either pin object or allocate from GART, but not both.
- All the handling must be done with the VM locks held.
V7: Addressed review comments from Christian:
- Do not take vm->eviction_lock
- Use amdgpu_bo_gpu_offset to get the wptr_bo GPU offset
V8: Rebase
V9: Changed the function names from gfx_v11* to mes_v11*
V10: Remove unused adev (Harish)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds new functions to map/unmap a usermode queue into
the FW, using the MES ring. As soon as this mapping is done, the
queue would be considered ready to accept the workload.
V1: Addressed review comments from Alex on the RFC patch series
- Map/Unmap should be IP specific.
V2:
Addressed review comments from Christian:
- Fix the wptr_mc_addr calculation (moved into another patch)
Addressed review comments from Alex:
- Do not add fptrs for map/unmap
V3: Integration with doorbell manager
V4: Rebase
V5: Use gfx_v11_0 for function names (Alex)
V6: Removed queue->proc/gang/fw_ctx_address variables and doing the
address calculations locally to keep the queue structure GEN
independent (Alex)
V7: Added R-B from Alex
V8: Rebase
V9: Rebase
V10: Rebase
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The MES FW expects us to allocate at least one page as context
space to process gang and process related context data. This
patch creates a joint object for the same, and calculates GPU
space offsets of these spaces.
V1: Addressed review comments on RFC patch:
Alex: Make this function IP specific
V2: Addressed review comments from Christian
- Allocate only one object for total FW space, and calculate
offsets for each of these objects.
V3: Integration with doorbell manager
V4: Review comments:
- Remove shadow from FW space list from cover letter (Alex)
- Alignment of macro (Luben)
V5: Merged patches 5 and 6 into this single patch
Addressed review comments:
- Use lower_32_bits instead of mask (Christian)
- gfx_v11_0 instead of gfx_v11 in function names (Alex)
- Shadow and GDS objects are now coming from userspace (Christian,
Alex)
V6:
- Add a comment to replace amdgpu_bo_create_kernel() with
amdgpu_bo_create() during fw_ctx object creation (Christian).
- Move proc_ctx_gpu_addr, gang_ctx_gpu_addr and fw_ctx_gpu_addr out
of generic queue structure and make it gen11 specific (Alex).
V7:
- Using helper function to create/destroy userqueue objects.
- Removed FW object space allocation.
V8:
- Updating FW object address from user values.
V9:
- uppdated function name from gfx_v11_* to mes_v11_*
V10:
- making this patch independent of IP based changes, moving any
GFX object related changes in GFX specific patch (Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
A Memory queue descriptor (MQD) of a userqueue defines it in
the hw's context. As MQD format can vary between different
graphics IPs, we need gfx GEN specific handlers to create MQDs.
This patch:
- Adds a new file which will be used for MES based userqueue
functions targeting GFX and SDMA IP.
- Introduces MQD handler functions for the usermode queues.
V1: Worked on review comments from Alex:
- Make MQD functions GEN and IP specific
V2: Worked on review comments from Alex:
- Reuse the existing adev->mqd[ip] for MQD creation
- Formatting and arrangement of code
V3:
- Integration with doorbell manager
V4: Review comments addressed:
- Do not create a new file for userq, reuse gfx_v11_0.c (Alex)
- Align name of structure members (Luben)
- Don't break up the Cc tag list and the Sob tag list in commit
message (Luben)
V5:
- No need to reserve the bo for MQD (Christian).
- Some more changes to support IP specific MQD creation.
V6:
- Add a comment reminding us to replace the amdgpu_bo_create_kernel()
calls while creating MQD object to amdgpu_bo_create() once eviction
fences are ready (Christian).
V7:
- Re-arrange userqueue functions in adev instead of uq_mgr (Alex)
- Use memdup_user instead of copy_from_user (Christian)
V9:
- Moved userqueue code from gfx_v11_0.c to new file mes_v11_0.c so
that it can be reused for SDMA userqueues as well (Shashank, Alex)
V10: Addressed review comments from Alex
- Making this patch independent of IP engine(GFX/SDMA/Compute) and
specific to MES V11 only, using the generic MQD structure.
- Splitting a spearate patch to enabling GFX support from here.
- Verify mqd va address to be non-NULL.
- Add a separate header file.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch introduces amdgpu_userqueue_object and its helper
functions to creates and destroy this object. The helper
functions creates/destroys a base amdgpu_bo, kmap/unmap it and
save the respective GPU and CPU addresses in the encapsulating
userqueue object.
These helpers will be used to create/destroy userqueue MQD, WPTR
and FW areas.
V7:
- Forked out this new patch from V11-gfx-userqueue patch to prevent
that patch from growing very big.
- Using amdgpu_bo_create instead of amdgpu_bo_create_kernel in prep
for eviction fences (Christian)
V9:
- Rebase
V10:
- Added Alex's R-B
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds:
- A new IOCTL function to create and destroy
- A new structure to keep all the user queue data in one place.
- A function to generate unique index for the queue.
V1: Worked on review comments from RFC patch series:
- Alex: Keep a list of queues, instead of single queue per process.
- Christian: Use the queue manager instead of global ptrs,
Don't keep the queue structure in amdgpu_ctx
V2: Worked on review comments:
- Christian:
- Formatting of text
- There is no need for queuing of userqueues, with idr in place
- Alex:
- Remove use_doorbell, its unnecessary
- Reuse amdgpu_mqd_props for saving mqd fields
- Code formatting and re-arrangement
V3:
- Integration with doorbell manager
V4:
- Accommodate MQD union related changes in UAPI (Alex)
- Do not set the queue size twice (Bas)
V5:
- Remove wrapper functions for queue indexing (Christian)
- Do not save the queue id/idr in queue itself (Christian)
- Move the idr allocation in the IP independent generic space
(Christian)
V6:
- Check the validity of input IP type (Christian)
V7:
- Move uq_func from uq_mgr to adev (Alex)
- Add missing free(queue) for error cases (Yifan)
V9:
- Rebase
V10: Addressed review comments from Christian, and added R-B:
- Do not initialize the local variable
- Convert DRM_ERROR to DEBUG.
V11:
- check the input flags to be zero (Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds IP independent skeleton code for amdgpu
usermode queue. It contains:
- A new files with init functions of usermode queues.
- A queue context manager in driver private data.
V1: Worked on design review comments from RFC patch series:
(https://patchwork.freedesktop.org/series/112214/)
- Alex: Keep a list of queues, instead of single queue per process.
- Christian: Use the queue manager instead of global ptrs,
Don't keep the queue structure in amdgpu_ctx
V2:
- Reformatted code, split the big patch into two
V3:
- Integration with doorbell manager
V4:
- Align the structure member names to the largest member's column
(Luben)
- Added SPDX license (Luben)
V5:
- Do not add amdgpu.h in amdgpu_userqueue.h (Christian).
- Move struct amdgpu_userq_mgr into amdgpu_userqueue.h (Christian).
V6: Rebase
V9: Rebase
V10: Rebase + Alex's R-B
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch intorduces new UAPI/IOCTL for usermode graphics
queue. The userspace app will fill this structure and request
the graphics driver to add a graphics work queue for it. The
output of this UAPI is a queue id.
This UAPI maps the queue into GPU, so the graphics app can start
submitting work to the queue as soon as the call returns.
V2: Addressed review comments from Alex and Christian
- Make the doorbell offset's comment clearer
- Change the output parameter name to queue_id
V3: Integration with doorbell manager
V4:
- Updated the UAPI doc (Pierre-Eric)
- Created a Union for engine specific MQDs (Alex)
- Added Christian's R-B
V5:
- Add variables for GDS and CSA in MQD structure (Alex)
- Make MQD data a ptr-size pair instead of union (Alex)
V9:
- renamed struct drm_amdgpu_userq_mqd_gfx_v11 to struct
drm_amdgpu_userq_mqd as its being used for SDMA and
compute queues as well
V10:
- keeping the drm_amdgpu_userq_mqd IP independent, moving the
_gfx_v11 objects in a separate structure in other patch.
(Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The defines, shifts and masks are already available in dce_6_0_d.h,
dce_6_0_sh_mask.h.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Pretty much was already there, just not ported to amdgpu.
Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Classify DCE6 resource and sequencer as they are for other DCE versions
Put dce60_resource.c and .h under amd/display/dc/resource/dce60
Put and rename dce60_hw_sequencer.c and .h under amd/display/dc/hwss/dce60
v2: fix build when CONFIG_DRM_AMD_DC_SI=n (Alex)
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If a valid header is not found during RAS eeprom init, consider it as
new and reset RAS table info.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Support NPS2 RAS.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
ASPM doesn't need to be disabled if pcie dpm is disabled.
So ASPM can be independantly enabled.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add interface for hardware init by vcn instance.
v2: fix code format
Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
VFs query RAS error counts directly from host with
AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. When ACA is enabled,
an unusable aca_sysfs is created rather than amdgpu_ras_sysfs_create()
Likewise, VFs depend on host support to query CPERs, rather than ACA component.
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In reference to memory carved out for APUs,
s/cave out/carve out/
Reviewed-by: shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
KIQ is replaced with MES on GFX 11 and newer.
Reviewed-by: shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
"interrupt" becomes "irq" in:
dce_vX_0_set_hpd_interrupt_state()
dce_vX_0_set_crtc_interrupt_state()
dce_vX_0_set_pageflip_interrupt_state()
It is easier when going through the code to just change the DCE number in
the functions' name to find and compare them across DCE versions.
Also, it standardizes function mapping inside a given structure where .set
and .process are both set to functions with a "_irq" suffix.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In DCE6, DCE8, DCE10, DCE11, "hdp" is replaced by "hpd" and
replace "type" by "hpd" for a uniform parameter naming usage across DCEs.
In link_factory.c, there is a missing "p" to "types"
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Don't fetch it again if we already have it. It seems the
registers don't reliably have the value at resume in some
cases.
Fixes: 785f0f9fe742 ("drm/amdgpu: Add mes v12_0 ip block support (v4)")
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 1e866f1fe528 ("drm/amd/pm: Prevent divide by zero")
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Rework heuristics for resolving the fault IPA (HPFAR_EL2 v. re-walk
stage-1 page tables) to align with the architecture. This avoids
possibly taking an SEA at EL2 on the page table walk or using an
architecturally UNKNOWN fault IPA
- Use acquire/release semantics in the KVM FF-A proxy to avoid
reading a stale value for the FF-A version
- Fix KVM guest driver to match PV CPUID hypercall ABI
- Use Inner Shareable Normal Write-Back mappings at stage-1 in KVM
selftests, which is the only memory type for which atomic
instructions are architecturally guaranteed to work
s390:
- Don't use %pK for debug printing and tracepoints
x86:
- Use a separate subclass when acquiring KVM's per-CPU posted
interrupts wakeup lock in the scheduled out path, i.e. when adding
a vCPU on the list of vCPUs to wake, to workaround a false positive
deadlock. The schedule out code runs with a scheduler lock that the
wakeup handler takes in the opposite order; but it does so with
IRQs disabled and cannot run concurrently with a wakeup
- Explicitly zero-initialize on-stack CPUID unions
- Allow building irqbypass.ko as as module when kvm.ko is a module
- Wrap relatively expensive sanity check with KVM_PROVE_MMU
- Acquire SRCU in KVM_GET_MP_STATE to protect guest memory accesses
selftests:
- Add more scenarios to the MONITOR/MWAIT test
- Add option to rseq test to override /dev/cpu_dma_latency
- Bring list of exit reasons up to date
- Cleanup Makefile to list once tests that are valid on all
architectures
Other:
- Documentation fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits)
KVM: arm64: Use acquire/release to communicate FF-A version negotiation
KVM: arm64: selftests: Explicitly set the page attrs to Inner-Shareable
KVM: arm64: selftests: Introduce and use hardware-definition macros
KVM: VMX: Use separate subclasses for PI wakeup lock to squash false positive
KVM: VMX: Assert that IRQs are disabled when putting vCPU on PI wakeup list
KVM: x86: Explicitly zero-initialize on-stack CPUID unions
KVM: Allow building irqbypass.ko as as module when kvm.ko is a module
KVM: x86/mmu: Wrap sanity check on number of TDP MMU pages with KVM_PROVE_MMU
KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency
KVM: x86: Acquire SRCU in KVM_GET_MP_STATE to protect guest memory accesses
Documentation: kvm: remove KVM_CAP_MIPS_TE
Documentation: kvm: organize capabilities in the right section
Documentation: kvm: fix some definition lists
Documentation: kvm: drop "Capability" heading from capabilities
Documentation: kvm: give correct name for KVM_CAP_SPAPR_MULTITCE
Documentation: KVM: KVM_GET_SUPPORTED_CPUID now exposes TSC_DEADLINE
selftests: kvm: list once tests that are valid on all architectures
selftests: kvm: bring list of exit reasons up to date
selftests: kvm: revamp MONITOR/MWAIT tests
KVM: arm64: Don't translate FAR if invalid/unsafe
...
|
|
HuC delayed loading fence, introduced with commit 27536e03271da
("drm/i915/huc: track delayed HuC load with a fence"), is registered with
object tracker early on driver probe but unregistered only from driver
remove, which is not called on early probe errors. Since its memory is
allocated under devres, then released anyway, it may happen to be
allocated again to the fence and reused on future driver probes, resulting
in kernel warnings that taint the kernel:
<4> [309.731371] ------------[ cut here ]------------
<3> [309.731373] ODEBUG: init destroyed (active state 0) object: ffff88813d7dd2e0 object type: i915_sw_fence hint: sw_fence_dummy_notify+0x0/0x20 [i915]
<4> [309.731575] WARNING: CPU: 2 PID: 3161 at lib/debugobjects.c:612 debug_print_object+0x93/0xf0
...
<4> [309.731693] CPU: 2 UID: 0 PID: 3161 Comm: i915_module_loa Tainted: G U 6.14.0-CI_DRM_16362-gf0fd77956987+ #1
...
<4> [309.731700] RIP: 0010:debug_print_object+0x93/0xf0
...
<4> [309.731728] Call Trace:
<4> [309.731730] <TASK>
...
<4> [309.731949] __debug_object_init+0x17b/0x1c0
<4> [309.731957] debug_object_init+0x34/0x50
<4> [309.732126] __i915_sw_fence_init+0x34/0x60 [i915]
<4> [309.732256] intel_huc_init_early+0x4b/0x1d0 [i915]
<4> [309.732468] intel_uc_init_early+0x61/0x680 [i915]
<4> [309.732667] intel_gt_common_init_early+0x105/0x130 [i915]
<4> [309.732804] intel_root_gt_init_early+0x63/0x80 [i915]
<4> [309.732938] i915_driver_probe+0x1fa/0xeb0 [i915]
<4> [309.733075] i915_pci_probe+0xe6/0x220 [i915]
<4> [309.733198] local_pci_probe+0x44/0xb0
<4> [309.733203] pci_device_probe+0xf4/0x270
<4> [309.733209] really_probe+0xee/0x3c0
<4> [309.733215] __driver_probe_device+0x8c/0x180
<4> [309.733219] driver_probe_device+0x24/0xd0
<4> [309.733223] __driver_attach+0x10f/0x220
<4> [309.733230] bus_for_each_dev+0x7d/0xe0
<4> [309.733236] driver_attach+0x1e/0x30
<4> [309.733239] bus_add_driver+0x151/0x290
<4> [309.733244] driver_register+0x5e/0x130
<4> [309.733247] __pci_register_driver+0x7d/0x90
<4> [309.733251] i915_pci_register_driver+0x23/0x30 [i915]
<4> [309.733413] i915_init+0x34/0x120 [i915]
<4> [309.733655] do_one_initcall+0x62/0x3f0
<4> [309.733667] do_init_module+0x97/0x2a0
<4> [309.733671] load_module+0x25ff/0x2890
<4> [309.733688] init_module_from_file+0x97/0xe0
<4> [309.733701] idempotent_init_module+0x118/0x330
<4> [309.733711] __x64_sys_finit_module+0x77/0x100
<4> [309.733715] x64_sys_call+0x1f37/0x2650
<4> [309.733719] do_syscall_64+0x91/0x180
<4> [309.733763] entry_SYSCALL_64_after_hwframe+0x76/0x7e
<4> [309.733792] </TASK>
...
<4> [309.733806] ---[ end trace 0000000000000000 ]---
That scenario is most easily reproducible with
igt@i915_module_load@reload-with-fault-injection.
Fix the issue by moving the cleanup step to driver release path.
Fixes: 27536e03271da ("drm/i915/huc: track delayed HuC load with a fence")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13592
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://lore.kernel.org/r/20250402172057.209924-2-janusz.krzysztofik@linux.intel.com
|
|
This is normally handled in the gfx IP suspend callbacks, but
for S0ix, those are skipped because we don't want to touch
gfx. So handle it in device suspend.
Fixes: b9467983b774 ("drm/amdgpu: add dynamic workload profile switching for gfx10")
Fixes: 963537ca2325 ("drm/amdgpu: add dynamic workload profile switching for gfx11")
Fixes: 5f95a1549555 ("drm/amdgpu: add dynamic workload profile switching for gfx12")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Pause the workload setting in dm when doing idle optimization
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add the callback for implementation for swsmu.
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To be used for display idle optimizations when
we want to pause non-default profiles.
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
ANNOTATE_IGNORE_ALTERNATIVE adds additional noise to the code generated
by CLAC/STAC alternatives, hurting readability for those whose read
uaccess-related code generation on a regular basis.
Remove the annotation specifically for the "NOP patched with CLAC/STAC"
case in favor of a manual check.
Leave the other uses of that annotation in place as they're less common
and more difficult to detect.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/fc972ba4995d826fcfb8d02733a14be8d670900b.1744098446.git.jpoimboe@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- fprobe: remove fprobe_hlist_node when module unloading
When a fprobe target module is removed, the fprobe_hlist_node should
be removed from the fprobe's hash table to prevent reusing
accidentally if another module is loaded at the same address.
- fprobe: lock module while registering fprobe
The module containing the function to be probeed is locked using a
reference counter until the fprobe registration is complete, which
prevents use after free.
- fprobe-events: fix possible UAF on modules
Basically as same as above, but in the fprobe-events layer we also
need to get module reference counter when we find the tracepoint in
the module.
* tag 'probes-fixes-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: fprobe: Cleanup fprobe hash when module unloading
tracing: fprobe events: Fix possible UAF on modules
tracing: fprobe: Fix to lock module while registering fprobe
|
|
With CONFIG_PREFIX_SYMBOLS, objtool adds __pfx prefix symbols
to claim the compiler emitted call padding bytes. When
CONFIG_X86_KERNEL_IBT is not selected, the symbols are added to
individual object files and for Rust objects, they end up being
exported, resulting in warnings with CONFIG_GENDWARFKSYMS as the
symbols have no debugging information:
warning: gendwarfksyms: symbol_print_versions: no information for symbol __pfx_rust_helper_put_task_struct
warning: gendwarfksyms: symbol_print_versions: no information for symbol __pfx_rust_helper_task_euid
warning: gendwarfksyms: symbol_print_versions: no information for symbol __pfx_rust_helper_readq_relaxed
...
Filter out the __pfx prefix from exported symbols similarly to
the existing __cfi and __odr_asan prefixes.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Cc: stable@vger.kernel.org
Fixes: ac61506bf2d1 ("rust: Use gendwarfksyms + extended modversions for CONFIG_MODVERSIONS")
Link: https://lore.kernel.org/r/20250318231815.917621-2-samitolvanen@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
When validate_linkmsg() fails in do_setlink(), we jump to the errout
label and calls netdev_unlock_ops() even though we have not called
netdev_lock_ops() as reported by syzbot. [0]
Let's return an error directly in such a case.
[0]
WARNING: bad unlock balance detected!
6.14.0-syzkaller-12504-g8bc251e5d874 #0 Not tainted
syz-executor814/5834 is trying to release lock (&dev_instance_lock_key) at:
[<ffffffff89f41f56>] netdev_unlock include/linux/netdevice.h:2756 [inline]
[<ffffffff89f41f56>] netdev_unlock_ops include/net/netdev_lock.h:48 [inline]
[<ffffffff89f41f56>] do_setlink+0xc26/0x43a0 net/core/rtnetlink.c:3406
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syz-executor814/5834:
#0: ffffffff900fc408 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock net/core/rtnetlink.c:80 [inline]
#0: ffffffff900fc408 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock net/core/rtnetlink.c:341 [inline]
#0: ffffffff900fc408 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0xd68/0x1fe0 net/core/rtnetlink.c:4064
stack backtrace:
CPU: 0 UID: 0 PID: 5834 Comm: syz-executor814 Not tainted 6.14.0-syzkaller-12504-g8bc251e5d874 #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_unlock_imbalance_bug+0x185/0x1a0 kernel/locking/lockdep.c:5296
__lock_release kernel/locking/lockdep.c:5535 [inline]
lock_release+0x1ed/0x3e0 kernel/locking/lockdep.c:5887
__mutex_unlock_slowpath+0xee/0x800 kernel/locking/mutex.c:907
netdev_unlock include/linux/netdevice.h:2756 [inline]
netdev_unlock_ops include/net/netdev_lock.h:48 [inline]
do_setlink+0xc26/0x43a0 net/core/rtnetlink.c:3406
rtnl_group_changelink net/core/rtnetlink.c:3783 [inline]
__rtnl_newlink net/core/rtnetlink.c:3937 [inline]
rtnl_newlink+0x1619/0x1fe0 net/core/rtnetlink.c:4065
rtnetlink_rcv_msg+0x80f/0xd70 net/core/rtnetlink.c:6955
netlink_rcv_skb+0x208/0x480 net/netlink/af_netlink.c:2534
netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
netlink_unicast+0x7f8/0x9a0 net/netlink/af_netlink.c:1339
netlink_sendmsg+0x8c3/0xcd0 net/netlink/af_netlink.c:1883
sock_sendmsg_nosec net/socket.c:712 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:727
____sys_sendmsg+0x523/0x860 net/socket.c:2566
___sys_sendmsg net/socket.c:2620 [inline]
__sys_sendmsg+0x271/0x360 net/socket.c:2652
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f8427b614a9
Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff9b59f3a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fff9b59f578 RCX: 00007f8427b614a9
RDX: 0000000000000000 RSI: 0000200000000300 RDI: 0000000000000004
RBP: 00007f8427bd4610 R08: 000000000000000c R09: 00007fff9b59f578
R10: 000000000000001b R11: 0000000000000246 R12: 0000000000000001
R13:
Fixes: 4c975fd70002 ("net: hold instance lock during NETDEV_REGISTER/UP")
Reported-by: syzbot+45016fe295243a7882d3@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=45016fe295243a7882d3
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250407164229.24414-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- A number of cpuset remote partition related fixes and cleanups along
with selftest updates.
- A change from this merge window made cgroup_rstat_updated_list()
called outside cgroup_rstat_lock leading to list corruptions. Fix it
by relocating the call inside the lock.
* tag 'cgroup-for-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/cpuset: Fix race between newly created partition and dying one
cgroup: rstat: call cgroup_rstat_updated_list with cgroup_rstat_lock
selftest/cgroup: Add a remote partition transition test to test_cpuset_prs.sh
selftest/cgroup: Clean up and restructure test_cpuset_prs.sh
selftest/cgroup: Update test_cpuset_prs.sh to use | as effective CPUs and state separator
cgroup/cpuset: Remove unneeded goto in sched_partition_write() and rename it
cgroup/cpuset: Code cleanup and comment update
cgroup/cpuset: Don't allow creation of local partition over a remote one
cgroup/cpuset: Remove remote_partition_check() & make update_cpumasks_hier() handle remote partition
cgroup/cpuset: Fix error handling in remote_partition_disable()
cgroup/cpuset: Fix incorrect isolated_cpus update in update_parent_effective_cpumask()
|
|
"Normal" comments in Rust (`//`) are also formatted in Markdown, like
the documentation (`///` and `//!`), see
Documentation/rust/coding-guidelines.rst
Thus use Markdown autolinks for a couple links that were missing it.
It also helps to get proper linking in some software like kitty [1].
Suggested-by: Benno Lossin <benno.lossin@proton.me>
Link: https://github.com/Rust-for-Linux/pin-init/pull/32#discussion_r2023103712 [1]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://github.com/Rust-for-Linux/pin-init/pull/32/commits/dd230d61bf0538281072fbff4bb71efc58f3420c
Fixes: 84837cf6fa54 ("rust: pin-init: change examples to the user-space version")
Cc: stable@vger.kernel.org
[ Change case in title. Reworded commit message. - Benno ]
Signed-off-by: Benno Lossin <benno.lossin@proton.me>
Link: https://lore.kernel.org/r/20250407201755.649153-3-benno.lossin@proton.me
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Similar to what was done for `Zeroable<NonNull<T>>` in commit
df27cef15360 ("rust: init: fix `Zeroable` implementation for
`Option<NonNull<T>>` and `Option<KBox<T>>`"), the latest Rust
documentation [1] says it guarantees that `transmute::<_,
Option<T>>([0u8; size_of::<T>()])` is sound and produces
`Option::<T>::None` only in some cases. In particular, it says:
`Box<U>` (specifically, only `Box<U, Global>`) when `U: Sized`
Thus restrict the `impl` to `Sized`, and use similar wording as in that
commit too.
Link: https://doc.rust-lang.org/stable/std/option/index.html#representation [1]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://github.com/Rust-for-Linux/pin-init/pull/32/commits/a6007cf555e5946bcbfafe93a6468c329078acd8
Fixes: 9b2299af3b92 ("rust: pin-init: add `std` and `alloc` support from the user-space version")
Cc: stable@vger.kernel.org
[ Adjust mentioned commit to the one from the kernel. - Benno ]
Signed-off-by: Benno Lossin <benno.lossin@proton.me>
Link: https://lore.kernel.org/r/20250407201755.649153-2-benno.lossin@proton.me
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC cleanups from Eric Biggers:
"Finish cleaning up the CRC kconfig options by removing the remaining
unnecessary prompts and an unnecessary 'default y', removing
CONFIG_LIBCRC32C, and documenting all the CRC library options"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crc: remove CONFIG_LIBCRC32C
lib/crc: document all the CRC library kconfig options
lib/crc: remove unnecessary prompt for CONFIG_CRC_ITU_T
lib/crc: remove unnecessary prompt for CONFIG_CRC_T10DIF
lib/crc: remove unnecessary prompt for CONFIG_CRC16
lib/crc: remove unnecessary prompt for CONFIG_CRC_CCITT
lib/crc: remove unnecessary prompt for CONFIG_CRC32 and drop 'default y'
|
|
A recent optimization change in LLVM [1] aims to transform certain loop
idioms into calls to strlen() or wcslen(). This change transforms the
first while loop in UniStrcat() into a call to wcslen(), breaking the
build when UniStrcat() gets inlined into alloc_path_with_tree_prefix():
ld.lld: error: undefined symbol: wcslen
>>> referenced by nls_ucs2_utils.h:54 (fs/smb/client/../../nls/nls_ucs2_utils.h:54)
>>> vmlinux.o:(alloc_path_with_tree_prefix)
>>> referenced by nls_ucs2_utils.h:54 (fs/smb/client/../../nls/nls_ucs2_utils.h:54)
>>> vmlinux.o:(alloc_path_with_tree_prefix)
Disable this optimization with '-fno-builtin-wcslen', which prevents the
compiler from assuming that wcslen() is available in the kernel's C
library.
[ More to the point - it's not that we couldn't implement wcslen(), it's
that this isn't an optimization at all in the context of the kernel.
Replacing a simple inlined loop with a function call to the same loop
is just stupid and pointless if you don't have long strings and fancy
libraries with vectorization support etc.
For the regular 'strlen()' cases, we want the compiler to do this in
order to handle the trivial case of constant strings. And we do have
optimized versions of 'strlen()' on some architectures. But for
wcslen? Just no. - Linus ]
Cc: stable@vger.kernel.org
Link: https://github.com/llvm/llvm-project/commit/9694844d7e36fd5e01011ab56b64f27b867aa72d [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Perf can hang while freeing a sigtrap event if a related deferred
signal hadn't managed to be sent before the file got closed:
perf_event_overflow()
task_work_add(perf_pending_task)
fput()
task_work_add(____fput())
task_work_run()
____fput()
perf_release()
perf_event_release_kernel()
_free_event()
perf_pending_task_sync()
task_work_cancel() -> FAILED
rcuwait_wait_event()
Once task_work_run() is running, the list of pending callbacks is
removed from the task_struct and from this point on task_work_cancel()
can't remove any pending and not yet started work items, hence the
task_work_cancel() failure and the hang on rcuwait_wait_event().
Task work could be changed to remove one work at a time, so a work
running on the current task can always cancel a pending one, however
the wait / wake design is still subject to inverted dependencies when
remote targets are involved, as pictured by Oleg:
T1 T2
fd = perf_event_open(pid => T2->pid); fd = perf_event_open(pid => T1->pid);
close(fd) close(fd)
<IRQ> <IRQ>
perf_event_overflow() perf_event_overflow()
task_work_add(perf_pending_task) task_work_add(perf_pending_task)
</IRQ> </IRQ>
fput() fput()
task_work_add(____fput()) task_work_add(____fput())
task_work_run() task_work_run()
____fput() ____fput()
perf_release() perf_release()
perf_event_release_kernel() perf_event_release_kernel()
_free_event() _free_event()
perf_pending_task_sync() perf_pending_task_sync()
rcuwait_wait_event() rcuwait_wait_event()
Therefore the only option left is to acquire the event reference count
upon queueing the perf task work and release it from the task work, just
like it was done before 3a5465418f5f ("perf: Fix event leak upon exec and file release")
but without the leaks it fixed.
Some adjustments are necessary to make it work:
* A child event might dereference its parent upon freeing. Care must be
taken to release the parent last.
* Some places assuming the event doesn't have any reference held and
therefore can be freed right away must instead put the reference and
let the reference counting to its job.
Reported-by: "Yi Lai" <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/all/Zx9Losv4YcJowaP%2F@ly-workstation/
Reported-by: syzbot+3c4321e10eea460eb606@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/673adf75.050a0220.87769.0024.GAE@google.com/
Fixes: 3a5465418f5f ("perf: Fix event leak upon exec and file release")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250304135446.18905-1-frederic@kernel.org
|
|
SCX_OPS_HAS_CGROUP_WEIGHT was only used to suppress the missing cgroup
weight support warnings. Now that the warnings are removed, the flag doesn't
do anything. Mark it for deprecation and remove its usage from scx_flatcg.
v2: Actually include the scx_flatcg update.
Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-and-reviewed-by: Andrea Righi <arighi@nvidia.com>
|
|
sched_ext generates warnings when cpu.weight / cpu.idle are set to
non-default values if the BPF scheduler doesn't implement weight support.
These warnings don't provide much value while adding constant annoyance. A
BPF scheduler may not implement any particular behavior and there's nothing
particularly special about missing cgroup weight support. Drop the warnings.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Replace kzalloc with kvzalloc for the exit_dump buffer allocation, which
can require large contiguous memory depending on the implementation.
This change prevents allocation failures by allowing the system to fall
back to vmalloc when contiguous memory allocation fails.
Since this buffer is only used for debugging purposes, physical memory
contiguity is not required, making vmalloc a suitable alternative.
Cc: stable@vger.kernel.org
Fixes: 07814a9439a3b0 ("sched_ext: Print debug dump after an error exit")
Suggested-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Commit d072acda4862 ("rust: use custom FFI integer types") did not
update rust-analyzer to include the new crate.
To enable rust-analyzer support for these custom ffi types, add the
`ffi` crate as a dependency to the `bindings`, `uapi` and `kernel`
crates, which all directly depend on it.
Fixes: d072acda4862 ("rust: use custom FFI integer types")
Signed-off-by: Lukas Fischer <kernel@o1oo11oo.de>
Reviewed-by: Tamir Duberstein <tamird@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250404125150.85783-2-kernel@o1oo11oo.de
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Some operations require checking, or ignoring, specific bits in an address
value. For example, this can be comparing address values to identify unique
structures.
Currently, the full address value is compared when filtering for duplicates.
This results in over counting and creation of extra records. This gives the
impression that more unique events occurred than did in reality.
Mask the address for physical rows on MI300.
[ bp: Simplify. ]
Fixes: 6f15e617cc99 ("RAS: Introduce a FRU memory poison manager")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
|
|
landlock_put_hierarchy() can be called when an error occurs in
landlock_merge_ruleset() due to insufficient memory. In this case, the
domain's audit details might not have been allocated yet, which would
cause landlock_free_hierarchy_details() to print a warning (but still
safely handle this case).
We could keep the WARN_ON_ONCE(!hierarchy) but it's not worth it for
this kind of function, so let's remove it entirely.
Cc: Paul Moore <paul@paul-moore.com>
Reported-by: syzbot+8bca99e91de7e060e4ea@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20250331104709.897062-1-mic@digikod.net
Reviewed-by: Günther Noack <gnoack@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Move helpers outside of CONFIG_OF, so basic allocation also works
without it.
Fixes: ed9c594d495d ("drm/panel: Add new helpers for refcounted panel allocatons")
Fixes: dcba396f6907 ("drm/panel: Add refcount support")
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/nyrjnvctqnk6f3x5q7rlmy5nb7iopoti56pgh43zqknici5ms4@cibpldh7epra
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|