Age | Commit message (Collapse) | Author |
|
Add intel_bw_pmdemand_needs_update() helper to avoid looking at struct
intel_bw_state internals outside of intel_bw.c.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/163fda39da2e1cf0f0c4fcb9c71103c98863179e.1750847509.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
With all the code touching struct intel_dbuf_state moved inside
skl_watermark.c, we move the struct definition there too, and make the
type opaque. This nicely reduces includes from skl_watermark.h.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/83ae5f022a1d6d83c031e5c079b04dc739102565.1750847509.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Add intel_dbuf_num_enabled_slices() and intel_dbuf_num_active_pipes()
helpers to avoid looking at struct intel_dbuf_state internals outside of
skl_watermark.c.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/7d555e7b4e93632b732b8b5a3cd4076baf781bee.1750847509.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Add intel_dbuf_pmdemand_needs_update() helper to avoid looking at struct
intel_dbuf_state internals outside of skl_watermark.c.
With this, we can also move to_intel_dbuf_state(),
intel_atomic_get_old_dbuf_state(), and intel_atomic_get_new_dbuf_state()
inside skl_watermark.c.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/b493f259d0d3db047151fee18d7e801ad469fa88.1750847509.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
DISPLAY_PLANE_FLIP_PENDING() has been unused since commit fd3a40242e87
("drm/i915: Rip out legacy page_flip completion/irq handling"). Remove.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250625132140.1564473-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
While doing voltage swing for type-c phy
for DP 1.62 and HDMI write the
LOADGEN_SHARING_PMD_DISABLE bit to 1.
-v2: Update commit.
Add bspec[Suraj]
-v3: Move w/a before DKL_TX_PMD_LANE_SUS.
Use DKL_TX_DPCNTL2[Ville]
-v4: Use intel_encoder_is_dp and
intel_encoder_is_hdmi. [Suraj]
Bspec: 55359
Signed-off-by: Nemesa Garg <nemesa.garg@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://lore.kernel.org/r/20250625074911.194085-1-nemesa.garg@intel.com
|
|
Allocate and register drm_panel to allow the panel_follower framework to
detect the eDP panel and pass drm_connector::kdev device to drm_panel
allocation for matching.
Call drm_panel_prepare/unprepare in ddi_enable for eDP to allow the
followers to get notified of the panel power state changes.
Note: This is for eDP with DDI platforms only.
v2: remove backlight setup from panel_register (Jani)
v3: Updated the commit message (Jani)
Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://lore.kernel.org/r/20250624-edp_panel-v3-1-e8197b6d9fde@intel.com
|
|
The local variable dev points to drm->dev already, use dev directly.
Link: https://lore.kernel.org/r/20250409103344.3661603-1-sakari.ailus@linux.intel.com
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
When a user closes an exec queue or interrupts an app with Ctrl-C,
this does not warrant wedging the device in mode 2.
Avoid this by skipping the wedge check for killed exec queues in
the TDR and LR exec queue cleanup worker.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250624174103.2707941-1-matthew.brost@intel.com
|
|
This reverts commit 3972872e459d812ab5e481a231a6066cf4f4d0f4.
There are several things wrong with the way this WA was implemented:
- The KLV is only supported on GuC 70.47.0 or newer, so we shouldn't
apply it unconditionally.
- The KLV requires 2 DWs of data, which are not currently provided.
The GuC currently ignores any unknown KLVs, so on versions older that
70.47.0 nothing happens. However, starting on 70.47.0 the GuC attempts
to parse the KLV and fails due to the missing data, causing a GuC load
abort.
Given that 70.47.0 is the first GuC version approved for public release
for PTL, let's revert this patch so it doesn't cause the GuC load to
fail with that blob. We can then re-apply it properly fixed after the
GuC definition is merged, which will also have the added benefit of
running the KLV addition through CI with the right GuC version.
Fixes: 3972872e459d ("drm/xe/ptl: Apply Wa_16026007364")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: sanirban <sk.anirban@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250625001202.1616606-2-daniele.ceraolospurio@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Commit 37d078e51b4c ("drm/xe/uapi: Split xe_sync types from flags") renamed some DRM_XE_SYNC_*
defines but later commits kept using the old names. Correct them with the new definition.
v2: correct fixes tag and update commit message to explain why (Lucas)
Fixes: 9329f0667215 ("drm/xe/uapi: Use LR abbrev for long-running vms")
Fixes: 4b437893a826 ("drm/xe/uapi: More uAPI documentation additions and cosmetic updates")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Zongyao Bai <zongyao.bai@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://lore.kernel.org/r/20250608230133.1250849-1-shuicheng.lin@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The igt_vma_pin1() function has a rather high stack usage, which gets
in the way of reducing the default warning limit:
In file included from drivers/gpu/drm/i915/i915_vma.c:2285:
drivers/gpu/drm/i915/selftests/i915_vma.c:257:12: error: stack frame size (1288) exceeds limit (1280) in 'igt_vma_pin1' [-Werror,-Wframe-larger-than]
There are two things going on here:
- The on-stack modes[] array is really large itself and gets constructed
for every call, using around 1000 bytes itself depending on the configuration.
- The call to i915_vma_pin() gets inlined and adds another 200 bytes for
the i915_gem_ww_ctx structure since commit 7d1c2618eac5 ("drm/i915: Take
reservation lock around i915_vma_pin.")
The second one is easy enough to change, by moving the function into the
appropriate .c file. Since it is already large enough to not always be
inlined, this seems like a good idea regardless, reducing both the code size
and the internal stack usage of each of its 67 callers.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250620113644.3844552-1-arnd@kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
An earlier patch fixed a build failure with clang, but I still see the
same problem with some configurations using gcc:
drivers/gpu/drm/i915/i915_pmu.c: In function 'config_mask':
include/linux/compiler_types.h:568:38: error: call to '__compiletime_assert_462' declared with attribute error: BUILD_BUG_ON failed: bit > BITS_PER_TYPE(typeof_member(struct i915_pmu, enable)) - 1
drivers/gpu/drm/i915/i915_pmu.c:116:3: note: in expansion of macro 'BUILD_BUG_ON'
116 | BUILD_BUG_ON(bit >
As I understand it, the problem is that the function is not always fully
inlined, but the __builtin_constant_p() can still evaluate the argument
as being constant.
Marking it as __always_inline so far works for me in all configurations.
Fixes: 686d773186bf ("drm/i915/pmu: Fix build error with GCOV and AutoFDO enabled")
Fixes: a644fde77ff7 ("drm/i915/pmu: Change bitmask of enabled events to u32")
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250620111824.3395007-1-arnd@kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
When KMSAN is enabled, this function causes has a rather excessive stack usage:
drivers/gpu/drm/i915/display/skl_watermark.c:2977:1: error: stack frame size (1432) exceeds limit (1408) in 'skl_compute_wm' [-Werror,-Wframe-larger-than]
This is apparently all caused by the varargs calls to drm_dbg_kms(). Inlining
this into skl_compute_wm() means that any function called by skl_compute_wm()
has its own stack on top of that.
Move the worst bit into a separate function marked as noinline_for_stack to
limit that to the one code path that actually needs it.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250620113748.3869160-1-arnd@kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
During driver probe we might be briefly using CT safe mode, which
is based on a delayed work, but usually we are able to stop this
once we have IRQ fully operational. However, if we abort the probe
quite early then during unwind we might try to destroy the workqueue
while there is still a pending delayed work that attempts to restart
itself which triggers a WARN.
This was recently observed during unsuccessful VF initialization:
[ ] xe 0000:00:02.1: probe with driver xe failed with error -62
[ ] ------------[ cut here ]------------
[ ] workqueue: cannot queue safe_mode_worker_func [xe] on wq xe-g2h-wq
[ ] WARNING: CPU: 9 PID: 0 at kernel/workqueue.c:2257 __queue_work+0x287/0x710
[ ] RIP: 0010:__queue_work+0x287/0x710
[ ] Call Trace:
[ ] delayed_work_timer_fn+0x19/0x30
[ ] call_timer_fn+0xa1/0x2a0
Exit the CT safe mode on unwind to avoid that warning.
Fixes: 09b286950f29 ("drm/xe/guc: Allow CTB G2H processing without G2H IRQ")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250612220937.857-3-michal.wajdeczko@intel.com
|
|
While we are indirectly draining our dedicated workqueue ggtt->wq
that we use to complete asynchronous removal of some GGTT nodes,
this happends as part of the managed-drm unwinding (ggtt_fini_early),
which could be later then manage-device unwinding, where we could
already unmap our MMIO/GMS mapping (mmio_fini).
This was recently observed during unsuccessful VF initialization:
[ ] xe 0000:00:02.1: probe with driver xe failed with error -62
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747340 __xe_bo_unpin_map_no_vm (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747540 __xe_bo_unpin_map_no_vm (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747240 __xe_bo_unpin_map_no_vm (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747040 tiles_fini (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e746840 mmio_fini (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747f40 xe_bo_pinned_fini (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e746b40 devm_drm_dev_init_release (16 bytes)
[ ] xe 0000:00:02.1: [drm:drm_managed_release] drmres release begin
[ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef81640 __fini_relay (8 bytes)
[ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80d40 guc_ct_fini (8 bytes)
[ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80040 __drmm_mutex_release (8 bytes)
[ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80140 ggtt_fini_early (8 bytes)
and this was leading to:
[ ] BUG: unable to handle page fault for address: ffffc900058162a0
[ ] #PF: supervisor write access in kernel mode
[ ] #PF: error_code(0x0002) - not-present page
[ ] Oops: Oops: 0002 [#1] SMP NOPTI
[ ] Tainted: [W]=WARN
[ ] Workqueue: xe-ggtt-wq ggtt_node_remove_work_func [xe]
[ ] RIP: 0010:xe_ggtt_set_pte+0x6d/0x350 [xe]
[ ] Call Trace:
[ ] <TASK>
[ ] xe_ggtt_clear+0xb0/0x270 [xe]
[ ] ggtt_node_remove+0xbb/0x120 [xe]
[ ] ggtt_node_remove_work_func+0x30/0x50 [xe]
[ ] process_one_work+0x22b/0x6f0
[ ] worker_thread+0x1e8/0x3d
Add managed-device action that will explicitly drain the workqueue
with all pending node removals prior to releasing MMIO/GSM mapping.
Fixes: 919bb54e989c ("drm/xe: Fix missing runtime outer protection for ggtt_remove_node")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250612220937.857-2-michal.wajdeczko@intel.com
|
|
Only need the flush for DPT host updates here. Normal GGTT updates don't
need special flush.
Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250606104546.1996818-4-matthew.auld@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Flushing l2 is only needed after all data has been written.
Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340")
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/20250606104546.1996818-3-matthew.auld@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Limit GT max frequency to 2600MHz and wait for frequency to reduce
before proceeding with a transient flush. This is really only needed for
the transient flush: if L2 flush is needed due to 16023588340 then
there's no need to do this additional wait since we are already using
the bigger hammer.
v2: Use generic names, ensure user set max frequency requests wait
for flush to complete (Rodrigo)
v3:
- User requests wait via wait_var_event_timeout (Lucas)
- Close races on flush + user requests (Lucas)
- Fix xe_guc_pc_remove_flush_freq_limit() being called on last gt
rather than root gt (Lucas)
v4:
- Only apply the freq reducing part if a TDF is needed: L2 flush trumps
the need for waiting a lower frequency
Fixes: aaa08078e725 ("drm/xe/bmg: Apply Wa_22019338487")
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://lore.kernel.org/r/20250618-wa-22019338487-v5-4-b888388477f2@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
xe_device_td_flush() has 2 possible implementations: an entire L2 flush
or a transient flush, depending on WA 16023588340. Make this clear by
splitting the function so it calls each of them.
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20250618-wa-22019338487-v5-3-b888388477f2@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
pc_set_mert_freq_cap() currently lock()/unlock() the mutex multiple times
to stash the current frequencies. It's not a problem since
xe_guc_pc_restore_stashed_freq() is guaranteed to be called only later
in the init sequence. However, now that we have _locked() variants for
this functions, use them and avoid potential issues when called from
other places or using the same pattern.
While at it, prefer and early return for the WA check to reduce
indentation.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250618-wa-22019338487-v5-2-b888388477f2@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
There are places in which the getters/setters are called one after the
other causing a multiple lock()/unlock(). These are not currently a
problem since they are all happening from the same thread, but there's a
race possibility as calls are added outside of the early init when the
max/min and stashed values need to be correlated.
Add the _locked() variants to prepare for that.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250618-wa-22019338487-v5-1-b888388477f2@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
When EDID is retrieved via drm_edid_raw(), it doesn't guarantee to
return proper EDID bytes the caller wants: it may be either NULL (that
leads to an Oops) or with too long bytes over the fixed size raw_edid
array (that may lead to memory corruption). The latter was reported
actually when connected with a bad adapter.
Add sanity checks for drm_edid_raw() to address the above corner
cases, and return EDID_BAD_INPUT accordingly.
Fixes: 48edb2a4256e ("drm/amd/display: switch amdgpu_dm_connector to use struct drm_edid")
Link: https://bugzilla.suse.com/show_bug.cgi?id=1236415
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
revise the pcie dpm parameters
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Brightness programming may involve a conversion of a user requested
brightness against what was in a custom brightness curve. The values
might not match what a user programmed.
[How]
Add a new trace event to show specific converted brightness values.
Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250623171114.1156451-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
commit 16dc8bc27c2a ("drm/amd/display: Export full brightness range to
userspace") adjusted the brightness range to scale to larger values, but
missed updating AMDGPU_MAX_BL_LEVEL which is needed to make sure that
scaling works properly with custom brightness curves.
[How]
As the change for max brightness of 0xFFFF only applies to devices
supporting DC, use existing DC define MAX_BACKLIGHT_LEVEL.
Fixes: 16dc8bc27c2a ("drm/amd/display: Export full brightness range to userspace")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250623171114.1156451-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There is a spelling mistake in a pr_info message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
SDMA 7.0.0/1: 7836028
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
SDMA 6.0.0 version 24
SDMA 6.0.2 version 21
SDMA 6.0.3 version 25
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Instead of checking the response flag, use status mask also to check
against any unexpected failures like a device drop. Also, log error if
waiting on a psp response fails/times out.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
They can be shared across multiple products
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It is not necessary to be ip generation specific
v2: rename the helper to is_multi_aid (Lijo)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The query_memory_partition does not need to remain
as soc specific callbacks. They can be shared across
multiple products
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This relocation allows MAX_MEM_RANGES to be shared
across multiple products
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
So they can be reused for future products
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
So it can be used for future products
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The update_partition_sched_list function does not
need to remain as a soc specific callback. It can
be reused for future products.
v2: bypass the function if xcp_mgr is not available (Likun)
v3: Let caller check the availability of xcp_mgr (Lijo)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The xcp select_sched function does not need to
remain as a soc specific callback. It can be reused
for future products
v2: bypass the function if xcp_mgr is not available (Likun)
v3: Let caller check the availability of xcp mgr (Lijo)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Transfer to use function amdgpu_ip_map_init to map ip
instance for aqua_vanjaram instead of operation on
different ASIC.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
IP instance map init function can be an common function
instead of operation on different ASIC.
V2: Create amdgpu_ip.[ch] file for ip related functions.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
restriction
This change removes the restriction when palign=64 and nbx=32.
This makes two piglit tests working. This is discussed on the
thread linked below.
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9056
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This change implements the support of PACKET3_COND_EXEC and
PACKET3_COND_WRITE which are required to implement
ARB_indirect_parameters. ARB_indirect_parameters is
part of OpenGL 4.6.
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34726
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Replace DRM_ERROR with drm_err function and update log
messages to drop __func__ and print return value.
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
AMDISP I2C device requires to power on ISP HW to probe the sensor
device. Instead of using the exported symbols from ISP driver to
control the power and clocks remotely,added Generic PM Domain (genpd)
support in amdgpu_isp device for its child devices (amd_isp_capture,
amd_isp_i2c_designware) to set power and clocks using PM methods.
Co-developed-by: Bin Du <bin.du@amd.com>
Signed-off-by: Bin Du <bin.du@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add support to set ISP clocks for SMU v14.0.0. ISP driver
uses amdgpu_dpm_set_soft_freq_range() API to set clocks via
SMU interface than communicating with PMFW directly.
amdgpu_dpm_set_soft_freq_range() is updated to take in any
pp_clock_type than limiting to support only PP_SCLK to allow
ISP and other driver modules to set the min/max clocks. Any
clock specific restrictions are expected to be taken care in
SOC specific SMU implementations instead of generic amdgpu_dpm
and amdgpu_smu interfaces.
Reviewed-by: Xiaojian Du <xiaojian.du@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add support to set ISP power for SMU v14.0.0. ISP driver
uses amdgpu_dpm_set_powergating_by_smu() API to
enable / disable power via SMU interface than communicating
with PMFW directly.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The issue was reproduced on NV10 using IGT pci_unplug test.
It is expected that `amdgpu_driver_postclose_kms()` is called prior to `amdgpu_drm_release()`.
However, the bug is that `amdgpu_fpriv` was freed in `amdgpu_driver_postclose_kms()`, and then
later accessed in `amdgpu_drm_release()` via a call to `amdgpu_userq_mgr_fini()`.
As a result, KASAN detected a use-after-free condition, as shown in the log below.
The proposed fix is to move the calls to `amdgpu_eviction_fence_destroy()` and
`amdgpu_userq_mgr_fini()` into `amdgpu_driver_postclose_kms()`, so they are invoked before
`amdgpu_fpriv` is freed.
This also ensures symmetry with the initialization path in `amdgpu_driver_open_kms()`,
where the following components are initialized:
- `amdgpu_userq_mgr_init()`
- `amdgpu_eviction_fence_init()`
- `amdgpu_ctx_mgr_init()`
Correspondingly, in `amdgpu_driver_postclose_kms()` we should clean up using:
- `amdgpu_userq_mgr_fini()`
- `amdgpu_eviction_fence_destroy()`
- `amdgpu_ctx_mgr_fini()`
This change eliminates the use-after-free and improves consistency in resource management between open and close paths.
[ +0.094367] ==================================================================
[ +0.000026] BUG: KASAN: slab-use-after-free in amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[ +0.000866] Write of size 8 at addr ffff88811c068c60 by task amd_pci_unplug/1737
[ +0.000026] CPU: 3 UID: 0 PID: 1737 Comm: amd_pci_unplug Not tainted 6.14.0+ #2
[ +0.000008] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
[ +0.000004] Call Trace:
[ +0.000004] <TASK>
[ +0.000003] dump_stack_lvl+0x76/0xa0
[ +0.000010] print_report+0xce/0x600
[ +0.000009] ? amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[ +0.000790] ? srso_return_thunk+0x5/0x5f
[ +0.000007] ? kasan_complete_mode_report_info+0x76/0x200
[ +0.000008] ? amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[ +0.000684] kasan_report+0xbe/0x110
[ +0.000007] ? amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[ +0.000601] __asan_report_store8_noabort+0x17/0x30
[ +0.000007] amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[ +0.000801] ? __pfx_amdgpu_userq_mgr_fini+0x10/0x10 [amdgpu]
[ +0.000819] ? srso_return_thunk+0x5/0x5f
[ +0.000008] amdgpu_drm_release+0xa3/0xe0 [amdgpu]
[ +0.000604] __fput+0x354/0xa90
[ +0.000010] __fput_sync+0x59/0x80
[ +0.000005] __x64_sys_close+0x7d/0xe0
[ +0.000006] x64_sys_call+0x2505/0x26f0
[ +0.000006] do_syscall_64+0x7c/0x170
[ +0.000004] ? kasan_record_aux_stack+0xae/0xd0
[ +0.000005] ? srso_return_thunk+0x5/0x5f
[ +0.000004] ? kmem_cache_free+0x398/0x580
[ +0.000006] ? __fput+0x543/0xa90
[ +0.000006] ? srso_return_thunk+0x5/0x5f
[ +0.000004] ? __fput+0x543/0xa90
[ +0.000004] ? __kasan_check_read+0x11/0x20
[ +0.000007] ? srso_return_thunk+0x5/0x5f
[ +0.000004] ? __kasan_check_read+0x11/0x20
[ +0.000003] ? srso_return_thunk+0x5/0x5f
[ +0.000004] ? fpregs_assert_state_consistent+0x21/0xb0
[ +0.000006] ? srso_return_thunk+0x5/0x5f
[ +0.000004] ? syscall_exit_to_user_mode+0x4e/0x240
[ +0.000005] ? srso_return_thunk+0x5/0x5f
[ +0.000004] ? do_syscall_64+0x88/0x170
[ +0.000003] ? srso_return_thunk+0x5/0x5f
[ +0.000004] ? do_syscall_64+0x88/0x170
[ +0.000004] ? srso_return_thunk+0x5/0x5f
[ +0.000004] ? irqentry_exit+0x43/0x50
[ +0.000004] ? srso_return_thunk+0x5/0x5f
[ +0.000004] ? exc_page_fault+0x7c/0x110
[ +0.000006] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000005] RIP: 0033:0x7ffff7b14f67
[ +0.000005] Code: ff e8 0d 16 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff
[ +0.000004] RSP: 002b:00007fffffffe358 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[ +0.000006] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff7b14f67
[ +0.000003] RDX: 0000000000000000 RSI: 00007ffff7f5755a RDI: 0000000000000003
[ +0.000003] RBP: 00007fffffffe380 R08: 0000555555568170 R09: 0000000000000000
[ +0.000003] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5c8
[ +0.000003] R13: 00005555555552a9 R14: 0000555555557d48 R15: 00007ffff7ffd040
[ +0.000007] </TASK>
[ +0.000286] Allocated by task 425 on cpu 11 at 29.751192s:
[ +0.000013] kasan_save_stack+0x28/0x60
[ +0.000008] kasan_save_track+0x18/0x70
[ +0.000006] kasan_save_alloc_info+0x38/0x60
[ +0.000006] __kasan_kmalloc+0xc1/0xd0
[ +0.000005] __kmalloc_cache_noprof+0x1bd/0x430
[ +0.000006] amdgpu_driver_open_kms+0x172/0x760 [amdgpu]
[ +0.000521] drm_file_alloc+0x569/0x9a0
[ +0.000008] drm_client_init+0x1b7/0x410
[ +0.000007] drm_fbdev_client_setup+0x174/0x470
[ +0.000007] drm_client_setup+0x8a/0xf0
[ +0.000006] amdgpu_pci_probe+0x50b/0x10d0 [amdgpu]
[ +0.000482] local_pci_probe+0xe7/0x1b0
[ +0.000008] pci_device_probe+0x5bf/0x890
[ +0.000005] really_probe+0x1fd/0x950
[ +0.000007] __driver_probe_device+0x307/0x410
[ +0.000005] driver_probe_device+0x4e/0x150
[ +0.000006] __driver_attach+0x223/0x510
[ +0.000005] bus_for_each_dev+0x102/0x1a0
[ +0.000006] driver_attach+0x3d/0x60
[ +0.000005] bus_add_driver+0x309/0x650
[ +0.000005] driver_register+0x13d/0x490
[ +0.000006] __pci_register_driver+0x1ee/0x2b0
[ +0.000006] xfrm_ealg_get_byidx+0x43/0x50 [xfrm_algo]
[ +0.000008] do_one_initcall+0x9c/0x3e0
[ +0.000007] do_init_module+0x29e/0x7f0
[ +0.000006] load_module+0x5c75/0x7c80
[ +0.000006] init_module_from_file+0x106/0x180
[ +0.000007] idempotent_init_module+0x377/0x740
[ +0.000006] __x64_sys_finit_module+0xd7/0x180
[ +0.000006] x64_sys_call+0x1f0b/0x26f0
[ +0.000006] do_syscall_64+0x7c/0x170
[ +0.000005] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000013] Freed by task 1737 on cpu 9 at 76.455063s:
[ +0.000010] kasan_save_stack+0x28/0x60
[ +0.000006] kasan_save_track+0x18/0x70
[ +0.000005] kasan_save_free_info+0x3b/0x60
[ +0.000006] __kasan_slab_free+0x54/0x80
[ +0.000005] kfree+0x127/0x470
[ +0.000006] amdgpu_driver_postclose_kms+0x455/0x760 [amdgpu]
[ +0.000485] drm_file_free.part.0+0x5b1/0xba0
[ +0.000007] drm_file_free+0x13/0x30
[ +0.000006] drm_client_release+0x1c4/0x2b0
[ +0.000006] drm_fbdev_ttm_fb_destroy+0xd2/0x120 [drm_ttm_helper]
[ +0.000007] put_fb_info+0x97/0xe0
[ +0.000006] unregister_framebuffer+0x197/0x380
[ +0.000005] drm_fb_helper_unregister_info+0x94/0x100
[ +0.000005] drm_fbdev_client_unregister+0x3c/0x80
[ +0.000007] drm_client_dev_unregister+0x144/0x330
[ +0.000006] drm_dev_unregister+0x49/0x1b0
[ +0.000006] drm_dev_unplug+0x4c/0xd0
[ +0.000006] amdgpu_pci_remove+0x58/0x130 [amdgpu]
[ +0.000482] pci_device_remove+0xae/0x1e0
[ +0.000006] device_remove+0xc7/0x180
[ +0.000006] device_release_driver_internal+0x3d4/0x5a0
[ +0.000007] device_release_driver+0x12/0x20
[ +0.000006] pci_stop_bus_device+0x104/0x150
[ +0.000006] pci_stop_and_remove_bus_device_locked+0x1b/0x40
[ +0.000005] remove_store+0xd7/0xf0
[ +0.000007] dev_attr_store+0x3f/0x80
[ +0.000006] sysfs_kf_write+0x125/0x1d0
[ +0.000005] kernfs_fop_write_iter+0x2ea/0x490
[ +0.000007] vfs_write+0x90d/0xe70
[ +0.000006] ksys_write+0x119/0x220
[ +0.000006] __x64_sys_write+0x72/0xc0
[ +0.000006] x64_sys_call+0x18ab/0x26f0
[ +0.000005] do_syscall_64+0x7c/0x170
[ +0.000005] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000013] The buggy address belongs to the object at ffff88811c068000
which belongs to the cache kmalloc-rnd-01-4k of size 4096
[ +0.000016] The buggy address is located 3168 bytes inside of
freed 4096-byte region [ffff88811c068000, ffff88811c069000)
[ +0.000022] The buggy address belongs to the physical page:
[ +0.000010] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff88811c06e000 pfn:0x11c068
[ +0.000006] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ +0.000006] flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
[ +0.000007] page_type: f5(slab)
[ +0.000007] raw: 0017ffffc0000040 ffff88810004c140 dead000000000122 0000000000000000
[ +0.000005] raw: ffff88811c06e000 0000000080040002 00000000f5000000 0000000000000000
[ +0.000006] head: 0017ffffc0000040 ffff88810004c140 dead000000000122 0000000000000000
[ +0.000005] head: ffff88811c06e000 0000000080040002 00000000f5000000 0000000000000000
[ +0.000006] head: 0017ffffc0000003 ffffea0004701a01 ffffffffffffffff 0000000000000000
[ +0.000005] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
[ +0.000004] page dumped because: kasan: bad access detected
[ +0.000011] Memory state around the buggy address:
[ +0.000009] ffff88811c068b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ +0.000012] ffff88811c068b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ +0.000011] >ffff88811c068c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ +0.000011] ^
[ +0.000010] ffff88811c068c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ +0.000011] ffff88811c068d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ +0.000011] ==================================================================
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Jesse Zhang <Jesse.Zhang@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
v2: drop amdgpu_drm_release() and assign drm_release()
as the callback directly.(Alex)
Fixes: adba0929736a ("drm/amdgpu: Fix Illegal opcode in command stream Error")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The `complete` callback should be described in kernel doc.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/linux-next/20250619205931.41cf9332@canb.auug.org.au/
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250620041420.3585005-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Just use kmalloc for the fences in the rare case we need
an independent fence.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
commit 017fbb6690c2 ("drm/amdgpu/discovery: check ip_discovery fw file
available") added support for reading an amdgpu IP discovery bin file
for some specific products. If it's not found then it will fallback to
hardcoded values. However if it's not found there is also a lot of noise
about missing files and errors.
Adjust the error handling to decrease most messages to DEBUG and to show
users less about missing files.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reported-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4312
Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Fixes: 017fbb6690c2 ("drm/amdgpu/discovery: check ip_discovery fw file available")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250617183052.1692059-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|