summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/msm/msm_gpu.h
AgeCommit message (Collapse)Author
2023-02-22Merge tag 'drm-msm-fixes-2023-01-16' into msm-fixesRob Clark
Back-merge of previous cycles msm-fixes for kexec fix (to avoid merge conflict) Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-01-16drm/msm/gpu: Bypass PM QoS constraint for idle clampRob Clark
Change idle freq clamping back to the direct method, bypassing PM QoS requests. The problem with using PM QoS requests is they call (indirectly) the governors ->get_target_freq() which goes thru a get_dev_status() cycle. The problem comes when the GPU becomes active again and we remove the idle-clamp request, we go through another get_dev_status() cycle for the period that the GPU has been idle, which triggers the governor to lower the target freq excessively. This partially reverts commit 7c0ffcd40b16 ("drm/msm/gpu: Respect PM QoS constraints"), but preserves the use of boost QoS request, so that it will continue to play nicely with other QoS requests such as a cooling device. This also mostly undoes commit 78f815c1cf8f ("drm/msm: return the average load over the polling period") Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/517785/ Link: https://lore.kernel.org/r/20230110231447.1939101-3-robdclark@gmail.com Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2023-01-16drm/msm/gpu: Add devfreq tuning debugfsRob Clark
Make the handful of tuning knobs available visible via debugfs. v2: select DEVFREQ_GOV_SIMPLE_ONDEMAND because for some reason struct devfreq_simple_ondemand_data depends on this Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/517784/ Link: https://lore.kernel.org/r/20230110231447.1939101-2-robdclark@gmail.com Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2023-01-11drm/msm/gpu: Fix potential double-freeRob Clark
If userspace was calling the MSM_SET_PARAM ioctl on multiple threads to set the COMM or CMDLINE param, it could trigger a race causing the previous value to be kfree'd multiple times. Fix this by serializing on the gpu lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Fixes: d4726d770068 ("drm/msm: Add a way to override processes comm/cmdline") Patchwork: https://patchwork.freedesktop.org/patch/517778/ Link: https://lore.kernel.org/r/20230110212903.1925878-1-robdclark@gmail.com
2022-11-30Merge tag 'drm-msm-next-2022-11-28' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next msm-next for v6.2 (the gpu/gem bits) - Remove exclusive-fence hack that caused over-synchronization - Fix speed-bin detection vs. probe-defer - Enable clamp_to_idle on 7c3 - Improved hangcheck detection Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvT1h_S4d=YRgphgR8i7aMaxQaNW8mru7QaoUo9uiUk2A@mail.gmail.com
2022-11-17drm/msm: Hangcheck progress detectionRob Clark
If the hangcheck timer expires, check if the fw's position in the cmdstream has advanced (changed) since last timer expiration, and allow it up to three additional "extensions" to it's alotted time. The intention is to continue to catch "shader stuck in a loop" type hangs quickly, but allow more time for things that are actually making forward progress. Because we need to sample the CP state twice to detect if there has not been progress, this also cuts the the timer's duration in half. v2: Fix typo (REG_A6XX_CP_CSQ_IB2_STAT), add comment v3: Only halve hangcheck timer duration for generations which support progress detection (hdanton); removed unused a5xx progress (without knowing how to adjust for data buffered in ROQ it is too likely to report a false negative) v4: Comment updates to better describe the total hangcheck duration when progress detection is applied Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Tested-by: Chia-I Wu <olvaffe@gmail.com> # dEQP-GLES2.functional.flush_finish.wait Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/511584/ Link: https://lore.kernel.org/r/20221114193049.1533391-3-robdclark@gmail.com
2022-11-17drm/msm/adreno: Simplify read64/write64 helpersRob Clark
The _HI reg is always following the _LO reg, so no need to pass these offsets seprately. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/511581/ Link: https://lore.kernel.org/r/20221114193049.1533391-2-robdclark@gmail.com
2022-09-30drm/msm/gpu: Fix crash during system suspend after unbindAkhil P Oommen
In adreno_unbind, we should clean up gpu device's drvdata to avoid accessing a stale pointer during system suspend. Also, check for NULL ptr in both system suspend/resume callbacks. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/505075/ Link: https://lore.kernel.org/r/20220928124830.2.I5ee0ac073ccdeb81961e5ec0cce5f741a7207a71@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm/a6xx: Ensure CX collapse during gpu recoveryAkhil P Oommen
Because there could be transient votes from other drivers/tz/hyp which may keep the cx gdsc enabled, we should poll until cx gdsc collapses. We can use the reset framework to poll for cx gdsc collapse from gpucc clk driver. This feature requires support from the platform's gpucc driver. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Patchwork: https://patchwork.freedesktop.org/patch/498397/ Link: https://lore.kernel.org/r/20220819015030.v5.5.I176567525af2b9439a7e485d0ca130528666a55c@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-27drm/msm/gem: Convert to using drm_gem_lruRob Clark
This converts over to use the shared GEM LRU/shrinker helpers. Note that it means we are no longer tracking purgeable or willneed buffers that are active separately. But the most recently pinned buffers should be at the tail of the various LRUs, and the shrinker is already prepared to encounter objects which are still active. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/496131/ Link: https://lore.kernel.org/r/20220802155152.1727594-11-robdclark@gmail.com
2022-08-27drm/msm: Split out idr_lockRob Clark
Otherwise if we hit reclaim pinning objects in the submit path, we'll be blocking retire_worker trying to free a submit. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/496116/ Link: https://lore.kernel.org/r/20220802155152.1727594-4-robdclark@gmail.com
2022-07-06drm/msm/gpu: Add GEM debug label to devcoreRob Clark
When trying to understand an iova fault devcore, once you figure out which buffer we accessed beyond the end of, it is useful to see the buffer's debug label. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/491910/ Link: https://lore.kernel.org/r/20220629211919.563585-3-robdclark@gmail.com
2022-07-06drm/msm: Avoid unclocked GMU register access in 6xx gpu_busyDouglas Anderson
From testing on sc7180-trogdor devices, reading the GMU registers needs the GMU clocks to be enabled. Those clocks get turned on in a6xx_gmu_resume(). Confusingly enough, that function is called as a result of the runtime_pm of the GPU "struct device", not the GMU "struct device". Unfortunately the current a6xx_gpu_busy() grabs a reference to the GMU's "struct device". The fact that we were grabbing the wrong reference was easily seen to cause crashes that happen if we change the GPU's pm_runtime usage to not use autosuspend. It's also believed to cause some long tail GPU crashes even with autosuspend. We could look at changing it so that we do pm_runtime_get_if_in_use() on the GPU's "struct device", but then we run into a different problem. pm_runtime_get_if_in_use() will return 0 for the GPU's "struct device" the whole time when we're in the "autosuspend delay". That is, when we drop the last reference to the GPU but we're waiting a period before actually suspending then we'll think the GPU is off. One reason that's bad is that if the GPU didn't actually turn off then the cycle counter doesn't lose state and that throws off all of our calculations. Let's change the code to keep track of the suspend state of devfreq. msm_devfreq_suspend() is always called before we actually suspend the GPU and msm_devfreq_resume() after we resume it. This means we can use the suspended state to know if we're powered or not. NOTE: one might wonder when exactly our status function is called when devfreq is supposed to be disabled. The stack crawl I captured was: msm_devfreq_get_dev_status devfreq_simple_ondemand_func devfreq_update_target qos_notifier_call qos_max_notifier_call blocking_notifier_call_chain pm_qos_update_target freq_qos_apply apply_constraint __dev_pm_qos_update_request dev_pm_qos_update_request msm_devfreq_idle_work Fixes: eadf79286a4b ("drm/msm: Check for powered down HW in the devfreq callbacks") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/489124/ Link: https://lore.kernel.org/r/20220610124639.v4.1.Ie846c5352bc307ee4248d7cab998ab3016b85d06@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-05drm/msm: Expose client engine utilization via fdinfoRob Clark
Similar to AMD commit 874442541133 ("drm/amdgpu: Add show_fdinfo() interface"), using the infrastructure added in previous patches, we add basic client info and GPU engine utilisation for msm. Example output: # cat /proc/`pgrep glmark2`/fdinfo/6 pos: 0 flags: 02400002 mnt_id: 21 ino: 162 drm-driver: msm drm-client-id: 7 drm-engine-gpu: 1734371319 ns drm-cycles-gpu: 1153645024 drm-maxfreq-gpu: 800000000 Hz See also: https://patchwork.freedesktop.org/patch/468505/ v2: Add dev-maxfreq-$engine and update drm-usage-stats.rst v3: spelling and compiler warning Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Patchwork: https://patchwork.freedesktop.org/patch/488906/ Link: https://lore.kernel.org/r/20220609174213.2265938-2-robdclark@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-04-21drm/msm: return the average load over the polling periodChia-I Wu
simple_ondemand interacts poorly with clamp_to_idle. It only looks at the load since the last get_dev_status call, while it should really look at the load over polling_ms. When clamp_to_idle true, it almost always picks the lowest frequency on active because the gpu is idle between msm_devfreq_idle/msm_devfreq_active. This logic could potentially be moved into devfreq core. Fixes: 7c0ffcd40b16 ("drm/msm/gpu: Respect PM QoS constraints") Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Cc: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220416003314.59211-3-olvaffe@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: simplify gpu_busy callbackChia-I Wu
Move tracking and busy time calculation to msm_devfreq_get_dev_status. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Cc: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220416003314.59211-2-olvaffe@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm/gpu: Drop duplicate fence counterRob Clark
The ring seqno counter duplicates the fence-context last_fence counter. They end up getting incremented in lock-step, on the same scheduler thread, but the split just makes things less obvious. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220411215849.297838-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: Add a way to override processes comm/cmdlineRob Clark
In the cause of using the GPU via virtgpu, the host side process is really a sort of proxy, and not terribly interesting from the PoV of crash/fault logging. Add a way to override these per process so that we can see the guest process's name. v2: Handle kmalloc failure, add comment to explain kstrdup returns NULL if passed NULL [Dan Carpenter] Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220317165144.222101-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: Add support for pointer paramsRob Clark
The 64b value field is already suffient to hold a pointer instead of immediate, but we also need a length field. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220317165144.222101-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: Remove unused field in submitRob Clark
Noticed this was unused and never set. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220225202614.225197-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-03-04drm/msm: Add SYSPROF param (v2)Rob Clark
Add a SYSPROF param for system profiling tools like Mesa's pps-producer (perfetto) to control behavior related to system-wide performance counter collection. In particular, for profiling, one wants to ensure that GPU context switches do not effect perfcounter state, and might want to suppress suspend (which would cause counters to lose state). v2: Swap the order in msm_file_private_set_sysprof() [sboyd] and initialize the sysprof_active refcount to one (because the under/ overflow checking in refcount_t doesn't expect a 0->1 transition) meaning that values greater than 1 means sysprof is active. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220304005317.776110-4-robdclark@gmail.com
2022-03-04drm/msm: Add SET_PARAM ioctlRob Clark
It was always expected to have a use for this some day, so we left a placeholder. Now we do. (And I expect another use in the not too distant future when we start allowing userspace to allocate GPU iova.) Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220304005317.776110-3-robdclark@gmail.com
2022-02-20drm/msm/gpu: Track global faults per address-spaceRob Clark
Other processes don't need to know about faults that they are isolated from by virtue of address space isolation. They are only interested in whether some of their state might have been corrupted. But to be safe, also track unattributed faults. This case should really never happen unless there is a kernel bug (and that would never happen, right?) v2: Instead of adding a new param, just change the behavior of the existing param to match what userspace actually wants [anholt] Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5934 Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220201161618.778455-3-robdclark@gmail.com Reviewed-by: Emma Anholt <emma@anholt.net> Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-02-20drm/msm/gpu: Add ctx to get_param()Rob Clark
Prep work for next patch. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220201161618.778455-2-robdclark@gmail.com Reviewed-by: Emma Anholt <emma@anholt.net> Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-01-25drm/msm/gpu: Wait for idle before suspendingRob Clark
System suspend uses pm_runtime_force_suspend(), which cheekily bypasses the runpm reference counts. This doesn't actually work so well when the GPU is active. So add a reasonable delay waiting for the GPU to become idle. Alternatively we could just return -EBUSY in this case, but that has the disadvantage of causing system suspend to fail. v2: s/ret/remaining [sboyd], and switch to using active_submits count to ensure we aren't racing with submit cleanup (and devfreq idle work getting scheduled, etc) v3: fix inverted logic Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220108180913.814448-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm/gpu: Respect PM QoS constraintsRob Clark
Re-work the boost and idle clamping to use PM QoS requests instead, so they get aggreggated with other requests (such as cooling device). This does have the minor side-effect that devfreq sysfs min_freq/ max_freq files now reflect the boost and idle clamping, as they show (despite what they are documented to show) the aggregated min/max freq. Fixing that in devfreq does not look straightforward after considering that OPPs can be dynamically added/removed. However writes to the sysfs files still behave as expected. v2: Use 64b math to avoid potential 32b overflow Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211120200103.1051459-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm: Handle fence rolloverRob Clark
Add some helpers for fence comparision, which handle rollover properly, and stop open coding fence seqno comparisions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org> Link: https://lore.kernel.org/r/20211109181117.591148-5-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm: Remove struct_mutex usageRob Clark
The remaining struct_mutex usage is just to serialize various gpu related things (submit/retire/recover/fault/etc), so replace struct_mutex with gpu->lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211109181117.591148-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm: Drop priv->lastctxRob Clark
cur_ctx_seqno already does the same thing, but handles the edge cases where a refcnt'd context can live after lastclose. So let's not have two ways to do the same thing. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org> Link: https://lore.kernel.org/r/20211109181117.591148-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-21drm/msm: Restore error return on invalid fenceRob Clark
When converting to use an idr to map userspace fence seqno values back to a dma_fence, we lost the error return when userspace passes seqno that is larger than the last submitted fence. Restore this check. Reported-by: Akhil P Oommen <akhilpo@codeaurora.org> Fixes: a61acbbe9cf8 ("drm/msm: Track "seqno" fences by idr") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org> Link: https://lore.kernel.org/r/20211111192457.747899-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-28Merge tag 'drm-msm-next-2021-10-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next * eDP support in DP sub-driver (for newer SoCs with native eDP output) * dpu irq handling cleanup * CRC support for making igt happy * Support for NO_CONNECTOR bridges * dsi: 14nm phy support for msm8953 * mdp5: support for msm8x53, sdm450, sdm632 * various smaller fixes and cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGsH9EwcpqGNNRJeL99NvFFjHX3SUg+nTYu0dHG5U9+QuA@mail.gmail.com
2021-10-18drm/msm/devfreq: Restrict idle clamping to a618 for nowRob Clark
Until we better understand the stability issues caused by frequent frequency changes, lets limit them to a618. Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: John Stultz <john.stultz@linaro.org> Tested-by: Caleb Connolly <caleb.connolly@linaro.org> Link: https://lore.kernel.org/r/20211018153627.2787882-1-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-15drm/msm/devfreq: Add 1ms delay before clamping freqRob Clark
Add a short delay before clamping to idle frequency on active->idle transition. It takes ~0.5ms to increase the freq again on the next idle->active transition, so this helps avoid extra freq transitions on workloads that bounce between CPU and GPU. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20210927230455.1066297-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-01drm/msm: One sched entity per process per priorityRob Clark
Some userspace apps make assumptions that rendering against multiple contexts within the same process (from the same thread, with appropriate MakeCurrent() calls) provides sufficient synchronization without any external synchronization (ie. glFenceSync()/glWaitSync()). Since a submitqueue maps to a gl/vk context, having multiple sched entities of the same priority only works with implicit sync enabled. To fix this, limit things to a single sched entity per priority level per process. An alternative would be sharing submitqueues between contexts in userspace, but tracking of per-context faults (ie. GL_EXT_robustness) is already done at the submitqueue level, so this is not an option. Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-01drm/msm: A bit more docs + cleanupRob Clark
msm_file_private is more gpu related, and in the next commit it will need access to other GPU specific #defines. While we're at it, add some comments. Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-07-28drm/msm: Utilize gpu scheduler prioritiesRob Clark
The drm/scheduler provides additional prioritization on top of that provided by however many number of ringbuffers (each with their own priority level) is supported on a given generation. Expose the additional levels of priority to userspace and map the userspace priority back to ring (first level of priority) and schedular priority (additional priority levels within the ring). Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210728010632.2633470-13-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-07-28drm/msm: Conversion to drm schedulerRob Clark
For existing adrenos, there is one or more ringbuffer, depending on whether preemption is supported. When preemption is supported, each ringbuffer has it's own priority. A submitqueue (which maps to a gl context or vk queue in userspace) is mapped to a specific ring- buffer at creation time, based on the submitqueue's priority. Each ringbuffer has it's own drm_gpu_scheduler. Each submitqueue maps to a drm_sched_entity. And each submit maps to a drm_sched_job. Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/4 Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210728010632.2633470-10-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-07-27drm/msm: Track "seqno" fences by idrRob Clark
Previously the (non-fd) fence returned from submit ioctl was a raw seqno, which is scoped to the ring. But from UABI standpoint, the ioctls related to seqno fences all specify a submitqueue. We can take advantage of that to replace the seqno fences with a cyclic idr handle. This is in preperation for moving to drm scheduler, at which point the submit ioctl will return after queuing the submit job to the scheduler, but before the submit is written into the ring (and therefore before a ring seqno has been assigned). Which means we need to replace the dma_fence that userspace may need to wait on with a scheduler fence. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210728010632.2633470-8-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-07-27drm/msm: Docs and misc cleanupRob Clark
Fix a couple incorrect or misspelt comments, and add submitqueue doc comment. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210728010632.2633470-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-07-27drm/msm: Devfreq tuningRob Clark
This adds a few things to try and make frequency scaling better match the workload: 1) Longer polling interval to avoid whip-lashing between too-high and too-low frequencies in certain workloads, like mobile games which throttle themselves to 30fps. Previously our polling interval was short enough to let things ramp down to minimum freq in the "off" frame, but long enough to not react quickly enough when rendering started on the next frame, leading to uneven frame times. (Ie. rather than a consistent 33ms it would alternate between 16/33/48ms.) 2) Awareness of when the GPU is active vs idle. Since we know when the GPU is active vs idle, we can clamp the frequency down to the minimum while it is idle. (If it is idle for long enough, then the autosuspend delay will eventually kick in and power down the GPU.) Since devfreq has no knowledge of powered-but-idle, this takes a small bit of trickery to maintain a "fake" frequency while idle. This, combined with the longer polling period allows devfreq to arrive at a reasonable "active" frequency, while still clamping to minimum freq when idle to reduce power draw. 3) Boost. Because simple_ondemand needs to see a certain threshold of busyness to ramp up, we could end up needing multiple polling cycles before it reacts appropriately on interactive workloads (ex. scrolling a web page after reading for some time), on top of the already lengthened polling interval, when we see a idle to active transition after a period of idle time we boost the frequency that we return to. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20210726144653.2180096-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-07-27drm/msm: Split out devfreq handlingRob Clark
Before we start adding more cleverness, split it into it's own file. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20210726144653.2180096-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-23drm/msm: devcoredump iommu fault supportRob Clark
Wire up support to stall the SMMU on iova fault, and collect a devcore- dump snapshot for easier debugging of faults. Currently this is a6xx-only, but mostly only because so far it is the only one using adreno-smmu-priv. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jordan Crouse <jordan@cosmicpenguin.net> Link: https://lore.kernel.org/r/20210610214431.539029-6-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-23drm/msm: export hangcheck_period in debugfsSamuel Iglesias Gonsalvez
While keeping the previous default value for hangcheck period, we allow now the possibility of configuring its value via debugfs. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Link: https://lore.kernel.org/r/20210607104441.184700-1-siglesias@igalia.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-23drm/msm: remove unused icc_path/ocmem_icc_pathJonathan Marek
These aren't used by anything anymore. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org> Link: https://lore.kernel.org/r/20210608172808.11803-2-jonathan@marek.ca Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-04-07drm/msm: Add param for userspace to query suspend countRob Clark
Performance counts, and ALWAYS_ON counters used for capturing GPU timestamps, lose their state across suspend/resume cycles. Userspace tooling for performance monitoring needs to be aware of this. For example, after a suspend userspace needs to recalibrate it's offset between CPU and GPU time. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jordan Crouse <jordan@cosmicpenguin.net> Link: https://lore.kernel.org/r/20210325012358.1759770-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-11-29drm/msm: rearrange the gpu_rmw() functionSharat Masetty
The register read-modify-write construct is generic enough that it can be used by other subsystems as needed, create a more generic rmw() function and have the gpu_rmw() use this new function. Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-11-05drm/msm: Add support for GPU coolingAkhil P Oommen
Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org> Tested-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-11-04drm/msm: Add priv->mm_lock to protect active/inactive listsRob Clark
Rather than relying on the big dev->struct_mutex hammer, introduce a more specific lock for protecting the bo lists. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-11-01drm/msm/gpu: Convert retire/recover work to kthread_workerRob Clark
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-09-15drm/msm: Allow a5xx to mark the RPTR shadow as privilegedJordan Crouse
Newer microcode versions have support for the CP_WHERE_AM_I opcode which allows the RPTR shadow memory to be marked as privileged to protect it from corruption. Move the RPTR shadow into its own buffer and protect it it if the current microcode version supports the new feature. We can also re-enable preemption for those targets that support CP_WHERE_AM_I. Start out by preemptively assuming that we can enable preemption and disable it in a5xx_hw_init if the microcode version comes back as too old. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>