summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2020-10-08drm/amd/display: Change ABM config init interfaceYongqiang Sun
[Why & How] change abm config init interface to support multiple ABMs. Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Reviewed-by: Chris Park <Chris.Park@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-07drm/amdgpu/swsmu: fix ARC build errorsAlex Deucher
We want to use the dev_* functions here rather than the pr_* variants. Switch to using dev_warn() which mirrors what we do on other asics. Fixes the following build errors on ARC: ../drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c: In function 'navi10_fill_i2c_req': ../arch/arc/include/asm/bug.h:24:2: error: implicit declaration of function 'pr_warn'; did you mean 'drm_warn'? [-Werror=implicit-function-declaration] ../drivers/gpu/drm/amd/amdgpu/../powerplay/sienna_cichlid_ppt.c: In function 'sienna_cichlid_fill_i2c_req': ../arch/arc/include/asm/bug.h:24:2: error: implicit declaration of function 'pr_warn'; did you mean 'drm_warn'? [-Werror=implicit-function-declaration] Reported-by: kernel test robot <lkp@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Evan Quan <evan.quan@amd.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: linux-snps-arc@lists.infradead.org Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-07drm/amdgpu: fix NULL pointer dereference for RenoirDirk Gouders
Commit c1cf79ca5ced46 ("drm/amdgpu: use IP discovery table for renoir") introduced a NULL pointer dereference when booting with amdgpu.discovery=0, because it removed the call of vega10_reg_base_init() for that case. Fix this by calling that funcion if amdgpu_discovery == 0 in addition to the case that amdgpu_discovery_reg_base_init() failed. Fixes: c1cf79ca5ced46 ("drm/amdgpu: use IP discovery table for renoir") Signed-off-by: Dirk Gouders <dirk@gouders.net> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-07drm/nouveau/mem: guard against NULL pointer access in mem_delKarol Herbst
other drivers seems to do something similar Signed-off-by: Karol Herbst <kherbst@redhat.com> Cc: dri-devel <dri-devel@lists.freedesktop.org> Cc: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201006220528.13925-2-kherbst@redhat.com
2020-10-07drm/nouveau/device: return error for unknown chipsetsKarol Herbst
Previously the code relied on device->pri to be NULL and to fail probing later. We really should just return an error inside nvkm_device_ctor for unsupported GPUs. Fixes: 24d5ff40a732 ("drm/nouveau/device: rework mmio mapping code to get rid of second map") Signed-off-by: Karol Herbst <kherbst@redhat.com> Cc: dann frazier <dann.frazier@canonical.com> Cc: dri-devel <dri-devel@lists.freedesktop.org> Cc: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201006220528.13925-1-kherbst@redhat.com
2020-10-01Merge tag 'amd-drm-fixes-5.9-2020-09-30' of ↵Dave Airlie
git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.9-2020-09-30: amdgpu: - Fix potential double free in userptr handling - Sienna Cichlid and Navy Flounder udpates - Add Sienna Cichlid PCI IDs - Drop experimental flag for navi12 - Raven fixes - Renoir fixes - HDCP fix - DCN3 fix for clang and older versions of gcc - Fix a runtime pm refcount issue Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200930161326.4243-1-alexander.deucher@amd.com
2020-09-30drm/amdgpu: disable gfxoff temporarily for navy_flounderJiansong Chen
gfxoff is temporarily disabled for navy_flounder, since at present the feature caused some tdr when performing display operations. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/amd/pm: setup APU dpm clock table in SMU HW initializationEvan Quan
As the dpm clock table is needed during DC HW initialization. And that (DC HW initialization) comes before smu_late_init() where current APU dpm clock table setup is performed. So, NULL pointer dereference will be triggered. By moving APU dpm clock table setup to smu_hw_init(), this can be avoided. Fixes: 02cf91c113ea ("drm/amd/powerplay: postpone operations not required for hw setup to late_init") Acked-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-by: Dirk Gouders <dirk@gouders.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/vmwgfx: Fix error handling in get_nodeZack Rusin
ttm_mem_type_manager_func.get_node was changed to return -ENOSPC instead of setting the node pointer to NULL. Unfortunately vmwgfx still had two places where it was explicitly converting -ENOSPC to 0 causing regressions. This fixes those spots by allowing -ENOSPC to be returned. That seems to fix recent regressions with vmwgfx. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Martin Krastev <krastevm@vmware.com> Sigend-off-by: Roland Scheidegger <sroland@vmware.com>
2020-09-29drm/amd/display: remove duplicate call to rn_vbios_smu_get_smu_version()Dirk Gouders
Commit 78fe9f63947a2b ("drm/amd/display: Remove DISPCLK Limit Floor for Certain SMU Versions") added a call to rn_vbios_smu_get_smu_version() to set clk_mgr->smu_ver. That field is initialized prior to the if-statement, already. Fixes: 78fe9f63947a2b (drm/amd/display: Remove DISPCLK Limit Floor for Certain SMU Versions) Signed-off-by: Dirk Gouders <dirk@gouders.net> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Sung Lee <sung.lee@amd.com> Cc: Yongqiang Sun <yongqiang.sun@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29drm/amdgpu/swsmu/smu12: fix force clock handling for mclkAlex Deucher
The state array is in the reverse order compared to other asics (high to low rather than low to high). Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1313 Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_configJean Delvare
A recent attempt to fix a ref count leak in amdgpu_display_crtc_set_config() turned out to be doing too much and "fixed" an intended decrease as if it were a leak. Undo that part to restore the proper balance. This is the very nature of this function to increase or decrease the power reference count depending on the situation. Consequences of this bug is that the power reference would eventually get down to 0 while the display was still in use, resulting in that display switching off unexpectedly. Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: e008fa6fb415 ("drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config") Cc: stable@vger.kernel.org Cc: Navid Emamdoost <navid.emamdoost@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29drm/amdgpu/display: fix CFLAGS setup for DCN30Alex Deucher
Properly handle clang and older versions of gcc. Fixes: e77165bf7b02a3 ("drm/amd/display: Add DCN3 blocks to Makefile") Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29drm/amd/display: fix return value check for hdcp_workFlora Cui
max_caps might be 0, thus hdcp_work might be ZERO_SIZE_PTR Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29drm/amdgpu: remove gpu_info fw support for sienna_cichlid etc.Jiansong Chen
Remove gpu_info fw support for sienna_cichlid etc., since the information can be retrieved from discovery binary. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29drm/amd/pm: Removed fixed clock in auto mode DPMSudheesh Mavila
SMU10_UMD_PSTATE_PEAK_FCLK value should not be used to set the DPM. Suggested-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25Merge tag 'drm-intel-fixes-2020-09-24' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.9-rc7: - Fix selftest reference to stack data out of scope - Fix GVT null pointer dereference - Backmerge from Linus' master to fix build Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87zh5fpmha.fsf@intel.com
2020-09-23drm/i915/selftests: Push the fake iommu device from the stack to dataChris Wilson
Since we store a pointer to the fake iommu device that is allocated on the stack, as soon as we leave the function it goes out of scope and any future dereference is undefined behaviour. Just in case we may need to look at the fake iommu device after initialiation, move the allocation from the stack into the data. Fixes: 01b9d4e21148 ("iommu/vt-d: Use dev_iommu_priv_get/set()") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200916105022.28316-2-chris@chris-wilson.co.uk (cherry picked from commit 9f9f4101fc98db56714e71676d5a1e2d27e01f7e) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-23Merge tag 'drm-misc-fixes-2020-09-18' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v5.9-rc6: - Fill asoc card owner in vc4. - Program secondary CSC correctly in sun4i, and extend register mapping to cover secondary CSC registers. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e3ab56cf-3b8e-9b21-f1b6-9a4989a52996@linux.intel.com
2020-09-22Merge tag 'gvt-fixes-2020-09-17' of https://github.com/intel/gvt-linux into ↵Jani Nikula
drm-intel-fixes gvt-fixes-2020-09-17 - Fix kernel oops for VFIO edid on BDW (Zhenyu) Signed-off-by: Jani Nikula <jani.nikula@intel.com> From: Zhenyu Wang <zhenyuw@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200917064208.GF11592@zhen-hp.sh.intel.com
2020-09-17drm/amdgpu: remove experimental flag from navi12Alex Deucher
Navi12 has worked fine for a while now. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17drm/amdgpu: add device ID for sienna_cichlid (v2)Likun Gao
Add device ID for sienna_cichlid. v2: squash in additional device ids. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17drm/amdgpu: use the AV1 defines for VCN 3.0Alex Deucher
Switch from magic numbers to defines for AV1 clockgating. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17drm/amdgpu: add VCN 3.0 AV1 registersAlex Deucher
This adds the AV1 registers. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17drm/amdgpu: add the GC 10.3 VRS registersAlex Deucher
Add the VRS registers. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17drm/amdgpu: prevent double kfree ttm->sgPhilip Yang
Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace: [ 420.932812] kernel BUG at /build/linux-do9eLF/linux-4.15.0/mm/slub.c:295! [ 420.934182] invalid opcode: 0000 [#1] SMP NOPTI [ 420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE [ 420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS 1.5.4 07/09/2020 [ 420.952887] RIP: 0010:__slab_free+0x180/0x2d0 [ 420.954419] RSP: 0018:ffffbe426291fa60 EFLAGS: 00010246 [ 420.955963] RAX: ffff9e29263e9c30 RBX: ffff9e29263e9c30 RCX: 000000018100004b [ 420.957512] RDX: ffff9e29263e9c30 RSI: fffff3d33e98fa40 RDI: ffff9e297e407a80 [ 420.959055] RBP: ffffbe426291fb00 R08: 0000000000000001 R09: ffffffffc0d39ade [ 420.960587] R10: ffffbe426291fb20 R11: ffff9e49ffdd4000 R12: ffff9e297e407a80 [ 420.962105] R13: fffff3d33e98fa40 R14: ffff9e29263e9c30 R15: ffff9e2954464fd8 [ 420.963611] FS: 00007fa2ea097780(0000) GS:ffff9e297e840000(0000) knlGS:0000000000000000 [ 420.965144] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 420.966663] CR2: 00007f16bfffefb8 CR3: 0000001ff0c62000 CR4: 0000000000340ee0 [ 420.968193] Call Trace: [ 420.969703] ? __page_cache_release+0x3c/0x220 [ 420.971294] ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu] [ 420.972789] kfree+0x168/0x180 [ 420.974353] ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu] [ 420.975850] ? kfree+0x168/0x180 [ 420.977403] amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu] [ 420.978888] ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm] [ 420.980357] ttm_tt_destroy.part.11+0x4f/0x60 [amdttm] [ 420.981814] ttm_tt_destroy+0x13/0x20 [amdttm] [ 420.983273] ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm] [ 420.984725] ttm_bo_release+0x1c9/0x360 [amdttm] [ 420.986167] amdttm_bo_put+0x24/0x30 [amdttm] [ 420.987663] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 420.989165] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10 [amdgpu] [ 420.990666] kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu] Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-18Merge tag 'mediatek-drm-fixes-5.9' of ↵Dave Airlie
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes for Linux 5.9 1. Fix scrolling of panel 2. Remove duplicated include 3. Use CPU when fail to get cmdq event 4. Add missing put_device() call Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200916231724.30571-1-chunkuang.hu@kernel.org
2020-09-18Merge tag 'drm-intel-fixes-2020-09-17' of ↵Dave Airlie
ssh://git.freedesktop.org/git/drm/drm-intel into drm-fixes drm/i915 fixes for v5.9-rc6: - Avoid exposing a partially constructed context - Use RCU instead of mutex for context termination list iteration - Avoid data race reported by KCSAN - Filter wake_flags passed to default_wake_function Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87y2l8vlj3.fsf@intel.com
2020-09-17drm/amd/display: Don't log hdcp module warnings in dmesgBhawanpreet Lakha
[Why] DTM topology updates happens by default now. This results in DTM warnings when hdcp is not even being enabled. This spams the dmesg and doesn't effect normal display functionality so it is better to log it using DRM_DEBUG_KMS() [How] Change the DRM_WARN() to DRM_DEBUG_KMS() Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17drm/amdgpu: declare ta firmware for navy_flounderJiansong Chen
The firmware provided via MODULE_FIRMWARE appears in the module information. External tools(eg. dracut) may use the list of fw files to include them as appropriate in an initramfs, thus missing declaration will lead to request firmware failure in boot time. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17drm/mediatek: Add missing put_device() call in mtk_hdmi_dt_parse_pdata()Yu Kuai
if of_find_device_by_node() succeed, mtk_drm_kms_init() doesn't have a corresponding put_device(). Thus add jump target to fix the exception handling for this function implementation. Fixes: 8f83f26891e1 ("drm/mediatek: Add HDMI support") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2020-09-17drm/mediatek: Add missing put_device() call in mtk_drm_kms_init()Yu Kuai
if of_find_device_by_node() succeed, mtk_drm_kms_init() doesn't have a corresponding put_device(). Thus add jump target to fix the exception handling for this function implementation. Fixes: 119f5173628a ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2020-09-17drm/mediatek: Add exception handing in mtk_drm_probe() if component init failYu Kuai
mtk_ddp_comp_init() is called in a loop in mtk_drm_probe(), if it fail, previous successive init component is not proccessed. Thus uninitialize valid component and put their device if component init failed. Fixes: 119f5173628a ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2020-09-17drm/mediatek: Add missing put_device() call in mtk_ddp_comp_init()Yu Kuai
if of_find_device_by_node() succeed, mtk_ddp_comp_init() doesn't have a corresponding put_device(). Thus add put_device() to fix the exception handling for this function implementation. Fixes: d0afe37f5209 ("drm/mediatek: support CMDQ interface in ddp component") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2020-09-17drm/mediatek: Use CPU when fail to get cmdq eventChun-Kuang Hu
Even though cmdq client is created successfully, without the cmdq event, cmdq could not work correctly, so use CPU when fail to get cmdq event. Fixes: 60fa8c13ab1a ("drm/mediatek: Move gce event property to mutex device node") Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2020-09-17drm/mediatek: Remove duplicated includeWang Hai
Remove mtk_drm_ddp.h which is included more than once Fixes: 9aef5867c86c ("drm/mediatek: drop use of drmP.h") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2020-09-16drm/i915: Filter wake_flags passed to default_wake_functionChris Wilson
(NOTE: This is the minimal backportable fix, a full fix is being developed at https://patchwork.freedesktop.org/patch/388048/) The flags passed to the wait_entry.func are passed onwards to try_to_wake_up(), which has a very particular interpretation for its wake_flags. In particular, beyond the published WF_SYNC, it has a few internal flags as well. Since we passed the fence->error down the chain via the flags argument, these ended up in the default_wake_function confusing the kernel/sched. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2110 Fixes: ef4688497512 ("drm/i915: Propagate fence errors") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728152144.1100-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] [Joonas: Added a note and link about more complete fix] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit f4b3c395540aa3d4f5a6275c5bdd83ab89034806) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-16drm/i915: Be wary of data races when reading the active execlistsChris Wilson
To implement preempt-to-busy (and so efficient timeslicing and best utilization of the hardware submission ports) we let the GPU run asynchronously in respect to the ELSP submission queue. This created challenges in keeping and accessing the driver state mirroring the asynchronous GPU execution. The latest occurence of this was spotted by KCSAN: [ 1413.563200] BUG: KCSAN: data-race in __await_execution+0x217/0x370 [i915] [ 1413.563221] [ 1413.563236] race at unknown origin, with read to 0xffff88885bb6c478 of 8 bytes by task 9654 on cpu 1: [ 1413.563548] __await_execution+0x217/0x370 [i915] [ 1413.563891] i915_request_await_dma_fence+0x4eb/0x6a0 [i915] [ 1413.564235] i915_request_await_object+0x421/0x490 [i915] [ 1413.564577] i915_gem_do_execbuffer+0x29b7/0x3c40 [i915] [ 1413.564967] i915_gem_execbuffer2_ioctl+0x22f/0x5c0 [i915] [ 1413.564998] drm_ioctl_kernel+0x156/0x1b0 [ 1413.565022] drm_ioctl+0x2ff/0x480 [ 1413.565046] __x64_sys_ioctl+0x87/0xd0 [ 1413.565069] do_syscall_64+0x4d/0x80 [ 1413.565094] entry_SYSCALL_64_after_hwframe+0x44/0xa9 To complicate matters, we have to both avoid the read tearing of *active and avoid any write tearing as perform the pending[] -> inflight[] promotion of the execlists. This is because we cannot rely on the memcpy doing u64 aligned copies on all kernels/platforms and so we opt to open-code it with explicit WRITE_ONCE annotations to satisfy KCSAN. v2: When in doubt, write the same comment again. v3: Expanded commit message. Fixes: b55230e5e800 ("drm/i915: Check for awaits on still currently executing requests") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200716142207.13003-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] [Joonas: Added expanded commit message from Tvrtko and Chris] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit b4d9145b0154f8c71dafc2db5fd445f1f3db9426) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-16drm/i915/gem: Reduce context termination list iteration guard to RCUChris Wilson
As we now protect the timeline list using RCU, we can drop the timeline->mutex for guarding the list iteration during context close, as we are searching for an inflight request. Any new request will see the context is banned and not be submitted. In doing so, pull the checks for a concurrent submission of the request (notably the i915_request_completed()) under the engine spinlock, to fully serialise with __i915_request_submit()). That is in the case of preempt-to-busy where the request may be completed during the __i915_request_submit(), we need to be careful that we sample the request status after serialising so that we don't miss the request the engine is actually submitting. Fixes: 4a3174152147 ("drm/i915/gem: Refine occupancy test in kill_context()") References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from early waits") # rcu protection of timeline->requests References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622 References: https://gitlab.freedesktop.org/drm/intel/-/issues/2158 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200806105954.7766-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit 736e785f9b28cd9ef2d16a80960a04fd00e64b22) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-16drm/i915/gem: Delay tracking the GEM context until it is registeredChris Wilson
Avoid exposing a partially constructed context by deferring the list_add() from the initial construction to the end of registration. Otherwise, if we peek into the list of contexts from inside debugfs, we may see the partially constructed context and chase down some dangling incomplete pointers. Reported-by: CQ Tang <cq.tang@intel.com> Fixes: 3aa9945a528e ("drm/i915: Separate GEM context construction and registration to userspace") References: f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: CQ Tang <cq.tang@intel.com> Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200730092856.23615-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit eb4dedae920a07c485328af3da2202ec5184fb17) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-15drm/amdgpu/dc: Require primary plane to be enabled whenever the CRTC isMichel Dänzer
Don't check drm_crtc_state::active for this either, per its documentation in include/drm/drm_crtc.h: * Hence drivers must not consult @active in their various * &drm_mode_config_funcs.atomic_check callback to reject an atomic * commit. atomic_remove_fb disables the CRTC as needed for disabling the primary plane. This prevents at least the following problems if the primary plane gets disabled (e.g. due to destroying the FB assigned to the primary plane, as happens e.g. with mutter in Wayland mode): * The legacy cursor ioctl returned EINVAL for a non-0 cursor FB ID (which enables the cursor plane). * If the cursor plane was enabled, changing the legacy DPMS property value from off to on returned EINVAL. v2: * Minor changes to code comment and commit log, per review feedback. GitLab: https://gitlab.gnome.org/GNOME/mutter/-/issues/1108 GitLab: https://gitlab.gnome.org/GNOME/mutter/-/issues/1165 GitLab: https://gitlab.gnome.org/GNOME/mutter/-/issues/1344 Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/radeon: revert "Prefer lower feedback dividers"Christian König
Turns out this breaks a lot of different hardware. This reverts commit fc8c70526bd30733ea8667adb8b8ffebea30a8ed. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amdgpu: Include sienna_cichlid in USBC PD FW support.Andrey Grodzovsky
Create sysfs interface also for sienna_cichlid. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amd/display: update nv1x stutter latenciesJun Lei
[why] Recent characterization shows increased stutter latencies on some SKUs, leading to underflow. [how] Update SOC params to account for this worst case latency. Signed-off-by: Jun Lei <jun.lei@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amd/display: Don't use DRM_ERROR() for DTM add topologyBhawanpreet Lakha
[Why] Previously we were only calling add_topology when hdcp was being enabled. Now we call add_topology by default so the ERROR messages are printed if the firmware is not loaded. This error message is not relevant for normal display functionality so no need to print a ERROR message. [How] Change DRM_ERROR to DRM_INFO Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amd/pm: support runtime pptable update for sienna_cichlid etc.Jiansong Chen
This avoids smu issue when enabling runtime pptable update for sienna_cichlid and so on. Runtime pptable udpate is needed for test and debug purpose. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amdkfd: fix a memory leak issueDennis Li
In the resume stage of GPU recovery, start_cpsch will call pm_init which set pm->allocated as false, cause the next pm_release_ib has no chance to release ib memory. Add pm_release_ib in stop_cpsch which will be called in the suspend stage of GPU recovery. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/kfd: fix a system crash issue during GPU recoveryDennis Li
The crash log as the below: [Thu Aug 20 23:18:14 2020] general protection fault: 0000 [#1] SMP NOPTI [Thu Aug 20 23:18:14 2020] CPU: 152 PID: 1837 Comm: kworker/152:1 Tainted: G OE 5.4.0-42-generic #46~18.04.1-Ubuntu [Thu Aug 20 23:18:14 2020] Hardware name: GIGABYTE G482-Z53-YF/MZ52-G40-00, BIOS R12 05/13/2020 [Thu Aug 20 23:18:14 2020] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [Thu Aug 20 23:18:14 2020] RIP: 0010:evict_process_queues_cpsch+0xc9/0x130 [amdgpu] [Thu Aug 20 23:18:14 2020] Code: 49 8d 4d 10 48 39 c8 75 21 eb 44 83 fa 03 74 36 80 78 72 00 74 0c 83 ab 68 01 00 00 01 41 c6 45 41 00 48 8b 00 48 39 c8 74 25 <80> 78 70 00 c6 40 6d 01 74 ee 8b 50 28 c6 40 70 00 83 ab 60 01 00 [Thu Aug 20 23:18:14 2020] RSP: 0018:ffffb29b52f6fc90 EFLAGS: 00010213 [Thu Aug 20 23:18:14 2020] RAX: 1c884edb0a118914 RBX: ffff8a0d45ff3c00 RCX: ffff8a2d83e41038 [Thu Aug 20 23:18:14 2020] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8a0e2e4178c0 [Thu Aug 20 23:18:14 2020] RBP: ffffb29b52f6fcb0 R08: 0000000000001b64 R09: 0000000000000004 [Thu Aug 20 23:18:14 2020] R10: ffffb29b52f6fb78 R11: 0000000000000001 R12: ffff8a0d45ff3d28 [Thu Aug 20 23:18:14 2020] R13: ffff8a2d83e41028 R14: 0000000000000000 R15: 0000000000000000 [Thu Aug 20 23:18:14 2020] FS: 0000000000000000(0000) GS:ffff8a0e2e400000(0000) knlGS:0000000000000000 [Thu Aug 20 23:18:14 2020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Thu Aug 20 23:18:14 2020] CR2: 000055c783c0e6a8 CR3: 00000034a1284000 CR4: 0000000000340ee0 [Thu Aug 20 23:18:14 2020] Call Trace: [Thu Aug 20 23:18:14 2020] kfd_process_evict_queues+0x43/0xd0 [amdgpu] [Thu Aug 20 23:18:14 2020] kfd_suspend_all_processes+0x60/0xf0 [amdgpu] [Thu Aug 20 23:18:14 2020] kgd2kfd_suspend.part.7+0x43/0x50 [amdgpu] [Thu Aug 20 23:18:14 2020] kgd2kfd_pre_reset+0x46/0x60 [amdgpu] [Thu Aug 20 23:18:14 2020] amdgpu_amdkfd_pre_reset+0x1a/0x20 [amdgpu] [Thu Aug 20 23:18:14 2020] amdgpu_device_gpu_recover+0x377/0xf90 [amdgpu] [Thu Aug 20 23:18:14 2020] ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [Thu Aug 20 23:18:14 2020] amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [Thu Aug 20 23:18:14 2020] process_one_work+0x20f/0x400 [Thu Aug 20 23:18:14 2020] worker_thread+0x34/0x410 When GPU hang, user process will fail to create a compute queue whose struct object will be freed later, but driver wrongly add this queue to queue list of the proccess. And then kfd_process_evict_queues will access a freed memory, which cause a system crash. v2: The failure to execute_queues should probably not be reported to the caller of create_queue, because the queue was already created. Therefore change to ignore the return value from execute_queues. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-09-14drm/i915/gvt: Fix port number for BDW on EDID region setupZhenyu Wang
Current BDW virtual display port is initialized as PORT_B, so need to use same port for VFIO EDID region, otherwise invalid EDID blob pointer is assigned which caused kernel null pointer reference. We might evaluate actual display hotplug for BDW to make this function work as expected, anyway this is always required to be fixed first. Reported-by: Alejandro Sior <aho@sior.be> Cc: Alejandro Sior <aho@sior.be> Fixes: 0178f4ce3c3b ("drm/i915/gvt: Enable vfio edid for all GVT supported platform") Reviewed-by: Hang Yuan <hang.yuan@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20200914030302.2775505-1-zhenyuw@linux.intel.com
2020-09-11Merge tag 'drm-misc-fixes-2020-09-09' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v5.9-rc5: - Fix double free in virtio. - Add missing put_device in sun4i, and other fixes. - Small ingenic fixes. - Handle sun4i alpha on lowest plane correctly. - Remove output->enabled from virtio, as it should use crtc_state. - Fix tve200 enable/disable. - Documentation fix. - Fix virtio unblank. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/478b49d1-b1b3-c983-7056-8a89249be435@mblankhorst.nl