summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-10-19Merge tag 'amd-drm-fixes-5.10-2020-10-14' of ↵Dave Airlie
git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-fixes-5.10-2020-10-14: amdgpu: - eDP fix - BACO fix - Kernel documentation fixes - SMU7 mclk fix - VCN1 hw bug workaround amdkfd: - kvfree vs kfree fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201014195403.4558-1-alexander.deucher@amd.com
2020-10-14drm/amdkfd: Use kvfree in destroy_crat_imageKent Russell
Now that we use kvmalloc for the crat_image, we need to use kvfree when we destroy this. Fixes: d0e63b343e575e ("drm/amdkfd: Use kvmalloc instead of kmalloc for VCRAT") Reported-by: Morris Zhang <shiwu.zhang@amd.clm> Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-14drm/amdgpu: vcn and jpeg ring synchronizationVeerabadhran G
Synchronize the ring usage for vcn1 and jpeg1 to workaround a hardware bug. Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-10-14drm/amd/pm: increase mclk switch threshold to 200 usEvan Quan
To avoid underflow seen on Polaris10 with some 3440x1440 144Hz displays. As the threshold of 190 us cuts too close to minVBlankTime of 192 us. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-10-14docs: amdgpu: fix a warning when building the documentationMauro Carvalho Chehab
As reported by Sphinx: Documentation/gpu/amdgpu.rst:200: WARNING: Inline emphasis start-string without end-string. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-14drm/amd/display: kernel-doc: document force_timing_syncMauro Carvalho Chehab
As warned when running "make htmldocs": ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:345: warning: Function parameter or member 'force_timing_sync' not described in 'amdgpu_display_manager' This new struct member was not documented at kernel-doc markup. Fixes: 3d4e52d0cf24 ("drm/amd/display: Add debugfs for forcing stream timing sync") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-14drm/amdgpu/swsmu: init the baco mutex in early_initAlex Deucher
GPU reset might get called during init time, before sw_init has been called. Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-14drm/amd/display: Fix module load hangs when connected to an eDPRodrigo Siqueira
It was recently introduced a change that enables driver to disable streams if pixel clock changes. Consequently, the code path executed in the disable vbios function expanded to an encoder verification part. The encoder loop is nested inside the pipe count loop, and both loops share the 'i' variable in control of their flow. This situation may lead to an infinite loop because the encoder loop constantly updates the `i` variable, making the first loop always positive. As a result, we can see a soft hang during the module load (modprobe amdgpu) and a series of dmesg log that looks like this: kernel:[ 124.538727] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [modprobe:1000] RSP: 0018:ffffabbf419bf0e8 EFLAGS: 00000282 RAX: ffffffffc0809de0 RBX: ffff93b35ccc0000 RCX: ffff93b366c21800 RDX: 0000000000000000 RSI: 0000000000000141 RDI: ffff93b35ccc0000 RBP: ffffabbf419bf108 R08: ffffabbf419bf164 R09: 0000000000000001 R10: 0000000000000003 R11: 0000000000000003 R12: 0000000008677d40 R13: 0000000000000141 R14: ffff93b35cfc0000 R15: ffff93b35abc0000 FS: 00007f1400717540(0000) GS:ffff93b37f680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005649b66b0968 CR3: 00000003e0fec000 CR4: 0000000000350ee0 Call Trace: amdgpu_device_rreg+0x17/0x20 [amdgpu] amdgpu_cgs_read_register+0x14/0x20 [amdgpu] dm_read_reg_func+0x3a/0xb0 [amdgpu] get_pixel_clk_frequency_100hz+0x30/0x50 [amdgpu] dc_commit_state+0x8f1/0xae0 [amdgpu] ? drm_calc_timestamping_constants+0x101/0x160 [drm] amdgpu_dm_atomic_commit_tail+0x39d/0x21a0 [amdgpu] ? dcn21_validate_bandwidth+0xe5/0x290 [amdgpu] ? kfree+0xc3/0x390 ? dcn21_validate_bandwidth+0xe5/0x290 [amdgpu] ... RSP: 002b:00007fff26009bd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 000055a8025bea50 RCX: 00007f140085c89d RDX: 0000000000000000 RSI: 000055a8025b8290 RDI: 000000000000000c RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000 R10: 000000000000000c R11: 0000000000000246 R12: 000055a8025b8290 R13: 0000000000000000 R14: 000055a8025bead0 R15: 000055a8025bea50 This issue was fixed by introducing a second variable for the internal loop. Fixes: 8353d30e747f4e ("drm/amd/display: disable stream if pixel clock changed with link active") Reviewed-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-14Merge tag 'drm-misc-next-fixes-2020-10-13' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next One fix for a bad revert in ingenic-drm, and one fix for panfrost to increase a timeout at power up. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20201013065709.lwjw3fthoxwsbqsl@gilmour.lan
2020-10-12drm/ingenic: Fix bad revertPaul Cercueil
Fix a badly reverted commit. The revert commit was cherry-picked from drm-misc-next to drm-misc-next-fixes, and in the process some unrelated code was added. Fixes: a3fb64c00d44 ("Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"") Signed-off-by: Paul Cercueil <paul@crapouillou.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20201012102509.10690-1-paul@crapouillou.net
2020-10-12Merge tag 'amd-drm-fixes-5.10-2020-10-09' of ↵Dave Airlie
git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-fixes-5.10-2020-10-09: amdgpu: - Clean up indirect register access - Navy Flounder fixes - SMU11 AC/DC interrupt fixes - GPUVM alignment fix - Display fixes - Misc other fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201009222810.4030-1-alexander.deucher@amd.com
2020-10-12Merge tag 'drm-intel-next-fixes-2020-10-02' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Propagated from drm-intel-next-queued: - Fix CRTC state checker (Ville) Propated from drm-intel-gt-next: - Avoid implicit vmpa for highmem on 32b (Chris) - Prevent PAT attriutes for writecombine if CPU doesn't support PAT (Chris) - Clear the buffer pool age before use. (Chris) - Fix error code (Dan) - Break up error capture compression loops (Chris) - Fix uninitialized variable in context_create_request (Maarten) - Check for errors on i915_vm_alloc_pt_stash to avoid NULL dereference (Matt) - Serialize debugfs i915_gem_objects with ctx->mutex (Chris) - Fix a rebase mistake caused during drm-intel-gt-next creation (Chris) - Hold request reference for canceling an active context (Chris) - Heartbeats fixes (Chris) - Use usigned during batch copies (Chris) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201002182610.GA2204465@intel.com
2020-10-10Merge tag 'drm-misc-next-fixes-2020-10-02' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next Three fixes for vc4 that addresses dual-display breakages Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20201002065243.ry7gp4or3ywhluer@gilmour.lan
2020-10-09drm/amdgpu: Fix invalid number of character '{' in amdgpu_acpi_initYe Bin
Fix follow warning: Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: ''. Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: CONFIG_ACPI... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: 'CONFIG_ACPI'. ...... Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: CONFIG_X86... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: 'CONFIG_X86'. Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: _X86_... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: '_X86_'. Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: __linux__... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: '__linux__'. Fixes: 97d798b276e9 ("drm/amdgpu: simplify ATIF backlight handling") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-09drm/amdgpu: Remove warning for virtual_displayEmily.Deng
Remove the virtual_display warning in drm_crtc_vblank_off when dev->num_crtcs is null. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-09drm/amdgpu: kfd_initialized can be statickernel test robot
Fixes: c7651b73586600 ("drm/amdgpu: Fix handling of KFD initialization failures") Signed-off-by: kernel test robot <lkp@intel.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-09drm/amd/pm: setup APU dpm clock table in SMU HW initializationEvan Quan
As the dpm clock table is needed during DC HW initialization. And that (DC HW initialization) comes before smu_late_init() where current APU dpm clock table setup is performed. So, NULL pointer dereference will be triggered. By moving APU dpm clock table setup to smu_hw_init(), this can be avoided. Fixes: 02cf91c113ea ("drm/amd/powerplay: postpone operations not required for hw setup to late_init") Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-by: Dirk Gouders <dirk@gouders.net> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-09drm/amdgpu: prevent spurious warningAlex Deucher
The default auto setting for kcq should not generate a warning. Fixes: a300de40f66b ("drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5)") Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-09drm/amdgpu/swsmu: fix ARC build errorsAlex Deucher
We want to use the dev_* functions here rather than the pr_* variants. Switch to using dev_warn() which mirrors what we do on other asics. Fixes the following build errors on ARC: ../drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c: In function 'navi10_fill_i2c_req': ../arch/arc/include/asm/bug.h:24:2: error: implicit declaration of function 'pr_warn'; did you mean 'drm_warn'? [-Werror=implicit-function-declaration] ../drivers/gpu/drm/amd/amdgpu/../powerplay/sienna_cichlid_ppt.c: In function 'sienna_cichlid_fill_i2c_req': ../arch/arc/include/asm/bug.h:24:2: error: implicit declaration of function 'pr_warn'; did you mean 'drm_warn'? [-Werror=implicit-function-declaration] Reported-by: kernel test robot <lkp@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Evan Quan <evan.quan@amd.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: linux-snps-arc@lists.infradead.org Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-09drm/amd/display: Fix OPTC_DATA_FORMAT programmingDmytro Laktyushkin
This should be programmed with timing rather than with odm. Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-09drm/amd/display: Don't allow pstate if no support in blankAlvin Lee
[Why] We will hang if we report switch in VACTIVE but not in VBLANK and DPG_EN = 1 [How] Block switch in ACTIVE if not supported in BLANK Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-09drm/panfrost: increase readl_relaxed_poll_timeout valuesChristian Hewitt
Amlogic SoC devices report the following errors frequently causing excessive dmesg log spam and early log rotataion, although the errors appear to be harmless as everything works fine: [ 7.202702] panfrost ffe40000.gpu: error powering up gpu L2 [ 7.203760] panfrost ffe40000.gpu: error powering up gpu shader ARM staff have advised increasing the timeout values to eliminate the errors in most normal scenarios, and testing with several different G31/G52 devices shows 20000 to be a reliable value. Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Suggested-by: Steven Price <steven.price@arm.com> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201008141738.13560-1-christianshewitt@gmail.com
2020-10-09MAINTAINERS: Update entry for st7703 driver after the renameOndrej Jirman
The driver was renamed, change the path in the MAINTAINERS file. Signed-off-by: Ondrej Jirman <megous@megous.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://lore.kernel.org/lkml/20200701184640.1674969-1-megous@megous.com/#t
2020-10-07Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"Paul Cercueil
This reverts commit 37054fc81443 ("gpu/drm: ingenic: Add option to mmap GEM buffers cached") At the very moment this commit was created, the DMA API it relied on was modified in the DMA tree, which caused the driver to break in linux-next. Revert it for now, and it will be resubmitted later to work with the new DMA API. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Acked-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20201004141758.1013317-1-paul@crapouillou.net
2020-10-05drm/amd/display: HDMI remote sink need mode validation for LinuxFangzhi Zuo
[Why] Currently mode validation is bypassed if remote sink exists. That leads to mode set issue when a BW bottle neck exists in the link path, e.g., a DP-to-HDMI converter that only supports HDMI 1.4. Any invalid mode passed to Linux user space will cause the modeset failure due to limitation of Linux user space implementation. [How] Mode validation is skipped only if in edid override. For real remote sink, clock limit check should be done for HDMI remote sink. Have HDMI related remote sink going through mode validation to elimiate modes which pixel clock exceeds BW limitation. Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05drm/amd/display: Change to correct unit on audio rateChris Park
[Why] Formula uses kHz in their formula while our driver operates with Hz. [How] Divide audio rate by 1000 on the initial variable that is entered into formula. Signed-off-by: Chris Park <Chris.Park@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Acked-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05drm/amd/display: Avoid set zero in the requested clkRodrigo Siqueira
[Why] Sometimes CRTCs can be disabled due to display unplugging or temporarily transition in the userspace; in these circumstances, DCE tries to set the minimum clock threshold. When we have this situation, the function bw_calcs is invoked with number_of_displays set to zero, making DCE set dispclk_khz and sclk_khz to zero. For these reasons, we have seen some ATOM bios errors that look like: [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 5secs aborting [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing EA8A (len 761, WS 0, PS 0) @ 0xEABA [How] This error happens due to an attempt to optimize the bandwidth using the sclk, and the dispclk clock set to zero. Technically we handle this in the function dce112_set_clock, but we are not considering the case that this value is set to zero. This commit fixes this issue by ensuring that we never set a minimum value below the minimum clock threshold. Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05drm/amdgpu: align frag_end to covered address spaceAlex Sierra
align frag_end to the next pd when there are no page table entries on the current pde. This fixes invalidation of larger address space areas where some page tables are allocated and other aren't. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05drm/amdgpu: fix NULL pointer dereference for RenoirDirk Gouders
Commit c1cf79ca5ced46 ("drm/amdgpu: use IP discovery table for renoir") introduced a NULL pointer dereference when booting with amdgpu.discovery=0, because it removed the call of vega10_reg_base_init() for that case. Fix this by calling that funcion if amdgpu_discovery == 0 in addition to the case that amdgpu_discovery_reg_base_init() failed. Fixes: c1cf79ca5ced46 ("drm/amdgpu: use IP discovery table for renoir") Signed-off-by: Dirk Gouders <dirk@gouders.net> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-02drm/vmwgfx: fix regression in thp code due to ttm init refactor.Dave Airlie
When I refactored this code with the new init paths, I failed to set the funcs back up properly, this caused a failure to bringup gdm properly. Fixes: 252f8d7b9174 ("drm/vmwgfx/ttm: convert vram mm init to new code paths") Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201001042012.13114-1-airlied@gmail.com
2020-10-01drm/amdgpu/swsmu: add interrupt work handler for smu11 partsAlex Deucher
We need to schedule the smu AC/DC interrupt ack to avoid potentially sleeping if the smu message mutex is contended. Fixes: e1188aacad1730 ("drm/amdgpu/smu11: add support for SMU AC/DC interrupts") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01drm/amdgpu/swsmu: add interrupt work functionAlex Deucher
So we can schedule work from interrupts. This might include long tasks or things that could sleep. Fixes: e1188aacad1730 ("drm/amdgpu/smu11: add support for SMU AC/DC interrupts") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01drm/amdgpu: enable GDDR6 save-restore support for navy_flounderHawking Zhang
add mp0 11_0_11 for navy_flounder to the mem training supported list, otherwise the modeprobe would fail on navy_flounder with latest vbios. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01drm/amdgpu: support indirect access reg outside of mmio bar (v2)Hawking Zhang
support both direct and indirect accessor in unified helper functions. v2: Retire indirect mmio access via mm_index/data Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01drm/amdgpu: switch to indirect reg access helperHawking Zhang
Switch WREG32/RREG32_PCIE to use indirect reg access helper for soc15 and onwards Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01drm/amdgpu: add helper function for indirect reg access (v3)Hawking Zhang
Add helper function in order to remove RREG32/WREG32 in current pcie_rreg/wreg function for soc15 and onwards adapters. PCIE_INDEX/DATA pairs are used to access regsiters outside of mmio bar in the helper functions. The new helper functions help remove the recursion of amdgpu_mm_rreg/wreg from pcie_rreg/wreg and provide the oppotunity to centralize direct and indirect access in a single function. v2: Fixed typo and refine the comments v3: Remove unnecessary volatile local variable Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01drm: bridge: cdns-mhdp8546: fix compile warningTomi Valkeinen
On x64 we get: drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c:751:10: warning: conversion from 'long unsigned int' to 'unsigned int' changes value from '18446744073709551613' to '4294967293' [-Woverflow] The registers are 32 bit, so fix by casting to u32. Fixes: fb43aa0acdfd ("drm: bridge: Add support for Cadence MHDP8546 DPI/DP bridge") Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Swapnil Jakhade <sjakhade@cadence.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200929091918.24813-1-tomi.valkeinen@ti.com
2020-09-30drm/amd/amdkfd: Surface files in Sysfs to allow users to get number ofRamesh Errabolu
compute units that are in use. [Why] Allow user to know how many compute units (CU) are in use at any given moment. [How] Surface files in Sysfs that allow user to determine the number of compute units that are in use for a given process. One Sysfs file is used per device. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/amd/amdgpu: Define and implement a function that collects number ofRamesh Errabolu
waves that are in flight. [Why] Allow user to know how many compute units (CU) are in use at any given moment. [How] Read registers of SQ that give number of waves that are in flight of various queues. Use this information to determine number of CU's in use. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30drm/i915: Avoid mixing integer types during batch copiesChris Wilson
Be consistent and use unsigned long throughout the chunk copies to avoid the inherent clumsiness of mixing integer types of different widths and signs. Failing to take acount of a wider unsigned type when using min_t can lead to treating it as a negative, only for it flip back to a large unsigned value after passing a boundary check. Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap") Testcase: igt/gen9_exec_parse/bb-large Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Candelaria, Jared" <jared.candelaria@intel.com> Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk (cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gem: Always test execution status on closing the contextChris Wilson
Verify that if a context is active at the time it is closed, that it is either persistent and preemptible (with hangcheck running) or it shall be removed from execution. Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-close Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-3-chris@chris-wilson.co.uk (cherry picked from commit d3bb2f9b5ee66d5e000293edd6b6575e59d11db9) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gt: Always send a pulse down the engine after disabling heartbeatChris Wilson
Currently, we check we can send a pulse prior to disabling the heartbeat to verify that we can change the heartbeat, but since we may re-evaluate execution upon changing the heartbeat interval we need another pulse afterwards to refresh execution. v2: Tvrtko asked if we could reduce the double pulse to a single, which opened up a discussion of how we should handle the pulse-error after attempting to change the property, and the desire to serialise adjustment of the property with its validating pulse, and unwind upon failure. Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-2-chris@chris-wilson.co.uk (cherry picked from commit 3dd66a94de59d7792e7917eb3075342e70f06f44) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Cancel outstanding work after disabling heartbeats on an engineChris Wilson
We only allow persistent requests to remain on the GPU past the closure of their containing context (and process) so long as they are continuously checked for hangs or allow other requests to preempt them, as we need to ensure forward progress of the system. If we allow persistent contexts to remain on the system after the the hangcheck mechanism is disabled, the system may grind to a halt. On disabling the mechanism, we sent a pulse along the engine to remove all executing contexts from the engine which would check for hung contexts -- but we did not prevent those contexts from being resubmitted if they survived the final hangcheck. Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-stop Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk (cherry picked from commit 7a991cd3e3da9a56d5616b62d425db000a3242f2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gem: Hold request reference for canceling an active contextChris Wilson
We have to be very careful while walking the timeline->requests list under the RCU guard, as the requests (and so rq->link) use SLAB_TYPESAFE_BY_RCU and so the requests may be reallocated within an rcu grace period. As the requests are reallocated, they are removed from one list and placed on another, and if we are iterating over that request at that moment, the list iteration jumps from one list to the next and promptly gets confused. Verify we hold the request reference to ensure that the request is not added to a new list behind our backs. <4> [582.745252] general protection fault, probably for non-canonical address 0xcccccccccccccd5c: 0000 [#1] PREEMPT SMP PTI <4> [582.745297] CPU: 0 PID: 1475 Comm: gem_ctx_persist Not tainted 5.9.0-rc1-CI-CI_DRM_8908+ #1 <4> [582.745304] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018 <4> [582.745317] RIP: 0010:__lock_acquire+0x2c3/0x1f40 <4> [582.745323] Code: 00 65 8b 05 c7 8a ef 7e 85 c0 0f 85 b4 07 00 00 44 8b 9d c4 08 00 00 45 85 db 0f 84 0f 01 00 00 ba 05 00 00 00 e9 c8 06 00 00 <48> 81 3f c0 89 c7 82 b8 00 00 00 00 41 0f 45 c0 83 fe 01 41 89 c3 <4> [582.745334] RSP: 0018:ffffc9000461bc40 EFLAGS: 00010002 <4> [582.745340] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 <4> [582.745345] RDX: 0000000000000000 RSI: 0000000000000000 RDI: cccccccccccccd5c <4> [582.745350] RBP: ffff8881ec4a2880 R08: 0000000000000001 R09: 0000000000000001 <4> [582.745356] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 <4> [582.745361] R13: 0000000000000000 R14: 0000000000000000 R15: cccccccccccccd5c <4> [582.745367] FS: 00007fb44da78e40(0000) GS:ffff888278000000(0000) knlGS:0000000000000000 <4> [582.745373] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [582.745378] CR2: 00007fb44daad040 CR3: 0000000268428000 CR4: 0000000000350ef0 <4> [582.745383] Call Trace: <4> [582.745390] ? __lock_acquire+0x913/0x1f40 <4> [582.745397] lock_acquire+0xb5/0x3c0 <4> [582.745526] ? kill_engines+0x19a/0x4b0 [i915] <4> [582.745533] ? find_held_lock+0x2d/0x90 <4> [582.745541] _raw_spin_lock_irq+0x30/0x40 <4> [582.745635] ? kill_engines+0x19a/0x4b0 [i915] <4> [582.745727] kill_engines+0x19a/0x4b0 [i915] <4> [582.745820] context_close+0x195/0x410 [i915] <4> [582.745912] i915_gem_context_close+0x5b/0x160 [i915] <4> [582.745994] i915_driver_postclose+0x14/0x40 [i915] <4> [582.746003] drm_file_free.part.13+0x240/0x290 <4> [582.746009] drm_release_noglobal+0x16/0x50 <4> [582.746016] __fput+0xa5/0x250 <4> [582.746021] task_work_run+0x6e/0xb0 <4> [582.746028] exit_to_user_mode_prepare+0x178/0x180 <4> [582.746034] syscall_exit_to_user_mode+0x36/0x220 <4> [582.746040] entry_SYSCALL_64_after_hwframe+0x44/0xa9 <4> [582.746045] RIP: 0033:0x7fb44d1dc421 <4> [582.746050] Code: f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 8b 05 ea cf 20 00 85 c0 75 16 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3f f3 c3 0f 1f 44 00 00 53 89 fb 48 83 ec 10 <4> [582.746062] RSP: 002b:00007ffed2e83818 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 <4> [582.746069] RAX: 0000000000000000 RBX: 0000556410bfe840 RCX: 00007fb44d1dc421 <4> [582.746075] RDX: 000000000000000a RSI: 00000000c0406469 RDI: 0000000000000008 <4> [582.746080] RBP: 0000000000000008 R08: 00007fb44d1c51cc R09: 00007fb44d1c5240 <4> [582.746086] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000fffffffb <4> [582.746091] R13: 0000000000000006 R14: 0000000000000000 R15: 000000000000000a <4> [582.746099] Modules linked in: vgem mei_hdcp snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio btusb btrtl btbcm btintel x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul bluetooth ghash_clmulni_intel ecdh_generic ecc i915 r8169 realtek mei_me mei snd_hda_intel i2c_hid snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm pinctrl_geminilake pinctrl_intel prime_numbers [last unloaded: test_drm_mm] Fixes: 736e785f9b28 ("drm/i915/gem: Reduce context termination list iteration guard to RCU") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-2-chris@chris-wilson.co.uk (cherry picked from commit badef44deff1fae8d21c5c1cfc4dde95fb5bf993) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Redo "Remove i915_request.lock requirement for execution callbacks"Chris Wilson
The reordering and rebasing of commit 2e4c6c1a9db5 ("drm/i915: Remove i915_request.lock requirement for execution callbacks") caused it to revert an earlier correction. Let us restore commit 99f0a640d464 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs") Fixes: 2e4c6c1a9db5 ("drm/i915: Remove i915_request.lock requirement for execution callbacks") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-1-chris@chris-wilson.co.uk (cherry picked from commit 35faeb7de9ef83da510a048f2016061f1e31d5fc) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915/gem: Serialise debugfs i915_gem_objects with ctx->mutexChris Wilson
Since the debugfs may peek into the GEM contexts as the corresponding client/fd is being closed, we may try and follow a dangling pointer. However, the context closure itself is serialised with the ctx->mutex, so if we hold that mutex as we inspect the state coupled in the context, we know the pointers within the context are stable and will remain valid as we inspect their tables. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: CQ Tang <cq.tang@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200723172119.17649-3-chris@chris-wilson.co.uk (cherry picked from commit 102f5aa491f262c818e607fc4fee08a724a76c69) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: check i915_vm_alloc_pt_stash for errorsMatthew Auld
If we are really unlucky and encounter an error during i915_vm_alloc_pt_stash, we end up passing an empty pt/pd stash all the way down into the low-level ppgtt alloc code, leading to explosions, since it expects at least the required number of pt/pd for the va range. [ 211.981418] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 211.981421] #PF: supervisor read access in kernel mode [ 211.981422] #PF: error_code(0x0000) - not-present page [ 211.981424] PGD 80000008439cb067 P4D 80000008439cb067 PUD 84a37f067 PMD 0 [ 211.981427] Oops: 0000 [#1] SMP PTI [ 211.981428] CPU: 1 PID: 1301 Comm: i915_selftest Tainted: G U I 5.9.0-rc5+ #3 [ 211.981430] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0050.2017.0831.1924 08/31/2017 [ 211.981521] RIP: 0010:__gen8_ppgtt_alloc+0x1ed/0x3c0 [i915] [ 211.981523] Code: c1 48 c7 c7 5d 5d fe c0 65 ff 0d ee 1d 03 3f e8 d9 91 1f e2 8b 55 c4 31 c0 48 8b 75 b8 85 d2 0f 95 c0 48 8b 1c c6 48 89 45 98 <48> 8b 03 48 8b 90 58 02 00 00 48 85 d2 0f 84 07 ea 15 00 48 81 fa [ 211.981526] RSP: 0018:ffffba2cc0eb3970 EFLAGS: 00010202 [ 211.981527] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000004 [ 211.981529] RDX: 0000000000000002 RSI: ffff9be998bdb8c0 RDI: ffff9be99c844300 [ 211.981530] RBP: ffffba2cc0eb39d8 R08: 0000000000000640 R09: ffff9be97cdfd000 [ 211.981531] R10: ffff9be97cdfd614 R11: 0000000000000000 R12: 0000000000000000 [ 211.981532] R13: ffff9be98607ba20 R14: ffff9be995a0b400 R15: ffffba2cc0eb39e8 [ 211.981534] FS: 00007f0f10b31000(0000) GS:ffff9be99fc40000(0000) knlGS:0000000000000000 [ 211.981536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 211.981538] CR2: 0000000000000000 CR3: 000000084d74e006 CR4: 00000000003706e0 [ 211.981539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 211.981541] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 211.981542] Call Trace: [ 211.981609] gen8_ppgtt_alloc+0x79/0x90 [i915] [ 211.981678] ppgtt_bind_vma+0x36/0x80 [i915] [ 211.981756] __vma_bind+0x39/0x40 [i915] [ 211.981818] fence_work+0x21/0x98 [i915] [ 211.981879] fence_notify+0x8d/0x128 [i915] [ 211.981939] __i915_sw_fence_complete+0x62/0x240 [i915] [ 211.982018] i915_vma_pin_ww+0x1ee/0x9c0 [i915] Fixes: cd0452aa2a0d ("drm/i915: Preallocate stashes for vma page-directories") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200921160844.73186-1-matthew.auld@intel.com (cherry picked from commit 1604cb2aa7fafd83e11f9257f765a5f5dd7c19d3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Fix uninitialised variable in intel_context_create_request.Maarten Lankhorst
In case backoff fails with an error, we return an undefined rq, assign err to rq correctly. Fixes: 8a929c9eb1c2 ("drm/i915: Use ww pinning for intel_context_create_request()") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200918111208.1392128-1-maarten.lankhorst@linux.intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 4316b19dee27cc5cd34a95fdbc0a3a5237507701) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Break up error capture compression loops with cond_resched()Chris Wilson
As the error capture will compress user buffers as directed to by the user, it can take an arbitrary amount of time and space. Break up the compression loops with a call to cond_resched(), that will allow other processes to schedule (avoiding the soft lockups) and also serve as a warning should we try to make this loop atomic in the future. Testcase: igt/gem_exec_capture/many-* Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-2-chris@chris-wilson.co.uk (cherry picked from commit 293f43c80c0027ff9299036c24218ac705ce584e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Fix an error code i915_gem_object_copy_blt()Dan Carpenter
This code should use "vma[1]" instead of "vma". The "vma" variable is a valid pointer. Fixes: 6b05030496f7 ("drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200911075243.GG12635@kadam (cherry picked from commit 68ba71e3ae6dd86a23486655e33c5f8c9bd90777) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>