summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2023-10-20drm/amd/display: clean up some inconsistent indentingJiapeng Chong
No functional modification involved. drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:2902 dm_resume() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6940 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amd/display: Simplify bool conversionYang Li
./drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c:4802:84-89: WARNING: conversion to bool not needed here Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6901 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amd/display: Remove unneeded semicolonYang Li
./drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c:464:3-4: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6900 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amd/display: Remove duplicated include in dce110_hwseq.cYang Li
./drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c: dce110_hwseq.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6897 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amd/display: clean up some inconsistent indentingsYang Li
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn35/dcn35_fpu.c:261 dcn35_update_bw_bounding_box_fpu() warn: inconsistent indenting Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amd/pm: Handle non-terminated overdrive commands.Bas Nieuwenhuizen
The incoming strings might not be terminated by a newline or a 0. (found while testing a program that just wrote the string itself, causing a crash) Cc: stable@vger.kernel.org Fixes: e3933f26b657 ("drm/amd/pp: Add edit/commit/show OD clock/voltage support in sysfs") Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amdgpu: Enable software RAS in vcn v4_0_3Hawking Zhang
Set VCN/JPEG RAS masks to enable software RAS for VCN and JPEG. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amdgpu: define ras_reset_error_count functionTao Zhou
Make the code architecture more simple. v2: reuse ras_reset_error_count in ras_reset_error_status. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amdkfd:remove unused codeJesse Zhang
Function svm_range_split_by_grinity is not used, so it is removed. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Suggested-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amd/pm: Support for getting power1_cap_min valueMa Jun
Support for getting power1_cap_min value on smu13 and smu11. For other Asics, we still use 0 as the default value. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amdgpu: Log UE corrected by replay as correctable errorCandice Li
Support replay mode where UE could be converted to CE. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdgpu: Reserve fences for VM updateFelix Kuehling
In amdgpu_dma_buf_move_notify reserve fences for the page table updates in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON in dma_resv_add_fence when using SDMA for page table updates. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdgpu: Fix possible null pointer dereferenceFelix Kuehling
abo->tbo.resource may be NULL in amdgpu_vm_bo_update. Fixes: 180253782038 ("drm/ttm: stop allocating dummy resources during BO creation") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdgpu: Workaround to skip kiq ring test during ras gpu recoveryStanley.Yang
This is workaround, kiq ring test failed in suspend stage when do ras recovery. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amd/display: Fix a handful of spelling mistakes in dml_print outputColin Ian King
There are a few spelling mistakes and an minor grammatical issue in some dml_print messages. Fix these. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdkfd: clean up some inconsistent indentingJiapeng Chong
No functional modification involved. drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:305 svm_range_free() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6804 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amd/display: Remove brackets in macro to conform to coding styleStylon Wang
[Why] Many of the register macros defined ind dcn32_resource.h have extra brackets. This is not conforming to the style of those defined in other DC header files. [How] Remove these brackets in dcn32_resource.h Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amd: Read IMU FW version from scratch register during hw_initMario Limonciello
If the IMU version wasn't discovered from the header, such as when the firmware was directly loaded by PSP then there is no firmware version to show to userspace from sysfs or IOCTL. The IMU F/W stores the version in the first scratch register though, so fetch it in these cases to let the driver export. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amd: Don't parse IMU ucode version if it won't be loadedMario Limonciello
When the IMU ucode is loaded by the PSP parsing the version that comes from Linux will vary. Rather than showing the wrong data to kernel interface consumers, avoid populating it in this case. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amd: Move microcode init step to early_init()Mario Limonciello
The intention for early init is to find any missing microcode early and fail the driver load if it's missing. Move this step to earlier in driver init to match other IP blocks. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdgpu: update retry times for psp BL waitAsad Kamal
Increase retry time for PSP BL wait, to compensate for longer time to set c2pmsg 35 ready bit during mode1 with RAS Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amd: Add missing kernel doc for prepare_suspend()Mario Limonciello
prepare_suspend() is intended to be used for any IP blocks that must allocate memory during the suspend sequence. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/all/20231017143555.6a6450fc@canb.auug.org.au/ Fixes: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdgpu: update to the latest GC 11.5 headersAlex Deucher
Add some additional bitfields. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amd/pm: Fix a memory leak on an error pathKunwu.Chan
Add missing free on an error path. Fixes: 511a95552ec8 ("drm/amd/pm: Add SMU 13.0.6 support") Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Kunwu.Chan <chentao@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdgpu/mes11: remove aggregated doorbell codeAlex Deucher
It's not enabled in hardware so the code is dead. Remove it. Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdgpu : Add hive ras recovery checkAsad Kamal
If one of the devices in the hive detects a fatal error, need to send ras recovery reset message to PMFW of all devices in the hive. For that add a flag in hive to indicate that it's undergoing ras recovery Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amd/display: Add missing lines of code in dc.cStylon Wang
[Why & How] A critial part of "drm/amd/display: Fix windowed MPO video with ODM combine for DCN32" is lost during promotion to upstream. This patch addes the code back to dc.c. Signed-off-by: Stylon Wang <stylon.wang@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19Revert "drm/amdgpu: Program xcp_ctl registers as needed"Mangesh Gadre
This reverts commit 0bdebfef3fb2b6291000765eaa9c6c8030293fce. XCP_CTL register is programmed by firmware and register access is protected. Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdgpu/umsch: add suspend and resume callbackLang Yu
Add missing IP callbacks. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdgpu/pm: update SMU 13.0.0 PMFW version checkAlex Deucher
Update the PMFW version check the the ROCm optimizations. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-18Merge tag 'amd-drm-next-6.7-2023-10-13' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-13: amdgpu: - DC replay fixes - Misc code cleanups and spelling fixes - Documentation updates - RAS EEPROM Updates - FRU EEPROM Updates - IP discovery updates - SR-IOV fixes - RAS updates - DC PQ fixes - SMU 13.0.6 updates - GC 11.5 Support - NBIO 7.11 Support - GMC 11 Updates - Reset fixes - SMU 11.5 Updates - SMU 13.0 OD support - Use flexible arrays for bo list handling - W=1 Fixes - SubVP fixes - DPIA fixes - DCN 3.5 Support - Devcoredump fixes - VPE 6.1 support - VCN 4.0 Updates - S/G display fixes - DML fixes - DML2 Support - MST fixes - VRR fixes - Enable seamless boot in more cases - Enable content type property for HDMI - OLED fixes - Rework and clean up GPUVM TLB flushing - DC ODM fixes - DP 2.x fixes - AGP aperture fixes - SDMA firmware loading cleanups - Cyan Skillfish GPU clock counter fix - GC 11 GART fix - Cache GPU fault info for userspace queries - DC cursor check fixes - eDP fixes - DC FP handling fixes - Variable sized array fixes - SMU 13.0.x fixes - IB start and size alignment fixes for VCN - SMU 14 Support - Suspend and resume sequence rework - vkms fix amdkfd: - GC 11 fixes - GC 10 fixes - Doorbell fixes - CWSR fixes - SVM fixes - Clean up GC info enumeration - Rework memory limit handling - Coherent memory handling fixes - Use partial migrations in GPU faults - TLB flush fixes - DMA unmap fixes - GC 9.4.3 fixes - SQ interrupt fix - GTT mapping fix - GC 11.5 Support radeon: - Misc code cleanups - W=1 Fixes - Fix possible buffer overflow - Fix possible NULL pointer dereference UAPI: - Add EXT_COHERENT memory allocation flags. These allow for system scope atomics. Proposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 - Add support for new VPE engine. This is a memory to memory copy engine with advanced scaling, CSC, and color management features Proposed mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713 - Add INFO IOCTL interface to query GPU faults Proposed Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Proposed libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231013175758.1735031-1-alexander.deucher@amd.com
2023-10-17Merge tag 'drm-habanalabs-next-2023-10-10' of ↵Dave Airlie
https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into drm-next This tag contains habanalabs driver changes for v6.7. The notable changes are: - uAPI changes: - Expose tsc clock sampling to better sync clock information in profiler. - Enhance engine error reporting in the info ioctl. - Block access to the eventfd operations through the control device. - Disable the option of the user to register multiple times with the same offset for timestamp dump by the driver. If a user wants to use the same offset in the timestamp buffer for different interrupt, it needs to first de-register the offset. - When exporting dma-buf (for p2p), force the user to specify size/offset in multiples of PAGE_SIZE. This is instead of the driver doing the rounding to PAGE_SIZE, which has caused the driver to map more memory than was intended by the user. - New features and improvements: - Complete the move of the driver to the accel subsystem by removing the custom habanalabs class and major and registering to accel subsystem. - Move the firmware interface files to include/linux/habanalabs. This is a pre-requisite for upstreaming the NIC drivers of Gaudi (as they need to include those files). - Perform device hard-reset upon PCIe AXI drain event to prevent the failure from cascading to different IP blocks in the SoC. In secured environments, this is done automatically by the firmware. - Print device name when it is removed for better debuggability. - Add support for trace of dma map sgtable operations. - Optimize handling of user interrupts by splitting the interrupts to two lists. One list for fast handling and second list for handling with timestamp recording, which is slower. - Prevent double device hard-reset due to 2 adjacent H/W events. - Set device status 'malfunction' while in rmmod. - Firmware related fixes: - Extend preboot timeout because preboot loading might take longer than expected in certain cases. - Add a protection mechanism for the Event Queue. In case it is full, the firmware will be able to notify about it through a dedicated interrupt. - Perform device hard-reset in case scrubbing of memory has failed. - Bug fixes and code cleanups: - Small fixes of dma-buf handling in Gaudi2, such as handling an offset != 0, using the correct exported size, creation of sg table. - Fix spmu mask creation. - Fix bug in wait for cs completion for decoder workloads. - Cleanup Greco name from documentation. - Fix bug in recording timestamp during cs completion interrupt handling. - Fix CoreSight ETF configuration and flush logic. - Fix small bug in hpriv_list handling (the list that contains the private data per process that opens our device). Signed-off-by: Dave Airlie <airlied@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCgAdFiEE7TEboABC71LctBLFZR1NuKta54AFAmUlHoQACgkQZR1NuKta # 54DsXQf8CW+W4iWJf5UDTj/E/giu9rVRrsUsU0hhCcXbecIxRsLObYXtulENu5/u # VuEAo/tAvo0LUKi8pdIv6ernDKaxZ1+fimlfXMCzllAA/ts3yp1NgunprsIsx3tv # YgcJ2GNR8UlVZ1qYuZl+4dOTyD0yfRMROUXBe7wqKnUXOEepOiLBxq6W15tZiJnx # L+V0yGkNk6pAoADIXLW9EgEXiN/bJZCXGPWp06i/Nz7cHIHJGoV59wAqftqllCtk # 8ZMkLByjlQKPhc5AgWBtKE8EGVip3sm7b/Q2Gq0ZXdZiebyVJ+AjuuDOdtq1UCIw # Rcp2576E7rByIBu3RAFlrioWhuR5Zw== # =2ien # -----END PGP SIGNATURE----- # gpg: Signature made Tue 10 Oct 2023 19:51:00 AEST # gpg: using RSA key ED311BA00042EF52DCB412C5651D4DB8AB5AE780 # gpg: Can't check signature: No public key From: Oded Gabbay <ogabbay@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/ZSUfiX4J7v4Wn0cU@ogabbay-vm-u22.habana-labs.com
2023-10-17Merge tag 'drm-intel-gt-next-2023-10-12' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: Fixes/improvements/new stuff: - Register engines early to avoid type confusion (Mathias Krause) - Suppress 'ignoring reset notification' message [guc] (John Harrison) - Update 'recommended' version to 70.12.1 for DG2/ADL-S/ADL-P/MTL [guc] (John Harrison) - Enable WA 14018913170 [guc, dg2] (Daniele Ceraolo Spurio) Future platform enablement: - Clean steer semaphore on resume (Nirmoy Das) - Skip MCR ops for ring fault register [mtl] (Nirmoy Das) - Make i915_gem_shrinker multi-gt aware [gem] (Jonathan Cavitt) - Enable GGTT updates with binder in MTL (Nirmoy Das, Chris Wilson) - Invalidate the TLBs on each GT (Chris Wilson) Miscellaneous: - Clarify type evolution of uabi_node/uabi_engines (Mathias Krause) - Annotate struct ct_incoming_msg with __counted_by [guc] (Kees Cook) - More use of GT specific print helpers [gt] (John Harrison) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZSfKotZVdypU6NaX@tursulin-desk
2023-10-16Merge tag 'drm-intel-next-2023-10-12' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next drm/i915 feature pull #2 for v6.7: Features and functionality: - Preparation for i915 display code reuse in upcoming Xe driver (Jani) - Drop the fastboot module parameter and use the platform defaults (Arun) - Enable new LNL FBC features (Vinod) - Add LNL display feature capability reads (Vinod) Refactoring and cleanups: - Locally enable W=1 warnings by default in i915 (Jani) - Move HDCP GSC message code to a separate file (Suraj) - GVT include cleanups (Jani) - Move more display init under display/ (Jani) - DPLL ID refactoring (Ville) - Better abstraction of GT0 (Jani) - Move VGA decode function to GMCH code (Uma) - Use local64_try_cmpxchg() to optimize PMU event read (Uros Bizjak) - Clean up FBC checks (Ville) - Constify and unify state checker calling conventions (Ville) - Add display step name helper (Chaitanya) Documentation: - Update CCS and GSC CS documentation (Rodrigo) - Fix a number of documentation typos (Randy Dunlap) Fixes: - VLV DSI fixes and quirks (Hans) - Fix crtc state memory leaks (Suraj) - Increase LSPCON mode settle timeout (Niko Tsirakis) - Stop clobbering old crtc state during state check (Ville) - Fix VLV color state readout (Ville) - Fix cx0 PHY pipe reset to allow S0iX (Khaled) - Ensure DP MST pbn_div is up-to-date after sink reconnect (Imre) - Drop an unnecessary NULL check to fix static analyzer warning (Suraj) - Use an explicit rather than implicit include for frontbuffer tracking (Jouni) Merges: - Backmerge drm-next to fix a conflict (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87r0m00xew.fsf@intel.com
2023-10-13drm/amdgpu/vkms: fix a possible null pointer dereferenceMa Ke
In amdgpu_vkms_conn_get_modes(), the return value of drm_cvt_mode() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_cvt_mode(). Add a check to avoid null pointer dereference. Signed-off-by: Ma Ke <make_ruc2021@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amd/swsmu: update smu v14_0_0 header files and metrics tableLi Ma
Update driver if, pmfw and ppsmc header files. Add new gpu_metrics_v3_0 for metrics table updated in driver if and reserve legacy metrics table to maintain backward compatibility. --- v1: Update header files and add gpu_metrics_v3_0. v2: Update smu_types.h, smu headers and drop smu_cmn_get_smc_version in smu v14_0_0. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: add RAS error info support for umc_v12_0Yang Wang
add RAS error info support for umc_v12_0. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: add RAS error info support for mmhub_v1_8Yang Wang
add RAS error info support for mmhub_v1_8. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: add RAS error info support for gfx_v9_4_3Yang Wang
add RAS error info support for gfx_v9_4_3. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: add RAS error info support for sdma_v4_4_2.Yang Wang
add RAS error info support for sdma_v4_4_2. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: add ras_err_info to identify RAS error sourceYang Wang
introduced "ras_err_info" to better identify a RAS ERROR source. NOTE: For legacy chips, keep the original RAS error print format. v1: RAS errors may come from different dies during a RAS error query, therefore, need a new data structure to identify the source of RAS ERROR. v2: - use new data structure 'amdgpu_smuio_mcm_config_info' instead of ras_err_id (in v1 patch) - refine ras error dump function name - refine ras error dump log format Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: flush the correct vmid tlb for specific pasidYifan Zhang
flush the correct vmid tlb for specific pasid on gmc 11. Fixes: 041a5743883d ("drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid") Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: make err_data structure built-in for ras_managerYang Wang
(No effect outside the ras_mgr data structure) Since a new member was added to the ras_err_data data structure, it becomes unreasonable for the ras_mgr instance to contain this data, because ras mgr only uses the 2 member information of ue_count/ce_count in err_data. This patch changes the code err_data into built-in structure members, making the code directly compatible. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: disable GFXOFF and PG during compute for GFX9Jesse Zhang
Temporary workaround to fix issues observed in some compute applications when GFXOFF is enabled on GFX9. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu/umsch: fix missing stuff during rebaseLang Yu
These are missed during rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu/umsch: correct IP version formatLang Yu
FW uses IP_VERSION_MAJ_MIN_REV format. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: don't use legacy invalidation on MMHUB v3.3Lang Yu
Legacy invalidation is not supported. This is missed during rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: correct NBIO v7.11 programingLang Yu
Use v7.7 before, switch to v7.11 now. Fix incorrect programing. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/radeon: fix a possible null pointer dereferenceMa Ke
In radeon_tv_get_modes(), the return value of drm_cvt_mode() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_cvt_mode(). Add a check to avoid null point dereference. Signed-off-by: Ma Ke <make_ruc2021@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13drm/amdgpu: Correctly use bo_va->ref_count in compute VMsXiaogang Chen
This is needed to correctly handle BOs imported into compute VM from gfx. Both kfd and gfx should use same bo_va and set bo_va->ref_count correctly when map the Bos into same VM, otherwise we may trigger kernel general protection when iterate mappings over bo_va's valids or invalids list. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Tested-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>