summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-22drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD NaviKai-Heng Feng
S2idle resume freeze can be observed on Intel ADL + AMD WX5500. This is caused by commit 0064b0ce85bb ("drm/amd/pm: enable ASPM by default"). The root cause is still not clear for now. So extend and apply the ASPM quirk from commit e02fe3bc7aba ("drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems"), to workaround the issue on Navi cards too. Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display: remove outdated 8bpc commentsAlex Hung
[Why] The commit c76e483cd916 ("drm/amd/display: Don't restrict bpc to 8 bpc") removes the historical 8bpc dependency and sets max_bpc to 16. [How] The comment that states "8bpc for non-edp" needs to be removed as well. Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/dc/core/dc_stat: Convert a couple of doc headers to ↵Lee Jones
kerneldoc format Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stat.c:38: warning: Cannot understand ***************************************************************************** drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stat.c:76: warning: Cannot understand ***************************************************************************** Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Mustapha Ghaddar <mghaddar@amd.com> Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Cc: Jasdeep Dhillon <jdhillon@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display: Implement workaround for writing to OTG_PIXEL_RATE_DIV registerSaaem Rizvi
[Why and How] Current implementation requires FPGA builds to take a different code path from DCN32 to write to OTG_PIXEL_RATE_DIV. Now that we have a workaround to write to OTG_PIXEL_RATE_DIV register without blanking display on hotplug on DCN32, we can allow the code paths for FPGA to be exactly the same allowing for more consistent testing. Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Saaem Rizvi <SyedSaaem.Rizvi@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amdgpu: Initialize umc ras callbackHawking Zhang
Fix a coding error which results to null interrupt handler for umc ras. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/dc/link/link_detection: Demote a couple of kerneldoc abusesLee Jones
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_detection.c:877: warning: Function parameter or member 'link' not described in 'detect_link_and_local_sink' drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_detection.c:877: warning: Function parameter or member 'reason' not described in 'detect_link_and_local_sink' drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_detection.c:1232: warning: Function parameter or member 'link' not described in 'dc_link_detect_connection_type' Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Lee Jones <lee@kernel.org> Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/dc/dce60/Makefile: Fix previous attempt to silence known ↵Lee Jones
override-init warnings Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:157:21: note: in expansion of macro ‘mmCRTC1_DCFE_MEM_LIGHT_SLEEP_CNTL’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_transform.h:170:9: note: in expansion of macro ‘SRI’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:183:17: note: in expansion of macro ‘XFM_COMMON_REG_LIST_DCE60’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:188:17: note: in expansion of macro ‘transform_regs’ drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_d.h:722:43: warning: initialized field overwritten [-Woverride-init] drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:157:21: note: in expansion of macro ‘mmCRTC2_DCFE_MEM_LIGHT_SLEEP_CNTL’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_transform.h:170:9: note: in expansion of macro ‘SRI’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:183:17: note: in expansion of macro ‘XFM_COMMON_REG_LIST_DCE60’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:189:17: note: in expansion of macro ‘transform_regs’ drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_d.h:722:43: note: (near initialization for ‘xfm_regs[2].DCFE_MEM_LIGHT_SLEEP_CN [100 lines snipped for brevity] Fixes: ceb3cf476a441 ("drm/amd/display/dc/dce60/Makefile: Ignore -Woverride-init warning") Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Mauro Rossi <issor.oruam@gmail.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/dc/link/protocols/link_dp_capability: Demote non-compliant ↵Lee Jones
kerneldoc Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c:2190: warning: Function parameter or member 'link' not described in 'dc_link_is_dp_sink_present' Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/dc/link/protocols/link_dp_capability: Remove unused variable ↵Lee Jones
and mark another as __maybe_unused ‘ds_port’ is clearly not used anywhere and ‘result_write_min_hblank’ is only utilised when debugging is enabled. The alternative would be to allocate the variable under the same clause as the debugging code, but that would become very messy, very quickly. Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c: In function ‘dp_wa_power_up_0010FA’: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c:280:42: warning: variable ‘ds_port’ set but not used [-Wunused-but-set-variable] drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c: In function ‘dpcd_set_source_specific_data’: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c:1296:32: warning: variable ‘result_write_min_hblank’ set but not used [-Wunused-but-set-variable] Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/dc/link/protocols/link_dp_training: Remove set but unused ↵Lee Jones
variable 'result' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_training.c: In function ‘perform_link_training_with_retries’: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_training.c:1586:38: warning: variable ‘result’ set but not used [-Wunused-but-set-variable] Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/dc/link/link_detection: Remove unused variable 'status'Lee Jones
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_detection.c: In function ‘query_hdcp_capability’: drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_detection.c:501:42: warning: variable ‘status’ set but not used [-Wunused-but-set-variable] Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/dc/dce/dmub_psr: Demote kerneldoc abuseLee Jones
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dmub_psr.c:257: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Zhang <dingchen.zhang@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/amdgpu_dm/amdgpu_dm_helpers: Move defines out to where they ↵Lee Jones
are actually used Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/include/ddc_service_types.h: At top level: drivers/gpu/drm/amd/amdgpu/../display/include/ddc_service_types.h:143:22: warning: ‘SYNAPTICS_DEVICE_ID’ defined but not used [-Wunused-const-variable=] drivers/gpu/drm/amd/amdgpu/../display/include/ddc_service_types.h:140:22: warning: ‘DP_VGA_LVDS_CONVERTER_ID_3’ defined but not used [-Wunused-const-variable=] drivers/gpu/drm/amd/amdgpu/../display/include/ddc_service_types.h:138:22: warning: ‘DP_VGA_LVDS_CONVERTER_ID_2’ defined but not used [-Wunused-const-variable=] drivers/gpu/drm/amd/amdgpu/../display/include/ddc_service_types.h:133:22: warning: ‘DP_SINK_DEVICE_STR_ID_2’ defined but not used [-Wunused-const-variable=] drivers/gpu/drm/amd/amdgpu/../display/include/ddc_service_types.h:132:22: warning: ‘DP_SINK_DEVICE_STR_ID_1’ defined but not used [-Wunused-const-variable=] [snip 400 similar lines brevity] Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/pm/swsmu/smu11/vangogh_ppt: Provide a couple of missing parameter ↵Lee Jones
descriptions Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:2381: warning: Function parameter or member 'residency' not described in 'vangogh_get_gfxoff_residency' drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:2399: warning: Function parameter or member 'entrycount' not described in 'vangogh_get_gfxoff_entrycount' Cc: Evan Quan <evan.quan@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Li Ma <li.ma@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/amdgpu/amdgpu_vce: Provide description for ↵Lee Jones
amdgpu_vce_validate_bo()'s 'p' param Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:599: warning: Function parameter or member 'p' not described in 'amdgpu_vce_validate_bo' v2: reword (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/amdgpu/amdgpu_mes: Ensure amdgpu_bo_create_kernel()'s return value ↵Lee Jones
is checked Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c: In function ‘amdgpu_mes_ctx_alloc_meta_data’: drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1099:13: warning: variable ‘r’ set but not used [-Wunused-but-set-variable] Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Jack Xiao <Jack.Xiao@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/amdgpu/ih_v6_0: Repair misspelling and provide descriptions for 'ih'Lee Jones
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/ih_v6_0.c:392: warning: Function parameter or member 'ih' not described in 'ih_v6_0_get_wptr' drivers/gpu/drm/amd/amdgpu/ih_v6_0.c:432: warning: Function parameter or member 'ih' not described in 'ih_v6_0_irq_rearm' drivers/gpu/drm/amd/amdgpu/ih_v6_0.c:458: warning: Function parameter or member 'ih' not described in 'ih_v6_0_set_rptr' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/amdgpu/gmc_v11_0: Provide a few missing param descriptions relating ↵Lee Jones
to hubs and flushes Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:282: warning: Function parameter or member 'vmhub' not described in 'gmc_v11_0_flush_gpu_tlb' drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:282: warning: Function parameter or member 'flush_type' not described in 'gmc_v11_0_flush_gpu_tlb' drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:322: warning: Function parameter or member 'flush_type' not described in 'gmc_v11_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:322: warning: Function parameter or member 'all_hub' not described in 'gmc_v11_0_flush_gpu_tlb_pasid' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/amdgpu/amdgpu_vm_pt: Supply description for ↵Lee Jones
amdgpu_vm_pt_free_dfs()'s unlocked param Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:683: warning: Function parameter or member 'unlocked' not described in 'amdgpu_vm_pt_free_dfs' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/amdgpu/amdgpu_ucode: Remove unused function ↵Lee Jones
‘amdgpu_ucode_print_imu_hdr’ Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:129:6: warning: no previous prototype for ‘amdgpu_ucode_print_imu_hdr’ [-Wmissing-prototypes] Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/amdgpu/amdgpu_device: Provide missing kerneldoc entry for ↵Lee Jones
'reset_context' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:5152: warning: Function parameter or member 'reset_context' not described in 'amdgpu_device_gpu_recover' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/dc/dc_hdmi_types: Move string definition to the only file ↵Lee Jones
it's used in Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/dc_hdmi_types.h:53:22: warning: ‘dp_hdmi_dongle_signature_str’ defined but not used [-Wunused-const-variable=] [snipped 400 similar lines for brevity] Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amdgpu/jpeg: enable jpeg v4_0 for sriovJane Jian
- skip direct jpeg registers read&write since it is not allowed - reset Doorbell range layout for sriov Signed-off-by: Jane Jian <Jane.Jian@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amdgpu: Adding CAP firmware initializationBill Liu
Added CAP firmware initialization for PSP v13.0.10 under psp_init_sriov_microcode Signed-off-by: Bill Liu <Bill.Liu@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display: Remove the unused function link_timing_bandwidth_kbps()Jiapeng Chong
The function link_timing_bandwidth_kbps is defined in the link_validation.c file, but not called elsewhere, so remove this unused function. drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_validation.c:258:10: warning: no previous prototype for ‘link_timing_bandwidth_kbps’. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4547 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display: make is_synaptics_cascaded_panamera staticJiapeng Chong
This symbol is not used outside of amdgpu_dm_mst_types.c, so marks it static. drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:211:6: warning: no previous prototype for ‘is_synaptics_cascaded_panamera’. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4548 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/amdgpu_dm: Refactor register_backlight_device()Hans de Goede
Refactor register_backlight_device(): 1) Turn the connector-type + signal check into an early exit condition to avoid the indentation level of the rest of the code 2) Add an array bounds check for the arrays indexed by dm->num_of_edps 3) register_backlight_device() always increases dm->num_of_edps if amdgpu_dm_register_backlight_device() has assigned a backlight_dev to the current dm->backlight_link[dm->num_of_edps] slot. So on its next call dm->backlight_dev[dm->num_of_edps] always point to the next empty slot and the "if (!dm->backlight_dev[dm->num_of_edps])" check will thus always succeed and can be removed. 4) Add a bl_idx local variable to use as array index, rather then using dm->num_of_edps to improve the code readability. Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amd/display/amdgpu_dm: Fix backlight_device_register() error handlingHans de Goede
backlight_device_register() returns an ERR_PTR on error, but other code such as amdgpu_dm_connector_destroy() assumes dm->backlight_dev[i] is NULL if no backlight is registered. Clear dm->backlight_dev[i] on registration failure, to avoid other code trying to deref an ERR_PTR pointer. Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amdgpu/gfx: set cg flags to enter/exit safe modeJane Jian
sriov needs to enter/exit safe mode in update umd p state add the cg flag to let it enter or exit while needed Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobsYuBiao Wang
[Why] For engines not supporting soft reset, i.e. VCN, there will be a failed ib test before mode 1 reset during asic reset. The fences in this case are never signaled and next time when we try to free the sa_bo, kernel will hang. [How] During pre_asic_reset, driver will clear job fences and afterwards the fences' refcount will be reduced to 1. For drm_sched_jobs it will be released in job_free_cb, and for non-sched jobs like ib_test, it's meant to be released in sa_bo_free but only when the fences are signaled. So we have to force signal the non_sched bad job's fence during pre_asic_reset or the clear is not complete. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amdgpu: add mes resume when do gfx post soft resetTong Liu01
[why] when gfx do soft reset, mes will also do reset, if mes is not resumed when do recover from soft reset, mes is unable to respond in later sequence [how] resume mes when do gfx post soft reset Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amdgpu: skip ASIC reset for APUs when go to S4Tim Huang
For GC IP v11.0.4/11, PSP TMR need to be reserved for ASIC mode2 reset. But for S4, when psp suspend, it will destroy the TMR that fails the ASIC reset. [ 96.006101] amdgpu 0000:62:00.0: amdgpu: MODE2 reset [ 100.409717] amdgpu 0000:62:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000011 SMN_C2PMSG_82:0x00000002 [ 100.411593] amdgpu 0000:62:00.0: amdgpu: Mode2 reset failed! [ 100.412470] amdgpu 0000:62:00.0: PM: pci_pm_freeze(): amdgpu_pmops_freeze+0x0/0x50 [amdgpu] returns -62 [ 100.414020] amdgpu 0000:62:00.0: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xd0 returns -62 [ 100.415311] amdgpu 0000:62:00.0: PM: pci_pm_freeze+0x0/0xd0 returned -62 after 4623202 usecs [ 100.416608] amdgpu 0000:62:00.0: PM: failed to freeze async: error -62 We can skip the reset on APUs, assuming we can resume them properly. Verified on some GFX11, GFX10 and old GFX9 APUs. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amdgpu: reposition the gpu reset checking for reuseTim Huang
Move the amdgpu_acpi_should_gpu_reset out of CONFIG_SUSPEND to share it with hibernate case. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22drm/amdgpu: drop the extra sign extensionAlex Deucher
amdgpu_bo_gpu_offset_no_check() already calls amdgpu_gmc_sign_extend() so no need to call it twice. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-21net/sonic: use dma_mapping_error() for error checkZhang Changzhong
The DMA address returned by dma_map_single() should be checked with dma_mapping_error(). Fix it accordingly. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/6645a4b5c1e364312103f48b7b36783b94e197a2.1679370343.git.fthain@linux-m68k.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21Merge branch 'fix-trainwreck-with-ocelot-switch-statistics-counters'Jakub Kicinski
Vladimir Oltean says: ==================== Fix trainwreck with Ocelot switch statistics counters While testing the patch set for preemptible traffic classes with some controlled traffic and measuring counter deltas: https://lore.kernel.org/netdev/20230220122343.1156614-1-vladimir.oltean@nxp.com/ I noticed that in the output of "ethtool -S swp0 --groups eth-mac eth-phy eth-ctrl rmon -- --src emac | grep -v ': 0'", the TX counters were off. Quickly I realized that their values were permutated by 1 compared to their names, and that for example tx-rmon-etherStatsPkts64to64Octets was incrementing when tx-rmon-etherStatsPkts65to127Octets should have. Initially I suspected something having to do with the bulk reading logic, and indeed I found a bug there (fixed as 1/3), but that was not the source of the problems. Instead it revealed other problems. While dumping the regions created by the driver on my switch, I figured out that it sees a discontinuity which shouldn't have existed between reg 0x278 and reg 0x280. Discontinuity between last reg 0x0 and new reg 0x0, creating new region Discontinuity between last reg 0x108 and new reg 0x200, creating new region Discontinuity between last reg 0x278 and new reg 0x280, creating new region Discontinuity between last reg 0x2b0 and new reg 0x400, creating new region region of 67 contiguous counters starting with SYS:STAT:CNT[0x000] region of 31 contiguous counters starting with SYS:STAT:CNT[0x080] region of 13 contiguous counters starting with SYS:STAT:CNT[0x0a0] region of 18 contiguous counters starting with SYS:STAT:CNT[0x100] That is where TX_MM_HOLD should have been, and that was the bug, since it was missing. After adding it, the regions look like this and the off-by-one issue is resolved: Discontinuity between last reg 0x000000 and new reg 0x000000, creating new region Discontinuity between last reg 0x000108 and new reg 0x000200, creating new region Discontinuity between last reg 0x0002b0 and new reg 0x000400, creating new region region of 67 contiguous counters starting with SYS:STAT:CNT[0x000] region of 45 contiguous counters starting with SYS:STAT:CNT[0x080] region of 18 contiguous counters starting with SYS:STAT:CNT[0x100] However, as I am thinking out loud, it should have not reported the other counters as off by one even when skipping TX_MM_HOLD... after all, on Ocelot/Seville, there are more counters which need to be skipped. Which is when I investigated and noticed the bug solved in 2/3. I've validated that both on native VSC9959 (which uses ocelot_mm_stats_layout) as well as by faking the other switches by making VSC9959 use the plain ocelot_stats_layout. To summarize: on all Ocelot switches, the TX counters and drop counters are completely broken. The RX counters are mostly fine. With this occasion, I have collected more cleanup patches in this area, which I'm going to submit after the net -> net-next merge. ==================== Link: https://lore.kernel.org/r/20230321010325.897817-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21net: mscc: ocelot: add TX_MM_HOLD to ocelot_mm_stats_layoutVladimir Oltean
The lack of a definition for this counter is what initially prompted me to investigate a problem which really manifested itself as the previous change, "net: mscc: ocelot: fix transfer from region->buf to ocelot->stats". When TX_MM_HOLD is defined in enum ocelot_stat but not in struct ocelot_stat_layout ocelot_mm_stats_layout, this creates a hole, which due to the aforementioned bug, makes all counters following TX_MM_HOLD be recorded off by one compared to their correct position. So for example, a non-zero TX_PMAC_OCTETS would be reported as TX_MERGE_FRAGMENTS, TX_PMAC_UNICAST would be reported as TX_PMAC_OCTETS, TX_PMAC_64 would be reported as TX_PMAC_PAUSE, etc etc. This is because the size of the hole (1) is much smaller than the size of the region, so the phenomenon where the stats are off-by-one, rather than lost, prevails. However, the phenomenon where stats are lost can be seen too, for example with DROP_LOCAL, which is at the beginning of its own region (offset 0x000400 vs the previous 0x0002b0 constitutes a discontinuity). This is also reported as off by one and saved to TX_PMAC_1527_MAX, but that counter is not reported to the unstructured "ethtool -S", as opposed to DROP_LOCAL which is (as "drop_local"). Fixes: ab3f97a9610a ("net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21net: mscc: ocelot: fix transfer from region->buf to ocelot->statsVladimir Oltean
To understand the problem, we need some definitions. The driver is aware of multiple counters (enum ocelot_stat), yet not all switches supported by the driver implement all counters. There are 2 statistics layouts: ocelot_stats_layout and ocelot_mm_stats_layout, the latter having 36 counters more than the former. ocelot->stats[] is not a compact array, i.e. there are elements within it which are not going to be populated for ocelot_stats_layout. On the other hand, ocelot->stats[] is easily indexable, for example "tx_octets" for port 3 can be found at ocelot->stats[3 * OCELOT_NUM_STATS + OCELOT_STAT_TX_OCTETS], and that is why we keep it sparse. Regions, as created by ocelot_prepare_stats_regions(), are compact (every element from region->buf will correspond to a counter that is present in this switch's layout) but are not easily indexable. Let's define holes as the ranges of values of enum ocelot_stat for which ocelot_stats_layout doesn't have a "reg" defined. For example, there is a hole between OCELOT_STAT_RX_GREEN_PRIO_7 and OCELOT_STAT_TX_OCTETS which is of 23 elements that are only present on ocelot_mm_stats_layout, and as such, they are also present in enum ocelot_stat. Let's define the left extremity of the hole - the last enum ocelot_stat still defined - as A (in this case OCELOT_STAT_RX_GREEN_PRIO_7) and the right extremity - the first enum ocelot_stat that is defined after a series of undefined ones - as B (in this case OCELOT_STAT_TX_OCTETS). There is a bug in the procedure which transfers stats from region->buf[] to ocelot->stats[]. For each hole in the ocelot_stats_layout, the logic transfers the stats starting with enum ocelot_stat B to ocelot->stats[] index A + 1. So all stats after a hole are saved to a position which is off by B - A + 1 elements. This causes 2 kinds of issues: (a) counters which shouldn't increment increment (b) counters which should increment don't Holes in the ocelot_stat_layout automatically imply the end of a region and the beginning of a new one; however the reverse is not necessarily true. For example, for ocelot_mm_stat_layout, there could be multiple regions (which indicate discontinuities in register addresses) while there is no hole (which indicates discontinuities in enum ocelot_stat values). In the example above, the stats from the second region->buf[] are not transferred to ocelot->stats starting with index "port * OCELOT_NUM_STATS + OCELOT_STAT_TX_OCTETS" as they should, but rather, starting with element "port * OCELOT_NUM_STATS + OCELOT_STAT_RX_GREEN_PRIO_7 + 1". That stats[] array element is not reported to user space for switches that use ocelot_stat_layout, and that is how issue (b) occurs. However, if the length of the second region is larger than the hole, then some stats will start to be transferred to the ocelot->stats[] indices which *are* reported to user space, but those indices contain wrong values (corresponding to unexpected counters). This is how issue (a) occurs. The procedure, as it was introduced in commit d87b1c08f38a ("net: mscc: ocelot: use bulk reads for stats"), was not buggy, because there were no holes in the struct ocelot_stat_layout instances at that time. The problem is that when those holes were introduced, the function was not updated to take them into consideration. To update the procedure, we need to know, for each region, which enum ocelot_stat corresponds to its region->base. We have no way of deducing that based on the contents of struct ocelot_stats_region, so we need to add this information. Fixes: ab3f97a9610a ("net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21net: mscc: ocelot: fix stats region batchingVladimir Oltean
The blamed commit changed struct ocelot_stat_layout :: "u32 offset" to "u32 reg". However, "u32 reg" is not quite a register address, but an enum ocelot_reg, which in itself encodes an enum ocelot_target target in the upper bits, and an index into the ocelot->map[target][] array in the lower bits. So, whereas the previous code comparison between stats_layout[i].offset and last + 1 was correct (because those "offsets" at the time were 32-bit relative addresses), the new code, comparing layout[i].reg to last + 4 is not correct, because the "reg" here is an enum/index, not an actual register address. What we want to compare are indeed register addresses, but to do that, we need to actually go through the same motions as __ocelot_bulk_read_ix() itself. With this bug, all statistics counters are deemed by ocelot_prepare_stats_regions() as constituting their own region. (Truncated) log on VSC9959 (Felix) below (prints added by me): Before: region of 1 contiguous counters starting with SYS:STAT:CNT[0x000] region of 1 contiguous counters starting with SYS:STAT:CNT[0x001] region of 1 contiguous counters starting with SYS:STAT:CNT[0x002] ... region of 1 contiguous counters starting with SYS:STAT:CNT[0x041] region of 1 contiguous counters starting with SYS:STAT:CNT[0x042] region of 1 contiguous counters starting with SYS:STAT:CNT[0x080] region of 1 contiguous counters starting with SYS:STAT:CNT[0x081] ... region of 1 contiguous counters starting with SYS:STAT:CNT[0x0ac] region of 1 contiguous counters starting with SYS:STAT:CNT[0x100] region of 1 contiguous counters starting with SYS:STAT:CNT[0x101] ... region of 1 contiguous counters starting with SYS:STAT:CNT[0x111] After: region of 67 contiguous counters starting with SYS:STAT:CNT[0x000] region of 45 contiguous counters starting with SYS:STAT:CNT[0x080] region of 18 contiguous counters starting with SYS:STAT:CNT[0x100] Since commit d87b1c08f38a ("net: mscc: ocelot: use bulk reads for stats") intended bulking as a performance improvement, and since now, with trivial-sized regions, performance is even worse than without bulking at all, this could easily qualify as a performance regression. Fixes: d4c367650704 ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Colin Foster <colin.foster@in-advantage.com> Tested-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21erspan: do not use skb_mac_header() in ndo_start_xmit()Eric Dumazet
Drivers should not assume skb_mac_header(skb) == skb->data in their ndo_start_xmit(). Use skb_network_offset() and skb_transport_offset() which better describe what is needed in erspan_fb_xmit() and ip6erspan_tunnel_xmit() syzbot reported: WARNING: CPU: 0 PID: 5083 at include/linux/skbuff.h:2873 skb_mac_header include/linux/skbuff.h:2873 [inline] WARNING: CPU: 0 PID: 5083 at include/linux/skbuff.h:2873 ip6erspan_tunnel_xmit+0x1d9c/0x2d90 net/ipv6/ip6_gre.c:962 Modules linked in: CPU: 0 PID: 5083 Comm: syz-executor406 Not tainted 6.3.0-rc2-syzkaller-00866-gd4671cb96fa3 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023 RIP: 0010:skb_mac_header include/linux/skbuff.h:2873 [inline] RIP: 0010:ip6erspan_tunnel_xmit+0x1d9c/0x2d90 net/ipv6/ip6_gre.c:962 Code: 04 02 41 01 de 84 c0 74 08 3c 03 0f 8e 1c 0a 00 00 45 89 b4 24 c8 00 00 00 c6 85 77 fe ff ff 01 e9 33 e7 ff ff e8 b4 27 a1 f8 <0f> 0b e9 b6 e7 ff ff e8 a8 27 a1 f8 49 8d bf f0 0c 00 00 48 b8 00 RSP: 0018:ffffc90003b2f830 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 000000000000ffff RCX: 0000000000000000 RDX: ffff888021273a80 RSI: ffffffff88e1bd4c RDI: 0000000000000003 RBP: ffffc90003b2f9d8 R08: 0000000000000003 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000000 R12: ffff88802b28da00 R13: 00000000000000d0 R14: ffff88807e25b6d0 R15: ffff888023408000 FS: 0000555556a61300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055e5b11eb6e8 CR3: 0000000027c1b000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __netdev_start_xmit include/linux/netdevice.h:4900 [inline] netdev_start_xmit include/linux/netdevice.h:4914 [inline] __dev_direct_xmit+0x504/0x730 net/core/dev.c:4300 dev_direct_xmit include/linux/netdevice.h:3088 [inline] packet_xmit+0x20a/0x390 net/packet/af_packet.c:285 packet_snd net/packet/af_packet.c:3075 [inline] packet_sendmsg+0x31a0/0x5150 net/packet/af_packet.c:3107 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg+0xde/0x190 net/socket.c:747 __sys_sendto+0x23a/0x340 net/socket.c:2142 __do_sys_sendto net/socket.c:2154 [inline] __se_sys_sendto net/socket.c:2150 [inline] __x64_sys_sendto+0xe1/0x1b0 net/socket.c:2150 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f123aaa1039 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc15d12058 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f123aaa1039 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000020000040 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f123aa648c0 R13: 431bde82d7b634db R14: 0000000000000000 R15: 0000000000000000 Fixes: 1baf5ebf8954 ("erspan: auto detect truncated packets.") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230320163427.8096-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21atm: idt77252: fix kmemleak when rmmod idt77252Li Zetao
There are memory leaks reported by kmemleak: unreferenced object 0xffff888106500800 (size 128): comm "modprobe", pid 1017, jiffies 4297787785 (age 67.152s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000970ce626>] __kmem_cache_alloc_node+0x20c/0x380 [<00000000fb5f78d9>] kmalloc_trace+0x2f/0xb0 [<000000000e947e2a>] idt77252_init_one+0x2847/0x3c90 [idt77252] [<000000006efb048e>] local_pci_probe+0xeb/0x1a0 ... unreferenced object 0xffff888106500b00 (size 128): comm "modprobe", pid 1017, jiffies 4297787785 (age 67.152s) hex dump (first 32 bytes): 00 20 3d 01 80 88 ff ff 00 20 3d 01 80 88 ff ff . =...... =..... f0 23 3d 01 80 88 ff ff 00 20 3d 01 00 00 00 00 .#=...... =..... backtrace: [<00000000970ce626>] __kmem_cache_alloc_node+0x20c/0x380 [<00000000fb5f78d9>] kmalloc_trace+0x2f/0xb0 [<00000000f451c5be>] alloc_scq.constprop.0+0x4a/0x400 [idt77252] [<00000000e6313849>] idt77252_init_one+0x28cf/0x3c90 [idt77252] The root cause is traced to the vc_maps which alloced in open_card_oam() are not freed in close_card_oam(). The vc_maps are used to record open connections, so when close a vc_map in close_card_oam(), the memory should be freed. Moreover, the ubr0 is not closed when close a idt77252 device, leading to the memory leak of vc_map and scq_info. Fix them by adding kfree in close_card_oam() and implementing new close_card_ubr0() to close ubr0. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Francois Romieu <romieu@fr.zoreil.com> Link: https://lore.kernel.org/r/20230320143318.2644630-1-lizetao1@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21hwmon (it87): Fix voltage scaling for chips with 10.9mV ADCsFrank Crawford
Fix voltage scaling for chips that have 10.9mV ADCs, where scaling was not performed. Fixes: ead8080351c9 ("hwmon: (it87) Add support for IT8732F") Signed-off-by: Frank Crawford <frank@crawford.emu.id.au> Link: https://lore.kernel.org/r/20230318080543.1226700-2-frank@crawford.emu.id.au [groeck: Update subject and description to focus on bug fix] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-03-22Merge tag 'drm-habanalabs-next-2023-03-20' of ↵Dave Airlie
https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into drm-next This tag contains habanalabs driver and accel changes for v6.4: - uAPI changes: - Add opcodes to the CS ioctl to allow user to stall/resume specific engines inside Gaudi2. This is to allow the user to perform power testing/measurements when training different topologies. - Expose in the INFO ioctl the amount of device memory that the driver and f/w reserve for themselves. - Expose in the INFO ioctl a bit-mask of the available rotator engines in Gaudi2. This is to align with other engines that are already exposed. - Expose in the INFO ioctl the register's address of the f/w that should be used to trigger interrupts from within the user's code running in the compute engines. - Add a critical-event bit in the eventfd bitmask so the user will know the event that was received was critical, and a reset will now occur - Expose in the INFO ioctl two new opcodes to fetch information on h/w and f/w events. The events recorded are the events that were reported in the eventfd. - New features and improvements: - Add a dedicated interrupt ID in MSI-X in the device to the notification of an unexpected user-related event in Gaudi2. Handle it in the driver by reporting this event. - Allow the user to fetch the device memory current usage even when the device is undergoing compute-reset (a reset type that only clears the compute engines). - Enable graceful reset mechanism for compute-reset. This will give the user a few seconds before the device is reset. For example, the user can, during that time, perform certain device operations (dump data for debug) or close the device in an orderly fashion. - Align the decoder with the rest of the engines in regard to notification to the user about interrupts and in regard to performing graceful reset when needed (instead of immediate reset). - Add support for assert interrupt from the TPC engine. - Get the reset type that is necessary to perform per event from the auto-generated irq_map array. - Print the specific reason why a device is still in use when notifying to the user about it (after the user closed the device's FD). - Move to threaded IRQ when handling interrupts of workload completions. - Firmware related fixes: - Fix RAZWI event handler to match newest f/w version. - Read error cause register in dma core events because the f/w doesn't do that. - Increase maximum time to wait for completion of Gaudi2 reset due to f/w bug. - Align to the latest firmware specs. - Enforce the release order of the compute device and dma-buf. i.e increment the device file refcount for any dma-buf that was exported for that device. This will make sure the compute device release function won't be called until the user closes all the FDs of the relevant dma-bufs. Without this change, closing the device's FD before/without closing the dma-buf's FD would always lead to hard-reset of the device. - Fix a link in the drm documentation to correctly point to the accel section. - Compilation warnings cleanups - Misc bug fixes and code cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCgAdFiEE7TEboABC71LctBLFZR1NuKta54AFAmQYfcAACgkQZR1NuKta # 54DB4Af/SuiHZkVXwr+yHPv9El726rz9ZQD7mQtzNmehWGonwAvz15yqocNMUSbF # JbqE/vrZjvbXrP1Uv5UrlRVdnFHSPV18VnHU4BMS/WOm19SsR6vZ0QOXOoa6/AUb # w+kF3D//DbFI4/mTGfpH5/pzwu51ti8aVktosPFlHIa8iI8CB4/4IV+ivQ8UW4oK # HyDRkIvHdRmER7vGOfhwhsr4zdqSlJBYrv3C3Z1dkSYBPW/5ICbiM1UlKycwdYKI # cajQBSdUQwUCWnI+i8RmSy3kjNO6OE4XRUvTv89F2bQeyK/1rJLG2m2xZR/Ml/o5 # 7Cgvbn0hWZyeqe7OObYiBlSOBSehCA== # =wclm # -----END PGP SIGNATURE----- # gpg: Signature made Tue 21 Mar 2023 01:37:36 AEST # gpg: using RSA key ED311BA00042EF52DCB412C5651D4DB8AB5AE780 # gpg: Can't check signature: No public key From: Oded Gabbay <ogabbay@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230320154026.GA766126@ogabbay-vm-u20.habana-labs.com
2023-03-21net: dsa: tag_brcm: legacy: fix daisy-chained switchesÁlvaro Fernández Rojas
When BCM63xx internal switches are connected to switches with a 4-byte Broadcom tag, it does not identify the packet as VLAN tagged, so it adds one based on its PVID (which is likely 0). Right now, the packet is received by the BCM63xx internal switch and the 6-byte tag is properly processed. The next step would to decode the corresponding 4-byte tag. However, the internal switch adds an invalid VLAN tag after the 6-byte tag and the 4-byte tag handling fails. In order to fix this we need to remove the invalid VLAN tag after the 6-byte tag before passing it to the 4-byte tag decoding. Fixes: 964dbf186eaa ("net: dsa: tag_brcm: add support for legacy tags") Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230319095540.239064-1-noltari@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22Merge tag 'drm-intel-gt-next-2023-03-16' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: - Fix issue #6333: "list_add corruption" and full system lockup from performance monitoring (Janusz) - Give the punit time to settle before fatally failing (Aravind, Chris) - Don't use stolen memory or BAR for ring buffers on LLC platforms (John) - Add missing ecodes and correct timeline seqno on GuC error captures (John) - Make sure DSM size has correct 1MiB granularity on Gen12+ (Nirmoy, Lucas) - Fix potential SSEU max_subslices array-index-out-of-bounds access on Gen11 (Andrea) - Whitelist COMMON_SLICE_CHICKEN3 for UMD access on Gen12+ (Matt R.) - Apply Wa_1408615072/Wa_1407596294 correctly on Gen11 (Matt R) - Apply LNCF/LBCF workarounds correctly on XeHP SDV/PVC/DG2 (Matt R) - Implement Wa_1606376872 for Xe_LP (Gustavo) - Consider GSI offset when doing MCR lookups on Meteorlake+ (Matt R.) - Add engine TLB invalidation for Meteorlake (Matt R.) - Fix GSC Driver-FLR completion on Meteorlake (Alan) - Fix GSC races on driver load/unload on Meteorlake+ (Daniele) - Disable MC6 for MTL A step (Badal) - Consolidate TLB invalidation flow (Tvrtko) - Improve debug GuC/HuC debug messages (Michal Wa., John) - Move fd_install after last use of fence (Rob) - Initialize the obj flags for shmem objects (Aravind) - Fix missing debug object activation (Nirmoy) - Probe lmem before the stolen portion (Matt A) - Improve clean up of GuC busyness stats worker (John) - Fix missing return code checks in GuC submission init (John) - Annotate two more workaround/tuning registers as MCR on PVC (Matt R) - Fix GEN8_MISCCPCTL definition and remove unused INF_UNIT_LEVEL_CLKGATE (Lucas) - Use sysfs_emit() and sysfs_emit_at() (Nirmoy) - Make kobj_type structures constant (Thomas W.) - make kobj attributes const on gt/ (Jani) - Remove the unused virtualized start hack on buddy allocator (Matt A) - Remove redundant check for DG1 (Lucas) - Move DG2 tuning to the right function (Lucas) - Rename dev_priv to i915 for private data naming consistency in gt/ (Andi) - Remove unnecessary whitelisting of CS_CTX_TIMESTAMP on Xe_HP platforms (Matt R.) - - Escape wildcard in method names in kerneldoc (Bagas) - Selftest improvements (Chris, Jonathan, Tvrtko, Anshuman, Tejas) - Fix sparse warnings (Jani) [airlied: fix unused variable in intel_workarounds] Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZBMSb42yjjzczRhj@jlahtine-mobl.ger.corp.intel.com
2023-03-21riscv: mm: Fix incorrect ASID argument when flushing TLBDylan Jhong
Currently, we pass the CONTEXTID instead of the ASID to the TLB flush function. We should only take the ASID field to prevent from touching the reserved bit field. Fixes: 3f1e782998cd ("riscv: add ASID-based tlbflushing methods") Signed-off-by: Dylan Jhong <dylan@andestech.com> Reviewed-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Link: https://lore.kernel.org/r/20230313034906.2401730-1-dylan@andestech.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-21Merge tag 'vfio-v6.3-rc4' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO fix from Alex Williamson: - Fix dirty size reporting for pre-copy in mlx5 variant driver (Yishai Hadas) * tag 'vfio-v6.3-rc4' of https://github.com/awilliam/linux-vfio: vfio/mlx5: Fix the report of dirty_bytes upon pre-copy
2023-03-21Merge tag 'nfsd-6.3-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a crash during NFS READs from certain client implementations - Address a minor kbuild regression in v6.3 * tag 'nfsd-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: don't replace page in rq_pages if it's a continuation of last page NFS & NFSD: Update GSS dependencies
2023-03-21drm/lima: Use drm_sched_job_add_syncobj_dependency()Maíra Canal
As lima_gem_add_deps() performs the same steps as drm_sched_job_add_syncobj_dependency(), replace the open-coded implementation in Lima in order to simply use the DRM function. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Maíra Canal <mairacanal@riseup.net> Link: https://patchwork.freedesktop.org/patch/msgid/20230224214133.411966-1-mcanal@igalia.com
2023-03-21net/mlx5: E-Switch, Fix an Oops in error handling codeDan Carpenter
The error handling dereferences "vport". There is nothing we can do if it is an error pointer except returning the error code. Fixes: 133dcfc577ea ("net/mlx5: E-Switch, Alloc and free unique metadata for match") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>