summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-12-20Merge tag 'drm-intel-next-2023-12-18' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next - Drop pointless null checks and fix a scaler bug (Ville) - Meteor Lake display fixes and clean-ups (RK, Jani, Andrzej, Mika, Imre) - Clean-up around flip done IRQ (Ville) - Fix eDP Meteor Lake bug (Jani) - Bigjoiner fixes (Ankit, Ville) - Cdclk/voltage_level cleanups and fixes (Ville) - DMC event stuff (Ville) - Remove dead code around intel_atomic_helper->free_list (Jouni) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZYB5XBRdWWgWoMKc@intel.com
2023-12-20Merge tag 'mediatek-drm-next-6.8' of ↵Dave Airlie
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next Mediatek DRM Next for Linux 6.8 1. Use devm_platform_ioremap_resource() 2. Stop using iommu_present() 3. Add display driver for MT8188 VDOSYS1 4. Add phy_mtk_dp module as pre-dependency Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231218145826.5643-1-chunkuang.hu@kernel.org
2023-12-20Merge tag 'drm-msm-next-2023-12-15' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next Updates for v6.8: Core: - Add support for SDM670, SM8650 - Handle the CFG interconnect to fix the obscure hangs / timeouts on register write - Kconfig fix for QMP dependency - DT schema fixes DPU: - Add support for SDM670, SM8650 - Enable SmartDMA on SM8350 and SM8450 - Correct UBWC settings for SC8280XP - Fix catalog settings for SC8180X - Actually make use of the version to switch between QSEED3/3LITE/4 scalers - Use devres-managed and drm-managed allocations where appropriate - misc other fixes - Enabled YUV writeback on SC7280, SM8250 - Enabled writeback on SM8350, SM8450 - CRC fix when encoder is selected as the input source - other misc fixes MDP4: - Use devres-managed and drm-managed allocations where appropriate - flush vblank event on CRTC disable MDP5: - Use devres-managed and drm-managed allocations where appropriate DP: - Add support for SM8650 - Enable PM runtime support - Merge msm-specific debugfs dir with the generic one - Described DisplayPort on SM8150 in DeviceTree bindings - Moved dp_display_get_next_bridge() to probe() DSI: - Add support for SM8650 - Enable PM runtime support GPU/GEM: - demote userspace triggerable warnings to debug - add GEM object metadata UAPI - move GPU devcoredumps to GPU device - fix hangcheck to skip retired submits - expose UBWC config to userspace - fix a680 chip-id - drm_exec conversion - drm/ci: remove rebase-merge directory (to unblock CI) [airlied: fix drm_exec/amd interaction] Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs9auYqmo-7NSd9FsbNBCDf7aBevd=4xkcF3A5G_OGvMQ@mail.gmail.com
2023-12-20Merge tag 'amd-drm-next-6.8-2023-12-15' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.8-2023-12-15: amdgpu: - Suspend fixes - Misc code cleanups - JPEG fix - Add AMD specific color management (protected by AMD_PRIVATE_COLOR) - UHBR13.5 cable fixes - Misc display fixes - Display WB fixes - PSR fixes - XGMI fix - ACPI WBRF support for handling potential RF interference from GPU clocks - Enable tunneling on high priority compute queues - drm_edid.h include cleanup - VPE DPM support - SMU 13 fixes - Fix possible double frees in error paths - Misc fixes amdkfd: - Support import and export of dma-bufs using GEM handles - MES shader debugger fixes - SVM fixes radeon: - drm_edid.h include cleanup - Misc code cleanups - Fix possible memory leak in error path drm: - Increase max objects to accomodate new color props - Make replace_property_blob_from_id a DRM helper - Track color management changes per plane platform-x86: - Merge immutable branch from Hans for platform dependencies for WBRF to coordinate merge of WBRF feature across wifi, platform, and GPU Signed-off-by: Dave Airlie <airlied@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZXygTgAKCRC93/aFa7yZ # 2EW1AQCILfGTtDWXzgLSpUBtt9jOooHqaSrah19Cfw0HlA3QIQD+OCohXH1LLZo1 # tYHyfsLv0LsNawI198qABzB1PwptSAI= # =M1AO # -----END PGP SIGNATURE----- # gpg: Signature made Sat 16 Dec 2023 04:51:58 AEST # gpg: using EDDSA key 203B921D836B5735349902BDBDDFF6856BBC99D8 # gpg: Can't check signature: No public key From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231215193519.5040-1-alexander.deucher@amd.com
2023-12-19drm: using mul_u32_u32() requires linux/math64.hStephen Rothwell
Some pending include file cleanups produced this error: In file included from include/linux/kernel.h:27, from drivers/gpu/ipu-v3/ipu-dp.c:7: include/drm/drm_color_mgmt.h: In function 'drm_color_lut_extract': include/drm/drm_color_mgmt.h:45:46: error: implicit declaration of function 'mul_u32_u32' [-Werror=implicit-function-declaration] 45 | return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(user_input, (1 << bit_precision) - 1), | ^~~~~~~~~~~ Fixes: c6fbb6bca108 ("drm: Fix color LUT rounding") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231219145734.13e40e1e@canb.auug.org.au
2023-12-19accel/habanalabs: fix information leak in sec_attest_info()Xingyuan Mo
This function may copy the pad0 field of struct hl_info_sec_attest to user mode which has not been initialized, resulting in leakage of kernel heap data to user mode. To prevent this, use kzalloc() to allocate and zero out the buffer, which can also eliminate other uninitialized holes, if any. Fixes: 0c88760f8f5e ("habanalabs/gaudi2: add secured attestation info uapi") Signed-off-by: Xingyuan Mo <hdthky0@gmail.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs/gaudi2: avoid overriding existing undefined opcode dataTomer Tayar
Part of the undefined opcode data is updated in gaudi2_handle_qman_err_generic() and some in handle_lower_qman_data_on_err(). However, the 'write_enable' flag is checked only in gaudi2_handle_qman_err_generic(), and information of more than a single error can be mixed there. Moreover, handle_lower_qman_data_on_err() is called only for the lower QMAN, so for an error in the upper QMAN there is only a partial info. Move all the data update to be done in a single place, protected by the 'write_enable' flag. As mainly the lower QMAN's info is interesting, avoid saving the partial info for the upper QMAN. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: add parent_device sysfs attributeTomer Tayar
The device debugfs directory was modified to be named as the device-name. This name is the parent device name, i.e. either the PCI address in case of an ASIC, or the simulator device name in case of a simulator. This change makes it more difficult for a user to access the debugfs directory for a specific accel device, because he can't just use the accel minor id, but he needs to do more device-dependent operations to get the device name. To make it easier to get this name, add a 'parent_device' sysfs attribute that the user can read using the minor id before accessing debugfs. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: update debugfs-driver-habanalabs with the device-name ↵Tomer Tayar
directory The device debugfs directory was modified to be named as the parent device name. Update the paths accordingly. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs/gaudi2: add zero padding when printing QM CP instructionTomer Tayar
QM instructions are in multiples of 64 bits and the command type is in the upper bits of first QWORD. To make it clearer that an undefined command is due to a type of 0x0, always print all 64 bits and add a zero padding if needed. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: report 3 instances of Infineon second stageAriel Suller
Infineon controller second stage has 3 instances that their version need to be reported by driver. Signed-off-by: Ariel Suller <asuller@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs/gaudi2: add signed dev info uAPIMoti Haimovski
User will provide a nonce via the INFO ioctl, and will retrieve the signed device info generated using given nonce. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs/gaudi2: use correct registers to dump QM CQ infoTomer Tayar
The QM CQ PTR_LO/PTR_HI/TSIZE registers are for pushing a CQ entry, and although they are updated by HW even when descriptors are fetched by PQ and CB addresses are fed into CQ, the correct registers to use when dumping the CQ info are the ones with the _STS suffix. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: expose module id through sysfsDani Liberman
Module ID exposes the physical location of the device in the server, from the pov of the devices in regard to how they are connected by internal fabric. This information is already exposed in our INFO ioctl, but there are utilities and scripts running in data-center which are already accessing sysfs for topology information and it is easier for them to continue getting that information from sysfs instead of opening a file descriptor. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: print error code when mapping failsDani Liberman
Failure to map is considered a non-trivial error and we need to notify the user about it. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs/gaudi2: get the correct QM CQ info upon an errorTomer Tayar
Upon a QM error, the address/size from both the CQ and the ARC_CQ are printed, although the instruction that led to the error was received from only one of them. Moreover, in case of a QM undefined opcode, only one of these address/size sets will be captured based on the value of ARC_CQ_PTR. However, this value can be non-zero even if currently the CQ is used, in case the CQ/ARC_CQ are alternately used. Under the assumption of having a stop-on-error configuration, modify to use CP_STS.CUR_CQ field to get the relevant CQ for the QM error. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: set hard reset flag if graceful reset is skippedTomer Tayar
hl_device_cond_reset() might be called with the hard reset flag unset, because a compute reset upon device release as part of a graceful reset is valid. If the conditions for graceful reset are not met, hl_device_reset() will be called for an immediate reset. In this case a compute reset is not valid, so it will be replaced with a hard reset together with a debug message about it. This message might be confusing, as it implies that a compute reset was requested when it shouldn't. To prevent this confusion, set the hard reset flag in hl_device_cond_reset() if going to an immediate reset. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: remove 'get temperature' debug printOfir Bitton
The print was added long back for a specific debug and can now be removed. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs/gaudi2: fix undef opcode reportingDafna Hirschfeld
currently the undefined opcode event bit in set only for lower cp and only if 'write_enable' is true. It should be set anyway and for all streams in order to report that event to userspace. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: fix EQ heartbeat mechanismFarah Kassabri
Stop rescheduling another heartbeat check when EQ heartbeat check fails as it generates confusing logs in dmesg that the heartbeat fails. Signed-off-by: Farah Kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: add support for Gaudi2C deviceOded Gabbay
Gaudi2 with PCI revision ID with the value of '3' represents Gaudi2C device and should be detected and initialized as Gaudi2. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: add log when eq event is not receivedFarah Kassabri
Add error log when no eq event is received from FW, to cover a scenario when FW is stuck for some reason. In such case driver will not receive neither the eq error interrupt or the eq heartbeat event, and will just initiate a reset without indication in the dmesg about the reason. Signed-off-by: Farah Kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs/gaudi2: assume hard-reset by FW upon PCIe AXI drainTomer Tayar
When a PCIe AXI drain event happens, it is possible that the driver cannot access the device through PCIe, and therefore cannot send a hard-reset request to FW. Starting from FW version 1.13, FW will initiate a hard-reset in such a case without waiting for a reset request from the driver. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: update device boot error checkFarah Kassabri
Use a predefined mask which set the device critical boot errors. Driver will fail and stop its loading, only upon detecting at least one of those errors defined in this mask. Signed-off-by: Farah Kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19accel/habanalabs: add pcie reset prepare/done hooksfarah kassabri
When working on a bare-metal system, if FLR will happen the firmware will handle it and driver will have no knowledge of it, and this will cause two issues: 1.The driver will be in operational state while it should be in reset. This will cause the heartbeat mechanism to keep sending messages to FW while pci device is in reset. Eventually heartbeat will fail and the device will end up in non-operational state. 2. After FW handles the FLR, and due to the reset it'll go back to preboot stage, and driver need to perform hard reset in order to load the boot fit binary. This patch will add reset_prepare hook that will set the device to be in disabled state, so it'll be not operational, and also reset_done hook which will be called after the actual FLR handling, then it will perform hard reset. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-12-19Merge tag 'drm-misc-next-2023-12-14' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for $kernel-version: UAPI Changes: Cross-subsystem Changes: - A few fixes for usb/typec Core Changes: - ci: Updates to the defconfig, igt version, etc. - writeback: Move the atomic_check helper from the encoder to connector Driver Changes: - rockchip: Add support for rk3588 - xe: Update the TODO list - panel: - nv3052c: Register documentation, init sequence improvements and support for the Fascontek FS035VG158 - st7701: Add support for the Anbernic RG-ARC - new driver: Synaptics R63353 panel controller, Ilitek ILI9805 panel controller - new panel: AUO G156HAN04.0 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/aqpn5miejmkks7pbcfex7b6u63uwsruywxsnr3x5ljs45qatin@nbkkej2elk46
2023-12-19drm/bridge: properly refcount DT nodes in aux bridge driversDmitry Baryshkov
The aux-bridge and aux-hpd-bridge drivers didn't call of_node_get() on the device nodes further used for dev->of_node and platform data. When bridge devices are released, the reference counts are decreased, resulting in refcount underflow / use-after-free warnings. Get corresponding refcounts during AUX bridge allocation. Reported-by: Luca Weiss <luca.weiss@fairphone.com> Fixes: 2a04739139b2 ("drm/bridge: add transparent bridge helper") Fixes: 26f4bac3d884 ("drm/bridge: aux-hpd: Replace of_device.h with explicit include") Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231216235910.911958-1-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2023-12-18drm/mediatek: dp: Add phy_mtk_dp module as pre-dependencyNícolas F. R. A. Prado
The mtk_dp driver registers a phy device which is handled by the phy_mtk_dp driver and assumes that the phy probe will complete synchronously, proceeding to make use of functionality exposed by that driver right away. This assumption however is false when the phy driver is built as a module, causing the mtk_dp driver to fail probe in this case. Add the phy_mtk_dp module as a pre-dependency to the mtk_dp module to ensure the phy module has been loaded before the dp, so that the phy probe happens synchrounously and the mtk_dp driver can probe successfully even with the phy driver built as a module. Suggested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver") Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Guillaume Ranquet <granquet@baylibre.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20231121142938.460846-1-nfraprado@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-12-18drm/i915/display: Remove dead code around intel_atomic_helper->free_listJouni Högander
After switching to directly using dma_fence instead of i915_sw_fence we have left some dead code around intel_atomic_helper->free_list. Remove that dead code. v2: Remove intel_atomic_state->freed as well Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231114134141.2527694-1-jouni.hogander@intel.com
2023-12-18drm/i915/dmc: Print out the DMC mmio register list at fw load timeVille Syrjälä
To help with debugging print out the mmio list contained in the DMC firmware. Also highlight the event registers, and whether we're going to disable them or not. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231211213750.27109-5-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>
2023-12-18drm/i915/dmc: Also disable HRR event on TGL/ADLS main DMCVille Syrjälä
Unlike later platforms TGL/ADLS has the half refresh rate (HRR) event on the main DMC (as opposed to the pipe DMC). Since we're disabling that event on all later platforms already let's do the same on TGL/ADLS as well. There is supposedly a bit somewhere (DMC_CHICKEN on TGL) to make the handler not do anything, but we don't currently have code to frob it. Though that bit should be off by default, the ADL+ experience has shown us that trusting any of this isn't a good idea. So seems safer to just disable all event handlers we know that we don't need. Also the TGL/ADLS DMC firmware is apparently using the wrong event (undelayed vblank) here anyway. It should be using the delayed vblank event instead (like ADL+ firmware does), but they didn't release a firmware fix for this and instead just hacked around this in the Windows driver code :/ v2: Also disable the event on ADLS (Imre) Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231213150807.21331-1-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>
2023-12-18drm/i915/dmc: Also disable the flip queue event on TGL main DMCVille Syrjälä
Unlike later platforms TGL has its flip queue event (CLK_MSEC) on the main DMC (as opposed to the pipe DMC). Currently we're doing a second pass to disable that, but let's just follow the same approach as the later platforms and never even enable the event in the first place. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231211213750.27109-3-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>
2023-12-18drm/i915/dmc: Don't enable any pipe DMC eventsVille Syrjälä
The pipe DMC seems to be making a mess of things in ADL. Various weird symptoms have been observed such as missing vblank irqs, typicalle happening when using multiple displays. Keep all pipe DMC event handlers disabled until needed (which is never atm). This is also what Windows does on ADL+. We can also drop DG2 from disable_all_flip_queue_events() since on DG2 the pipe DMC is the one that handles the flip queue events. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8685 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231211213750.27109-2-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>
2023-12-15drm/amd/display: Revert " drm/amd/display: Use channel_width = 2 for vram ↵Alvin Lee
table 3.0" [Description] Revert commit fec05adc40c2 ("drm/amd/display: Use channel_width = 2 for vram table 3.0") Because the issue is being fixed from VBIOS side. Reviewed-by: Samson Tam <samson.tam@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15drm/amd/display: remove HPO PG in driver sideMuhammad Ahmed
[why & how] Add config to make HPO PG handled in dmubfw ips entry/exit Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Muhammad Ahmed <ahmed.ahmed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15drm/amd/display: do not send commands to DMUB if DMUB is inactive from S3Samson Tam
[Why] On resume from S3, may get apply_idle_optimizations call while DMUB is inactive which will just time out. [How] Set and track power state in dmub_srv and check power state before sending commands to DMUB. Add interface in both dmub_srv and dc_dmub_srv Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Samson Tam <samson.tam@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15drm/amd/swsmu: remove duplicate definition of smu v14_0_0 driver if versionLi Ma
There is a repeated define of smu v14_0_0 driver if version, so delete one in driver if header. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15drm/amdgpu: make an improvement on amdgpu_hmm_range_get_pagesJames Zhu
Only schedule when hmm_range_fault returns error. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15drm/amdgpu: increase hmm range get pages timeoutJames Zhu
When application tries to allocate all system memory and cause memory to swap out. Needs more time for hmm_range_fault to validate the remaining page for allocation. To be safe, increase timeout value to 1 second for 64MB range. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15drm/amdkfd: svm range always mapped flag not working on APUPhilip Yang
On gfx943 APU there is no VRAM and page migration, queue CWSR area, svm range with always mapped flag, is not mapped to GPU correctly. This works fine if retry fault on CWSR area can be recovered, but could cause deadlock if there is another retry fault recover waiting for CWSR to finish. Fix this by mapping svm range with always mapped flag to GPU with ACCESS attribute if XNACK ON. There is side effect, because all GPUs have ACCESS attribute by default on new svm range with XNACK on, the CWSR area will be mapped to all GPUs after this change. This side effect will be fixed with Thunk change to set CWSR svm range with ACCESS_IN_PLACE attribute on the GPU that user queue is created. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15drm/amdkfd: only flush mes process context if mes support is thereJonathan Kim
Fix up on mes process context flush to prevent non-mes devices from spamming error messages or running into undefined behaviour during process termination. Fixes: bd33bb1409b4 ("drm/amdkfd: fix mes set shader debugger process management") Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Eric Huang <jinhuieric.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15drm/imagination: Fix error path in pvr_vm_create_contextDonald Robson
It is possible to double free the vm_ctx->mmu_ctx object in this function.     630 err_page_table_destroy: --> 631         pvr_mmu_context_destroy(vm_ctx->mmu_ctx); The pvr_vm_context_put() function does:         kref_put(&vm_ctx->ref_count, pvr_vm_context_release); Here the pvr_vm_context_release() will call:         pvr_mmu_context_destroy(vm_ctx->mmu_ctx); Refactor to an unwind style. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Donald Robson <donald.robson@imgtec.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231213144431.94956-2-donald.robson@imgtec.com
2023-12-15drm/imagination: Fix ERR_PTR test on pointer to pointer.Donald Robson
drivers/gpu/drm/imagination/pvr_vm.c:631 pvr_vm_create_context() error: 'vm_ctx->mmu_ctx' dereferencing possible ERR_PTR() 612 vm_ctx->mmu_ctx = pvr_mmu_context_create(pvr_dev); 613 err = PTR_ERR_OR_ZERO(&vm_ctx->mmu_ctx); ^ The address is never an error pointer so this will always return 0. Remove the ampersand. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Donald Robson <donald.robson@imgtec.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231213144431.94956-1-donald.robson@imgtec.com
2023-12-15drm/imagination: Fixed oops when misusing ioctl CREATE_HWRT_DATASETDonald Robson
While writing the matching IGT suite I discovered that it's possible to cause a kernel oops when using DRM_IOCTL_PVR_CREATE_HWRT_DATASET when the call to hwrt_init_common_fw_structure() fails. Use an unwind-type error path to avoid cleaning up the object using the the release function before it is fully resolved. Signed-off-by: Donald Robson <donald.robson@imgtec.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231208163019.95913-1-donald.robson@imgtec.com
2023-12-15drm/imagination: Fixed infinite loop in pvr_vm_mips_map()Donald Robson
Unwinding loop in error path for this function uses unsigned limit variable, causing the promotion of the signed counter variable. --> 204 for (; pfn >= start_pfn; pfn--) ^^^^^^^^^^^^^^^^ If start_pfn can be zero then this is an endless loop. I've seen this code in other places as well. This loop is slightly off as well. It should decrement pfn on the first iteration. Fix by making the loop limit variables signed. Also fix missing predecrement by modifying to while loop. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Donald Robson <donald.robson@imgtec.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231208160825.92933-1-donald.robson@imgtec.com
2023-12-15drm/i915/mtl: Fix HDMI/DP PLL clock selectionImre Deak
Select the HDMI specific PLL clock only for HDMI outputs. Fixes: 62618c7f117e ("drm/i915/mtl: C20 PLL programming") Cc: Mika Kahola <mika.kahola@intel.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231213220526.1828827-1-imre.deak@intel.com
2023-12-14drm/amd/display: fix documentation for dm_crtc_additional_color_mgmt()Melissa Wen
warning: expecting prototype for drm_crtc_additional_color_mgmt(). Prototype was for dm_crtc_additional_color_mgmt() instead Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312141801.o9eBCxt9-lkp@intel.com/ Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14drm/amd/display: fix documentation for amdgpu_dm_verify_lut3d_size()Alex Deucher
It takes the plane state rather than the crtc state. Fixes: aba8b76baabd ("drm/amd/display: add plane shaper LUT support") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Melissa Wen <mwen@igalia.com> Cc: Harry.Wentland@amd.com
2023-12-14drm/amd/pm: fix a double-free in amdgpu_parse_extended_power_tableZhipeng Lu
The amdgpu_free_extended_power_table is called in every error-handling paths of amdgpu_parse_extended_power_table. However, after the following call chain of returning: amdgpu_parse_extended_power_table |-> kv_dpm_init / si_dpm_init (the only two caller of amdgpu_parse_extended_power_table) |-> kv_dpm_sw_init / si_dpm_sw_init (the only caller of kv_dpm_init / si_dpm_init, accordingly) |-> kv_dpm_fini / si_dpm_fini (goto dpm_failed in xx_dpm_sw_init) |-> amdgpu_free_extended_power_table As above, the amdgpu_free_extended_power_table is called twice in this returning chain and thus a double-free is triggered. Similarily, the last kfree in amdgpu_parse_extended_power_table also cause a double free with amdgpu_free_extended_power_table in kv_dpm_fini. Fixes: 84176663e70d ("drm/amd/pm: create a new holder for those APIs used only by legacy ASICs(si/kv)") Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14gpu/drm/radeon: fix two memleaks in radeon_vm_initZhipeng Lu
When radeon_bo_create and radeon_vm_clear_bo fail, the vm->page_tables allocated before need to be freed. However, neither radeon_vm_init itself nor its caller have done such deallocation. Fixes: 6d2f2944e95e ("drm/radeon: use normal BOs for the page tables v4") Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>