linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-10-25	scsi: mpt3sas: re-do lost mpt3sas DMA mask fix	Sreekanth Reddy
	This is a re-do of commit e0e0747de0ea ("scsi: mpt3sas: Fix return value check of dma_get_required_mask()"), which I ended up undoing in a mis-merge in commit 62e6e5940c0c ("Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi"). The original commit message was scsi: mpt3sas: Fix return value check of dma_get_required_mask() Fix the incorrect return value check of dma_get_required_mask(). Due to this incorrect check, the driver was always setting the DMA mask to 63 bit. Link: https://lore.kernel.org/r/20220913120538.18759-2-sreekanth.reddy@broadcom.com Fixes: ba27c5cf286d ("scsi: mpt3sas: Don't change the DMA coherent mask after allocations") Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> and this fix was lost when I mis-merged the conflict with commit 9df650963bf6 ("scsi: mpt3sas: Don't change DMA mask while reallocating pools"). Reported-by: Juergen Gross <jgross@suse.com> Fixes: 62e6e5940c0c ("Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi") Link: https://lore.kernel.org/all/CAHk-=wjaK-TxrNaGtFDpL9qNHL1MVkWXO1TT6vObD5tXMSC4Zg@mail.gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-25	can: mcp251x: mcp251x_can_probe(): add missing unregister_candev() in error path	Dongliang Mu
	In mcp251x_can_probe(), if mcp251x_gpio_setup() fails, it forgets to unregister the CAN device. Fix this by unregistering can device in mcp251x_can_probe(). Fixes: 2d52dabbef60 ("can: mcp251x: add GPIO support") Signed-off-by: Dongliang Mu <dzm91@hust.edu.cn> Link: https://lore.kernel.org/all/20221024090256.717236-1-dzm91@hust.edu.cn [mkl: adjust label] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-10-25	can: mscan: mpc5xxx: mpc5xxx_can_probe(): add missing put_clock() in error path	Dongliang Mu
	The commit 1149108e2fbf ("can: mscan: improve clock API use") only adds put_clock() in mpc5xxx_can_remove() function, forgetting to add put_clock() in the error handling code. Fix this bug by adding put_clock() in the error handling code. Fixes: 1149108e2fbf ("can: mscan: improve clock API use") Signed-off-by: Dongliang Mu <dzm91@hust.edu.cn> Link: https://lore.kernel.org/all/20221024133828.35881-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-10-25	RDMA/rxe: Fix mr leak in RESPST_ERR_RNR	Li Zhijian
	rxe_recheck_mr() will increase mr's ref_cnt, so we should call rxe_put(mr) to drop mr's ref_cnt in RESPST_ERR_RNR to avoid below warning: WARNING: CPU: 0 PID: 4156 at drivers/infiniband/sw/rxe/rxe_pool.c:259 __rxe_cleanup+0x1df/0x240 [rdma_rxe] ... Call Trace: rxe_dereg_mr+0x4c/0x60 [rdma_rxe] ib_dereg_mr_user+0xa8/0x200 [ib_core] ib_mr_pool_destroy+0x77/0xb0 [ib_core] nvme_rdma_destroy_queue_ib+0x89/0x240 [nvme_rdma] nvme_rdma_free_queue+0x40/0x50 [nvme_rdma] nvme_rdma_teardown_io_queues.part.0+0xc3/0x120 [nvme_rdma] nvme_rdma_error_recovery_work+0x4d/0xf0 [nvme_rdma] process_one_work+0x582/0xa40 ? pwq_dec_nr_in_flight+0x100/0x100 ? rwlock_bug.part.0+0x60/0x60 worker_thread+0x2a9/0x700 ? process_one_work+0xa40/0xa40 kthread+0x168/0x1a0 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 Link: https://lore.kernel.org/r/20221024052049.20577-1-lizhijian@fujitsu.com Fixes: 8a1a0be894da ("RDMA/rxe: Replace mr by rkey in responder resources") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-24	net: lan966x: Stop replacing tx dcbs and dcbs_buf when changing MTU	Horatiu Vultur
	When a frame is sent using FDMA, the skb is mapped and then the mapped address is given to an tx dcb that is different than the last used tx dcb. Once the HW finish with this frame, it would generate an interrupt and then the dcb can be reused and memory can be freed. For each dcb there is an dcb buf that contains some meta-data(is used by PTP, is it free). There is 1 to 1 relationship between dcb and dcb_buf. The following issue was observed. That sometimes after changing the MTU to allocate new tx dcbs and dcbs_buf, two frames were not transmitted. The frames were not transmitted because when reloading the tx dcbs, it was always presuming to use the first dcb but that was not always happening. Because it could be that the last tx dcb used before changing MTU was first dcb and then when it tried to get the next dcb it would take dcb 1 instead of 0. Because it is supposed to take a different dcb than the last used one. This can be fixed simply by changing tx->last_in_use to -1 when the fdma is disabled to reload the new dcb and dcbs_buff. But there could be a different issue. For example, right after the frame is sent, the MTU is changed. Now all the dcbs and dcbs_buf will be cleared. And now get the interrupt from HW that it finished with the frame. So when we try to clear the skb, it is not possible because we lost all the dcbs_buf. The solution here is to stop replacing the tx dcbs and dcbs_buf when changing MTU because the TX doesn't care what is the MTU size, it is only the RX that needs this information. Fixes: 2ea1cbac267e ("net: lan966x: Update FDMA to change MTU.") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20221021090711.3749009-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-25	Merge tag 'drm-misc-next-2022-10-20' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 6.2: UAPI Changes: - Documentation for page-flip flags Cross-subsystem Changes: - dma-buf: Add unlocked variant of vmapping and attachment-mapping functions Core Changes: - atomic-helpers: CRTC primary plane test fixes - connector: TV API consistency improvements, cmdline parsing improvements - crtc-helpers: Introduce drm_crtc_helper_atomic_check() helper - edid: Fixes for HFVSDB parsing, - fourcc: Addition of the Vivante tiled modifier - makefile: Sort and reorganize the objects files - mode_config: Remove fb_base from drm_mode_config_funcs - sched: Add a module parameter to change the scheduling policy, refcounting fix for fences - tests: Sort the Kunit tests in the Makefile, improvements to the DP-MST tests - ttm: Remove unnecessary drm_mm_clean() call Driver Changes: - New driver: ofdrm - Move all drivers to a common dma-buf locking convention - bridge: - adv7533: Remove dynamic lane switching - it6505: Runtime PM support - ps8640: Handle AUX defer messages - tc358775: Drop soft-reset over I2C - ast: Atomic Gamma LUT Support, Convert to SHMEM, various improvements - lcdif: Support for YUV planes - mgag200: Fix PLL Setup on some revisions - udl: Modesetting improvements, hot-unplug support - vc4: Fix support for PAL-M Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20221020072405.g3o4hxuk75gmeumw@houat
2022-10-24	drm/i915/guc: Add compute reglist for guc err capture	Alan Previn
	We missed this at initial upstream because at that time none of the GuC enabled platforms had a compute engine. Add this now. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019072930.17755-3-alan.previn.teres.alexis@intel.com
2022-10-24	drm/i915/guc: Add error-capture init warnings when needed	Alan Previn
	If GuC is being used and we initialized GuC-error-capture, we need to be warning if we don't provide an error-capture register list in the firmware ADS, for valid GT engines. A warning makes sense as this would impact debugability without realizing why a reglist wasn't retrieved and reported by GuC. However, depending on the platform, we might have certain engines that have a register list for engine instance error state but not for engine class. Thus, add a check only to warn if the register list was non existent vs an empty list (use the empty lists to skip the warning). NOTE: if a future platform were to introduce new registers in place of what was an empty list on existing / legacy hardware engines no warning is provided as the empty list is meant to be used intentionally. As an example, if a future hardware were to add blitter engine-class-registers (new) on top of the legacy blitter engine-instance-register (HEAD, TAIL, etc.), no warning is generated. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019072930.17755-2-alan.previn.teres.alexis@intel.com
2022-10-24	Merge tag 'net-6.1-rc3-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf. The net-memcg fix stands out, the rest is very run-off-the-mill. Maybe I'm biased. Current release - regressions: - eth: fman: re-expose location of the MAC address to userspace, apparently some udev scripts depended on the exact value Current release - new code bugs: - bpf: - wait for busy refill_work when destroying bpf memory allocator - allow bpf_user_ringbuf_drain() callbacks to return 1 - fix dispatcher patchable function entry to 5 bytes nop Previous releases - regressions: - net-memcg: avoid stalls when under memory pressure - tcp: fix indefinite deferral of RTO with SACK reneging - tipc: fix a null-ptr-deref in tipc_topsrv_accept - eth: macb: specify PHY PM management done by MAC - tcp: fix a signed-integer-overflow bug in tcp_add_backlog() Previous releases - always broken: - eth: amd-xgbe: SFP fixes and compatibility improvements Misc: - docs: netdev: offer performance feedback to contributors" * tag 'net-6.1-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits) net-memcg: avoid stalls when under memory pressure tcp: fix indefinite deferral of RTO with SACK reneging tcp: fix a signed-integer-overflow bug in tcp_add_backlog() net: lantiq_etop: don't free skb when returning NETDEV_TX_BUSY net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failed docs: netdev: offer performance feedback to contributors kcm: annotate data-races around kcm->rx_wait kcm: annotate data-races around kcm->rx_psock net: fman: Use physical address for userspace interfaces net/mlx5e: Cleanup MACsec uninitialization routine atlantic: fix deadlock at aq_nic_stop nfp: only clean `sp_indiff` when application firmware is unloaded amd-xgbe: add the bit rate quirk for Molex cables amd-xgbe: fix the SFP compliance codes check for DAC cables amd-xgbe: enable PLL_CTL for fixed PHY modes only amd-xgbe: use enums for mailbox cmd and sub_cmds amd-xgbe: Yellow carp devices do not need rrc bpf: Use __llist_del_all() whenever possbile during memory draining bpf: Wait for busy refill_work when destroying bpf memory allocator MAINTAINERS: add keyword match on PTP ...
2022-10-24	drm/i915: Improve long running compute w/a for GuC submission	John Harrison
	A workaround was added to the driver to allow compute workloads to run 'forever' by disabling pre-emption on the RCS engine for Gen12. It is not totally unbound as the heartbeat will kick in eventually and cause a reset of the hung engine. However, this does not work well in GuC submission mode. In GuC mode, the pre-emption timeout is how GuC detects hung contexts and triggers a per engine reset. Thus, disabling the timeout means also losing all per engine reset ability. A full GT reset will still occur when the heartbeat finally expires, but that is a much more destructive and undesirable mechanism. The purpose of the workaround is actually to give compute tasks longer to reach a pre-emption point after a pre-emption request has been issued. This is necessary because Gen12 does not support mid-thread pre-emption and compute tasks can have long running threads. So, rather than disabling the timeout completely, just set it to a 'long' value. v2: Review feedback from Tvrtko - must hard code the 'long' value instead of determining it algorithmically. So make it an extra CONFIG definition. Also, remove the execlist centric comment from the existing pre-emption timeout CONFIG option given that it applies to more than just execlists. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-5-John.C.Harrison@Intel.com
2022-10-24	drm/i915: Make the heartbeat play nice with long pre-emption timeouts	John Harrison
	Compute workloads are inherently not pre-emptible for long periods on current hardware. As a workaround for this, the pre-emption timeout for compute capable engines was disabled. This is undesirable with GuC submission as it prevents per engine reset of hung contexts. Hence the next patch will re-enable the timeout but bumped up by an order of magnitude. However, the heartbeat might not respect that. Depending upon current activity, a pre-emption to the heartbeat pulse might not even be attempted until the last heartbeat period. Which means that only one period is granted for the pre-emption to occur. With the aforesaid bump, the pre-emption timeout could be significantly larger than this heartbeat period. So adjust the heartbeat code to take the pre-emption timeout into account. When it reaches the final (high priority) period, it now ensures the delay before hitting reset is bigger than the pre-emption timeout. v2: Fix for selftests which adjust the heartbeat period manually. v3: Add FIXME comment about selftests. Add extra FIXME comment and drm_notices when setting heartbeat to a non-default value (review feedback from Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-4-John.C.Harrison@Intel.com
2022-10-24	drm/i915: Fix compute pre-emption w/a to apply to compute engines	John Harrison
	An earlier patch added support for compute engines. However, it missed enabling the anti-pre-emption w/a for the new engine class. So move the 'compute capable' flag earlier and use it for the pre-emption w/a test. Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions") Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: "Michał Winiarski" <michal.winiarski@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-3-John.C.Harrison@Intel.com
2022-10-24	drm/i915/guc: Limit scheduling properties to avoid overflow	John Harrison
	GuC converts the pre-emption timeout and timeslice quantum values into clock ticks internally. That significantly reduces the point of 32bit overflow. On current platforms, worst case scenario is approximately 110 seconds. Rather than allowing the user to set higher values and then get confused by early timeouts, add limits when setting these values. v2: Add helper functions for clamping (review feedback from Tvrtko). v3: Add a bunch of BUG_ON range checks in addition to the checks already in the clamping functions (Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-2-John.C.Harrison@Intel.com
2022-10-24	drm/client: Switch drm_client_buffer_delete() to unlocked drm_gem_vunmap	Dmitry Osipenko
	The drm_client_buffer_delete() wasn't switched to unlocked GEM vunmapping by accident when rest of drm_client code transitioned to the unlocked variants of the vmapping functions. Make drm_client_buffer_delete() use the unlocked variant. This fixes lockdep warning splat about missing reservation lock when framebuffer is released. Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/dri-devel/890f70db-68b0-8456-ca3c-c5496ef90517@collabora.com/T/ Fixes: 79e2cf2e7a19 ("drm/gem: Take reservation lock for vmap/vunmap operations") Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221020213335.309092-1-dmitry.osipenko@collabora.com
2022-10-24	Merge tag 'pinctrl-v6.1-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Fix typos in UART1 and MMC in the Ingenic driver - A really well researched glitch bug fix to the Qualcomm driver that was tracked down and fixed by Dough Anderson from Chromium. Hats off for this one! - Revert two patches on the Xilinx ZynqMP driver: this needs a proper solution making use of firmware version information to adapt to different firmware releases - Fix interrupt triggers in the Ocelot driver * tag 'pinctrl-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: ocelot: Fix incorrect trigger of the interrupt. Revert "dt-bindings: pinctrl-zynqmp: Add output-enable configuration" Revert "pinctrl: pinctrl-zynqmp: Add support for output-enable and bias-high-impedance" pinctrl: qcom: Avoid glitching lines when we first mux to output pinctrl: Ingenic: JZ4755 bug fixes
2022-10-24	drm/amd/display: Revert logic for plane modifiers	Joaquín Ignacio Aramendía
	This file was split in commit 5d945cbcd4b16a29d6470a80dfb19738f9a4319f ("drm/amd/display: Create a file dedicated to planes") and the logic in dm_plane_format_mod_supported() function got changed by a switch logic. That change broke drm_plane modifiers setting on series 5000 APUs (tested on OXP mini AMD 5800U and HP Dev One 5850U PRO) leading to Gamescope not working as reported on GitHub[1] To reproduce the issue, enter a TTY and run: $ gamescope -- vkcube With said commit applied it will abort. This one restores the old logic, fixing the issue that affects Gamescope. [1](https://github.com/Plagman/gamescope/issues/624) Cc: <stable@vger.kernel.org> # 6.0.x Signed-off-by: Joaquín Ignacio Aramendía <samsagax@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdkfd: correct the cache info for gfx1036	Jesse Zhang
	correct the cache information for gfx1036 Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-10-24	drm/amdkfd: update gfx1037 Lx cache setting	Prike Liang
	Update the gfx1037 L1/L2 cache setting. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-10-24	drm/amdgpu: skip mes self test for gc 11.0.3 in recover	YuBiao Wang
	Temporary disable mes self teset for gc 11.0.3 during gpu_recovery. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd: Add IMU fw version to fw version queries	David Francis
	IMU is a new firmware for GFX11. There are four means by which firmware version can be queried from the driver: device attributes, vf2pf, debugfs, and the AMDGPU_INFO_FW_VERSION option in the amdgpu info ioctl. Add IMU as an option for those four methods. V2: Added debugfs Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: Don't return false if no stream	Alvin Lee
	pipe_ctx[i] exists even if the pipe is not in use. If the pipe is not in use it will always have a null stream, so don't return false in this case. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: Remove wrong pipe control lock	Rodrigo Siqueira
	When using a device based on DCN32/321, we have an issue where a second 4k@60Hz display does not light up, and the system becomes unresponsive for a few minutes. In the debug process, it was possible to see a hang in the function dcn20_post_unlock_program_front_end in this part: for (j = 0; j < TIMEOUT_FOR_PIPE_ENABLE_MS*1000 && hubp->funcs->hubp_is_flip_pending(hubp); j++) mdelay(1); } The hubp_is_flip_pending always returns positive for waiting pending flips which is a symptom of pipe hang. Additionally, the dmesg log shows this message after a few minutes: BUG: soft lockup - CPU#4 stuck for 26s! ... [ +0.000003] dcn20_post_unlock_program_front_end+0x112/0x340 [amdgpu] [ +0.000171] dc_commit_state_no_check+0x63d/0xbf0 [amdgpu] [ +0.000155] ? dc_validate_global_state+0x358/0x3d0 [amdgpu] [ +0.000154] dc_commit_state+0xe2/0xf0 [amdgpu] This confirmed the hypothesis that we had a pipe hanging somewhere. Next, after checking the ftrace entries, we have the below weird sequence: [..] 2) \| dcn10_lock_all_pipes [amdgpu]() { 2) 0.120 us \| optc1_is_tg_enabled [amdgpu](); 2) \| dcn20_pipe_control_lock [amdgpu]() { 2) \| dc_dmub_srv_clear_inbox0_ack [amdgpu]() { 2) 0.121 us \| amdgpu_dm_dmub_reg_write [amdgpu](); 2) 0.551 us \| } 2) \| dc_dmub_srv_send_inbox0_cmd [amdgpu]() { 2) 0.110 us \| amdgpu_dm_dmub_reg_write [amdgpu](); 2) 0.511 us \| } 2) \| dc_dmub_srv_wait_for_inbox0_ack [amdgpu]() { 2) 0.110 us \| amdgpu_dm_dmub_reg_read [amdgpu](); 2) 0.110 us \| amdgpu_dm_dmub_reg_read [amdgpu](); 2) 0.110 us \| amdgpu_dm_dmub_reg_read [amdgpu](); 2) 0.110 us \| amdgpu_dm_dmub_reg_read [amdgpu](); 2) 0.110 us \| amdgpu_dm_dmub_reg_read [amdgpu](); 2) 0.110 us \| amdgpu_dm_dmub_reg_read [amdgpu](); 2) 0.110 us \| amdgpu_dm_dmub_reg_read [amdgpu](); [..] We are not expected to read from dmub register so many times and for so long. From the trace log, it was possible to identify that the function dcn20_pipe_control_lock was triggering the dmub operation when it was unnecessary and causing the hang issue. This commit drops the unnecessary dmub code and, consequently, fixes the second display not lighting up the issue. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/pm: allow gfxoff on gc_11_0_3	Kenneth Feng
	allow gfxoff on gc_11_0_3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdkfd: Fix memory leak in kfd_mem_dmamap_userptr()	Rafael Mendonca
	If the number of pages from the userptr BO differs from the SG BO then the allocated memory for the SG table doesn't get freed before returning -EINVAL, which may lead to a memory leak in some error paths. Fix this by checking the number of pages before allocating memory for the SG table. Fixes: 264fb4d332f5 ("drm/amdgpu: Add multi-GPU DMA mapping helpers") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdgpu: Remove ATC L2 access for MMHUB 2.1.x	Lijo Lazar
	MMHUB 2.1.x versions don't have ATCL2. Remove accesses to ATCL2 registers. Since they are non-existing registers, read access will cause a 'Completer Abort' and gets reported when AER is enabled with the below patch. Tagging with the patch so that this is backported along with it. v2: squash in uninitialized warning fix (Nathan Chancellor) Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-10-24	drm/amd/display: Revert logic for plane modifiers	Joaquín Ignacio Aramendía
	This file was split in commit 5d945cbcd4b16a29d6470a80dfb19738f9a4319f ("drm/amd/display: Create a file dedicated to planes") and the logic in dm_plane_format_mod_supported() function got changed by a switch logic. That change broke drm_plane modifiers setting on series 5000 APUs (tested on OXP mini AMD 5800U and HP Dev One 5850U PRO) leading to Gamescope not working as reported on GitHub[1] To reproduce the issue, enter a TTY and run: $ gamescope -- vkcube With said commit applied it will abort. This one restores the old logic, fixing the issue that affects Gamescope. [1](https://github.com/Plagman/gamescope/issues/624) Cc: <stable@vger.kernel.org> # 6.0.x Signed-off-by: Joaquín Ignacio Aramendía <samsagax@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/radeon: fix repeated words in comments	wangjianli
	Delete the redundant word 'the'. Signed-off-by: wangjianli <wangjianli@cdjrlc.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	amd/amdgpu: fix repeated words in comments	wangjianli
	Delete the redundant word 'the'. Signed-off-by: wangjianli <wangjianli@cdjrlc.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: refine wake up aux in retrieve link caps	Lewis Huang
	[Why] Read set_power_state dpcd after HPD cause USB4 CTS 4.2.1.1 [How] Read LTTPR caps first. If aux channel not ready, wake up aux channel. If wake up aux channel return pass, retrieve lttpr caps again. If wake up aux channel return false, register a detection retry timer. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Lewis Huang <Lewis.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: Add events log to trace OPTC lock and unlock	Rodrigo Siqueira
	As an attempt to offer more DCN debug tools for cases where the OPTC can hang, this commit introduces a trace event responsible for showing OPTC status when it requests lock and unlock. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Aurabindo Pillai <Aurabindo.Pillai@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: Convert documentation to a kernel-doc	Rodrigo Siqueira
	The dc_dmub_srv file has a lot of documentation associated with SubVP that could be converted to a kernel-doc. This commit just changes the comment style to a kernel-doc. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Aurabindo Pillai <Aurabindo.Pillai@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: Refactor eDP PSR codes	Ian Chen
	We split out PSR config from "global" to "per-panel" config settings. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Robin Chen <robin.chen@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Ian Chen <ian.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: correctly populate dcn315 clock table	Dmytro Laktyushkin
	Fix incorrect pstate read order as well as min and max state logic. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: remove audio mute control in hpo dp	Wenjing Liu
	VPG doesn't have the ability to mute audio output by sending all 0s in audio SDP. The existing implemention is disabling audio SDP instead. This is same as what dp_audio_enable does. Since it is no longer referenced by any callers, we decided to remove this interface for simplicity. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: move stream encoder audio setup to link_hwss	Wenjing Liu
	Unify stream encoder audio setup interface. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: refactor enable/disable audio stream hw sequence	Wenjing Liu
	[why] 1. As recommended by hardware team, don't enable APG when stream is not enabled. 2. Move audio stream encoder programming into link_hwss. [how] 1. Merge dp_audio_enable into enable audio stream hw sequence. 2. Move stream encoder programming into link hwss level to unify stream encoder programming interface. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: Remove FPU guards from the DML folder	Rodrigo Siqueira
	As part of the programming expectation for using DML functions, DC requires that any DML function invoked outside DML uses: DC_FP_START(); ... dml function ... DC_FP_END(); Additionally, all the DML functions that can be invoked outside the DML folder call the function dc_assert_fp_enabled(), which is responsible for triggering a warning in the case that the DML function was not guarded by the DC_FP_START/END. For this reason, call DC_FP_START/END inside DML is wrong, and this commit removes all of those references. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Nevenko Stupar <Nevenko.Stupar@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: Add UHBR135 and UHBR20 into debugfs	Fangzhi Zuo
	Add support to manually force link rate to UHBR135 (0x546) and UHBR20 (0x7d0). Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle resume	Prike Liang
	In the S2idle suspend/resume phase the gfxoff is keeping functional so some IP blocks will be likely to reinitialize at gfxoff entry and that will result in failing to program GC registers.Therefore, let disallow gfxoff until AMDGPU IPs reinitialized completely. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdgpu: skip mes self test for gc 11.0.3 in recover	YuBiao Wang
	Temporary disable mes self teset for gc 11.0.3 during gpu_recovery. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdkfd: introduce dummy cache info for property asic	Prike Liang
	This dummy cache info will enable kfd base function support. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdkfd: correct the cache info for gfx1036	Jesse Zhang
	correct the cache information for gfx1036 Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdkfd: update gfx1037 Lx cache setting	Prike Liang
	Update the gfx1037 L1/L2 cache setting. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd: Add IMU fw version to fw version queries	David Francis
	IMU is a new firmware for GFX11. There are four means by which firmware version can be queried from the driver: device attributes, vf2pf, debugfs, and the AMDGPU_INFO_FW_VERSION option in the amdgpu info ioctl. Add IMU as an option for those four methods. V2: Added debugfs Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdkfd: use vma_lookup() instead of find_vma()	Deming Wang
	Using vma_lookup() verifies the start address is contained in the found vma. This results in easier to read the code. Signed-off-by: Deming Wang <wangdeming@inspur.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: make dcn32_mpc_funcs static	ruanjinjie
	The symbol is not used outside of the file, so mark it static. Fixes the following warning: drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_mpc.c:985:24: warning: symbol 'dcn32_mpc_funcs' was not declared. Should it be static? Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: ruanjinjie <ruanjinjie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: make dcn32_mmhubbub_funcs static	ruanjinjie
	The symbol is not used outside of the file, so mark it static. Fixes the following warning: drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_mmhubbub.c:214:28: warning: symbol 'dcn32_mmhubbub_funcs' was not declared. Should it be static? Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: ruanjinjie <ruanjinjie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amd/display: Make some symbols static	ruanjinjie
	These symbols qp_table_422_10bpc_min, qp_table_444_8bpc_max, qp_table_420_12bpc_max, qp_table_444_10bpc_min, qp_table_420_8bpc_max, qp_table_444_8bpc_min, qp_table_444_12bpc_min, qp_table_420_12bpc_min, qp_table_422_12bpc_min, qp_table_422_12bpc_max, qp_table_444_12bpc_max, qp_table_420_8bpc_min, qp_table_422_8bpc_min, qp_table_422_10bpc_max, qp_table_420_10bpc_max, qp_table_420_10bpc_min, qp_table_444_10bpc_max, qp_table_422_8bpc_max are not used outside of the file, so mark them static. ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:28:18: warning: symbol 'qp_table_422_10bpc_min' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:61:18: warning: symbol 'qp_table_444_8bpc_max' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:102:18: warning: symbol 'qp_table_420_12bpc_max' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:135:18: warning: symbol 'qp_table_444_10bpc_min' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:188:18: warning: symbol 'qp_table_420_8bpc_max' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:209:18: warning: symbol 'qp_table_444_8bpc_min' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:250:18: warning: symbol 'qp_table_444_12bpc_min' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:315:18: warning: symbol 'qp_table_420_12bpc_min' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:348:18: warning: symbol 'qp_table_422_12bpc_min' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:389:18: warning: symbol 'qp_table_422_12bpc_max' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:430:18: warning: symbol 'qp_table_444_12bpc_max' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:495:18: warning: symbol 'qp_table_420_8bpc_min' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:516:18: warning: symbol 'qp_table_422_8bpc_min' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:541:18: warning: symbol 'qp_table_422_10bpc_max' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:574:16: warning: symbol 'qp_table_420_10bpc_max' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:601:18: warning: symbol 'qp_table_420_10bpc_min' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:628:18: warning: symbol 'qp_table_444_10bpc_max' was not declared. Should it be static? ./drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/qp_tables.h:681:18: warning: symbol 'qp_table_422_8bpc_max' was not declared. Should it be static? Signed-off-by: ruanjinjie <ruanjinjie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24	drm/amdgpu: fix sdma doorbell init ordering on APUs	Alex Deucher
	Commit 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") uncovered a bug in amdgpu that required a reordering of the driver init sequence to avoid accessing a special register on the GPU before it was properly set up leading to an PCI AER error. This reordering uncovered a different hw programming ordering dependency in some APUs where the SDMA doorbells need to be programmed before the GFX doorbells. To fix this, move the SDMA doorbell programming back into the soc15 common code, but use the actual doorbell range values directly rather than the values stored in the ring structure since those will not be initialized at this point. This is a partial revert, but with the doorbell assignment fixed so the proper doorbell index is set before it's used. Fixes: e3163bc8ffdfdb ("drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: skhan@linuxfoundation.org
2022-10-24	drm/amd/display: Don't return false if no stream	Alvin Lee
	pipe_ctx[i] exists even if the pipe is not in use. If the pipe is not in use it will always have a null stream, so don't return false in this case. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>