summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2024-11-04block: pre-calculate max_zone_append_sectorsChristoph Hellwig
max_zone_append_sectors differs from all other queue limits in that the final value used is not stored in the queue_limits but needs to be obtained using queue_limits_max_zone_append_sectors helper. This not only adds (tiny) extra overhead to the I/O path, but also can be easily forgotten in file system code. Add a new max_hw_zone_append_sectors value to queue_limits which is set by the driver, and calculate max_zone_append_sectors from that and the other inputs in blk_validate_zoned_limits, similar to how max_sectors is calculated to fix this. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241104073955.112324-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-04drm/amd/pm: correct the workload settingKenneth Feng
Correct the workload setting in order not to mix the setting with the end user. Update the workload mask accordingly. v2: changes as below: 1. the end user can not erase the workload from driver except default workload. 2. always shows the real highest priority workoad to the end user. 3. the real workload mask is combined with driver workload mask and end user workload mask. v3: apply this to the other ASICs as well. v4: simplify the code v5: refine the code based on the review comments. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: Add compatible NPS mode infoLijo Lazar
Populate the compatible NPS modes also for providing partition configuration details through sysfs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: Skip IP coredump for RAS errorsLijo Lazar
For RAS errors, source of error is known. Skip the core dump of IP states. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: Group gfx sysfs functionsLijo Lazar
Make amdgpu_gfx_sysfs_init/fini functions as common entry points for all gfx related sysfs nodes. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: Add nps_mode in RAS init_flagCandice Li
Add nps_mode in RAS init_flag. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: add amdgpu_sdma_sched_mask debugfsJesse Zhang
Userspace wants to run jobs on a specific sdma ring for verification purposes. This debugfs entry helps to disable or enable submitting jobs to a specific ring. This entry is populated only if there are at least two or more cores in the sdma ip. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: add amdgpu_gfx_sched_mask and amdgpu_compute_sched_mask debugfsJesse Zhang
compute/gfx may have multiple rings on some hardware. In some cases, userspace wants to run jobs on a specific ring for validation purposes. This debugfs entry helps to disable or enable submitting jobs to a specific ring. This entry is populated only if there are at least two or more cores in the gfx/compute ip. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: Fix dummy_read_page overlapping mappingsPrike Liang
Use the dma_map_page_attrs() with DMA_ATTR_SKIP_CPU_SYNC attribute setting to handle the dummy page overlapping mappings. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: skip amdgpu_device_cache_pci_state under sriovVictor Zhao
Under sriov, host driver will save and restore vf pci cfg space during reset. And during device init, under sriov, pci_restore_state happens after fullaccess released, and it can have race condition with mmio protection enable from host side leading to missing interrupts. So skip amdgpu_device_cache_pci_state for sriov. Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdkfd: Use dynamic allocation for CU occupancy array in ↵Srinivasan Shanmugam
'kfd_get_cu_occupancy()' The `kfd_get_cu_occupancy` function previously declared a large `cu_occupancy` array as a local variable, which could lead to stack overflows due to excessive stack usage. This commit replaces the static array allocation with dynamic memory allocation using `kcalloc`, thereby reducing the stack size. This change avoids the risk of stack overflows in kernel space, in scenarios where `AMDGPU_MAX_QUEUES` is large. The allocated memory is freed using `kfree` before the function returns to prevent memory leaks. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c: In function ‘kfd_get_cu_occupancy’: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:322:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=] 322 | } | ^ Fixes: 6ae9e1aba97e ("drm/amdkfd: Update logic for CU occupancy calculations") Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/pm: always pick the pptable from IFWIKenneth Feng
always pick the pptable from IFWI on smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: prevent NULL pointer dereference if ATIF is not supportedAntonio Quartulli
acpi_evaluate_object() may return AE_NOT_FOUND (failure), which would result in dereferencing buffer.pointer (obj) while being NULL. Although this case may be unrealistic for the current code, it is still better to protect against possible bugs. Bail out also when status is AE_NOT_FOUND. This fixes 1 FORWARD_NULL issue reported by Coverity Report: CID 1600951: Null pointer dereferences (FORWARD_NULL) Signed-off-by: Antonio Quartulli <antonio@mandelbit.com> Fixes: c9b7c809b89f ("drm/amd: Guard against bad data for ATIF ACPI method") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241031152848.4716-1-antonio@mandelbit.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04can: mcp251xfd: mcp251xfd_get_tef_len(): fix length calculationMarc Kleine-Budde
Commit b8e0ddd36ce9 ("can: mcp251xfd: tef: prepare to workaround broken TEF FIFO tail index erratum") introduced mcp251xfd_get_tef_len() to get the number of unhandled transmit events from the Transmit Event FIFO (TEF). As the TEF has no head pointer, the driver uses the TX FIFO's tail pointer instead, assuming that send frames are completed. However the check for the TEF being full was not correct. This leads to the driver stop working if the TEF is full. Fix the TEF full check by assuming that if, from the driver's point of view, there are no free TX buffers in the chip and the TX FIFO is empty, all messages must have been sent and the TEF must therefore be full. Reported-by: Sven Schuchmann <schuchmann@schleissheimer.de> Closes: https://patch.msgid.link/FR3P281MB155216711EFF900AD9791B7ED9692@FR3P281MB1552.DEUP281.PROD.OUTLOOK.COM Fixes: b8e0ddd36ce9 ("can: mcp251xfd: tef: prepare to workaround broken TEF FIFO tail index erratum") Tested-by: Sven Schuchmann <schuchmann@schleissheimer.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20241104-mcp251xfd-fix-length-calculation-v3-1-608b6e7e2197@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04can: mcp251xfd: mcp251xfd_ring_alloc(): fix coalescing configuration when ↵Marc Kleine-Budde
switching CAN modes Since commit 50ea5449c563 ("can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode"), the current ring and coalescing configuration is passed to can_ram_get_layout(). That fixed the issue when switching between CAN-CC and CAN-FD mode with configured ring (rx, tx) and/or coalescing parameters (rx-frames-irq, tx-frames-irq). However 50ea5449c563 ("can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode"), introduced a regression when switching CAN modes with disabled coalescing configuration: Even if the previous CAN mode has no coalescing configured, the new mode is configured with active coalescing. This leads to delayed receiving of CAN-FD frames. This comes from the fact, that ethtool uses usecs = 0 and max_frames = 1 to disable coalescing, however the driver uses internally priv->{rx,tx}_obj_num_coalesce_irq = 0 to indicate disabled coalescing. Fix the regression by assigning struct ethtool_coalesce ec->{rx,tx}_max_coalesced_frames_irq = 1 if coalescing is disabled in the driver as can_ram_get_layout() expects this. Reported-by: https://github.com/vdh-robothania Closes: https://github.com/raspberrypi/linux/issues/6407 Fixes: 50ea5449c563 ("can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode") Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241025-mcp251xfd-fix-coalesing-v1-1-9d11416de1df@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04can: rockchip_canfd: Drop obsolete dependency on COMPILE_TESTJean Delvare
Since commit 0166dc11be91 ("of: make CONFIG_OF user selectable"), OF can be enabled on all architectures. Therefore depending on COMPILE_TEST as an alternative is no longer needed. Signed-off-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20241022130439.70d016e9@endymion.delvare Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04can: rockchip_canfd: CAN_ROCKCHIP_CANFD should depend on ARCH_ROCKCHIPGeert Uytterhoeven
The Rockchip CAN-FD controller is only present on Rockchip SoCs. Hence add a dependency on ARCH_ROCKCHIP, to prevent asking the user about this driver when configuring a kernel without Rockchip platform support. Fixes: ff60bfbaf67f219c ("can: rockchip_canfd: add driver for Rockchip CAN-FD controller") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Link: https://patch.msgid.link/a4b3c8c1cca9515e67adac83af5ba1b1fab2fcbc.1727169288.git.geert+renesas@glider.be Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04can: c_can: fix {rx,tx}_errors statisticsDario Binacchi
The c_can_handle_bus_err() function was incorrectly incrementing only the receive error counter, even in cases of bit or acknowledgment errors that occur during transmission. The patch fixes the issue by incrementing the appropriate counter based on the type of error. Fixes: 881ff67ad450 ("can: c_can: Added support for Bosch C_CAN controller") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241014135319.2009782-1-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04can: m_can: m_can_close(): don't call free_irq() for IRQ-less devicesMarc Kleine-Budde
In commit b382380c0d2d ("can: m_can: Add hrtimer to generate software interrupt") support for IRQ-less devices was added. Instead of an interrupt, the interrupt routine is called by a hrtimer-based polling loop. That patch forgot to change free_irq() to be only called for devices with IRQs. Fix this, by calling free_irq() conditionally only if an IRQ is available for the device (and thus has been requested previously). Fixes: b382380c0d2d ("can: m_can: Add hrtimer to generate software interrupt") Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Link: https://patch.msgid.link/20240930-m_can-cleanups-v1-1-001c579cdee4@pengutronix.de Cc: <stable@vger.kernel.org> # v6.6+ Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04ACPI: battery: Check for error code from devm_mutex_init() callAndy Shevchenko
Even if it's not critical, the avoidance of checking the error code from devm_mutex_init() call today diminishes the point of using devm variant of it. Tomorrow it may even leak something. Add the missed check. Fixes: 0710c1ce5045 ("ACPI: battery: initialize mutexes through devm_ APIs") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20241030162754.2110946-1-andriy.shevchenko@linux.intel.com [ rjw: Added 2 empty code lines ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-11-04can: {cc770,sja1000}_isa: allow building on x86_64Thomas Mühlbacher
The ISA variable is only defined if X86_32 is also defined. However, these drivers are still useful and in use on at least some modern 64-bit x86 industrial systems as well. With the correct module parameters, they work as long as IO port communication is possible, despite their name having ISA in them. Fixes: a29689e60ed3 ("net: handle HAS_IOPORT dependencies") Signed-off-by: Thomas Mühlbacher <tmuehlbacher@posteo.net> Link: https://patch.msgid.link/20240919174151.15473-2-tmuehlbacher@posteo.net Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04drm/amd/display: 3.2.308Aric Cyr
This version brings along following fixes: - Prune Invalid Modes for HDMI Output - SPL Cleanup - Fix brightness level not retained over reboot - Remove inaccessible registers from DMU diagnostics Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: Prune Invalid Modes For HDMI OutputFangzhi Zuo
[Why] 1. HDMI does not have 6 bpc support. Having 6 bpc pass validation does not comply with spec. 2. Validate 420 only for native HDMI, but not apply to pcon use case. 3. Current mode validation log is not readable. [how] 1. Cap 8 bpc for dp-hdmi converter. 2. Validate yuv420 for pcon use case as well, if rgb/yuv444 8bpc cannot fit into pcon bw limitation of the link from the converter to HDMI sink. 3. Add readable pixel_format and color_depth into debug log. Reviewed-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: Implement new backlight_level_params structureKaitlyn Tse
[Why] Implement the new backlight_level_params structure as part of the VBAC framework, the information in this structure is needed to be passed down to the DMCUB to identify the backlight control type, to adjust the backlight of the panel and to perform any required conversions from PWM to nits or vice versa. [How] Modified existing functions to include the new backlight_level_params structure. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Kaitlyn Tse <Kaitlyn.Tse@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: [FW Promotion] Release 0.0.241.0Taimur Hassan
- Add DPCS health check - Update USB4 PHY SSC - Fix FAMS2 SubVP Close to VBlank changes - Create VESA Aux-based backlight control path - Fix PSR1 CRC error during CTS test Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04EDAC/ie31200: Add Kaby Lake-S dual-core host bridge IDJames Ye
Add device ID for dual-core Kaby Lake-S processors e.g. i3-7100. Signed-off-by: James Ye <jye836@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Jason Baron <jbaron@akamai.com> Link: https://lore.kernel.org/r/20240824120622.46226-1-jye836@gmail.com
2024-11-04drm/amd/display: Add a missing DCN401 reg definitionAurabindo Pillai
Add a mising reg field to the autogenerated header for future use Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/pm: add inst to dpm_set_vcn_enableBoyuan Zhang
Add an instance parameter to the existing function dpm_set_vcn_enable() for future implementation. Re-write all pptable functions accordingly. v2: Remove duplicated dpm_set_vcn_enable() functions in v1. Instead, adding instance parameter to existing functions. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04dm cache: fix potential out-of-bounds access on the first resumeMing-Hung Tsai
Out-of-bounds access occurs if the fast device is expanded unexpectedly before the first-time resume of the cache table. This happens because expanding the fast device requires reloading the cache table for cache_create to allocate new in-core data structures that fit the new size, and the check in cache_preresume is not performed during the first resume, leading to the issue. Reproduce steps: 1. prepare component devices: dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" dmsetup create cdata --table "0 65536 linear /dev/sdc 8192" dmsetup create corig --table "0 524288 linear /dev/sdc 262144" dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct 2. load a cache table of 512 cache blocks, and deliberately expand the fast device before resuming the cache, making the in-core data structures inadequate. dmsetup create cache --notable dmsetup reload cache --table "0 524288 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0" dmsetup reload cdata --table "0 131072 linear /dev/sdc 8192" dmsetup resume cdata dmsetup resume cache 3. suspend the cache to write out the in-core dirty bitset and hint array, leading to out-of-bounds access to the dirty bitset at offset 0x40: dmsetup suspend cache KASAN reports: BUG: KASAN: vmalloc-out-of-bounds in is_dirty_callback+0x2b/0x80 Read of size 8 at addr ffffc90000085040 by task dmsetup/90 (...snip...) The buggy address belongs to the virtual mapping at [ffffc90000085000, ffffc90000087000) created by: cache_ctr+0x176a/0x35f0 (...snip...) Memory state around the buggy address: ffffc90000084f00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffffc90000084f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 >ffffc90000085000: 00 00 00 00 00 00 00 00 f8 f8 f8 f8 f8 f8 f8 f8 ^ ffffc90000085080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffffc90000085100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 Fix by checking the size change on the first resume. Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com> Fixes: f494a9c6b1b6 ("dm cache: cache shrinking support") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04dm cache: optimize dirty bit checking with find_next_bit when resizingMing-Hung Tsai
When shrinking the fast device, dm-cache iteratively searches for a dirty bit among the cache blocks to be dropped, which is less efficient. Use find_next_bit instead, as it is twice as fast as the iterative approach with test_bit. Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com> Fixes: f494a9c6b1b6 ("dm cache: cache shrinking support") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04dm cache: fix out-of-bounds access to the dirty bitset when resizingMing-Hung Tsai
dm-cache checks the dirty bits of the cache blocks to be dropped when shrinking the fast device, but an index bug in bitset iteration causes out-of-bounds access. Reproduce steps: 1. create a cache device of 1024 cache blocks (128 bytes dirty bitset) dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" dmsetup create cdata --table "0 131072 linear /dev/sdc 8192" dmsetup create corig --table "0 524288 linear /dev/sdc 262144" dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0" 2. shrink the fast device to 512 cache blocks, triggering out-of-bounds access to the dirty bitset (offset 0x80) dmsetup suspend cache dmsetup reload cdata --table "0 65536 linear /dev/sdc 8192" dmsetup resume cdata dmsetup resume cache KASAN reports: BUG: KASAN: vmalloc-out-of-bounds in cache_preresume+0x269/0x7b0 Read of size 8 at addr ffffc900000f3080 by task dmsetup/131 (...snip...) The buggy address belongs to the virtual mapping at [ffffc900000f3000, ffffc900000f5000) created by: cache_ctr+0x176a/0x35f0 (...snip...) Memory state around the buggy address: ffffc900000f2f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffffc900000f3000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffffc900000f3080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ^ ffffc900000f3100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffffc900000f3180: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 Fix by making the index post-incremented. Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com> Fixes: f494a9c6b1b6 ("dm cache: cache shrinking support") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04dm cache: fix flushing uninitialized delayed_work on cache_ctr errorMing-Hung Tsai
An unexpected WARN_ON from flush_work() may occur when cache creation fails, caused by destroying the uninitialized delayed_work waker in the error path of cache_create(). For example, the warning appears on the superblock checksum error. Reproduce steps: dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" dmsetup create cdata --table "0 65536 linear /dev/sdc 8192" dmsetup create corig --table "0 524288 linear /dev/sdc 262144" dd if=/dev/urandom of=/dev/mapper/cmeta bs=4k count=1 oflag=direct dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0" Kernel logs: (snip) WARNING: CPU: 0 PID: 84 at kernel/workqueue.c:4178 __flush_work+0x5d4/0x890 Fix by pulling out the cancel_delayed_work_sync() from the constructor's error path. This patch doesn't affect the use-after-free fix for concurrent dm_resume and dm_destroy (commit 6a459d8edbdb ("dm cache: Fix UAF in destroy()")) as cache_dtr is not changed. Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com> Fixes: 6a459d8edbdb ("dm cache: Fix UAF in destroy()") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04dm cache: correct the number of origin blocks to match the target lengthMing-Hung Tsai
When creating a cache device, the actual size of the cache origin might be greater than the specified cache target length. In such case, the number of origin blocks should match the cache target length, not the full size of the origin device, since access beyond the cache target is not possible. This issue occurs when reducing the origin device size using lvm, as lvreduce preloads the new cache table before resuming the cache origin, which can result in incorrect sizes for the discard bitset and smq hotspot blocks. Reproduce steps: 1. create a cache device consists of 4096 origin blocks dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" dmsetup create cdata --table "0 65536 linear /dev/sdc 8192" dmsetup create corig --table "0 524288 linear /dev/sdc 262144" dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0" 2. reduce the cache origin to 2048 oblocks, in lvreduce's approach dmsetup reload corig --table "0 262144 linear /dev/sdc 262144" dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0" dmsetup suspend cache dmsetup suspend corig dmsetup suspend cdata dmsetup suspend cmeta dmsetup resume corig dmsetup resume cdata dmsetup resume cmeta dmsetup resume cache 3. shutdown the cache, and check the number of discard blocks in superblock. The value is expected to be 2048, but actually is 4096. dmsetup remove cache corig cdata cmeta dd if=/dev/sdc bs=1c count=8 skip=224 2>/dev/null | hexdump -e '1/8 "%u\n"' Fix by correcting the origin_blocks initialization in cache_create and removing the unused origin_sectors from struct cache_args accordingly. Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com> Fixes: c6b4fcbad044 ("dm: add cache target") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04dm-verity: don't crash if panic_on_corruption is not selectedMikulas Patocka
If the user sets panic_on_error and doesn't set panic_on_corruption, dm-verity should not panic on data mismatch. But, currently it panics, because it treats data mismatch as I/O error. This commit fixes the logic so that if there is data mismatch and panic_on_corruption or restart_on_corruption is not selected, the system won't restart or panic. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Fixes: f811b83879fb ("dm-verity: introduce the options restart_on_error and panic_on_error")
2024-11-04drm/amd/display: Don't write DP_MSTM_CTRL after LTWayne Lin
[Why] Observe after suspend/resme, we can't light up mst monitors under specific mst hub. The reason is that driver still writes DPCD DP_MSTM_CTRL after LT. It's forbidden even we write the same value for that dpcd register. [How] We already resume the mst branch device dpcd settings during resume_mst_branch_status(). Leverage drm_dp_mst_topology_queue_probe() to only probe the topology, not calling drm_dp_mst_topology_mgr_resume() which will set DP_MSTM_CTRL as well. Reviewed-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: Minimize wait for pending updatesIlya Bakoulin
[Why/How] Move the wait for pending updates past prepare_bandwidth if the previous update was not a full update to reduce the average time it takes to complete a full update. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: parse umc_info or vram_info based on ASICAurabindo Pillai
An upstream bug report suggests that there are production dGPUs that are older than DCN401 but still have a umc_info in VBIOS tables with the same version as expected for a DCN401 product. Hence, reading this tables should be guarded with a version check. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: Remove otg w/a toggling on HPO interfacesAusef Yousof
[why&how] Adjust otg w/a disable condition to include HPO explicitly rather than assuming it is implicitly used through DP2. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: Remove hw w/a toggle if on DP2/HPOAusef Yousof
[why&how] Applying a hw w/a only relevant to DIG FIFO causing corruption using HPO, do not apply the w/a if on DP2/HPO Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: SPL cleanupSamson Tam
[Why & How] Move from pointer to callback to reference callback directly Missed renaming fixpt functions with spl prefix Reviewed-by: Navid Assadian <Navid.Assadian@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: Fix brightness level not retained over rebootTom Chung
[Why] During boot up and resume the DC layer will reset the panel brightness to fix a flicker issue. It will cause the dm->actual_brightness is not the current panel brightness level. (the dm->brightness is the correct panel level) [How] Set the backlight level after do the set mode. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Fixes: d9e865826c20 ("drm/amd/display: Simplify brightness initialization") Reported-by: Mark Herbert <mark.herbert42@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3655 Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: fix asserts in SPL during bootupSamson Tam
[Why] During mode validation, there maybe modes that fail max_downscale_src_width check and scaling_quality taps are 0. This will cause an assert to trigger in spl_set_filters_data() because taps are 0. [How] Move taps calculation for non-adaptive scaling mode to separate function and call it if max_downscale_src_width fails. This will populate taps if scaling_quality taps are 0. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: Remove inaccessible registers from DMU diagnosticsNicholas Kazlauskas
[Why] SEC_CNTL isn't readable by x86 and can block Z8 entry if read. [How] Remove the read. Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: fix rxstatus_msg_sz type narrowingDominik Kaszewski
[Why] Code reading rxstatus message size was incorrectly assigning it to uint8_t, despite the value being 10 bits long (lower byte plus lowest 2 bits from upper byte). This caused the highest 2 bits to be ignored, potentially missing invalid values. [How] Change all local variables holding rxstatus message size from uint8_t to uint16_t, as in mod_hdcp_message_hdcp2::rx_id_list_size. Replaced untyped HDCP_2_2_HMID_RXSTATUS_MSG_SZ_HI macro with function hdcp_2_2_hmid_rxstatus_msg_sz(const uint8_t[2]) to encapsulate entire calculation and return a typed result. Removed spaces mixed with tabs to fix indentation on modified lines. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04dm-unstriped: cast an operand to sector_t to prevent potential uint32_t overflowZichen Xie
This was found by a static analyzer. There may be a potential integer overflow issue in unstripe_ctr(). uc->unstripe_offset and uc->unstripe_width are defined as "sector_t"(uint64_t), while uc->unstripe, uc->chunk_size and uc->stripes are all defined as "uint32_t". The result of the calculation will be limited to "uint32_t" without correct casting. So, we recommend adding an extra cast to prevent potential integer overflow. Fixes: 18a5bf270532 ("dm: add unstriped target") Signed-off-by: Zichen Xie <zichenxie0106@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org
2024-11-04Revert "drm/amd/display: Block UHBR Based On USB-C PD Cable ID"Ausef Yousof
This reverts commit 4f01a68751194d05280d659a65758c09e4af04d6. [why & how] The offending commit caused a lighting issue for Samsung Odyssey G9 monitors when connecting via USB-C. The commit was intended to block certain UHBR rates. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: avoid divided by zeroCharlene Liu
[why] insert divided by zero protection Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/display: Do Not Fallback To SW Cursor If HW Cursor RequiredAustin Zheng
[Why/How] Tearing can occur if there is a flip immediate plane and SW cursor. check_subvp_sw_cursor_fallback_req falls back to SW cursor if the stream has the potential to use subVP. Check for fallback not needed if HW cursor is required. e.g. Fullscreen gaming Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: use string choice helpersR Sundar
Use string choice helpers for better readability. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202410161814.I6p2Nnux-lkp@intel.com/ Signed-off-by: R Sundar <prosunofficial@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amdgpu: fix comment about amdgpu.abmlevel defaultsjeffbai@aosc.io
Since 040fdcde288a ("drm/amdgpu: respect the abmlevel module parameter value if it is set"), the default value for amdgpu.abmlevel was set to -1, or auto. However, the comment explaining the default value was not updated to reflect the change (-1, or auto; not -1, or disabled). Clarify that the default value (-1) means auto. Fixes: 040fdcde288a ("drm/amdgpu: respect the abmlevel module parameter value if it is set") Reported-by: Ruikai Liu <rickliu2000@outlook.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>