summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2024-03-20drm/amdgpu: Skip access PF-only registers on gfx10/gfxhub2_1 under SRIOVZhenGuo Yin
[Why] RLCG interface returns "out-of-range" error under SRIOV VF when accessing PF-only registers. [How] Skip access PF-only registers on gfx10/gfxhub2_1 under SRIOV. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amdgpu: Init zone device and drm client after mode-1 reset on reloadAhmad Rehman
In passthrough environment, when amdgpu is reloaded after unload, mode-1 is triggered after initializing the necessary IPs, That init does not include KFD, and KFD init waits until the reset is completed. KFD init is called in the reset handler, but in this case, the zone device and drm client is not initialized, causing app to create kernel panic. v2: Removing the init KFD condition from amdgpu_amdkfd_drm_client_create. As the previous version has the potential of creating DRM client twice. v3: v2 patch results in SDMA engine hung as DRM open causes VM clear to SDMA before SDMA init. Adding the condition to in drm client creation, on top of v1, to guard against drm client creation call multiple times. Signed-off-by: Ahmad Rehman <Ahmad.Rehman@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amdgpu: amdgpu_ttm_gart_bind set gtt bound flagPhilip Yang
Otherwise after the GTT bo is released, the GTT and gart space is freed but amdgpu_ttm_backend_unbind will not clear the gart page table entry and leave valid mapping entry pointing to the stale system page. Then if GPU access the gart address mistakely, it will read undefined value instead page fault, harder to debug and reproduce the real issue. Cc: stable@vger.kernel.org Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amdgpu/vcn: enable vcn1 fw load for VCN 4_0_6Saleemkhan Jamadar
v1 - update the fw header for each vcn instance (Veera) VCN1 has different FW binary in VCN v4_0_6. Add changes to load the VCN1 fw binary Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amd/display: Enable DML2 debug flagsAurabindo Pillai
[WHY & HOW] Enable DML2 related debug config options in DM for testing purposes. Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amd/display: Change default size for dummy plane in DML2Swapnil Patel
[WHY & HOW] Currently, to map dc states into dml_display_cfg, We create a dummy plane if the stream doesn't have any planes attached to it. This dummy plane uses max addersable width height. This results in certain mode validations failing when they shouldn't. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Swapnil Patel <swapnil.patel@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amdgpu: Reset IH OVERFLOW_EN bit for IH 7.0Friedrich Vock
IH 7.0 support landed shortly after the original patch for resetting the bit on all other generations, but without that patch applied. Fixes: 12443fc53e7d ("drm/amdgpu: Add ih v7_0 ip block support") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amdgpu: fix mmhub client id out-of-bounds accessLang Yu
Properly handle cid 0x140. Fixes: aba2be41470a ("drm/amdgpu: add mmhub 3.3.0 support") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amdgpu: fix use-after-free bugVitaly Prosyak
The bug can be triggered by sending a single amdgpu_gem_userptr_ioctl to the AMDGPU DRM driver on any ASICs with an invalid address and size. The bug was reported by Joonkyo Jung <joonkyoj@yonsei.ac.kr>. For example the following code: static void Syzkaller1(int fd) { struct drm_amdgpu_gem_userptr arg; int ret; arg.addr = 0xffffffffffff0000; arg.size = 0x80000000; /*2 Gb*/ arg.flags = 0x7; ret = drmIoctl(fd, 0xc1186451/*amdgpu_gem_userptr_ioctl*/, &arg); } Due to the address and size are not valid there is a failure in amdgpu_hmm_register->mmu_interval_notifier_insert->__mmu_interval_notifier_insert-> check_shl_overflow, but we even the amdgpu_hmm_register failure we still call amdgpu_hmm_unregister into amdgpu_gem_object_free which causes access to a bad address. The following stack is below when the issue is reproduced when Kazan is enabled: [ +0.000014] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020 [ +0.000009] RIP: 0010:mmu_interval_notifier_remove+0x327/0x340 [ +0.000017] Code: ff ff 49 89 44 24 08 48 b8 00 01 00 00 00 00 ad de 4c 89 f7 49 89 47 40 48 83 c0 22 49 89 47 48 e8 ce d1 2d 01 e9 32 ff ff ff <0f> 0b e9 16 ff ff ff 4c 89 ef e8 fa 14 b3 ff e9 36 ff ff ff e8 80 [ +0.000014] RSP: 0018:ffffc90002657988 EFLAGS: 00010246 [ +0.000013] RAX: 0000000000000000 RBX: 1ffff920004caf35 RCX: ffffffff8160565b [ +0.000011] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8881a9f78260 [ +0.000010] RBP: ffffc90002657a70 R08: 0000000000000001 R09: fffff520004caf25 [ +0.000010] R10: 0000000000000003 R11: ffffffff8161d1d6 R12: ffff88810e988c00 [ +0.000010] R13: ffff888126fb5a00 R14: ffff88810e988c0c R15: ffff8881a9f78260 [ +0.000011] FS: 00007ff9ec848540(0000) GS:ffff8883cc880000(0000) knlGS:0000000000000000 [ +0.000012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000010] CR2: 000055b3f7e14328 CR3: 00000001b5770000 CR4: 0000000000350ef0 [ +0.000010] Call Trace: [ +0.000006] <TASK> [ +0.000007] ? show_regs+0x6a/0x80 [ +0.000018] ? __warn+0xa5/0x1b0 [ +0.000019] ? mmu_interval_notifier_remove+0x327/0x340 [ +0.000018] ? report_bug+0x24a/0x290 [ +0.000022] ? handle_bug+0x46/0x90 [ +0.000015] ? exc_invalid_op+0x19/0x50 [ +0.000016] ? asm_exc_invalid_op+0x1b/0x20 [ +0.000017] ? kasan_save_stack+0x26/0x50 [ +0.000017] ? mmu_interval_notifier_remove+0x23b/0x340 [ +0.000019] ? mmu_interval_notifier_remove+0x327/0x340 [ +0.000019] ? mmu_interval_notifier_remove+0x23b/0x340 [ +0.000020] ? __pfx_mmu_interval_notifier_remove+0x10/0x10 [ +0.000017] ? kasan_save_alloc_info+0x1e/0x30 [ +0.000018] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? __kasan_kmalloc+0xb1/0xc0 [ +0.000018] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? __kasan_check_read+0x11/0x20 [ +0.000020] amdgpu_hmm_unregister+0x34/0x50 [amdgpu] [ +0.004695] amdgpu_gem_object_free+0x66/0xa0 [amdgpu] [ +0.004534] ? __pfx_amdgpu_gem_object_free+0x10/0x10 [amdgpu] [ +0.004291] ? do_syscall_64+0x5f/0xe0 [ +0.000023] ? srso_return_thunk+0x5/0x5f [ +0.000017] drm_gem_object_free+0x3b/0x50 [drm] [ +0.000489] amdgpu_gem_userptr_ioctl+0x306/0x500 [amdgpu] [ +0.004295] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004270] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? __this_cpu_preempt_check+0x13/0x20 [ +0.000015] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? sysvec_apic_timer_interrupt+0x57/0xc0 [ +0.000020] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ +0.000022] ? drm_ioctl_kernel+0x17b/0x1f0 [drm] [ +0.000496] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004272] ? drm_ioctl_kernel+0x190/0x1f0 [drm] [ +0.000492] drm_ioctl_kernel+0x140/0x1f0 [drm] [ +0.000497] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004297] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm] [ +0.000489] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? __kasan_check_write+0x14/0x20 [ +0.000016] drm_ioctl+0x3da/0x730 [drm] [ +0.000475] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004293] ? __pfx_drm_ioctl+0x10/0x10 [drm] [ +0.000506] ? __pfx_rpm_resume+0x10/0x10 [ +0.000016] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? __kasan_check_write+0x14/0x20 [ +0.000010] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? _raw_spin_lock_irqsave+0x99/0x100 [ +0.000015] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ +0.000014] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? preempt_count_sub+0x18/0xc0 [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000010] ? _raw_spin_unlock_irqrestore+0x27/0x50 [ +0.000019] amdgpu_drm_ioctl+0x7e/0xe0 [amdgpu] [ +0.004272] __x64_sys_ioctl+0xcd/0x110 [ +0.000020] do_syscall_64+0x5f/0xe0 [ +0.000021] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ +0.000015] RIP: 0033:0x7ff9ed31a94f [ +0.000012] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ +0.000013] RSP: 002b:00007fff25f66790 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ +0.000016] RAX: ffffffffffffffda RBX: 000055b3f7e133e0 RCX: 00007ff9ed31a94f [ +0.000012] RDX: 000055b3f7e133e0 RSI: 00000000c1186451 RDI: 0000000000000003 [ +0.000010] RBP: 00000000c1186451 R08: 0000000000000000 R09: 0000000000000000 [ +0.000009] R10: 0000000000000008 R11: 0000000000000246 R12: 00007fff25f66ca8 [ +0.000009] R13: 0000000000000003 R14: 000055b3f7021ba8 R15: 00007ff9ed7af040 [ +0.000024] </TASK> [ +0.000007] ---[ end trace 0000000000000000 ]--- v2: Consolidate any error handling into amdgpu_hmm_register which applied to kfd_bo also. (Christian) v3: Improve syntax and comment (Christian) Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Joonkyo Jung <joonkyoj@yonsei.ac.kr> Cc: Dokyung Song <dokyungs@yonsei.ac.kr> Cc: <jisoo.jang@yonsei.ac.kr> Cc: <yw9865@yonsei.ac.kr> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amdgpu: Handle duplicate BOs during process restoreMukul Joshi
In certain situations, some apps can import a BO multiple times (through IPC for example). To restore such processes successfully, we need to tell drm to ignore duplicate BOs. While at it, also add additional logging to prevent silent failures when process restore fails. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amd/display: Use freesync when `DRM_EDID_FEATURE_CONTINUOUS_FREQ` foundMario Limonciello
The monitor shipped with the Framework 16 supports VRR [1], but it's not being advertised. This is because the detailed timing block doesn't contain `EDID_DETAIL_MONITOR_RANGE` which amdgpu looks for to find min and max frequencies. This check however is superfluous for this case because update_display_info() calls drm_get_monitor_range() to get these ranges already. So if the `DRM_EDID_FEATURE_CONTINUOUS_FREQ` EDID feature is found then turn on freesync without extra checks. v2: squash in fix from Harry Closes: https://www.reddit.com/r/framework/comments/1b4y2i5/no_variable_refresh_rate_on_the_framework_16_on/ Closes: https://www.reddit.com/r/framework/comments/1b6vzcy/framework_16_variable_refresh_rate/ Closes: https://community.frame.work/t/resolved-no-vrr-freesync-with-amd-version/42338 Link: https://gist.github.com/superm1/e8fbacfa4d0f53150231d3a3e0a13faf Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20Merge patch series "RISC-V: ACPI: Enable CPPC based cpufreq support"Palmer Dabbelt
Sunil V L <sunilvl@ventanamicro.com> says: This series enables the support for "Collaborative Processor Performance Control (CPPC) on ACPI based RISC-V platforms. It depends on the encoding of CPPC registers as defined in RISC-V FFH spec [2]. CPPC is described in the ACPI spec [1]. RISC-V FFH spec required to enable this, is available at [2]. [1] - https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#collaborative-processor-performance-control [2] - https://github.com/riscv-non-isa/riscv-acpi-ffh/releases/download/v1.0.0/riscv-ffh.pdf * b4-shazam-merge: RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ cpufreq: Move CPPC configs to common Kconfig and add RISC-V ACPI: RISC-V: Add CPPC driver Link: https://lore.kernel.org/r/20240208034414.22579-1-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20Merge patch series "RISC-V: ACPI: Add LPI support"Palmer Dabbelt
Sunil V L <sunilvl@ventanamicro.com> says: This series adds support for Low Power Idle (LPI) on ACPI based platforms. LPI is described in the ACPI spec [1]. RISC-V FFH spec required to enable this is available at [2]. [1] - https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#lpi-low-power-idle-states [2] - https://github.com/riscv-non-isa/riscv-acpi-ffh/releases/download/v/riscv-ffh.pdf * b4-shazam-merge: ACPI: Enable ACPI_PROCESSOR for RISC-V ACPI: RISC-V: Add LPI driver cpuidle: RISC-V: Move few functions to arch/riscv Link: https://lore.kernel.org/r/20240118062930.245937-1-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20drm/i915: Rename ICL_PORT_TX_DW6 bitsVille Syrjälä
Our definitions for bit 7 and bit 0 of ICL_PORT_TX_DW6 are swapped. Functionally it doesn't matter as we always set both bits, but let's rename the bits to match bspec 100%. And while at it, add the definition for bits 1-6 as well, just to have it all fully documented. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240308072400.28918-1-ville.syrjala@linux.intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-20octeontx2-af: Use separate handlers for interruptsSubbaraya Sundeep
For PF to AF interrupt vector and VF to AF vector same interrupt handler is registered which is causing race condition. When two interrupts are raised to two CPUs at same time then two cores serve same event corrupting the data. Fixes: 7304ac4567bc ("octeontx2-af: Add mailbox IRQ and msg handlers") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-20octeontx2-pf: Send UP messages to VF only when VF is up.Subbaraya Sundeep
When PF sending link status messages to VF, it is possible that by the time link_event_task work function is executed VF might have brought down. Hence before sending VF link status message check whether VF is up to receive it. Fixes: ad513ed938c9 ("octeontx2-vf: Link event notification support") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-20octeontx2-pf: Use default max_active works instead of oneSubbaraya Sundeep
Only one execution context for the workqueue used for PF and VFs mailbox communication is incorrect since multiple works are queued simultaneously by all the VFs and PF link UP messages. Hence use default number of execution contexts by passing zero as max_active to alloc_workqueue function. With this fix in place, modify UP messages also to wait until completion. Fixes: d424b6c02415 ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-20octeontx2-pf: Wait till detach_resources msg is completeSubbaraya Sundeep
During VF driver remove, a message is sent to detach VF resources to PF but VF is not waiting until message is complete. Also mailbox interrupts need to be turned off after the detach resource message is complete. This patch fixes that problem. Fixes: 05fcc9e08955 ("octeontx2-pf: Attach NIX and NPA block LFs") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-20octeontx2: Detect the mbox up or down message via registerSubbaraya Sundeep
A single line of interrupt is used to receive up notifications and down reply messages from AF to PF (similarly from PF to its VF). PF acts as bridge and forwards VF messages to AF and sends respsones back from AF to VF. When an async event like link event is received by up message when PF is in middle of forwarding VF message then mailbox errors occur because PF state machine is corrupted. Since VF is a separate driver or VF driver can be in a VM it is not possible to serialize from the start of communication at VF. Hence to differentiate between type of messages at PF this patch makes sender to set mbox data register with distinct values for up and down messages. Sender also checks whether previous interrupt is received before triggering current interrupt by waiting for mailbox data register to become zero. Fixes: 5a6d7c9daef3 ("octeontx2-pf: Mailbox communication with AF") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-20dma-buf: Fix NULL pointer dereference in sanitycheck()Pavel Sakharov
If due to a memory allocation failure mock_chain() returns NULL, it is passed to dma_fence_enable_sw_signaling() resulting in NULL pointer dereference there. Call dma_fence_enable_sw_signaling() only if mock_chain() succeeds. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: d62c43a953ce ("dma-buf: Enable signaling on fence for selftests") Signed-off-by: Pavel Sakharov <p.sakharov@ispras.ru> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240319231527.1821372-1-p.sakharov@ispras.ru
2024-03-20i2c: muxes: pca954x: Allow sharing reset GPIOChris Packham
Some hardware designs with multiple PCA954x devices use a reset GPIO connected to all the muxes. Support this configuration by making use of the reset controller framework which can deal with the shared reset GPIOs. Fall back to the old GPIO descriptor method if the reset controller framework is not enabled. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Peter Rosin <peda@axentia.se> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-03-20Merge tag 'i2c-host-6.9-part2' of ↵Wolfram Sang
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-mergewindow Théo adds support for the Mobileye EyeQ5-I2C in the bindings. This patch is followed by eight commits featuring improvements to the Nomadik controller, such as simplification of the IRQ logic, renaming of the private data structure, more efficient use of FIELD_PREP/GET, GENMASK, etc., better time measurement with ktime, and more.
2024-03-20drm/i915/scaler: Update Pipe src size check in skl_update_scalerAnkit Nautiyal
For Earlier platforms, the Pipe source size is 12-bits so max pipe source width and height is 4096. For newer platforms it is 13-bits so theoretically max width/height is 8192. For few of the earlier platforms the scaler did not use all bits of the PIPESRC, so max scaler source size was used to make that the pipe source size is programmed within limits, before using scaler. This creates a problem, for MTL where scaler source size is 4096, but max pipe source width can theroretically be 8192. Switch the check to use the max scaler destination size, which closely match the limits. Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313143825.3461208-1-ankit.k.nautiyal@intel.com
2024-03-19net/bnx2x: Prevent access to a freed page in page_poolThinh Tran
Fix race condition leading to system crash during EEH error handling During EEH error recovery, the bnx2x driver's transmit timeout logic could cause a race condition when handling reset tasks. The bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() SGEs are freed using bnx2x_free_rx_sge_range(). However, this could overlap with the EEH driver's attempt to reset the device using bnx2x_io_slot_reset(), which also tries to free SGEs. This race condition can result in system crashes due to accessing freed memory locations in bnx2x_free_rx_sge() 799 static inline void bnx2x_free_rx_sge(struct bnx2x *bp, 800 struct bnx2x_fastpath *fp, u16 index) 801 { 802 struct sw_rx_page *sw_buf = &fp->rx_page_ring[index]; 803 struct page *page = sw_buf->page; .... where sw_buf was set to NULL after the call to dma_unmap_page() by the preceding thread. EEH: Beginning: 'slot_reset' PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing... bnx2x 0011:01:00.0: enabling device (0140 -> 0142) bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc0080000025065fc Oops: Kernel access of bad area, sig: 11 [#1] ..... Call Trace: [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable) [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0 [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550 [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60 [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170 [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0 [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 To solve this issue, we need to verify page pool allocations before freeing. Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer element") Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240315205535.1321-1-thinhtr@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-19cpufreq: Move CPPC configs to common Kconfig and add RISC-VSunil V L
CPPC related config options are currently defined only in ARM specific file. However, they are required for RISC-V as well. Instead of creating a new Kconfig.riscv file and duplicating them, move them to the common Kconfig file and enable RISC-V too. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20240208034414.22579-3-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-19ACPI: RISC-V: Add CPPC driverSunil V L
Add cpufreq driver based on ACPI CPPC for RISC-V. The driver uses either SBI CPPC interfaces or the CSRs to access the CPPC registers as defined by the RISC-V FFH spec. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20240208034414.22579-2-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-19ACPI: Enable ACPI_PROCESSOR for RISC-VSunil V L
The ACPI processor driver is not currently enabled for RISC-V. This is required to enable CPU related functionalities like LPI and CPPC. Hence, enable ACPI_PROCESSOR for RISC-V. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20240118062930.245937-4-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-19ACPI: RISC-V: Add LPI driverSunil V L
Enable Low Power Idle (LPI) based cpuidle driver for RISC-V platforms. It depends on SBI HSM calls for idle state transitions. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20240118062930.245937-3-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-19cpuidle: RISC-V: Move few functions to arch/riscvSunil V L
To support ACPI Low Power Idle (LPI), few functions are required which are currently static functions in the DT based cpuidle driver. Hence, move them under arch/riscv so that ACPI driver also can use them. Since they are no longer static functions, append "riscv_" prefix to the function name. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20240118062930.245937-2-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-19perf: starfive: fix 64-bit only COMPILE_TEST conditionConor Dooley
ARCH_STARFIVE is not restricted to 64-bit platforms, so while Will's addition of a 64-bit only condition satisfied the build robots doing COMPILE_TEST builds, Palmer ran into the same problems with writeq() being undefined during regular rv32 builds. Promote the dependency on 64-bit to its own `depends on` so that the driver can never be included in 32-bit builds. Reported-by: Palmer Dabbelt <palmer@rivosinc.com> Fixes: c2b24812f7bc ("perf: starfive: Add StarLink PMU support") Fixes: f0dbc6d0de38 ("perf: starfive: Only allow COMPILE_TEST for 64-bit architectures") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> Acked-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Link: https://lore.kernel.org/r/20240318-emphatic-rally-f177a4fe1bdc@spud Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-03-19Merge tag 'soc-late-6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull more ARM SoC updates from Arnd Bergmann: "These are changes that for some reason ended up not making it into the first four branches but that should still make it into 6.9: - A rework of the omap clock support that touches both drivers and device tree files - The reset controller branch changes that had a dependency on late bugfixes. Merging them here avoids a backmerge of 6.8-rc5 into the drivers branch - The RISC-V/starfive, RISC-V/microchip and ARM/Broadcom devicetree changes that got delayed and needed some extra time in linux-next for wider testing" * tag 'soc-late-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (31 commits) soc: fsl: dpio: fix kcalloc() argument order bus: ts-nbus: Improve error reporting bus: ts-nbus: Convert to atomic pwm API riscv: dts: starfive: jh7110: Add camera subsystem nodes ARM: bcm: stop selecing CONFIG_TICK_ONESHOT ARM: dts: omap3: Update clksel clocks to use reg instead of ti,bit-shift ARM: dts: am3: Update clksel clocks to use reg instead of ti,bit-shift clk: ti: Improve clksel clock bit parsing for reg property clk: ti: Handle possible address in the node name dt-bindings: pwm: opencores: Add compatible for StarFive JH8100 dt-bindings: riscv: cpus: reg matches hart ID reset: Instantiate reset GPIO controller for shared reset-gpios reset: gpio: Add GPIO-based reset controller cpufreq: do not open-code of_phandle_args_equal() of: Add of_phandle_args_equal() helper reset: simple: add support for Sophgo SG2042 dt-bindings: reset: sophgo: support SG2042 riscv: dts: microchip: add specific compatible for mpfs pdma riscv: dts: microchip: add missing CAN bus clocks ARM: brcmstb: Add debug UART entry for 74165 ...
2024-03-19Merge tag 's390-6.9-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - Various virtual vs physical address usage fixes - Add new bitwise types and helper functions and use them in s390 specific drivers and code to make it easier to find virtual vs physical address usage bugs. Right now virtual and physical addresses are identical for s390, except for module, vmalloc, and similar areas. This will be changed, hopefully with the next merge window, so that e.g. the kernel image and modules will be located close to each other, allowing for direct branches and also for some other simplifications. As a prerequisite this requires to fix all misuses of virtual and physical addresses. As it turned out people are so used to the concept that virtual and physical addresses are the same, that new bugs got added to code which was already fixed. In order to avoid that even more code gets merged which adds such bugs add and use new bitwise types, so that sparse can be used to find such usage bugs. Most likely the new types can go away again after some time - Provide a simple ARCH_HAS_DEBUG_VIRTUAL implementation - Fix kprobe branch handling: if an out-of-line single stepped relative branch instruction has a target address within a certain address area in the entry code, the program check handler may incorrectly execute cleanup code as if KVM code was executed, leading to crashes - Fix reference counting of zcrypt card objects - Various other small fixes and cleanups * tag 's390-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits) s390/entry: compare gmap asce to determine guest/host fault s390/entry: remove OUTSIDE macro s390/entry: add CIF_SIE flag and remove sie64a() address check s390/cio: use while (i--) pattern to clean up s390/raw3270: make class3270 constant s390/raw3270: improve raw3270_init() readability s390/tape: make tape_class constant s390/vmlogrdr: make vmlogrdr_class constant s390/vmur: make vmur_class constant s390/zcrypt: make zcrypt_class constant s390/mm: provide simple ARCH_HAS_DEBUG_VIRTUAL support s390/vfio_ccw_cp: use new address translation helpers s390/iucv: use new address translation helpers s390/ctcm: use new address translation helpers s390/lcs: use new address translation helpers s390/qeth: use new address translation helpers s390/zfcp: use new address translation helpers s390/tape: fix virtual vs physical address confusion s390/3270: use new address translation helpers s390/3215: use new address translation helpers ...
2024-03-19Merge tag 'pm-6.9-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These update the Energy Model to make it prevent errors due to power unit mismatches, fix a typo in power management documentation, convert one driver to using a platform remove callback returning void, address two cpufreq issues (one in the core and one in the DT driver), and enable boost support in the SCMI cpufreq driver. Specifics: - Modify the Energy Model code to bail out and complain if the unit of power is not uW to prevent errors due to unit mismatches (Lukasz Luba) - Make the intel_rapl platform driver use a remove callback returning void (Uwe Kleine-König) - Fix typo in the suspend and interrupts document (Saravana Kannan) - Make per-policy boost flags actually take effect on platforms using cpufreq_boost_set_sw() (Sibi Sankar) - Enable boost support in the SCMI cpufreq driver (Sibi Sankar) - Make the DT cpufreq driver use zalloc_cpumask_var() for allocating cpumasks to avoid using unitinialized memory (Marek Szyprowski)" * tag 'pm-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: scmi: Enable boost support firmware: arm_scmi: Add support for marking certain frequencies as turbo cpufreq: dt: always allocate zeroed cpumask cpufreq: Fix per-policy boost behavior on SoCs using cpufreq_boost_set_sw() Documentation: power: Fix typo in suspend and interrupts doc PM: EM: Force device drivers to provide power in uW powercap: intel_rapl: Convert to platform remove callback returning void
2024-03-19Merge tag 'acpi-6.9-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI updates from Rafael Wysocki: "These update ACPI documentation and kerneldoc comments. Specifics: - Add markup to generate links from footnotes in the ACPI enumeration document (Chris Packham) - Update the handle_eject_request() kerneldoc comment to document the arguments of the function and improve kerneldoc comments for ACPI suspend and hibernation functions (Yang Li)" * tag 'acpi-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: PM: Improve kerneldoc comments for suspend and hibernation functions ACPI: docs: enumeration: Make footnotes links ACPI: Document handle_eject_request() arguments
2024-03-19Merge tag 'thermal-6.9-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more thermal control updates from Rafael Wysocki: "These update thermal drivers for ARM platforms by adding new hardware support (r8a779h0, H616 THS), addressing issues (Mediatek LVTS, Mediatek MT7896, thermal-of) and cleaning up code. Specifics: - Fix memory leak in the error path at probe time in the Mediatek LVTS driver (Christophe Jaillet) - Fix control buffer enablement regression on Meditek MT7896 (Frank Wunderlich) - Drop spaces before TABs in different places: thermal-of, ST drivers and Makefile (Geert Uytterhoeven) - Adjust DT binding for NXP as fsl,tmu-range min/maxItems can vary among several SoC versions (Fabio Estevam) - Add support for the H616 THS controller on Sun8i platforms (Martin Botka) - Don't fail probe due to zone registration failure because there is no trip points defined in the DT (Mark Brown) - Support variable TMU array size for new platforms (Peng Fan) - Adjust the DT binding for thermal-of and make the polling time not required and assume it is zero when not found in the DT (Konrad Dybcio) - Add r8a779h0 support in both the DT and the rcar_gen3 driver (Geert Uytterhoeven)" * tag 'thermal-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal/drivers/rcar_gen3: Add support for R-Car V4M dt-bindings: thermal: rcar-gen3-thermal: Add r8a779h0 support thermal/of: Assume polling-delay(-passive) 0 when absent dt-bindings: thermal-zones: Don't require polling-delay(-passive) thermal/drivers/qoriq: Fix getting tmu range thermal/drivers/sun8i: Don't fail probe due to zone registration failure thermal/drivers/sun8i: Add support for H616 THS controller thermal/drivers/sun8i: Add SRAM register access code thermal/drivers/sun8i: Extend H6 calibration to support 4 sensors thermal/drivers/sun8i: Explain unknown H6 register value dt-bindings: thermal: sun8i: Add H616 THS controller soc: sunxi: sram: export register 0 for THS on H616 dt-bindings: thermal: qoriq-thermal: Adjust fsl,tmu-range min/maxItems thermal: Drop spaces before TABs thermal/drivers/mediatek: Fix control buffer enablement on MT7896 thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error handling path
2024-03-19Merge tag 'ata-6.9-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: "A single fix for ASMedia HBAs. These HBAs do not indicate that they support SATA Port Multipliers CAP.SPM (Supports Port Multiplier) is not set. Likewise, they do not allow you to probe the devices behind an attached PMP, as defined according to the SATA-IO PMP specification. Instead, they have decided to implement their own version of PMP, and because of this, plugging in a PMP actually works, even if the HBA claims that it does not support PMP. Revert a recent quirk for these HBAs, as that breaks ASMedia's own implementation of PMP. Unfortunately, this will once again give some users of these HBAs significantly increased boot time. However, a longer boot time for some, is the lesser evil compared to some other users not being able to detect their drives at all" * tag 'ata-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ahci: asm1064: asm1166: don't limit reported ports
2024-03-19Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: - Per vq sizes in vdpa - Info query for block devices support in vdpa - DMA sync callbacks in vduse - Fixes, cleanups * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (35 commits) virtio_net: rename free_old_xmit_skbs to free_old_xmit virtio_net: unify the code for recycling the xmit ptr virtio-net: add cond_resched() to the command waiting loop virtio-net: convert rx mode setting to use workqueue virtio: packed: fix unmap leak for indirect desc table vDPA: report virtio-blk flush info to user space vDPA: report virtio-block read-only info to user space vDPA: report virtio-block write zeroes configuration to user space vDPA: report virtio-block discarding configuration to user space vDPA: report virtio-block topology info to user space vDPA: report virtio-block MQ info to user space vDPA: report virtio-block max segments in a request to user space vDPA: report virtio-block block-size to user space vDPA: report virtio-block max segment size to user space vDPA: report virtio-block capacity to user space virtio: make virtio_bus const vdpa: make vdpa_bus const vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_min vDPA/ifcvf: get_max_vq_size to return max size virtio_vdpa: create vqs with the actual size ...
2024-03-19dm-integrity: fix a memory leak when rechecking the dataMikulas Patocka
Memory for the "checksums" pointer will leak if the data is rechecked after checksum failure (because the associated kfree won't happen due to 'goto skip_io'). Fix this by freeing the checksums memory before recheck, and just use the "checksum_onstack" memory for storing checksum during recheck. Fixes: c88f5e553fe3 ("dm-integrity: recheck the integrity tag after a failure") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-03-19Merge tag 'for-linus-6.9-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - Xen event channel handling fix for a regression with a rare kernel config and some added hardening - better support of running Xen dom0 in PVH mode - a cleanup for the xen grant-dma-iommu driver * tag 'for-linus-6.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/events: increment refcnt only if event channel is refcounted xen/evtchn: avoid WARN() when unbinding an event channel x86/xen: attempt to inflate the memory balloon on PVH xen/grant-dma-iommu: Convert to platform remove callback returning void
2024-03-19net: phy: fix phy_read_poll_timeout argument type in genphy_loopbackNikita Kiryushin
read_poll_timeout inside phy_read_poll_timeout can set val negative in some cases (for example, __mdiobus_read inside phy_read can return -EOPNOTSUPP). Supposedly, commit 4ec732951702 ("net: phylib: fix phy_read*_poll_timeout()") should fix problems with wrong-signed vals, but I do not see how as val is sent to phy_read as is and __val = phy_read (not val) is checked for sign. Change val type for signed to allow better error handling as done in other phy_read_poll_timeout callers. This will not fix any error handling by itself, but allows, for example, to modify cond with appropriate sign check or check resulting val separately. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 014068dcb5b1 ("net: phy: genphy_loopback: add link speed configuration") Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20240315175052.8049-1-kiryushin@ancud.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19can: kvaser_pciefd: Add additional Xilinx interruptsMartin Jocić
Since Xilinx-based adapters now support up to eight CAN channels, the TX interrupt mask array must have eight elements. Signed-off-by: Martin Jocic <martin.jocic@kvaser.com> Link: https://lore.kernel.org/all/2ab3c0585c3baba272ede0487182a423a420134b.camel@kvaser.com Fixes: 9b221ba452aa ("can: kvaser_pciefd: Add support for Kvaser PCIe 8xCAN") [mkl: replace Link by Fixes tag] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-03-19nouveau/gsp: don't check devinit disable on GSP.Dave Airlie
GSP should be handling this and I can see no evidence in opengpu driver that this register should be touched. Fixed acceleration on 2080 Ti GPUs. Fixes: 15740541e8f0 ("drm/nouveau/devinit/tu102-: prepare for GSP-RM") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240314014521.2695233-1-airlied@gmail.com
2024-03-19Merge branches 'pm-em', 'pm-powercap' and 'pm-sleep'Rafael J. Wysocki
Merge additional updates related to the Energy Model, power capping and system-wide power management for 6.9-rc1: - Modify the Energy Model code to bail out and complain if the unit of power is not uW to prevent errors due to unit mismatches (Lukasz Luba). - Make the intel_rapl platform driver use a remove callback returning void (Uwe Kleine-König). - Fix typo in the suspend and interrupts document (Saravana Kannan). * pm-em: PM: EM: Force device drivers to provide power in uW * pm-powercap: powercap: intel_rapl: Convert to platform remove callback returning void * pm-sleep: Documentation: power: Fix typo in suspend and interrupts doc
2024-03-19fbmon: prevent division by zero in fb_videomode_from_videomode()Roman Smirnov
The expression htotal * vtotal can have a zero value on overflow. It is necessary to prevent division by zero like in fb_var_to_videomode(). Found by Linux Verification Center (linuxtesting.org) with Svace. Signed-off-by: Roman Smirnov <r.smirnov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Helge Deller <deller@gmx.de>
2024-03-19usb: usb-acpi: Fix oops due to freeing uninitialized pld pointerMathias Nyman
If reading the ACPI _PLD port location object fails, or the port doesn't have a _PLD ACPI object then the *pld pointer will remain uninitialized and oops when freed. The patch that caused this is currently in next, on its way to v6.9. So no need to add this to stable or current 6.8 kernel. Reported-by: Klara Modin <klarasmodin@gmail.com> Closes: https://lore.kernel.org/linux-usb/7e92369a-3197-4883-9988-3c93452704f5@gmail.com/ Tested-by: Klara Modin <klarasmodin@gmail.com> Fixes: f3ac348e6e04 ("usb: usb-acpi: Set port connect type of not connectable ports correctly") Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240308113425.1144689-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-19ahci: asm1064: asm1166: don't limit reported portsConrad Kostecki
Previously, patches have been added to limit the reported count of SATA ports for asm1064 and asm1166 SATA controllers, as those controllers do report more ports than physically having. While it is allowed to report more ports than physically having in CAP.NP, it is not allowed to report more ports than physically having in the PI (Ports Implemented) register, which is what these HBAs do. (This is a AHCI spec violation.) Unfortunately, it seems that the PMP implementation in these ASMedia HBAs is also violating the AHCI and SATA-IO PMP specification. What these HBAs do is that they do not report that they support PMP (CAP.SPM (Supports Port Multiplier) is not set). Instead, they have decided to add extra "virtual" ports in the PI register that is used if a port multiplier is connected to any of the physical ports of the HBA. Enumerating the devices behind the PMP as specified in the AHCI and SATA-IO specifications, by using PMP READ and PMP WRITE commands to the physical ports of the HBA is not possible, you have to use the "virtual" ports. This is of course bad, because this gives us no way to detect the device and vendor ID of the PMP actually connected to the HBA, which means that we can not apply the proper PMP quirks for the PMP that is connected to the HBA. Limiting the port map will thus stop these controllers from working with SATA Port Multipliers. This patch reverts both patches for asm1064 and asm1166, so old behavior is restored and SATA PMP will work again, but it will also reintroduce the (minutes long) extra boot time for the ASMedia controllers that do not have a PMP connected (either on the PCIe card itself, or an external PMP). However, a longer boot time for some, is the lesser evil compared to some other users not being able to detect their drives at all. Fixes: 0077a504e1a4 ("ahci: asm1166: correct count of reported ports") Fixes: 9815e3961754 ("ahci: asm1064: correct count of reported ports") Cc: stable@vger.kernel.org Reported-by: Matt <cryptearth@googlemail.com> Signed-off-by: Conrad Kostecki <conikost@gentoo.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> [cassel: rewrote commit message] Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-03-19wireguard: netlink: access device through ctx instead of peerJason A. Donenfeld
The previous commit fixed a bug that led to a NULL peer->device being dereferenced. It's actually easier and faster performance-wise to instead get the device from ctx->wg. This semantically makes more sense too, since ctx->wg->peer_allowedips.seq is compared with ctx->allowedips_seq, basing them both in ctx. This also acts as a defence in depth provision against freed peers. Cc: stable@vger.kernel.org Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: netlink: check for dangling peer via is_dead instead of empty listJason A. Donenfeld
If all peers are removed via wg_peer_remove_all(), rather than setting peer_list to empty, the peer is added to a temporary list with a head on the stack of wg_peer_remove_all(). If a netlink dump is resumed and the cursored peer is one that has been removed via wg_peer_remove_all(), it will iterate from that peer and then attempt to dump freed peers. Fix this by instead checking peer->is_dead, which was explictly created for this purpose. Also move up the device_update_lock lockdep assertion, since reading is_dead relies on that. It can be reproduced by a small script like: echo "Setting config..." ip link add dev wg0 type wireguard wg setconf wg0 /big-config ( while true; do echo "Showing config..." wg showconf wg0 > /dev/null done ) & sleep 4 wg setconf wg0 <(printf "[Peer]\nPublicKey=$(wg genkey)\n") Resulting in: BUG: KASAN: slab-use-after-free in __lock_acquire+0x182a/0x1b20 Read of size 8 at addr ffff88811956ec70 by task wg/59 CPU: 2 PID: 59 Comm: wg Not tainted 6.8.0-rc2-debug+ #5 Call Trace: <TASK> dump_stack_lvl+0x47/0x70 print_address_description.constprop.0+0x2c/0x380 print_report+0xab/0x250 kasan_report+0xba/0xf0 __lock_acquire+0x182a/0x1b20 lock_acquire+0x191/0x4b0 down_read+0x80/0x440 get_peer+0x140/0xcb0 wg_get_device_dump+0x471/0x1130 Cc: stable@vger.kernel.org Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Reported-by: Lillian Berry <lillian@star-ark.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: device: remove generic .ndo_get_stats64Breno Leitao
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is configured") moved the callback to dev_get_tstats64() to net core, so, unless the driver is doing some custom stats collection, it does not need to set .ndo_get_stats64. Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64 function pointer. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: device: leverage core stats allocatorBreno Leitao
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of in this driver. With this new approach, the driver doesn't have to bother with error handling (allocation failure checking, making sure free happens in the right spot, etc). This is core responsibility now. Remove the allocation in this driver and leverage the network core allocation instead. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>