summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-07-17Revert "drm/prime: Use dma_buf from GEM object instance"Thomas Zimmermann
This reverts commit f83a9b8c7fd0557b0c50784bfdc1bbe9140c9bf8. The dma_buf field in struct drm_gem_object is not stable over the object instance's lifetime. The field becomes NULL when user space releases the final GEM handle on the buffer object. This resulted in a NULL-pointer deref. Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer: Acquire internal references on GEM handles") only solved the problem partially. They especially don't work for buffer objects without a DRM framebuffer associated. Hence, this revert to going back to using .import_attach->dmabuf. v3: - cc stable Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Zack Rusin <zack.rusin@broadcom.com> Cc: <stable@vger.kernel.org> # v6.15+ Link: https://lore.kernel.org/r/20250715155934.150656-5-tzimmermann@suse.de
2025-07-17Revert "drm/etnaviv: Use dma_buf from GEM object instance"Thomas Zimmermann
This reverts commit e91eb3ae415472b28211d7fed07fa283845b311e. The dma_buf field in struct drm_gem_object is not stable over the object instance's lifetime. The field becomes NULL when user space releases the final GEM handle on the buffer object. This resulted in a NULL-pointer deref. Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer: Acquire internal references on GEM handles") only solved the problem partially. They especially don't work for buffer objects without a DRM framebuffer associated. Hence, this revert to going back to using .import_attach->dmabuf. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://lore.kernel.org/r/20250715155934.150656-4-tzimmermann@suse.de
2025-07-17Revert "drm/vmwgfx: Use dma_buf from GEM object instance"Thomas Zimmermann
This reverts commit aec8a40228acb385d60feec59b54573d307e60f3. The dma_buf field in struct drm_gem_object is not stable over the object instance's lifetime. The field becomes NULL when user space releases the final GEM handle on the buffer object. This resulted in a NULL-pointer deref. Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer: Acquire internal references on GEM handles") only solved the problem partially. They especially don't work for buffer objects without a DRM framebuffer associated. Hence, this revert to going back to using .import_attach->dmabuf. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://lore.kernel.org/r/20250715155934.150656-3-tzimmermann@suse.de
2025-07-17Revert "drm/virtio: Use dma_buf from GEM object instance"Thomas Zimmermann
This reverts commit 415cb45895f43015515473fbc40563ca5eec9a7c. The dma_buf field in struct drm_gem_object is not stable over the object instance's lifetime. The field becomes NULL when user space releases the final GEM handle on the buffer object. This resulted in a NULL-pointer deref. Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer: Acquire internal references on GEM handles") only solved the problem partially. They especially don't work for buffer objects without a DRM framebuffer associated. Hence, this revert to going back to using .import_attach->dmabuf. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://lore.kernel.org/r/20250715155934.150656-3-tzimmermann@suse.de
2025-07-17Merge branch 'for-next' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux Tony Nguyen says: ==================== Add RDMA support for Intel IPU E2000 in idpf Tatyana Nikolova says: This idpf patch series is the second part of the staged submission for introducing RDMA RoCEv2 support for the IPU E2000 line of products, referred to as GEN3. To support RDMA GEN3 devices, the idpf driver uses common definitions of the IIDC interface and implements specific device functionality in iidc_rdma_idpf.h. The IPU model can host one or more logical network endpoints called vPorts per PCI function that are flexibly associated with a physical port or an internal communication port. Other features as it pertains to GEN3 devices include: * MMIO learning * RDMA capability negotiation * RDMA vectors discovery between idpf and control plane These patches are split from the submission "Add RDMA support for Intel IPU E2000 (GEN3)" [1]. The patches have been tested on a range of hosts and platforms with a variety of general RDMA applications which include standalone verbs (rping, perftest, etc.), storage and HPC applications. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> [1] https://lore.kernel.org/all/20240724233917.704-1-tatyana.e.nikolova@intel.com/ This idpf patch series is the second part of the staged submission for introducing RDMA RoCEv2 support for the IPU E2000 line of products, referred to as GEN3. To support RDMA GEN3 devices, the idpf driver uses common definitions of the IIDC interface and implements specific device functionality in iidc_rdma_idpf.h. The IPU model can host one or more logical network endpoints called vPorts per PCI function that are flexibly associated with a physical port or an internal communication port. Other features as it pertains to GEN3 devices include: * MMIO learning * RDMA capability negotiation * RDMA vectors discovery between idpf and control plane These patches are split from the submission "Add RDMA support for Intel IPU E2000 (GEN3)" [1]. The patches have been tested on a range of hosts and platforms with a variety of general RDMA applications which include standalone verbs (rping, perftest, etc.), storage and HPC applications. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> [1] https://lore.kernel.org/all/20240724233917.704-1-tatyana.e.nikolova@intel.com/ IWL reviews: v3: https://lore.kernel.org/all/20250708210554.1662-1-tatyana.e.nikolova@intel.com/ v2: https://lore.kernel.org/all/20250612220002.1120-1-tatyana.e.nikolova@intel.com/ v1 (split from previous series): https://lore.kernel.org/all/20250523170435.668-1-tatyana.e.nikolova@intel.com/ v3: https://lore.kernel.org/all/20250207194931.1569-1-tatyana.e.nikolova@intel.com/ RFC v2: https://lore.kernel.org/all/20240824031924.421-1-tatyana.e.nikolova@intel.com/ RFC: https://lore.kernel.org/all/20240724233917.704-1-tatyana.e.nikolova@intel.com/ * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux: idpf: implement get LAN MMIO memory regions idpf: implement IDC vport aux driver MTU change handler idpf: implement remaining IDC RDMA core callbacks and handlers idpf: implement RDMA vport auxiliary dev create, init, and destroy idpf: implement core RDMA auxiliary dev create, init, and destroy idpf: use reserved RDMA vectors from control plane ==================== Link: https://patch.msgid.link/20250714181002.2865694-1-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17drm/sched: Remove optimization that causes hang when killing dependent jobsLin.Cao
When application A submits jobs and application B submits a job with a dependency on A's fence, the normal flow wakes up the scheduler after processing each job. However, the optimization in drm_sched_entity_add_dependency_cb() uses a callback that only clears dependencies without waking up the scheduler. When application A is killed before its jobs can run, the callback gets triggered but only clears the dependency without waking up the scheduler, causing the scheduler to enter sleep state and application B to hang. Remove the optimization by deleting drm_sched_entity_clear_dep() and its usage, ensuring the scheduler is always woken up when dependencies are cleared. Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup scheduler") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250717084453.921097-1-lincao12@amd.com
2025-07-17netfilter: nf_conntrack: fix crash due to removal of uninitialised entryFlorian Westphal
A crash in conntrack was reported while trying to unlink the conntrack entry from the hash bucket list: [exception RIP: __nf_ct_delete_from_lists+172] [..] #7 [ff539b5a2b043aa0] nf_ct_delete at ffffffffc124d421 [nf_conntrack] #8 [ff539b5a2b043ad0] nf_ct_gc_expired at ffffffffc124d999 [nf_conntrack] #9 [ff539b5a2b043ae0] __nf_conntrack_find_get at ffffffffc124efbc [nf_conntrack] [..] The nf_conn struct is marked as allocated from slab but appears to be in a partially initialised state: ct hlist pointer is garbage; looks like the ct hash value (hence crash). ct->status is equal to IPS_CONFIRMED|IPS_DYING, which is expected ct->timeout is 30000 (=30s), which is unexpected. Everything else looks like normal udp conntrack entry. If we ignore ct->status and pretend its 0, the entry matches those that are newly allocated but not yet inserted into the hash: - ct hlist pointers are overloaded and store/cache the raw tuple hash - ct->timeout matches the relative time expected for a new udp flow rather than the absolute 'jiffies' value. If it were not for the presence of IPS_CONFIRMED, __nf_conntrack_find_get() would have skipped the entry. Theory is that we did hit following race: cpu x cpu y cpu z found entry E found entry E E is expired <preemption> nf_ct_delete() return E to rcu slab init_conntrack E is re-inited, ct->status set to 0 reply tuplehash hnnode.pprev stores hash value. cpu y found E right before it was deleted on cpu x. E is now re-inited on cpu z. cpu y was preempted before checking for expiry and/or confirm bit. ->refcnt set to 1 E now owned by skb ->timeout set to 30000 If cpu y were to resume now, it would observe E as expired but would skip E due to missing CONFIRMED bit. nf_conntrack_confirm gets called sets: ct->status |= CONFIRMED This is wrong: E is not yet added to hashtable. cpu y resumes, it observes E as expired but CONFIRMED: <resumes> nf_ct_expired() -> yes (ct->timeout is 30s) confirmed bit set. cpu y will try to delete E from the hashtable: nf_ct_delete() -> set DYING bit __nf_ct_delete_from_lists Even this scenario doesn't guarantee a crash: cpu z still holds the table bucket lock(s) so y blocks: wait for spinlock held by z CONFIRMED is set but there is no guarantee ct will be added to hash: "chaintoolong" or "clash resolution" logic both skip the insert step. reply hnnode.pprev still stores the hash value. unlocks spinlock return NF_DROP <unblocks, then crashes on hlist_nulls_del_rcu pprev> In case CPU z does insert the entry into the hashtable, cpu y will unlink E again right away but no crash occurs. Without 'cpu y' race, 'garbage' hlist is of no consequence: ct refcnt remains at 1, eventually skb will be free'd and E gets destroyed via: nf_conntrack_put -> nf_conntrack_destroy -> nf_ct_destroy. To resolve this, move the IPS_CONFIRMED assignment after the table insertion but before the unlock. Pablo points out that the confirm-bit-store could be reordered to happen before hlist add resp. the timeout fixup, so switch to set_bit and before_atomic memory barrier to prevent this. It doesn't matter if other CPUs can observe a newly inserted entry right before the CONFIRMED bit was set: Such event cannot be distinguished from above "E is the old incarnation" case: the entry will be skipped. Also change nf_ct_should_gc() to first check the confirmed bit. The gc sequence is: 1. Check if entry has expired, if not skip to next entry 2. Obtain a reference to the expired entry. 3. Call nf_ct_should_gc() to double-check step 1. nf_ct_should_gc() is thus called only for entries that already failed an expiry check. After this patch, once the confirmed bit check passes ct->timeout has been altered to reflect the absolute 'best before' date instead of a relative time. Step 3 will therefore not remove the entry. Without this change to nf_ct_should_gc() we could still get this sequence: 1. Check if entry has expired. 2. Obtain a reference. 3. Call nf_ct_should_gc() to double-check step 1: 4 - entry is still observed as expired 5 - meanwhile, ct->timeout is corrected to absolute value on other CPU and confirm bit gets set 6 - confirm bit is seen 7 - valid entry is removed again First do check 6), then 4) so the gc expiry check always picks up either confirmed bit unset (entry gets skipped) or expiry re-check failure for re-inited conntrack objects. This change cannot be backported to releases before 5.19. Without commit 8a75a2c17410 ("netfilter: conntrack: remove unconfirmed list") |= IPS_CONFIRMED line cannot be moved without further changes. Cc: Razvan Cojocaru <rzvncj@gmail.com> Link: https://lore.kernel.org/netfilter-devel/20250627142758.25664-1-fw@strlen.de/ Link: https://lore.kernel.org/netfilter-devel/4239da15-83ff-4ca4-939d-faef283471bb@gmail.com/ Fixes: 1397af5bfd7d ("netfilter: conntrack: remove the percpu dying list") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-07-17net: fix segmentation after TCP/UDP fraglist GROFelix Fietkau
Since "net: gro: use cb instead of skb->network_header", the skb network header is no longer set in the GRO path. This breaks fraglist segmentation, which relies on ip_hdr()/tcp_hdr() to check for address/port changes. Fix this regression by selectively setting the network header for merged segment skbs. Fixes: 186b1ea73ad8 ("net: gro: use cb instead of skb->network_header") Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250705150622.10699-1-nbd@nbd.name Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17gpiolib: devres: release GPIOs in devm_gpiod_put_array()André Draszik
devm_gpiod_put_array() is meant to undo the effects of devm_gpiod_get_array() - in particular, it should release the GPIOs contained in the array acquired with the latter. It is meant to be the resource-managed version of gpiod_put_array(), and it should behave similar to the non-array version devm_gpiod_put(). Since commit d1d52c6622a6 ("gpiolib: devres: Finish the conversion to use devm_add_action()") it doesn't do that anymore, it just removes the devres action and frees associated memory, but it doesn't actually release the GPIOs. Fix by switching from devm_remove_action() to devm_release_action(), which will in addition invoke the action to release the GPIOs. Fixes: d1d52c6622a6 ("gpiolib: devres: Finish the conversion to use devm_add_action()") Signed-off-by: André Draszik <andre.draszik@linaro.org> Reported-by: Wattson CI <wattson-external@google.com> Reported-by: Samuel Wu <wusamuel@google.com> Link: https://lore.kernel.org/r/20250715-gpiolib-devres-put-array-fix-v1-1-970d82a8c887@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-07-16ipv6: mcast: Delay put pmc->idev in mld_del_delrec()Yue Haibing
pmc->idev is still used in ip6_mc_clear_src(), so as mld_clear_delrec() does, the reference should be put after ip6_mc_clear_src() return. Fixes: 63ed8de4be81 ("mld: add mc_lock for protecting per-interface mld data") Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Link: https://patch.msgid.link/20250714141957.3301871-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16Merge branch 's390-bpf-fix-bpf_arch_text_poke-with-new_addr-null-again'Alexei Starovoitov
Ilya Leoshkevich says: ==================== s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL again This series fixes a regression causing perf on s390 to trigger a kernel panic. Patch 1 fixes the issue, patch 2 adds a test to make sure this doesn't happen again. ==================== Link: https://patch.msgid.link/20250716194524.48109-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-16selftests/bpf: Stress test attaching a BPF prog to another BPF progIlya Leoshkevich
Add a test that invokes a BPF prog in a loop, while concurrently attaching and detaching another BPF prog to and from it. This helps identifying race conditions in bpf_arch_text_poke(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250716194524.48109-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-16s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL againIlya Leoshkevich
Commit 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic") has accidentally removed the critical piece of commit c730fce7c70c ("s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL"), causing intermittent kernel panics in e.g. perf's on_switch() prog to reappear. Restore the fix and add a comment. Fixes: 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic") Cc: stable@vger.kernel.org Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250716194524.48109-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-16sched/ext: Prevent update_locked_rq() calls with NULL rqBreno Leitao
Avoid invoking update_locked_rq() when the runqueue (rq) pointer is NULL in the SCX_CALL_OP and SCX_CALL_OP_RET macros. Previously, calling update_locked_rq(NULL) with preemption enabled could trigger the following warning: BUG: using __this_cpu_write() in preemptible [00000000] This happens because __this_cpu_write() is unsafe to use in preemptible context. rq is NULL when an ops invoked from an unlocked context. In such cases, we don't need to store any rq, since the value should already be NULL (unlocked). Ensure that update_locked_rq() is only called when rq is non-NULL, preventing calling __this_cpu_write() on preemptible context. Suggested-by: Peter Zijlstra <peterz@infradead.org> Fixes: 18853ba782bef ("sched_ext: Track currently locked rq") Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # v6.15
2025-07-16Merge branch 'net-mlx5e-add-support-for-pcie-congestion-events'Jakub Kicinski
Tariq Toukan says: ==================== net/mlx5e: Add support for PCIe congestion events Dragos says: PCIe congestion events are events generated by the firmware when the device side has sustained PCIe inbound or outbound traffic above certain thresholds. The high and low threshold are hysteresis thresholds to prevent flapping: once the high threshold has been reached, a low threshold event will be triggered only after the bandwidth usage went below the low threshold. This series adds support for receiving and exposing such events as ethtool counters. 2 new pairs of counters are exposed: pci_bw_in/outbound_high/low. These should help the user understand if the device PCI is under pressure. Planned followup patches: - Allow configuration of thresholds through devlink. - Add ethtool counter for wakeups which did not result in any state change. ==================== Link: https://patch.msgid.link/1752589821-145787-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16net/mlx5e: Add device PCIe congestion ethtool statsDragos Tatulea
Implement the PCIe Congestion Event notifier which triggers a work item to query the PCIe Congestion Event object. The result of the congestion state is reflected in the new ethtool stats: * pci_bw_inbound_high: the device has crossed the high threshold for inbound PCIe traffic. * pci_bw_inbound_low: the device has crossed the low threshold for inbound PCIe traffic * pci_bw_outbound_high: the device has crossed the high threshold for outbound PCIe traffic. * pci_bw_outbound_low: the device has crossed the low threshold for outbound PCIe traffic The high and low thresholds are currently configured at 90% and 75%. These are hysteresis thresholds which help to check if the PCI bus on the device side is in a congested state. If low + 1 = high then the device is in a congested state. If low == high then the device is not in a congested state. The counters are also documented. A follow-up patch will make the thresholds configurable. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1752589821-145787-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16net/mlx5e: Create/destroy PCIe Congestion Event objectDragos Tatulea
Add initial infrastructure to create and destroy the PCIe Congestion Event object if the object is supported. The verb for the object creation function is "set" instead of "create" because the function will accommodate the modify operation as well in a subsequent patch. The next patches will hook it up to the event handler and will add actual functionality. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1752589821-145787-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16s390/net: Remove NETIUCV device driverNagamani PV
The netiucv driver creates TCP/IP interfaces over IUCV between Linux guests on z/VM and other z/VM entities. Rationale for removal: - NETIUCV connections are only supported for compatibility with earlier versions and not to be used for new network setups, since at least Linux kernel 4.0. - No known active users, use cases, or product dependencies - The driver is no longer relevant for z/VM networking; preferred methods include: * Device pass-through (e.g., OSA, RoCE) * z/VM Virtual Switch (VSWITCH) The IUCV mechanism itself remains supported and is actively used via AF_IUCV, hvc_iucv, and smsg_iucv. Signed-off-by: Nagamani PV <nagamani@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250715074210.3999296-1-wintera@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16Merge branch 'expose-refclk-for-rmii-and-enable-rmii'Jakub Kicinski
Ryan Wanner says: ==================== Expose REFCLK for RMII and enable RMII This set allows the REFCLK property to be exposed as a dt-property to properly reflect the correct RMII layout. RMII can take an external or internal provided REFCLK, since this is not SoC dependent but board dependent this must be exposed as a DT property for the macb driver. This set also enables RMII mode for the SAMA7 SoCs gigabit mac. v1: https://lore.kernel.org/cover.1750346271.git.Ryan.Wanner@microchip.com ==================== Link: https://patch.msgid.link/cover.1752510727.git.Ryan.Wanner@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16net: cadence: macb: sama7g5_emac: Remove USARIO CLKEN flagRyan Wanner
Remove USARIO_CLKEN flag since this is now a device tree argument and not fixed to the SoC. This will instead be selected by the "cdns,refclk-ext" device tree property. Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com> Link: https://patch.msgid.link/1e7a8c324526f631f279925aa8a6aa937d55c796.1752510727.git.Ryan.Wanner@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16net: cadence: macb: Enable RMII for SAMA7 gemRyan Wanner
This macro enables the RMII mode bit in the USRIO register when RMII mode is requested. Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com> Link: https://patch.msgid.link/6698836e4ee7df5f6bee181f0d2e38d4b8e4cec2.1752510727.git.Ryan.Wanner@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16net: cadence: macb: Expose REFCLK as a device tree propertyRyan Wanner
The RMII and RGMII can both support internal or external provided REFCLKs 50MHz and 125MHz respectively. Since this is dependent on the board that the SoC is on this needs to be set via the device tree. This property flag is checked in the MACB DT node so the REFCLK cap is configured the correct way for the RMII or RGMII is configured on the board. Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com> Link: https://patch.msgid.link/7f9b65896d6b7b48275bc527b72a16347f8ce10a.1752510727.git.Ryan.Wanner@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16dt-bindings: net: cdns,macb: Add external REFCLK propertyRyan Wanner
REFCLK can be provided by an external source so this should be exposed by a DT property. The REFCLK is used for RMII and in some SoCs that use this driver the RGMII 125MHz clk can also be provided by an external source. Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/d558467c4d5b27fb3135ffdead800b14cd9c6c0a.1752510727.git.Ryan.Wanner@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16Merge branch 'selftest-net-add-selftest-for-netpoll'Jakub Kicinski
Breno Leitao says: ==================== selftest: net: Add selftest for netpoll I am submitting a new selftest for the netpoll subsystem specifically targeting the case where the RX is polling in the TX path, which is a case that we don't have any test in the tree today. This is done when netpoll_poll_dev() called, and this test creates a scenario when that is probably. The test does the following: 1) Configuring a single RX/TX queue to increase contention on the interface. 2) Generating background traffic to saturate the network, mimicking real-world congestion. 3) Sending netconsole messages to trigger netpoll polling and monitor its behavior. 4) Using dynamic netconsole targets via configfs, with the ability to delete and recreate targets during the test. 5) Running bpftrace in parallel to verify that netpoll_poll_dev() is called when expected. If it is called, then the test passes, otherwise the test is marked as skipped. In order to achieve it, I stole Jakub's bpftrace helper from [1], and did some small changes that I found useful to use the helper. So, this patchset basically contains: 1) The code stolen from Jakub 2) Improvements on bpftrace() helper 3) The selftest itself Link: https://lore.kernel.org/all/20250421222827.283737-22-kuba@kernel.org/ [1] ==================== Link: https://patch.msgid.link/20250714-netpoll_test-v7-0-c0220cfaa63e@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16selftests: net: add netpoll basic functionality testBreno Leitao
Add a basic selftest for the netpoll polling mechanism, specifically targeting the netpoll poll() side. The test creates a scenario where network transmission is running at maximum speed, and netpoll needs to poll the NIC. This is achieved by: 1. Configuring a single RX/TX queue to create contention 2. Generating background traffic to saturate the interface 3. Sending netconsole messages to trigger netpoll polling 4. Using dynamic netconsole targets via configfs 5. Delete and create new netconsole targets after some messages 6. Start a bpftrace in parallel to make sure netpoll_poll_dev() is called 7. If bpftrace exists and netpoll_poll_dev() was called, stop. The test validates a critical netpoll code path by monitoring traffic flow and ensuring netpoll_poll_dev() is called when the normal TX path is blocked. This addresses a gap in netpoll test coverage for a path that is tricky for the network stack. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250714-netpoll_test-v7-3-c0220cfaa63e@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16selftests: drv-net: Strip '@' prefix from bpftrace map keysBreno Leitao
The '@' prefix in bpftrace map keys is specific to bpftrace and can be safely removed when processing results. This patch modifies the bpftrace utility to strip the '@' from map keys before storing them in the result dictionary, making the keys more consistent with Python conventions. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250714-netpoll_test-v7-2-c0220cfaa63e@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16selftests: drv-net: add helper/wrapper for bpftraceJakub Kicinski
bpftrace is very useful for low level driver testing. perf or trace-cmd would also do for collecting data from tracepoints, but they require much more post-processing. Add a wrapper for running bpftrace and sanitizing its output. bpftrace has JSON output, which is great, but it prints loose objects and in a slightly inconvenient format. We have to read the objects line by line, and while at it return them indexed by the map name. Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250714-netpoll_test-v7-1-c0220cfaa63e@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16doc: xdp: Clarify driver implementation for XDP Rx metadataSong Yoong Siang
Clarify that drivers must remove device-reserved metadata from the data_meta area before passing frames to XDP programs. Additionally, expand the explanation of how userspace and BPF programs should coordinate the use of METADATA_SIZE, and add a detailed diagram to illustrate pointer adjustments and metadata layout. Also describe the requirements and constraints enforced by bpf_xdp_adjust_meta(). Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/r/20250716154846.3513575-1-yoong.siang.song@intel.com
2025-07-16Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-07-15 (ixgbe, fm10k, i40e, ice) Arnd Bergmann resolves compile issues with large NR_CPUS for ixgbe, fm10k, and i40e. For ice: Dave adds a NULL check for LAG netdev. Michal corrects a pointer check in debugfs initialization. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: check correct pointer in fwlog debugfs ice: add NULL check in eswitch lag check ethernet: intel: fix building with large NR_CPUS ==================== Link: https://patch.msgid.link/20250715202948.3841437-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16net: airoha: fix potential use-after-free in airoha_npu_get()Alok Tiwari
np->name was being used after calling of_node_put(np), which releases the node and can lead to a use-after-free bug. Previously, of_node_put(np) was called unconditionally after of_find_device_by_node(np), which could result in a use-after-free if pdev is NULL. This patch moves of_node_put(np) after the error check to ensure the node is only released after both the error and success cases are handled appropriately, preventing potential resource issues. Fixes: 23290c7bc190 ("net: airoha: Introduce Airoha NPU support") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250715143102.3458286-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16ipv6: mcast: Simplify mld_clear_{report|query}()Yue Haibing
Use __skb_queue_purge() instead of re-implementing it. Note that it uses kfree_skb_reason() instead of kfree_skb() internally, and pass SKB_DROP_REASON_QUEUE_PURGE drop reason to the kfree_skb tracepoint. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250715120709.3941510-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16vsock/test: fix vsock_ioctl_int() check for unsupported ioctlStefano Garzarella
`vsock_do_ioctl` returns -ENOIOCTLCMD if an ioctl support is not implemented, like for SIOCINQ before commit f7c722659275 ("vsock: Add support for SIOCINQ ioctl"). In net/socket.c, -ENOIOCTLCMD is re-mapped to -ENOTTY for the user space. So, our test suite, without that commit applied, is failing in this way: 34 - SOCK_STREAM ioctl(SIOCINQ) functionality...ioctl(21531): Inappropriate ioctl for device Return false in vsock_ioctl_int() to skip the test in this case as well, instead of failing. Fixes: 53548d6bffac ("test/vsock: Add retry mechanism to ioctl wrapper") Cc: niuxuewei.nxw@antgroup.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Xuewei Niu <niuxuewei.nxw@antgroup.com> Link: https://patch.msgid.link/20250715093233.94108-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16tcp: fix UaF in tcp_prune_ofo_queue()Paolo Abeni
The CI reported a UaF in tcp_prune_ofo_queue(): BUG: KASAN: slab-use-after-free in tcp_prune_ofo_queue+0x55d/0x660 Read of size 4 at addr ffff8880134729d8 by task socat/20348 CPU: 0 UID: 0 PID: 20348 Comm: socat Not tainted 6.16.0-rc5-virtme #1 PREEMPT(full) Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0x82/0xd0 print_address_description.constprop.0+0x2c/0x400 print_report+0xb4/0x270 kasan_report+0xca/0x100 tcp_prune_ofo_queue+0x55d/0x660 tcp_try_rmem_schedule+0x855/0x12e0 tcp_data_queue+0x4dd/0x2260 tcp_rcv_established+0x5e8/0x2370 tcp_v4_do_rcv+0x4ba/0x8c0 __release_sock+0x27a/0x390 release_sock+0x53/0x1d0 tcp_sendmsg+0x37/0x50 sock_write_iter+0x3c1/0x520 vfs_write+0xc09/0x1210 ksys_write+0x183/0x1d0 do_syscall_64+0xc1/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fcf73ef2337 Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 RSP: 002b:00007ffd4f924708 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcf73ef2337 RDX: 0000000000002000 RSI: 0000555f11d1a000 RDI: 0000000000000008 RBP: 0000555f11d1a000 R08: 0000000000002000 R09: 0000000000000000 R10: 0000000000000040 R11: 0000000000000246 R12: 0000000000000008 R13: 0000000000002000 R14: 0000555ee1a44570 R15: 0000000000002000 </TASK> Allocated by task 20348: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 __kasan_slab_alloc+0x59/0x70 kmem_cache_alloc_node_noprof+0x110/0x340 __alloc_skb+0x213/0x2e0 tcp_collapse+0x43f/0xff0 tcp_try_rmem_schedule+0x6b9/0x12e0 tcp_data_queue+0x4dd/0x2260 tcp_rcv_established+0x5e8/0x2370 tcp_v4_do_rcv+0x4ba/0x8c0 __release_sock+0x27a/0x390 release_sock+0x53/0x1d0 tcp_sendmsg+0x37/0x50 sock_write_iter+0x3c1/0x520 vfs_write+0xc09/0x1210 ksys_write+0x183/0x1d0 do_syscall_64+0xc1/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 20348: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x38/0x50 kmem_cache_free+0x149/0x330 tcp_prune_ofo_queue+0x211/0x660 tcp_try_rmem_schedule+0x855/0x12e0 tcp_data_queue+0x4dd/0x2260 tcp_rcv_established+0x5e8/0x2370 tcp_v4_do_rcv+0x4ba/0x8c0 __release_sock+0x27a/0x390 release_sock+0x53/0x1d0 tcp_sendmsg+0x37/0x50 sock_write_iter+0x3c1/0x520 vfs_write+0xc09/0x1210 ksys_write+0x183/0x1d0 do_syscall_64+0xc1/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff888013472900 which belongs to the cache skbuff_head_cache of size 232 The buggy address is located 216 bytes inside of freed 232-byte region [ffff888013472900, ffff8880134729e8) The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x13472 head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x80000000000040(head|node=0|zone=1) page_type: f5(slab) raw: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290 raw: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000 head: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290 head: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000 head: 0080000000000001 ffffea00004d1c81 00000000ffffffff 00000000ffffffff head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888013472880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888013472900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff888013472980: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc ^ ffff888013472a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888013472a80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb Indeed tcp_prune_ofo_queue() is reusing the skb dropped a few lines above. The caller wants to enqueue 'in_skb', lets check space vs the latter. Fixes: 1d2fbaad7cd8 ("tcp: stronger sk_rcvbuf checks") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: syzbot+865aca08c0533171bf6a@syzkaller.appspotmail.com Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/b78d2d9bdccca29021eed9a0e7097dd8dc00f485.1752567053.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16net/mlx5: Correctly set gso_size when LRO is usedChristoph Paasch
gso_size is expected by the networking stack to be the size of the payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt is the full sized frame (including the headers). Dividing cqe_bcnt by lro_num_seg will then give incorrect results. For example, running a bpftrace higher up in the TCP-stack (tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even though in reality the payload was only 1448 bytes. This can have unintended consequences: - In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss will be 1448 (because tp->advmss is 1448). Thus, we will always recompute scaling_ratio each time an LRO-packet is received. - In tcp_gro_receive(), it will interfere with the decision whether or not to flush and thus potentially result in less gro'ed packets. So, we need to discount the protocol headers from cqe_bcnt so we can actually divide the payload by lro_num_seg to get the real gso_size. v2: - Use "(unsigned char *)tcp + tcp->doff * 4 - skb->data)" to compute header-len (Tariq Toukan <tariqt@nvidia.com>) - Improve commit-message (Gal Pressman <gal@nvidia.com>) Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files") Signed-off-by: Christoph Paasch <cpaasch@openai.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250715-cpaasch-pf-925-investigate-incorrect-gso_size-on-cx-7-nic-v2-1-e06c3475f3ac@openai.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16selftests: packetdrill: correct the expected timing in tcp_rcv_big_endseqJakub Kicinski
Commit f5fda1a86884 ("selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt") added this test recently, but it's failing with: # tcp_rcv_big_endseq.pkt:41: error handling packet: timing error: expected outbound packet at 1.230105 sec but happened at 1.190101 sec; tolerance 0.005046 sec # script packet: 1.230105 . 1:1(0) ack 54001 win 0 # actual packet: 1.190101 . 1:1(0) ack 54001 win 0 It's unclear why the test expects the ack to be delayed. Correct it. Link: https://patch.msgid.link/20250715142849.959444-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16ethtool: Don't check for RXFH fields conflict when no input_xfrm is requestedGal Pressman
The requirement of ->get_rxfh_fields() in ethtool_set_rxfh() is there to verify that we have no conflict of input_xfrm with the RSS fields options, there is no point in doing it if input_xfrm is not supported/requested. This is under the assumption that a driver that supports input_xfrm will also support ->get_rxfh_fields(), so add a WARN_ON() to ethtool_check_ops() to verify it, and remove the op NULL check. This fixes the following error in mlx4_en, which doesn't support getting/setting RXFH fields. $ ethtool --set-rxfh-indir eth2 hfunc xor Cannot set RX flow hash configuration: Operation not supported Fixes: 72792461c8e8 ("net: ethtool: don't mux RXFH via rxnfc callbacks") Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250715140754.489677-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16selftests: rtnetlink: fix addrlft test flakiness on power-saving systemsHangbin Liu
Jakub reported that the rtnetlink test for the preferred lifetime of an address has become quite flaky. The issue started appearing around the 6.16 merge window in May, and the test fails with: FAIL: preferred_lft addresses remaining The flakiness might be related to power-saving behavior, as address expiration is handled by a "power-efficient" workqueue. To address this, use slowwait to check more frequently whether the address still exists. This reduces the likelihood of the system entering a low-power state during the test, improving reliability. Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250715043459.110523-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16bcachefs: Fix bch2_maybe_casefold() when CONFIG_UTF8=nKent Overstreet
maybe_casefold() shouldn't have been nooped, just bch2_casefold(). Fixes: 94426e4201fb ("bcachefs: opts.casefold_disabled") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-16bcachefs: Fix build when CONFIG_UNICODE=nKent Overstreet
94426e4201fb, which added the killswitch for casefolding, accidentally removed some of the ifdefs we need to avoid build errors. It appears we need better build testing for different configurations, it took two weeks for the robots to catch this one. Fixes: 94426e4201fb ("bcachefs: opts.casefold_disabled") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-16bcachefs: Fix reference to invalid bucket in copygcKent Overstreet
Use bch2_dev_bucket_tryget() instead of bch2_dev_tryget() before checking the bucket bitmap. Reported-by: syzbot+3168625f36f4a539237e@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-16bcachefs: Don't build aux search tree when still repairing nodeKent Overstreet
bch2_btree_node_drop_keys_outside_node() will (re)build aux search trees, because it's also called by topology repair. bch2_btree_node_read_done() was calling it before validating individual keys; invalid ones have to be dropped. If we call drop_keys_outside_node() first, then bch2_bset_build_aux_tree() doesn't run because the node already has an aux search tree - which was invalidated by the repair. Reported-by: syzbot+c5e7a66b3b23ae65d44f@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-16bcachefs: Tweak threshold for allocator triggering discardsKent Overstreet
The allocator path has a "if we're really low on free buckets, check if we should issue discards" - tweak this to also trigger discards if more than 1/128th of the device is in need_discard state. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-16bcachefs: Fix triggering of discard by the journal pathKent Overstreet
It becomes possible to do discards after a journal flush, which naturally the journal code is reponsible for. A prior refactoring seems to have broken this - which went unnoticed because the foreground allocator path can also trigger discards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-16drm/amdgpu/gfx8: reset compute ring wptr on the GPU on resumeEeli Haapalainen
Commit 42cdf6f687da ("drm/amdgpu/gfx8: always restore kcq MQDs") made the ring pointer always to be reset on resume from suspend. This caused compute rings to fail since the reset was done without also resetting it for the firmware. Reset wptr on the GPU to avoid a disconnect between the driver and firmware wptr. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3911 Fixes: 42cdf6f687da ("drm/amdgpu/gfx8: always restore kcq MQDs") Signed-off-by: Eeli Haapalainen <eeli.haapalainen@protonmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2becafc319db3d96205320f31cc0de4ee5a93747) Cc: stable@vger.kernel.org
2025-07-16drm/amdgpu: Increase reset counter only on successLijo Lazar
Increment the reset counter only if soft recovery succeeded. This is consistent with a ring hard reset behaviour where counter gets incremented only if hard reset succeeded. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 25c314aa3ec3d30e4ee282540e2096b5c66a2437) Cc: stable@vger.kernel.org
2025-07-16drm/radeon: Do not hold console lock during resumeThomas Zimmermann
The function radeon_resume_kms() acquires the console lock. It is inconsistent, as it depends on the notify_client argument. That lock then covers a number of suspend operations that are unrelated to the console. Remove the calls to console_lock() and console_unlock() from the radeon function. The console lock is only required by DRM's fbdev emulation, which acquires it as necessary. Also fixes a possible circular dependency between the console lock and the client-list mutex, where the mutex is supposed to be taken first. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit fff8e0504499a929f26e2fb7cf7e2c9854e37b91)
2025-07-16drm/radeon: Do not hold console lock while suspending clientsThomas Zimmermann
The radeon driver holds the console lock while suspending in-kernel DRM clients. This creates a circular dependency with the client-list mutex, which is supposed to be acquired first. Reported when combining radeon with another DRM driver. Therefore, do not take the console lock in radeon, but let the fbdev DRM client acquire the lock when needed. This is what all other DRM drivers so. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reported-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Closes: https://lore.kernel.org/dri-devel/0a087cfd-bd4c-48f1-aa2f-4a3b12593935@oss.qualcomm.com/ Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 612ec7c69d04cb58beb1332c2806da9f2f47a3ae)
2025-07-16drm/amd/display: Disable CRTC degamma LUT for DCN401Melissa Wen
In DCN401 pre-blending degamma LUT isn't affecting cursor as in previous DCN version. As this is not the behavior close to what is expected for CRTC degamma LUT, disable CRTC degamma LUT property in this HW. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4176 --- When enabling HDR on KDE, it takes the first CRTC 1D LUT available and apply a color transformation (Gamma 2.2 -> PQ). AMD driver usually advertises a CRTC degamma LUT as the first CRTC 1D LUT, but it's actually applied pre-blending. In previous HW version, it seems to work fine because the 1D LUT was applied to cursor too, but DCN401 presents a different behavior and the 1D LUT isn't affecting the hardware cursor. To address the wrong gamma on cursor with HDR (see the link), I came up with this patch that disables CRTC degamma LUT in this hw, since it presents a different behavior than others. With this KDE sees CRTC regamma LUT as the first post-blending 1D LUT available. This is actually more consistent with AMD color pipeline. It was tested by the reporter, since I don't have the HW available for local testing and debugging. Melissa --- Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 340231cdceec2c45995d773a358ca3c341f151aa) Cc: stable@vger.kernel.org
2025-07-16drm/amd/display: Free memory allocationClayton King
[WHY] Free memory to avoid memory leak Reviewed-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Clayton King <clayton.king@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit fa699acb8e9be2341ee318077fa119acc7d5f329) Cc: stable@vger.kernel.org
2025-07-16Merge tag 'probes-fixes-v6.16-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fix from Masami Hiramatsu: - fprobe-event: The @params variable was being used in an error path without being initialized. The fix to return an error code. * tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/probes: Avoid using params uninitialized in parse_btf_arg()