summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-05-22Bluetooth: btusb: Add new VID/PID 13d3/3584 for MT7922Liwei Sun
A new variant of MT7922 wireless device has been identified. The device introduces itself as MEDIATEK MT7922, so treat it as MediaTek device. With this patch, btusb driver works as expected: [ 3.151162] Bluetooth: Core ver 2.22 [ 3.151185] Bluetooth: HCI device and connection manager initialized [ 3.151189] Bluetooth: HCI socket layer initialized [ 3.151191] Bluetooth: L2CAP socket layer initialized [ 3.151194] Bluetooth: SCO socket layer initialized [ 3.295718] Bluetooth: hci0: HW/SW Version: 0x008a008a, Build Time: 20241106163512 [ 4.676634] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 4.676637] Bluetooth: BNEP filters: protocol multicast [ 4.676640] Bluetooth: BNEP socket layer initialized [ 5.560453] Bluetooth: hci0: Device setup in 2320660 usecs [ 5.560457] Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported. [ 5.619197] Bluetooth: hci0: AOSP extensions version v1.00 [ 5.619204] Bluetooth: hci0: AOSP quality report is supported [ 5.619301] Bluetooth: MGMT ver 1.23 [ 6.741247] Bluetooth: RFCOMM TTY layer initialized [ 6.741258] Bluetooth: RFCOMM socket layer initialized [ 6.741261] Bluetooth: RFCOMM ver 1.11 lspci output: 04:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter USB information: T: Bus=01 Lev=01 Prnt=01 Port=04 Cnt=02 Dev#= 3 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3584 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I: If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I:* If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us Signed-off-by: Liwei Sun <sunliweis@126.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-05-22Bluetooth: btusb: use skb_pull to avoid unsafe access in QCA dump handlingEn-Wei Wu
Use skb_pull() and skb_pull_data() to safely parse QCA dump packets. This avoids direct pointer math on skb->data, which could lead to invalid access if the packet is shorter than expected. Fixes: 20981ce2d5a5 ("Bluetooth: btusb: Add WCN6855 devcoredump support") Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.15-rc8). Conflicts: 80f2ab46c2ee ("irdma: free iwdev->rf after removing MSI-X") 4bcc063939a5 ("ice, irdma: fix an off by one in error handling code") c24a65b6a27c ("iidc/ice/irdma: Update IDC to support multiple consumers") https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au No extra adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22vfio/type1: Fix error unwind in migration dirty bitmap allocationLi RongQing
When setting up dirty page tracking at the vfio IOMMU backend for device migration, if an error is encountered allocating a tracking bitmap, the unwind loop fails to free previously allocated tracking bitmaps. This occurs because the wrong loop index is used to generate the tracking object. This results in unintended memory usage for the life of the current DMA mappings where bitmaps were successfully allocated. Use the correct loop index to derive the tracking object for freeing during unwind. Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages tracking") Signed-off-by: Li RongQing <lirongqing@baidu.com> Link: https://lore.kernel.org/r/20250521034647.2877-1-lirongqing@baidu.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-05-22Merge tag 'icc-6.16-rc1' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next Georgi writes: interconnect changes for 6.16 This pull request contains the interconnect changes for the 6.16-rc1 merge window. The core and driver changes are listed below. Core changes: - Add support for dynamic id allocation, that allows creating multiple instances of the same provider Driver changes: - Add driver for the EPSS L3 instances on SA8775P SoC - Add QoS support for SM8650 SoC - Add some missing nodes for SM8650 - Misc dt-binding style and indentation fixes Signed-off-by: Georgi Djakov <djakov@kernel.org> * tag 'icc-6.16-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc: interconnect: qcom: sm8650: remove regmap config for mc_virt & clk_virt interconnect: qcom: sm8650: add the MASTER_APSS_NOC dt-bindings: interconnect: sm8650: document the MASTER_APSS_NOC interconnect: qcom: sm8650: enable QoS configuration dt-bindings: interconnect: Correct indentation and style in DTS example interconnect: qcom: sa8775p: Add dynamic icc node id support interconnect: qcom: icc-rpmh: Add dynamic icc node id support interconnect: qcom: Add multidev EPSS L3 support interconnect: core: Add dynamic id allocation support dt-bindings: interconnect: Add EPSS L3 compatible for SA8775P
2025-05-22Merge tag 'net-6.15-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "This is somewhat larger than what I hoped for, with a few PRs from subsystems and follow-ups for the recent netdev locking changes, anyhow there are no known pending regressions. Including fixes from bluetooth, ipsec and CAN. Current release - regressions: - eth: team: grab team lock during team_change_rx_flags - eth: bnxt_en: fix netdev locking in ULP IRQ functions Current release - new code bugs: - xfrm: ipcomp: fix truesize computation on receive - eth: airoha: fix page recycling in airoha_qdma_rx_process() Previous releases - regressions: - sched: hfsc: fix qlen accounting bug when using peek in hfsc_enqueue() - mr: consolidate the ipmr_can_free_table() checks. - bridge: netfilter: fix forwarding of fragmented packets - xsk: bring back busy polling support in XDP_COPY - can: - add missing rcu read protection for procfs content - kvaser_pciefd: force IRQ edge in case of nested IRQ Previous releases - always broken: - xfrm: espintcp: remove encap socket caching to avoid reference leak - bluetooth: use skb_pull to avoid unsafe access in QCA dump handling - eth: idpf: - fix null-ptr-deref in idpf_features_check - fix idpf_vport_splitq_napi_poll() - eth: hibmcge: fix wrong ndo.open() after reset fail issue" * tag 'net-6.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits) octeontx2-af: Fix APR entry mapping based on APR_LMT_CFG octeontx2-af: Set LMT_ENA bit for APR table entries net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done octeontx2-pf: Avoid adding dcbnl_ops for LBK and SDP vf selftests/tc-testing: Add an HFSC qlen accounting test sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue() idpf: fix idpf_vport_splitq_napi_poll() net: hibmcge: fix wrong ndo.open() after reset fail issue. net: hibmcge: fix incorrect statistics update issue xsk: Bring back busy polling support in XDP_COPY can: slcan: allow reception of short error messages net: lan743x: Restore SGMII CTRL register on resume bnxt_en: Fix netdev locking in ULP IRQ functions MAINTAINERS: Drop myself to reviewer for ravb driver net: dwmac-sun8i: Use parsed internal PHY address instead of 1 net: ethernet: ti: am65-cpsw: Lower random mac address error print to info can: kvaser_pciefd: Continue parsing DMA buf after dropped RX can: kvaser_pciefd: Fix echo_skb race can: kvaser_pciefd: Force IRQ edge in case of nested IRQ idpf: fix null-ptr-deref in idpf_features_check ...
2025-05-22net/mlx5e: Convert mlx5 netdevs to instance lockingCosmin Ratiu
This patch convert mlx5 to use the new netdev instance lock in addition to the pre-existing state_lock (and the RTNL). mlx5e_priv.state_lock was already used throughout mlx5 to protect against concurrent state modifications on the same netdev, usually in addition to the RTNL. The new netdev instance lock will eventually replace it, but for now, it is acquired in addition to the existing locks in the order RTNL -> instance lock -> state_lock. All three netdev types handled by mlx5 are converted to the new style of locking, because they share a lot of code related to initializing channels and dealing with NAPI, so it's better to convert all three rather than introduce different assumptions deep in the call stack depending on the type of device. Because of the nature of the call graphs in mlx5, it wasn't possible to incrementally convert parts of the driver to use the new lock, since either all call paths into NAPI have to possess the new lock if the *_locked variants are used, or none of them can have the lock. One area which required extra care is the interaction between closing channels and devlink health reporter tasks. Previously, the recovery tasks were unconditionally acquiring the RTNL, which could lead to deadlocks in these scenarios: T1: mlx5e_close (== .ndo_stop(), has RTNL) -> mlx5e_close_locked -> mlx5e_close_channels -> mlx5e_ptp_close -> mlx5e_ptp_close_queues -> mlx5e_ptp_close_txqsqs -> mlx5e_ptp_close_txqsq -> cancel_work_sync(&ptpsq->report_unhealthy_work) waits for T2: mlx5e_ptpsq_unhealthy_work -> mlx5e_reporter_tx_ptpsq_unhealthy -> mlx5e_health_report -> devlink_health_report -> devlink_health_reporter_recover -> mlx5e_tx_reporter_ptpsq_unhealthy_recover which does: rtnl_lock(); => Deadlock. Another similar instance of this is: T1: mlx5e_close (== .ndo_stop(), has RTNL) -> mlx5e_close_locked -> mlx5e_close_channels -> mlx5e_ptp_close -> mlx5e_ptp_close_queues -> mlx5e_ptp_close_txqsqs -> mlx5e_ptp_close_txqsq -> cancel_work_sync(&sq->recover_work) waits for T2: mlx5e_tx_err_cqe_work -> mlx5e_reporter_tx_err_cqe -> mlx5e_health_report -> devlink_health_report -> devlink_health_reporter_recover -> mlx5e_tx_reporter_err_cqe_recover which does: rtnl_lock(); => Another deadlock. Fix that by using the same pattern previously done in mlx5e_tx_timeout_work, where the RTNL was repeatedly tried to be acquired until either: a) it is successfully acquired or b) there's no need for the work to be done any more (channel is being closed). Now, for all three recovery tasks, the instance lock is repeatedly tried to be acquired until successful or the channel/SQ is closed. As a side-effect, drop the !test_bit(MLX5E_STATE_OPENED, &priv->state) check from mlx5e_tx_timeout_work, it's weaker than !test_bit(MLX5E_STATE_CHANNELS_ACTIVE, &priv->state) and unnecessary. Future patches will introduce new call paths (from netdev queue management ops) which can close channels (and call cancel_work_sync on the recovery tasks) without the RTNL lock and only with the netdev instance lock. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1747829342-1018757-6-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22net/mlx5e: Don't drop RTNL during firmware flashCosmin Ratiu
There's no explanation in the original commit of why that was done, but presumably flashing takes a long time and holding RTNL for so long blocks other interactions with the netdev layer. However, the stack is moving towards netdev instance locking and dropping and reacquiring RTNL in the context of flashing introduces locking ordering issues: RTNL must be acquired before the netdev instance lock and released after it. This patch therefore takes the simpler approach by no longer dropping and reacquiring the RTNL, as soon RTNL for ethtool will be removed, leaving only the instance lock to protect against races. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1747829342-1018757-5-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22IB/IPoIB: Allow using netdevs that require the instance lockCosmin Ratiu
After the last patch removing vlan_rwsem, it is an incremental step to allow ipoib to work with netdevs that require the instance lock. In several places, netdev_lock() is changed to netdev_lock_ops_to_full() which takes care of not acquiring the lock again when the netdev is already locked. In ipoib_ib_tx_timeout_work() and __ipoib_ib_dev_flush() for HEAVY flushes, the netdev lock is acquired/released. This is needed because these functions end up calling .ndo_stop()/.ndo_open() on subinterfaces, and the device may expect the netdev instance lock to be held. ipoib_set_mode() now explicitly acquires ops lock while manipulating the features, mtu and tx queues. Finally, ipoib_napi_enable()/ipoib_napi_disable() now use the *_locked variants of the napi_enable()/napi_disable() calls and optionally acquire the netdev lock themselves depending on the dev they operate on. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1747829342-1018757-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22IB/IPoIB: Replace vlan_rwsem with the netdev instance lockCosmin Ratiu
vlan_rwsem was added more than a decade ago to work around a deadlock involving the original mutex being acquired twice, once from the wq. Subsequent changes then tweaked it to partially protect access to ipoib_dev_priv->child_intfs together with the RTNL. Flushing the wq synchronously was also since then refactored to happen separately. This semaphore unfortunately prevents updating ipoib to work with devices that require the netdev lock, because of lock ordering issues between RTNL, vlan_rwsem and the netdev instance locks of parent and child devices. To uncomplicate things, this commit replaces vlan_rwsem with the netdev instance lock of the parent device. Both parent child_intfs list and the children's list membership in it require holding the parent netdev instance lock. All call paths were carefully reviewed and no-longer-needed ASSERT_RTNL calls were dropped. Some non-trivial changes: - ipoib_match_gid_pkey_addr() now only acquires the instance lock and iterates through child_intfs for the first level of recursion (the parent), as it's not possible to have multiple levels of nested subinterfaces. - ipoib_open() and ipoib_stop() schedule tasks on the global workqueue to open/stop child interfaces to avoid potentially acquiring nested netdev instance locks. To avoid the device going away between the task scheduling and execution, netdev_hold/netdev_put are used. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1747829342-1018757-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22IB/IPoIB: Enqueue separate work_structs for each flushed interfaceCosmin Ratiu
Previously, flushing a netdevice involved first flushing all child devices from the flush task itself. That requires holding the lock that protects the list for the entire duration of the flush. This poses a problem when converting from vlan_rwsem to the netdev instance lock (next patch), because holding the parent lock while trying to acquire a child lock makes lockdep unhappy, rightfully. Fix this by splitting a big flush task into individual flush tasks (all are already created in their respective ipoib_dev_priv structs) and defining a helper function to enqueue all of them while holding the list lock. In ipoib_set_mac, the function is not used and the task is enqueued directly, because in the subsequent patches locking is changed and this function may be called with the netdev instance lock held. This is effectively a noop, the wq is single-threaded and ordered and will execute the same flush operations in the same order as before. Furthermore, there should be no new races because ipoib_parent_unregister_pre() calls flush_workqueue() after stopping new work generation to wait for pending work to complete. flush_workqueue() waits for all currently enqueued work to finish before returning. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1747829342-1018757-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22Revert "drm/amd: Keep display off while going into S4"Mario Limonciello
commit 68bfdc8dc0a1a ("drm/amd: Keep display off while going into S4") attempted to keep displays off during the S4 sequence by not resuming display IP. This however leads to hangs because DRM clients such as the console can try to access registers and cause a hang. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4155 Fixes: 68bfdc8dc0a1a ("drm/amd: Keep display off while going into S4") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250522141328.115095-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e485502c37b097b0bd773baa7e2741bf7bd2909a) Cc: stable@vger.kernel.org
2025-05-22Merge tag 'pinctrl-v6.15-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "This deals with a crash in the Qualcomm pin controller GPIO parts when using hogs. The first patch to gpiolib makes gpiochip_line_is_valid() NULL-tolerant. The second patch fixes the actual problem" * tag 'pinctrl-v6.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: qcom: switch to devm_register_sys_off_handler() gpiolib: don't crash on enabling GPIO HOG pins
2025-05-22RDMA/rxe: Break endless pagefault loop for RO pagesLeon Romanovsky
RO pages has "perm" equal to 0, that caused to the situation where such pages were marked as needed to have fault and caused to infinite loop. Fixes: eedd5b1276e7 ("RDMA/umem: Store ODP access mask information in PFN") Reported-by: Daisuke Matsuda <dskmtsd@gmail.com> Closes: https://lore.kernel.org/all/3016329a-4edd-4550-862f-b298a1b79a39@gmail.com/ Link: https://patch.msgid.link/096fab178d48ed86942ee22eafe9be98e29092aa.1747913377.git.leonro@nvidia.com Tested-by: Daisuke Matsuda <dskmtsd@gmail.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2025-05-22Merge tag 'coresight-next-v6.16' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/coresight/linux into char-misc-next Suzuki writes: coresight: updates for Linux v6.16 CoreSight self-hosted trace driver subsystem updates for Linux v6.16 includes: - Clear CLAIM tags on device probe if self-hosted tags are set. - Support for perf AUX pause/resume for CoreSight ETM PMU driver, with trace collection at pause. - Miscellaneous fixes for the subsystem Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> * tag 'coresight-next-v6.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/coresight/linux: (27 commits) coresight: prevent deactivate active config while enabling the config coresight: holding cscfg_csdev_lock while removing cscfg from csdev coresight/etm4: fix missing disable active config coresight: etm4x: Fix timestamp bit field handling coresight: tmc: fix failure to disable/enable ETF after reading Documentation: coresight: Document AUX pause and resume coresight: perf: Update buffer on AUX pause coresight: tmc: Re-enable sink after buffer update coresight: perf: Support AUX trace pause and resume coresight: etm4x: Hook pause and resume callbacks coresight: Introduce pause and resume APIs for source coresight: etm4x: Extract the trace unit controlling coresight: cti: Replace inclusion by struct fwnode_handle forward declaration coresight: Disable MMIO logging for coresight stm driver coresight: replicator: Fix panic for clearing claim tag coresight: Add a KUnit test for coresight_find_default_sink() coresight: Remove extern from function declarations coresight: Remove inlines from static function definitions coresight: Clear self hosted claim tag on probe coresight: etm3x: Convert raw base pointer to struct coresight access ...
2025-05-22Revert "drm/amd: Keep display off while going into S4"Mario Limonciello
commit 68bfdc8dc0a1a ("drm/amd: Keep display off while going into S4") attempted to keep displays off during the S4 sequence by not resuming display IP. This however leads to hangs because DRM clients such as the console can try to access registers and cause a hang. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4155 Fixes: 68bfdc8dc0a1a ("drm/amd: Keep display off while going into S4") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250522141328.115095-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22ublk: run auto buf unregisgering in same io_ring_ctx with registeringMing Lei
UBLK_F_AUTO_BUF_REG requires that the buffer registered automatically is unregistered in same `io_ring_ctx`, so check it explicitly. Document this requirement for UBLK_F_AUTO_BUF_REG. Drop WARN_ON_ONCE() which is triggered from userspace code path. Fixes: 99c1e4eb6a3f ("ublk: register buffer to local io_uring with provided buf index via UBLK_F_AUTO_BUF_REG") Reported-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250522152043.399824-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-22drm/amd/pm: Fetch partition metrics on SMUv13.0.12Lijo Lazar
Add support to fetch compute partition related metrics in SMUv13.0.12 SOCs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22Revert "drm/amd/display: [FW Promotion] Release 0.1.11.0"Aurabindo Pillai
This reverts commit 81fc9ca25f02c53c055b842a40f2a915bd0bd5e0 since it introduces incompatbility with older firmware Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: seq64 memory unmap uses uninterruptible lockPhilip Yang
To unmap and free seq64 memory when drm node close to free vm, if there is signal accepted, then taking vm lock failed and leaking seq64 va mapping, and then dmesg has error log "still active bo inside vm". Change to use uninterruptible lock fix the mapping leaking and no dmesg error log. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: update ras support checkMangesh Gadre
update ras support check for vcn 5.0.1 Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Enable RAS for jpeg 5.0.1Mangesh Gadre
Enable jpeg ras posion processing and aca error logging Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Add jpeg poison status regMangesh Gadre
added registers to enable jpeg ras Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Enable RAS for vcn 5.0.1Mangesh Gadre
Enable vcn ras posion processing and aca error logging Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/display: Add a new dcdebugmask to allow skip detection LTWayne Lin
Under specific embedded scenarios, we might still use DP interface rather than eDP interface. Under such case, detection link training is unnecessary. Add a new dcdebugmask value that can be used to skip the detection LT Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Link: https://lore.kernel.org/amd-gfx/20250521063934.2111323-1-Wayne.Lin@amd.com/ Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/display: no 3D and blnd LUT as DPP color caps for DCN401Melissa Wen
Match what is declared as DPP color caps with hw caps. DCN401 has MPC shaper + 3D LUTs that are movable before and after blending (get from plane or stream), but no DPP blend LUTs. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/display: only collect data if debug gamut_remap is availableMelissa Wen
Color gamut_remap state log may be not available for some hw versions, so prevent null pointer dereference by checking if there is a function to collect data for this hw version. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Remove duplicated "context still alive" checkTvrtko Ursulin
When amdgpu_ctx_mgr_fini() calls amdgpu_ctx_mgr_entity_fini() it contains the exact same "context still alive" check as it will do next. Remove the duplicated copy. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Make amdgpu_ctx_mgr_entity_fini staticTvrtko Ursulin
Function amdgpu_ctx_mgr_entity_fini() only has a single local caller so lets make it local. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Update runtime pm checksAlex Deucher
Don't enable BACO when in passthrough. PCI resets don't work correctly when in BACO. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Add vcn poison status regMangesh Gadre
added register to enable vcn ras Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/pm: Use external link order for xgmi dataLijo Lazar
xgmi_port_num interface reports external link number for port number. To be consistent, use the external link number for reporting other XGMI link data also. v2: For invalid link number return -EINVAL (Kevin) Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/radeon: fixing typo in macro nameJihed Chaibi
"ENABLE" is currently misspelled in SYS_INFO_GPUCAPS__ENABEL_DFS_BYPASS Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: fixing typo in macro nameJihed Chaibi
"ENABLE" is currently misspelled in SYS_INFO_GPUCAPS__ENABEL_DFS_BYPASS PS: checkpatch.pl is complaining about the presence of a space at the start of drivers/gpu/drm/amd/include/atomfirmware.h line: 1716 This is propably because this file uses (two) spaces and not tabs. Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/display: fix typo in commentsDaniil Ryabov
Fix double 'u' in 'frequuency' Signed-off-by: Daniil Ryabov <daniilryabov4@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/display: Adjust set_value function with prefix to help in ftraceLeonardo Gomes
Adjust set_value function in hw_hpd.c file to have prefix to help in ftrace, the name change from 'set_value' to 'dal_hw_hpd_set_value' Signed-off-by: Leonardo da Silva Gomes <leonardodasigomes@gmail.com> Co-developed-by: Derick Frias <derick.william.moraes@gmail.com> Signed-off-by: Derick Frias <derick.william.moraes@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/display: Adjust get_value function with prefix to help in ftraceLeonardo Gomes
Adjust get_value function in hw_hpd.c file to have prefix to help in ftrace, the name change from 'get_value' to 'dal_hw_hpd_get_value' Signed-off-by: Leonardo da Silva Gomes <leonardodasigomes@gmail.com> Co-developed-by: Derick Frias <derick.william.moraes@gmail.com> Signed-off-by: Derick Frias <derick.william.moraes@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/pm: Fetch partition metrics on SMUv13.0.6Lijo Lazar
Add support to fetch compute partition related metrics in SMUv13.0.6 SOCs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Add sysfs nodes for partitionLijo Lazar
Add sysfs nodes to provide compute paritition specific data. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/pm: Add support to query partition metricsLijo Lazar
Add interfaces to query compute partition related metrics data. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Register aqua vanjaram jpeg poison irqStanley.Yang
Register aqua vanjaram jpeg poison irq, add jpeg poison handle. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Register aqua vanjaram vcn poison irqStanley.Yang
Register aqua vanjaram vcn poison irq, add vcn poison handle. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/pm: Use macro to initialize metrics tableLijo Lazar
Helps to keep a build time check about usage of right datatype and avoids maintenance as new versions get added. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/pm: Fill pldm version for SMU v13.0.6 SOCsAsad Kamal
Fetch pldm version from static metrics table for SMU v13.0.6 SOCs Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amd/pm: Update pmfw headers for smu_v_13_0_6Asad Kamal
Update pmfw headers for smu_v_13_0_6 to include pldm version as part of statics metrics table Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: Fix eviction fence worker race during fd closeJesse.Zhang
The current cleanup order during file descriptor close can lead to a race condition where the eviction fence worker attempts to access a destroyed mutex from the user queue manager: [ 517.294055] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 517.294060] WARNING: CPU: 8 PID: 2030 at kernel/locking/mutex.c:564 [ 517.294094] Workqueue: events amdgpu_eviction_fence_suspend_worker [amdgpu] The issue occurs because: 1. We destroy the user queue manager (including its mutex) first 2. Then try to destroy eviction fences which may have pending work 3. The eviction fence worker may try to access the already-destroyed mutex Fix this by reordering the cleanup to: 1. First mark the fd as closing and destroy eviction fences, which flushes any pending work 2. Then safely destroy the user queue manager after we're certain no more fence work will be executed The copy in amdgpu_driver_postclose_kms() needs to be removed (Christian) Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdgpu: lock the eviction fence for wq signals itPrike Liang
Lock and refer to the eviction fence before the eviction fence schedules work queue tries to signal it. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22drm/amdkfd: Change svm_range_get_info return typeAndrey Vatoropin
Static analysis shows that pointer "svms" cannot be NULL because it points to the object "struct svm_range_list". Remove the extra NULL check. It is meaningless and harms the readability of the code. In the function svm_range_get_info() there is no possibility of failure. Therefore, the caller of the function svm_range_get_info() does not need a return value. Change the function svm_range_get_info() return type from "int" to "void". Since the function svm_range_get_info() has a return type of "void". The caller of the function svm_range_get_info() does not need a return value. Delete extra code. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22EDAC/bluefield: Don't use bluefield_edac_readl() result on errorDavid Thompson
The bluefield_edac_readl() routine returns an uninitialized result on error paths. In those cases the calling routine should not use the uninitialized result. The driver should simply log the error, and then return early. Fixes: e41967575474 ("EDAC/bluefield: Use Arm SMC for EMI access on BlueField-2") Signed-off-by: David Thompson <davthompson@nvidia.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shravan Kumar Ramani <shravankr@nvidia.com> Link: https://lore.kernel.org/20250318214747.12271-1-davthompson@nvidia.com
2025-05-22eth: bnxt: fix deadlock when xdp is attached or detachedTaehee Yoo
When xdp is attached or detached, dev->ndo_bpf() is called by do_setlink(), and it acquires netdev_lock() if needed. Unlike other drivers, the bnxt driver is protected by netdev_lock while xdp is attached/detached because it sets dev->request_ops_lock to true. So, the bnxt_xdp(), that is callback of ->ndo_bpf should not acquire netdev_lock(). But the xdp_features_{set | clear}_redirect_target() was changed to acquire netdev_lock() internally. It causes a deadlock. To fix this problem, bnxt driver should use xdp_features_{set | clear}_redirect_target_locked() instead. Splat looks like: ============================================ WARNING: possible recursive locking detected 6.15.0-rc6+ #1 Not tainted -------------------------------------------- bpftool/1745 is trying to acquire lock: ffff888131b85038 (&dev->lock){+.+.}-{4:4}, at: xdp_features_set_redirect_target+0x1f/0x80 but task is already holding lock: ffff888131b85038 (&dev->lock){+.+.}-{4:4}, at: do_setlink.constprop.0+0x24e/0x35d0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&dev->lock); lock(&dev->lock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by bpftool/1745: #0: ffffffffa56131c8 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_setlink+0x1fe/0x570 #1: ffffffffaafa75a0 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_setlink+0x236/0x570 #2: ffff888131b85038 (&dev->lock){+.+.}-{4:4}, at: do_setlink.constprop.0+0x24e/0x35d0 stack backtrace: CPU: 1 UID: 0 PID: 1745 Comm: bpftool Not tainted 6.15.0-rc6+ #1 PREEMPT(undef) Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 Call Trace: <TASK> dump_stack_lvl+0x7a/0xd0 print_deadlock_bug+0x294/0x3d0 __lock_acquire+0x153b/0x28f0 lock_acquire+0x184/0x340 ? xdp_features_set_redirect_target+0x1f/0x80 __mutex_lock+0x1ac/0x18a0 ? xdp_features_set_redirect_target+0x1f/0x80 ? xdp_features_set_redirect_target+0x1f/0x80 ? __pfx_bnxt_rx_page_skb+0x10/0x10 [bnxt_en ? __pfx___mutex_lock+0x10/0x10 ? __pfx_netdev_update_features+0x10/0x10 ? bnxt_set_rx_skb_mode+0x284/0x540 [bnxt_en ? __pfx_bnxt_set_rx_skb_mode+0x10/0x10 [bnxt_en ? xdp_features_set_redirect_target+0x1f/0x80 xdp_features_set_redirect_target+0x1f/0x80 bnxt_xdp+0x34e/0x730 [bnxt_en 11cbcce8fa11cff1dddd7ef358d6219e4ca9add3] dev_xdp_install+0x3f4/0x830 ? __pfx_bnxt_xdp+0x10/0x10 [bnxt_en 11cbcce8fa11cff1dddd7ef358d6219e4ca9add3] ? __pfx_dev_xdp_install+0x10/0x10 dev_xdp_attach+0x560/0xf70 dev_change_xdp_fd+0x22d/0x280 do_setlink.constprop.0+0x2989/0x35d0 ? __pfx_do_setlink.constprop.0+0x10/0x10 ? lock_acquire+0x184/0x340 ? find_held_lock+0x32/0x90 ? rtnl_setlink+0x236/0x570 ? rcu_is_watching+0x11/0xb0 ? trace_contention_end+0xdc/0x120 ? __mutex_lock+0x946/0x18a0 ? __pfx___mutex_lock+0x10/0x10 ? __lock_acquire+0xa95/0x28f0 ? rcu_is_watching+0x11/0xb0 ? rcu_is_watching+0x11/0xb0 ? cap_capable+0x172/0x350 rtnl_setlink+0x2cd/0x570 Fixes: 03df156dd3a6 ("xdp: double protect netdev->xdp_flags with netdev->lock") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250520071155.2462843-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>