summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)Author
2025-07-03wifi: mt76: mt7915: mcu: lower default timeoutDavid Bauer
The default timeout set in mt76_connac2_mcu_fill_message of 20 seconds leads to excessive stalling in case messages are lost. Testing showed that a smaller timeout of 5 seconds is sufficient in normal operation. Signed-off-by: David Bauer <mail@david-bauer.net> Link: https://patch.msgid.link/20250402004528.1036715-1-mail@david-bauer.net Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-03wifi: mt76: mt7915: mcu: increase eeprom command timeoutDavid Bauer
Increase the timeout for MCU_EXT_CMD_EFUSE_BUFFER_MODE command. Regular retries upon hardware-recovery have been observed. Increasing the timeout slightly remedies this problem. Signed-off-by: David Bauer <mail@david-bauer.net> Link: https://patch.msgid.link/20250402004528.1036715-2-mail@david-bauer.net Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-03igbvf: add tx_timeout_count to ethtool statisticsKohei Enju
Add `tx_timeout_count` to ethtool statistics to provide visibility into transmit timeout events, bringing igbvf in line with other Intel ethernet drivers. Currently `tx_timeout_count` is incremented in igbvf_watchdog_task() and igbvf_tx_timeout() but is not exposed to userspace nor used elsewhere in the driver. Before: # ethtool -S ens5 | grep tx tx_packets: 43 tx_bytes: 4408 tx_restart_queue: 0 After: # ethtool -S ens5 | grep tx tx_packets: 41 tx_bytes: 4241 tx_restart_queue: 0 tx_timeout_count: 0 Tested-by: Kohei Enju <enjuk@amazon.com> Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03igbvf: remove unused interrupt counter fields from struct igbvf_adapterKohei Enju
Remove `int_counter0` and `int_counter1` from struct igbvf_adapter since they are only incremented in interrupt handlers igbvf_intr_msix_rx() and igbvf_msix_other(), but never read or used anywhere in the driver. Note that igbvf_intr_msix_tx() does not have similar counter increments, suggesting that these were likely overlooked during development. Eliminate the fields and their unnecessary accesses in interrupt handlers. Tested-by: Kohei Enju <enjuk@amazon.com> Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: spelling correctionsSimon Horman
Correct spelling as flagged by codespell. Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: turn off MDD while modifying SRRCTLRadoslaw Tyl
Modifying SRRCTL register can generate MDD event. Turn MDD off during SRRCTL register write to prevent generating MDD. Fix RCT in ixgbe_set_rx_drop_en(). Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: add Tx hang detection unhandled MDDSlawomir Mrozowicz
Add Tx Hang detection due to an unhandled MDD Event. Previously, a malicious VF could disable the entire port causing TX to hang on the E610 card. Those events that caused PF to freeze were not detected as an MDD event and usually required a Tx Hang watchdog timer to catch the suspension, and perform a physical function reset. Implement flows in the affected PF driver in such a way to check the cause of the hang, detect it as an MDD event and log an entry of the malicious VF that caused the Hang. The PF blocks the malicious VF, if it continues to be the source of several MDD events. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com> Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: check for MDD eventsDon Skidmore
When an event is detected it is logged and, for the time being, the queue is immediately re-enabled. This is due to the lack of an API to the hypervisor so it could deal with it as it chooses. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: add MDD supportPaul Greenwalt
Add malicious driver detection to ixgbe driver. The supported devices are E610 and X550. Handling MDD events is enabled while VFs are created and turned off when they are disabled. There is no runtime command to enable or disable MDD independently. MDD event is logged when malicious VF driver is detected. For example VF can try to send incorrect Tx descriptor (TSO on, but length field not correct). It can be reproduced by manipulating the driver, or using driver with incorrect descriptor values. Example log: "Malicious event on VF 0 tx:128 rx:128" Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03i40e: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel i40e driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel ixgbe driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03igb: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel igb driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03igc: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel igc driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ice: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel ice driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03bonding: don't force LACPDU tx to ~333 ms boundariesSeth Forshee (DigitalOcean)
The timer which ensures that no more than 3 LACPDUs are transmitted in a second rearms itself every 333ms regardless of whether an LACPDU is transmitted when the timer expires. This causes LACPDU tx to be delayed until the next expiration of the timer, which effectively aligns LACPDUs to ~333ms boundaries. This results in a variable amount of jitter in the timing of periodic LACPDUs. Change this to only rearm the timer when an LACPDU is actually sent, allowing tx at any point after the timer has expired. Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> Reviewed-by: Carlos Bilbao <carlos.bilbao@kernel.org> Link: https://patch.msgid.link/20250625-fix-lacpdu-jitter-v1-1-4d0ee627e1ba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03net: ngbe: specify IRQ vector when the number of VFs is 7Jiawen Wu
For NGBE devices, the queue number is limited to be 1 when SRIOV is enabled. In this case, IRQ vector[0] is used for MISC and vector[1] is used for queue, based on the previous patches. But for the hardware design, the IRQ vector[1] must be allocated for use by the VF[6] when the number of VFs is 7. So the IRQ vector[0] should be shared for PF MISC and QUEUE interrupts. +-----------+----------------------+ | Vector | Assigned To | +-----------+----------------------+ | Vector 0 | PF MISC and QUEUE | | Vector 1 | VF 6 | | Vector 2 | VF 5 | | Vector 3 | VF 4 | | Vector 4 | VF 3 | | Vector 5 | VF 2 | | Vector 6 | VF 1 | | Vector 7 | VF 0 | +-----------+----------------------+ Minimize code modifications, only adjust the IRQ vector number for this case. Fixes: 877253d2cbf2 ("net: ngbe: add sriov function support") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20250701063030.59340-4-jiawenwu@trustnetic.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03net: wangxun: revert the adjustment of the IRQ vector sequenceJiawen Wu
Due to hardware limitations of NGBE, queue IRQs can only be requested on vector 0 to 7. When the number of queues is set to the maximum 8, the PCI IRQ vectors are allocated from 0 to 8. The vector 0 is used by MISC interrupt, and althrough the vector 8 is used by queue interrupt, it is unable to receive packets. This will cause some packets to be dropped when RSS is enabled and they are assigned to queue 8. So revert the adjustment of the MISC IRQ location, to make it be the last one in IRQ vectors. Fixes: 937d46ecc5f9 ("net: wangxun: add ethtool_ops for channel number") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20250701063030.59340-3-jiawenwu@trustnetic.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03net: txgbe: request MISC IRQ in ndo_openJiawen Wu
Move the creating of irq_domain for MISC IRQ from .probe to .ndo_open, and free it in .ndo_stop, to maintain consistency with the queue IRQs. This it for subsequent adjustments to the IRQ vectors. Fixes: aefd013624a1 ("net: txgbe: use irq_domain for interrupt controller") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250701063030.59340-2-jiawenwu@trustnetic.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03virtio_net: Enforce minimum TX ring size for reliabilityLaurent Vivier
The `tx_may_stop()` logic stops TX queues if free descriptors (`sq->vq->num_free`) fall below the threshold of (`MAX_SKB_FRAGS` + 2). If the total ring size (`ring_num`) is not strictly greater than this value, queues can become persistently stopped or stop after minimal use, severely degrading performance. A single sk_buff transmission typically requires descriptors for: - The virtio_net_hdr (1 descriptor) - The sk_buff's linear data (head) (1 descriptor) - Paged fragments (up to MAX_SKB_FRAGS descriptors) This patch enforces that the TX ring size ('ring_num') must be strictly greater than (MAX_SKB_FRAGS + 2). This ensures that the ring is always large enough to hold at least one maximally-fragmented packet plus at least one additional slot. Reported-by: Lei Yang <leiyang@redhat.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250521092236.661410-4-lvivier@redhat.com Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03virtio_net: Cleanup '2+MAX_SKB_FRAGS'Laurent Vivier
Improve consistency by using everywhere it is needed 'MAX_SKB_FRAGS + 2' rather than '2+MAX_SKB_FRAGS' or '2 + MAX_SKB_FRAGS'. No functional change. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250521092236.661410-3-lvivier@redhat.com Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03virtio-net: xsk: rx: fix the frame's length checkBui Quang Minh
When calling buf_to_xdp, the len argument is the frame data's length without virtio header's length (vi->hdr_len). We check that len with xsk_pool_get_rx_frame_size() + vi->hdr_len to ensure the provided len does not larger than the allocated chunk size. The additional vi->hdr_len is because in virtnet_add_recvbuf_xsk, we use part of XDP_PACKET_HEADROOM for virtio header and ask the vhost to start placing data from hard_start + XDP_PACKET_HEADROOM - vi->hdr_len not hard_start + XDP_PACKET_HEADROOM But the first buffer has virtio_header, so the maximum frame's length in the first buffer can only be xsk_pool_get_rx_frame_size() not xsk_pool_get_rx_frame_size() + vi->hdr_len like in the current check. This commit adds an additional argument to buf_to_xdp differentiate between the first buffer and other ones to correctly calculate the maximum frame's length. Cc: stable@vger.kernel.org Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Fixes: a4e7ba702701 ("virtio_net: xsk: rx: support recv small mode") Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Link: https://patch.msgid.link/20250630151315.86722-2-minhquangbui99@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03virtio-net: use the check_mergeable_len helperBui Quang Minh
Replace the current repeated code to check received length in mergeable mode with the new check_mergeable_len helper. Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250630144212.48471-4-minhquangbui99@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03virtio-net: remove redundant truesize check with PAGE_SIZEBui Quang Minh
The truesize is guaranteed not to exceed PAGE_SIZE in get_mergeable_buf_len(). It is saved in mergeable context, which is not changeable by the host side, so the check in receive path is quite redundant. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Link: https://patch.msgid.link/20250630144212.48471-3-minhquangbui99@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03virtio-net: ensure the received length does not exceed allocated sizeBui Quang Minh
In xdp_linearize_page, when reading the following buffers from the ring, we forget to check the received length with the true allocate size. This can lead to an out-of-bound read. This commit adds that missing check. Cc: <stable@vger.kernel.org> Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set") Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250630144212.48471-2-minhquangbui99@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03wifi: iwlwifi: Fix error code in iwl_op_mode_dvm_start()Dan Carpenter
Preserve the error code if iwl_setup_deferred_work() fails. The current code returns ERR_PTR(0) (which is NULL) on this path. I believe the missing error code potentially leads to a use after free involving debugfs. Fixes: 90a0d9f33996 ("iwlwifi: Add missing check for alloc_ordered_workqueue") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/a7a1cd2c-ce01-461a-9afd-dbe535f8df01@sabinyo.mountain Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
2025-07-02net/mlx5: Manage TC arbiter nodes and implement full support for tc-bwCarolina Jubran
Introduce support for managing Traffic Class (TC) arbiter nodes and associated vports TC nodes within the E-Switch QoS hierarchy. This patch adds support for the new scheduling node type, `SCHED_NODE_TYPE_VPORTS_TC_TSAR`, and implements full support for setting tc-bw on both vports and nodes. Key changes include: - Introduced the new scheduling node type, `SCHED_NODE_TYPE_VPORTS_TC_TSAR`, for managing vports within the TC arbiter node. - New helper functions for creating and destroying vports TC nodes under the TC arbiter. - Updated the minimum rate normalization function to skip nodes of type `SCHED_NODE_TYPE_VPORTS_TC_TSAR`. Vports TC TSARs have bandwidth shares configured on them but not minimum rates, so their `min_rate` cannot be normalized. - Implementation of `esw_qos_tc_arbiter_scheduling_setup()` and `esw_qos_tc_arbiter_scheduling_teardown()` for initializing and cleaning up TC arbiter scheduling elements. These functions now fully support tc-bw configuration on TC arbiter nodes. - Introduced a new helper `esw_qos_calculate_tc_bw_divider()` to compute the total TC bandwidth share, which is used as a divider for normalizing each TC's share. - Added `esw_qos_tc_arbiter_get_bw_shares()` and `esw_qos_set_tc_arbiter_bw_shares()` to handle the settings of bandwidth shares for vports traffic class TSARs. - `esw_qos_set_tc_arbiter_bw_shares()` normalizes each TC share based on the total and the firmware's maximum allowed TSAR bandwidth share. - Refactored `mlx5_esw_devlink_rate_node_tc_bw_set()` and `mlx5_esw_devlink_rate_leaf_tc_bw_set()` to fully support configuring tc-bw on devlink rate nodes and vports, respectively. - Refactored `mlx5_esw_qos_node_update_parent()` to ensure that tc-bw configuration remains compatible with setting a parent on a rate node, preserving level hierarchy functionality. - Refactored `esw_qos_calc_bw_share()` to generalize its input so it can be used for both minimum rate and bandwidth share calculations. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250629142138.361537-8-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02net/mlx5: Add traffic class scheduling support for vport QoSCarolina Jubran
Introduce support for traffic class (TC) scheduling on vports by allowing the vport to own multiple TC scheduling nodes. This patch enables more granular control of QoS by defining three distinct QoS states for vports, each providing unique scheduling behavior: 1. Regular QoS: The `sched_node` represents the vport directly, handling QoS as a single scheduling entity. 2. TC QoS on the vport: The `sched_node` acts as a TC arbiter, enabling TC scheduling directly on the vport. 3. TC QoS on the parent node: The `sched_node` functions as a rate limiter, with TC arbitration enabled at the parent level, associating multiple scheduling nodes with each vport. Key changes include: - Added support for new scheduling elements, vport traffic class and rate limiter. - New helper functions for creating, destroying, and restoring vport TC scheduling nodes, handling transitions between regular QoS and TC arbitration states. - Updated `esw_qos_vport_enable()` and `esw_qos_vport_disable()` to support both regular QoS and TC arbitration states, ensuring consistent transitions between scheduling modes. - Introduced a `sched_nodes` array under `vport->qos` to store multiple TC scheduling nodes per vport, enabling finer control over per-TC QoS. - Enhanced `esw_qos_vport_update_parent()` to handle transitions between the three QoS states based on the current and new parent node types. This patch lays the groundwork for future support for configuring tc-bw on vports. Although the infrastructure is in place, full support for tc-bw is not yet implemented; attempts to set tc-bw on vports will return `-EOPNOTSUPP`. No functional changes are introduced at this stage. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250629142138.361537-7-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02net/mlx5: Add support for setting tc-bw on nodesCarolina Jubran
Introduce support for enabling and disabling Traffic Class (TC) arbitration for existing devlink rate nodes. This patch adds support for a new scheduling node type, `SCHED_NODE_TYPE_TC_ARBITER_TSAR`. Key changes include: - New helper functions for transitioning existing rate nodes to TC arbiter nodes and vice versa. These functions handle the allocation of TC arbiter nodes, copying of child nodes, and restoring vport QoS settings when TC arbitration is disabled. - Implementation of `mlx5_esw_devlink_rate_node_tc_bw_set()` to manage tc-bw configuration on nodes. - Introduced stubs for `esw_qos_tc_arbiter_scheduling_setup()` and `esw_qos_tc_arbiter_scheduling_teardown()`, which will be extended in future patches to provide full support for tc-bw on devlink rate objects. - Validation functions for tc-bw settings, allowing graceful handling of unsupported traffic class bandwidth configurations. - Updated `__esw_qos_alloc_node()` to insert the new node into the parent’s children list only if the parent is not NULL. For the root TSAR, the new node is inserted directly after the allocation call. - Don't allow `tc-bw` configuration for nodes containing non-leaf children. This patch lays the groundwork for future support for configuring tc-bw on devlink rate nodes. Although the infrastructure is in place, full support for tc-bw is not yet implemented; attempts to set tc-bw on nodes will return `-EOPNOTSUPP`. No functional changes are introduced at this stage. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250629142138.361537-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02net/mlx5: Add no-op implementation for setting tc-bw on rate objectsCarolina Jubran
Introduce `mlx5_esw_devlink_rate_node_tc_bw_set()` and `mlx5_esw_devlink_rate_leaf_tc_bw_set()` with no-op logic. Future patches will add support for setting traffic class bandwidth on rate objects. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250629142138.361537-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02selftest: netdevsim: Add devlink rate tc-bw testCarolina Jubran
Test verifies that netdevsim correctly implements devlink ops callbacks that set tc-bw on leaf or node rate object. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250629142138.361537-4-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02netlink: introduce type-checking attribute iteration for nlmsgCarolina Jubran
Add the nlmsg_for_each_attr_type() macro to simplify iteration over attributes of a specific type in a Netlink message. Convert existing users in vxlan and nfsd to use the new macro. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250629142138.361537-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02tun: remove unnecessary tun_xdp_hdr structureJason Wang
With f95f0f95cfb7("net, xdp: Introduce xdp_init_buff utility routine"), buffer length could be stored as frame size so there's no need to have a dedicated tun_xdp_hdr structure. We can simply store virtio net header instead. Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250701010352.74515-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02Merge branch '200GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-07-01 (idpf, igc) For idpf: Michal returns 0 for key size when RSS is not supported. Ahmed changes control queue to a spinlock due to sleeping calls. For igc: Vitaly disables L1.2 PCI-E link substate on I226 devices to resolve performance issues. * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: disable L1.2 PCI-E link substate to avoid performance issue idpf: convert control queue mutex to a spinlock idpf: return 0 size for RSS key if not supported ==================== Link: https://patch.msgid.link/20250701164317.2983952-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02amd-xgbe: add support for giant packet sizeRaju Rangoju
AMD XGBE hardware supports giant Ethernet frames up to 16K bytes. Add support for configuring and enabling giant packet handling in the driver. - Define new register fields and macros for giant packet support. - Update the jumbo frame configuration logic to enable giant packet mode when MTU exceeds the jumbo threshold. Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250701121929.319690-1-Raju.Rangoju@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02net: ifb: support BIG TCP packetsEric Dumazet
Set the driver limit to GSO_MAX_SIZE (512 KB). This allows the admin/user to set a GSO limit up to this value, to avoid segmenting too large GRO packets in the netem -> ifb path. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250701084540.459261-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02net: libwx: fix the incorrect display of the queue numberJiawen Wu
When setting "ethtool -L eth0 combined 1", the number of RX/TX queue is changed to be 1. RSS is disabled at this moment, and the indices of FDIR have not be changed in wx_set_rss_queues(). So the combined count still shows the previous value. This issue was introduced when supporting FDIR. Fix it for those devices that support FDIR. Fixes: 34744a7749b3 ("net: txgbe: add FDIR info to ethtool ops") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/A5C8FE56D6C04608+20250701070625.73680-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02amd-xgbe: do not double read link statusRaju Rangoju
The link status is latched low so that momentary link drops can be detected. Always double-reading the status defeats this design feature. Only double read if link was already down This prevents unnecessary duplicate readings of the link status. Fixes: 4f3b20bfbb75 ("amd-xgbe: add support for rx-adaptation") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250701065016.4140707-1-Raju.Rangoju@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02net: thunderbolt: Enable end-to-end flow control also in transmitzhangjianrong
According to USB4 specification, if E2E flow control is disabled for the Transmit Descriptor Ring, the Host Interface Adapter Layer shall not require any credits to be available before transmitting a Tunneled Packet from this Transmit Descriptor Ring, so e2e flow control should be enabled in both directions. Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/20250624153805.GC2824380@black.fi.intel.com Signed-off-by: zhangjianrong <zhangjianrong5@huawei.com> Link: https://patch.msgid.link/20250628093813.647005-1-zhangjianrong5@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02net: thunderbolt: Fix the parameter passing of ↵zhangjianrong
tb_xdomain_enable_paths()/tb_xdomain_disable_paths() According to the description of tb_xdomain_enable_paths(), the third parameter represents the transmit ring and the fifth parameter represents the receive ring. tb_xdomain_disable_paths() is the same case. [Jakub] Mika says: it works now because both rings ->hop is the same Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/20250625051149.GD2824380@black.fi.intel.com Signed-off-by: zhangjianrong <zhangjianrong5@huawei.com> Link: https://patch.msgid.link/20250628094920.656658-1-zhangjianrong5@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02net/mlx5: Check device memory pointer before usageStav Aviram
Add a NULL check before accessing device memory to prevent a crash if dev->dm allocation in mlx5_init_once() fails. Fixes: c9b9dcb430b3 ("net/mlx5: Move device memory management to mlx5_core") Signed-off-by: Stav Aviram <saviram@nvidia.com> Link: https://patch.msgid.link/c88711327f4d74d5cebc730dc629607e989ca187.1751370035.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-07-02net/mlx5: fs, fix RDMA TRANSPORT init cleanup flowPatrisious Haddad
Failing during the initialization of root_namespace didn't cleanup the priorities of the namespace on which the failure occurred. Properly cleanup said priorities on failure. Fixes: 52931f55159e ("net/mlx5: fs, add multiple prios to RDMA TRANSPORT steering domain") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://patch.msgid.link/78cf89b5d8452caf1e979350b30ada6904362f66.1751451780.git.leon@kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-07-02wifi: ath12k: add extended NSS bandwidth support for 160 MHzPradeep Kumar Chitrapu
Currently rx and tx MCS map for 160 MHz under HE capabilities are not updating properly, when 160 MHz is configured with NSS lesser than max NSS support. Fix this by utilizing nss_ratio_enabled and nss_ratio_info fields sent by firmware in service ready event. However, if firmware advertises EXT NSS BW support in VHT caps as 1(1x2) and when nss_ratio_info indicates 1:1, reset the EXT NSS BW Support in VHT caps to 0 which indicates 1x1. This is to avoid incorrectly choosing 1:2 NSS ratio when using the default VHT caps advertised by firmware. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://patch.msgid.link/20250701010408.1257201-10-quic_pradeepc@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-07-02wifi: ath12k: add support for 160 MHz bandwidthPradeep Kumar Chitrapu
Add support to configure maximum NSS in 160 MHz bandwidth. Firmware advertises support for handling NSS ratio information as a part of service ready ext event using nss_ratio_enabled flag. Save this information in ath12k_pdev_cap to calculate NSS ratio. Additionally, reorder the code by moving ath12k_peer_assoc_h_phymode() before ath12k_peer_assoc_h_vht() to ensure that arg->peer_phymode correctly reflects the bandwidth in the max NSS calculation. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Co-developed-by: P Praneesh <quic_ppranees@quicinc.com> Signed-off-by: P Praneesh <quic_ppranees@quicinc.com> Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://patch.msgid.link/20250701010408.1257201-9-quic_pradeepc@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-07-02wifi: ath12k: clean up 80P80 supportPradeep Kumar Chitrapu
Clean up unused 80P80 references as hardware does not support it. This is applicable to both QCN9274 and WCN7850. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://patch.msgid.link/20250701010408.1257201-8-quic_pradeepc@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-07-02wifi: ath12k: add support for setting fixed HE rate/GI/LTFPradeep Kumar Chitrapu
Add support to set fixed HE rate/GI/LTF values using nl80211. Reuse parts of the existing code path already used for HT/VHT to implement the new helpers symmetrically, similar to how HT/VHT is handled. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Co-developed-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Link: https://patch.msgid.link/20250701010408.1257201-7-quic_pradeepc@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-07-02wifi: ath12k: generate rx and tx mcs maps for supported HE mcsPradeep Kumar Chitrapu
Generate rx and tx mcs maps in ath12k_mac_set_hemcsmap() based on number of supported tx/rx chains and set them in supported mcs/nss for HE capabilities. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Co-developed-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://patch.msgid.link/20250701010408.1257201-5-quic_pradeepc@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-07-02wifi: ath12k: move HE MCS mapper to a separate functionPradeep Kumar Chitrapu
Refactor the HE MCS mapper functionality in ath12k_mac_copy_he_cap() into a new function. This helps improve readability, extensibility and will be used when adding support for 160 MHz bandwidth in subsequent patches. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Co-developed-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://patch.msgid.link/20250701010408.1257201-4-quic_pradeepc@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-07-02wifi: ath12k: push EHT MU-MIMO params to hardwarePradeep Kumar Chitrapu
Currently, only the EHT IE in management frames is updated with respect to MU-MIMO configurations, but this change is not reflected in the hardware. Add support to propagate MU-MIMO configurations to the hardware as well for AP mode. Similar support for STA mode will be added in future. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Co-developed-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://patch.msgid.link/20250701010408.1257201-3-quic_pradeepc@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-07-02wifi: ath12k: push HE MU-MIMO params to hardwarePradeep Kumar Chitrapu
Currently, only the HE IE in management frames is updated with respect to MU-MIMO configurations, but this change is not reflected in the hardware. Add support to propagate MU-MIMO configurations to the hardware as well. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Co-developed-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://patch.msgid.link/20250701010408.1257201-2-quic_pradeepc@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-07-02nui: Fix dma_mapping_error() checkThomas Fourier
dma_map_XXX() functions return values DMA_MAPPING_ERROR as error values which is often ~0. The error value should be tested with dma_mapping_error(). This patch creates a new function in niu_ops to test if the mapping failed. The test is fixed in niu_rbr_add_page(), added in niu_start_xmit() and the successfully mapped pages are unmaped upon error. Fixes: ec2deec1f352 ("niu: Fix to check for dma mapping errors.") Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>