git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-07-15	net: dsa: tag_sja1105: prefer precise source port info on SJA1110 too	Vladimir Oltean
	Now that dsa_8021q_rcv() handles better the case where we don't overwrite the precise source information if it comes from an external (non-tag_8021q) source, we can now unify the call sequence between sja1105_rcv() and sja1110_rcv(). This is a preparatory change for creating a higher-level wrapper for the entire sequence which will live in tag_8021q. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20240713211620.1125910-6-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15	net: dsa: tag_sja1105: absorb entire sja1105_vlan_rcv() into dsa_8021q_rcv()	Vladimir Oltean
	tag_sja1105 has a wrapper over dsa_8021q_rcv(): sja1105_vlan_rcv(), which determines whether the packet came from a bridge with vlan_filtering=1 (the case resolved via dsa_find_designated_bridge_port_by_vid()), or if it contains a tag_8021q header. Looking at a new tagger implementation for vsc73xx, based also on tag_8021q, it is becoming clear that the logic is needed there as well. So instead of forcing each tagger to wrap around dsa_8021q_rcv(), let's merge the logic into the core. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Link: https://patch.msgid.link/20240713211620.1125910-5-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15	net: dsa: tag_sja1105: absorb logic for not overwriting precise info into ↵	Vladimir Oltean
	dsa_8021q_rcv() In both sja1105_rcv() and sja1110_rcv(), we may have precise source port information coming from parallel hardware mechanisms, in addition to the tag_8021q header. Only sja1105_rcv() has extra logic to not overwrite that precise info with what's present in the VLAN tag. This is because sja1110_rcv() gets by, by having a reversed set of checks when assigning skb->dev. When the source port is imprecise (vbid >=1), source_port and switch_id will be set to zeroes by dsa_8021q_rcv(), which might be problematic. But by checking for vbid >= 1 first, sja1110_rcv() fends that off. We would like to make more code common between sja1105_rcv() and sja1110_rcv(), and for that, we need to make sure that sja1110_rcv() also goes through the precise source port preservation logic. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20240713211620.1125910-4-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15	net: dsa: vsc73xx: Add vlan filtering	Pawel Dembicki
	This patch implements VLAN filtering for the vsc73xx driver. After starting VLAN filtering, the switch is reconfigured from QinQ to a simple VLAN aware mode. This is required because VSC73XX chips do not support inner VLAN tag filtering. Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20240713211620.1125910-3-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15	net: dsa: vsc73xx: add port_stp_state_set function	Pawel Dembicki
	This isn't a fully functional implementation of 802.1D, but port_stp_state_set is required for a future tag8021q operations. This implementation handles properly all states, but vsc73xx doesn't forward STP packets. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Link: https://patch.msgid.link/20240713211620.1125910-2-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15	net: ti: icssg-prueth: Split out common object into module	MD Danish Anwar
	icssg_prueth.c and icssg_prueth_sr1.c drivers use multiple common .c files. These common objects are getting added to multiple modules. As a result when both drivers are enabled in .config, below warning is seen. drivers/net/ethernet/ti/Makefile: icssg/icssg_common.o is added to multiple modules: icssg-prueth icssg-prueth-sr1 drivers/net/ethernet/ti/Makefile: icssg/icssg_classifier.o is added to multiple modules: icssg-prueth icssg-prueth-sr1 drivers/net/ethernet/ti/Makefile: icssg/icssg_config.o is added to multiple modules: icssg-prueth icssg-prueth-sr1 drivers/net/ethernet/ti/Makefile: icssg/icssg_mii_cfg.o is added to multiple modules: icssg-prueth icssg-prueth-sr1 drivers/net/ethernet/ti/Makefile: icssg/icssg_stats.o is added to multiple modules: icssg-prueth icssg-prueth-sr1 drivers/net/ethernet/ti/Makefile: icssg/icssg_ethtool.o is added to multiple modules: icssg-prueth icssg-prueth-sr1 Fix this by building a new module (icssg.o) for all the common objects. Both the driver can then depend on this common module. Some APIs being exported have emac_ as the prefix which may result into confusion with other existing APIs with emac_ prefix, to avoid confusion, rename the APIs being exported with emac_ to icssg_ prefix. This also fixes below error seen when both drivers are built. ERROR: modpost: "icssg_queue_pop" [drivers/net/ethernet/ti/icssg-prueth-sr1.ko] undefined! ERROR: modpost: "icssg_queue_push" [drivers/net/ethernet/ti/icssg-prueth-sr1.ko] undefined! Reported-and-tested-by: Thorsten Leemhuis <linux@leemhuis.info> Closes: https://lore.kernel.org/oe-kbuild-all/202405182038.ncf1mL7Z-lkp@intel.com/ Fixes: 487f7323f39a ("net: ti: icssg-prueth: Add helper functions to configure FDB") Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-14	virtio_net: Fix napi_skb_cache_put warning	Breno Leitao
	After the commit bdacf3e34945 ("net: Use nested-BH locking for napi_alloc_cache.") was merged, the following warning began to appear: WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0 __warn+0x12f/0x340 napi_skb_cache_put+0x82/0x4b0 napi_skb_cache_put+0x82/0x4b0 report_bug+0x165/0x370 handle_bug+0x3d/0x80 exc_invalid_op+0x1a/0x50 asm_exc_invalid_op+0x1a/0x20 __free_old_xmit+0x1c8/0x510 napi_skb_cache_put+0x82/0x4b0 __free_old_xmit+0x1c8/0x510 __free_old_xmit+0x1c8/0x510 __pfx___free_old_xmit+0x10/0x10 The issue arises because virtio is assuming it's running in NAPI context even when it's not, such as in the netpoll case. To resolve this, modify virtnet_poll_tx() to only set NAPI when budget is available. Same for virtnet_poll_cleantx(), which always assumed that it was in a NAPI context. Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Link: https://patch.msgid.link/20240712115325.54175-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	Merge branch 'net-phy-bcm5481x-add-support-for-broadr-reach-mode'	Jakub Kicinski
	Kamil Horák says: ==================== net: phy: bcm5481x: add support for BroadR-Reach mode PATCH 1 - Add the 10baseT1BRR_Full link mode PATCH 2 - Add the definitions of LRE registers, necessary to use BroadR-Reach modes on the BCM5481x PHY PATCH 3 - Add brr-mode flag to switch between IEEE802.3 and BroadR-Reach PATCH 4 - Implementation of the BroadR-Reach modes for the Broadcom PHYs ==================== Link: https://patch.msgid.link/20240712150709.3134474-1-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	net: phy: bcm-phy-lib: Implement BroadR-Reach link modes	Kamil Horák (2N)
	Implement single-pair BroadR-Reach modes on bcm5481x PHY by Broadcom. Create set of functions alternative to IEEE 802.3 to handle configuration of these modes on compatible Broadcom PHYs. There is only subset of capabilities supported because of limited collection of hardware available for the development. For BroadR-Reach capable PHYs, the LRE (Long Reach Ethernet) alternative register set is handled. Only bcm54811 PHY is verified, for bcm54810, there is some support possible but untested. There is no auto-negotiation of the link parameters (called LDS in the Broadcom terminology, Long-Distance Signaling) for bcm54811. It should be possible to enable LDS for bcm54810. Signed-off-by: Kamil Horák (2N) <kamilh@axis.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240712150709.3134474-5-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	dt-bindings: ethernet-phy: add optional brr-mode flag	Kamil Horák (2N)
	There is a group of PHY chips supporting BroadR-Reach link modes in a manner allowing for more or less identical register usage as standard Clause 22 PHY. These chips support standard Ethernet link modes as well, however, the circuitry is mutually exclusive and cannot be auto-detected. The link modes in question are 100Base-T1 as defined in IEEE802.3bw, based on Broadcom's 1BR-100 link mode, and newly defined 10Base-T1BRR (1BR-10 in Broadcom documents). Add optional brr-mode flag to switch the PHY to BroadR-Reach mode. Signed-off-by: Kamil Horák (2N) <kamilh@axis.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240712150709.3134474-4-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	net: phy: bcm54811: Add LRE registers definitions	Kamil Horák (2N)
	Add the definitions of LRE registers for Broadcom BCM5481x PHY Signed-off-by: Kamil Horák (2N) <kamilh@axis.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240712150709.3134474-3-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	net: phy: bcm54811: New link mode for BroadR-Reach	Kamil Horák (2N)
	Introduce a new link mode necessary for 10 MBit single-pair connection in BroadR-Reach mode on bcm5481x PHY by Broadcom. This new link mode, 10baseT1BRR, is known as 1BR10 in the Broadcom terminology. Another link mode to be used is 1BR100 and it is already present as 100baseT1, because Broadcom's 1BR100 became 100baseT1 (IEEE 802.3bw). Signed-off-by: Kamil Horák (2N) <kamilh@axis.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240712150709.3134474-2-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	Merge branch 'virtio-net-support-af_xdp-zero-copy'	Jakub Kicinski
	Xuan Zhuo says: ==================== virtio-net: support AF_XDP zero copy v5: http://lore.kernel.org/all/20240611114147.31320-1-xuanzhuo@linux.alibaba.com XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero copy feature of xsk (XDP socket) needs to be supported by the driver. The performance of zero copy is very good. mlx5 and intel ixgbe already support this feature, This patch set allows virtio-net to support xsk's zerocopy xmit feature. At present, we have completed some preparation: 1. vq-reset (virtio spec and kernel code) 2. virtio-core premapped dma 3. virtio-net xdp refactor So it is time for Virtio-Net to complete the support for the XDP Socket Zerocopy. Virtio-net can not increase the queue num at will, so xsk shares the queue with kernel. On the other hand, Virtio-Net does not support generate interrupt from driver manually, so when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the local CPU, then we wake up napi directly. This patch set includes some refactor to the virtio-net to let that to support AF_XDP. Because there are too many commits, the work of virtio net supporting af-xdp is split to rx part and tx part. This patch set is for rx part. So the flag NETDEV_XDP_ACT_XSK_ZEROCOPY is not added, if someone want to test for af-xdp rx, the flag needs to be adding locally. ENV: Qemu with vhost-user(polling mode). Host CPU: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 19531092064 RX-missed: 0 RX-bytes: 1093741155584 RX-errors: 0 RX-nombuf: 0 TX-packets: 5959955552 TX-errors: 0 TX-bytes: 371030645664 Throughput (since last show) Rx-pps: 8861574 Rx-bps: 3969985208 Tx-pps: 8861493 Tx-bps: 3969962736 ############################################################################ testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 68152727 RX-missed: 0 RX-bytes: 3816552712 RX-errors: 0 RX-nombuf: 0 TX-packets: 68114967 TX-errors: 33216 TX-bytes: 3814438152 Throughput (since last show) Rx-pps: 6333196 Rx-bps: 2837272088 Tx-pps: 6333227 Tx-bps: 2837285936 ############################################################################ But AF_XDP consumes more CPU for tx and rx napi(100% and 86%). ==================== Link: https://patch.msgid.link/20240708112537.96291-1-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: xsk: rx: support recv merge mode	Xuan Zhuo
	Support AF-XDP for merge mode. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-11-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: xsk: rx: support recv small mode	Xuan Zhuo
	In the process: 1. We may need to copy data to create skb for XDP_PASS. 2. We may need to call xsk_buff_free() to release the buffer. 3. The handle for xdp_buff is difference from the buffer. If we pushed this logic into existing receive handle(merge and small), we would have to maintain code scattered inside merge and small (and big). So I think it is a good choice for us to put the xsk code into an independent function. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-10-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: xsk: rx: support fill with xsk buffer	Xuan Zhuo
	Implement the logic of filling rq with XSK buffers. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-9-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: xsk: support wakeup	Xuan Zhuo
	xsk wakeup is used to trigger the logic for xsk xmit by xsk framework or user. Virtio-net does not support to actively generate an interruption, so it tries to trigger tx NAPI on the local cpu. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-8-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: xsk: bind/unbind xsk for rx	Xuan Zhuo
	This patch implement the logic of bind/unbind xsk pool to rq. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-7-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: separate receive_mergeable	Xuan Zhuo
	This commit separates the function receive_mergeable(), put the logic of appending frag to the skb as an independent function. The subsequent commit will reuse it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-6-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: separate receive_buf	Xuan Zhuo
	This commit separates the function receive_buf(), then we wrap the logic of handling the skb to an independent function virtnet_receive_done(). The subsequent commit will reuse it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-5-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: separate virtnet_tx_resize()	Xuan Zhuo
	This patch separates two sub-functions from virtnet_tx_resize(): * virtnet_tx_pause * virtnet_tx_resume Then the subsequent virtnet_tx_reset() can share these two functions. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-4-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: separate virtnet_rx_resize()	Xuan Zhuo
	This patch separates two sub-functions from virtnet_rx_resize(): * virtnet_rx_pause * virtnet_rx_resume Then the subsequent reset rx for xsk can share these two functions. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-3-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	virtio_net: replace VIRTIO_XDP_HEADROOM by XDP_PACKET_HEADROOM	Xuan Zhuo
	virtio net has VIRTIO_XDP_HEADROOM that is equal to XDP_PACKET_HEADROOM to calculate the headroom for xdp. But here we should use the macro XDP_PACKET_HEADROOM from bpf.h to calculate the headroom for xdp. So here we remove the VIRTIO_XDP_HEADROOM, and use the XDP_PACKET_HEADROOM to replace it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20240708112537.96291-2-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	Merge branch ↵	Jakub Kicinski
	'eliminate-config_nr_cpus-dependency-in-dpaa-eth-and-enable-compile_test-in-fsl_qbman' Vladimir Oltean says: ==================== Eliminate CONFIG_NR_CPUS dependency in dpaa-eth and enable COMPILE_TEST in fsl_qbman Breno's previous attempt at enabling COMPILE_TEST for the fsl_qbman driver (now included here as patch 5/5) triggered compilation warnings for large CONFIG_NR_CPUS values: https://lore.kernel.org/all/202406261920.l5pzM1rj-lkp@intel.com/ Patch 1/5 switches two NR_CPUS arrays in the dpaa-eth driver to dynamic allocation to avoid that warning. There is more NR_CPUS usage in the fsl-qbman driver, but that looks relatively harmless and I couldn't find a good reason to change it. I noticed, while testing, that the driver doesn't actually work properly with high CONFIG_NR_CPUS values, and patch 2/5 addresses that. During code analysis, I have identified two places which treat conditions that can never happen. Patches 3/5 and 4/5 simplify the probing code - dpaa_fq_setup() - just a little bit. Finally we have at 5/5 the patch that triggered all of this. There is an okay from Herbert to take it via netdev, despite it being on soc/qbman: https://lore.kernel.org/all/Zns%2FeVVBc7pdv0yM@gondor.apana.org.au/ Link to v1: https://lore.kernel.org/netdev/20240710230025.46487-1-vladimir.oltean@nxp.com/ ==================== Link: https://patch.msgid.link/20240713225336.1746343-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	soc: fsl: qbman: FSL_DPAA depends on COMPILE_TEST	Breno Leitao
	As most of the drivers that depend on ARCH_LAYERSCAPE, make FSL_DPAA depend on COMPILE_TEST for compilation and testing. # grep -r depends.\ARCH_LAYERSCAPE.\COMPILE_TEST \| wc -l 29 Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://patch.msgid.link/20240713225336.1746343-6-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	net: dpaa: no need to make sure all CPUs receive a corresponding Tx queue	Vladimir Oltean
	dpaa_fq_setup() iterates through the &priv->dpaa_fq_list elements allocated by dpaa_alloc_all_fqs(). This includes a call to: if (!dpaa_fq_alloc(dev, 0, dpaa_max_num_txqs(), list, FQ_TYPE_TX)) goto fq_alloc_failed; which gives us dpaa_max_num_txqs() elements of FQ_TYPE_TX type. The code block which we are deleting runs after an earlier iteration through &priv->dpaa_fq_list. So at the end of this iteration (for which there is no early break), egress_cnt will be unconditionally equal to dpaa_max_num_txqs(). In other words, dpaa_alloc_all_fqs() has already allocated TX queues for all possible CPUs and the maximal number of traffic classes, and we've already iterated once through them all. The while() condition is dead code, remove it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://patch.msgid.link/20240713225336.1746343-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	net: dpaa: stop ignoring TX queues past the number of CPUs	Vladimir Oltean
	dpaa_fq_setup() iterates through the queues allocated by dpaa_alloc_all_fqs() and saved in &priv->dpaa_fq_list. The allocation for FQ_TYPE_TX looks as follows: if (!dpaa_fq_alloc(dev, 0, dpaa_max_num_txqs(), list, FQ_TYPE_TX)) goto fq_alloc_failed; Thus, iterating again through FQ_TYPE_TX queues in dpaa_fq_setup() and counting them will never yield an egress_cnt larger than the allocated size, dpaa_max_num_txqs(). The comparison serves no purpose since it is always true; remove it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://patch.msgid.link/20240713225336.1746343-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	net: dpaa: eliminate NR_CPUS dependency in egress_fqs[] and conf_fqs[]	Vladimir Oltean
	The driver uses the DPAA_TC_TXQ_NUM and DPAA_ETH_TXQ_NUM macros for TX queue handling, and they depend on CONFIG_NR_CPUS. In generic .config files, these can go to very large (8096 CPUs) values for the systems that DPAA1 is integrated in (1-24 CPUs). We allocate a lot of resources that will never be used. Those are: - system memory - QMan FQIDs as managed by qman_alloc_fqid_range(). This is especially painful since currently, when booting with CONFIG_NR_CPUS=8096, a LS1046A-RDB system will only manage to probe 3 of its 6 interfaces. The rest will run out of FQD ("/reserved-memory/qman-fqd" in the device tree) and fail at the qman_create_fq() stage of the probing process. - netdev queues as alloc_etherdev_mq() argument. The high queue indices are simply hidden from the network stack after the call to netif_set_real_num_tx_queues(). With just a tiny bit more effort, we can replace the NR_CPUS compile-time constant with the num_possible_cpus() run-time constant, and dynamically allocate the egress_fqs[] and conf_fqs[] arrays. Even on a system with a high CONFIG_NR_CPUS, num_possible_cpus() will remain equal to the number of available cores on the SoC. The replacement is as follows: - DPAA_TC_TXQ_NUM -> dpaa_num_txqs_per_tc() - DPAA_ETH_TXQ_NUM -> dpaa_max_num_txqs() Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://patch.msgid.link/20240713225336.1746343-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	net: dpaa: avoid on-stack arrays of NR_CPUS elements	Vladimir Oltean
	The dpaa-eth driver is written for PowerPC and Arm SoCs which have 1-24 CPUs. It depends on CONFIG_NR_CPUS having a reasonably small value in Kconfig. Otherwise, there are 2 functions which allocate on-stack arrays of NR_CPUS elements, and these can quickly explode in size, leading to warnings such as: drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:3280:12: warning: stack frame size (16664) exceeds limit (2048) in 'dpaa_eth_probe' [-Wframe-larger-than] The problem is twofold: - Reducing the array size to the boot-time num_possible_cpus() (rather than the compile-time NR_CPUS) creates a variable-length array, which should be avoided in the Linux kernel. - Using NR_CPUS as an array size makes the driver blow up in stack consumption with generic, as opposed to hand-crafted, .config files. A simple solution is to use dynamic allocation for num_possible_cpus() elements (aka a small number determined at runtime). Link: https://lore.kernel.org/all/202406261920.l5pzM1rj-lkp@intel.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Breno Leitao <leitao@debian.org> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://patch.msgid.link/20240713225336.1746343-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	Merge tag 'ipsec-next-2024-07-13' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2024-07-13 1) Support sending NAT keepalives in ESP in UDP states. Userspace IKE daemon had to do this before, but the kernel can better keep track of it. From Eyal Birger. 2) Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP data paths. Currently, IPsec crypto offload is enabled for GRO code path only. This patchset support UDP encapsulation for the non GRO path. From Mike Yu. * tag 'ipsec-next-2024-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: Support crypto offload for outbound IPv4 UDP-encapsulated ESP packet xfrm: Support crypto offload for inbound IPv4 UDP-encapsulated ESP packet xfrm: Allow UDP encapsulation in crypto offload control path xfrm: Support crypto offload for inbound IPv6 ESP packets not in GRO path xfrm: support sending NAT keepalives in ESP in UDP states ==================== Link: https://patch.msgid.link/20240713102416.3272997-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	Merge branch 'introduce-en7581-ethernet-support'	Jakub Kicinski
	Lorenzo Bianconi says: ==================== Introduce EN7581 ethernet support Add airoha_eth driver in order to introduce ethernet support for Airoha EN7581 SoC available on EN7581 development board. EN7581 mac controller is mainly composed by Frame Engine (FE) and QoS-DMA (QDMA) modules. FE is used for traffic offloading (just basic functionalities are supported now) while QDMA is used for DMA operation and QOS functionalities between mac layer and the dsa switch (hw QoS is not available yet and it will be added in the future). Currently only hw lan features are available, hw wan will be added with subsequent patches. ==================== Link: https://patch.msgid.link/cover.1720818878.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	net: airoha: Introduce ethernet support for EN7581 SoC	Lorenzo Bianconi
	Add airoha_eth driver in order to introduce ethernet support for Airoha EN7581 SoC available on EN7581 development board (en7581-evb). EN7581 mac controller is mainly composed by the Frame Engine (PSE+PPE) and QoS-DMA (QDMA) modules. FE is used for traffic offloading (just basic functionalities are currently supported) while QDMA is used for DMA operations and QOS functionalities between the mac layer and the external modules conncted to the FE GDM ports (e.g MT7530 DSA switch or external phys). A general overview of airoha_eth architecture is reported below: ┌───────┐ ┌───────┐ │ QDMA2 │ │ QDMA1 │ └───┬───┘ └───┬───┘ │ │ ┌───────▼─────────────────────────────────────────────▼────────┐ │ │ │ P5 P0 │ │ │ │ │ │ │ ┌──────┐ │ P3 ├────► GDM3 │ │ │ └──────┘ │ │ │ │ ┌─────┐ │ │ │ PPE ◄────┤ P4 PSE │ └─────┘ │ │ │ │ │ │ │ │ ┌──────┐ │ P9 ├────► GDM4 │ │ │ └──────┘ │ │ │ │ │ │ │ P2 P1 │ └─────────┬───────────────────────────────────────────┬────────┘ │ │ ┌───▼──┐ ┌──▼───┐ │ GDM2 │ │ GDM1 │ └──────┘ └──┬───┘ │ ┌────▼─────┐ │ MT7530 │ └──────────┘ Currently only hw LAN features (QDMA1+GDM1) are available while hw WAN (QDMA2+GDM{2,3,4}) ones will be added with subsequent patches introducing traffic offloading support. Tested-by: Benjamin Larsson <benjamin.larsson@genexis.eu> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/274945d2391c195098ab180a46d0617b18b9e42c.1720818878.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	dt-bindings: net: airoha: Add EN7581 ethernet controller	Lorenzo Bianconi
	Introduce device-tree binding documentation for Airoha EN7581 ethernet mac controller. Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/7dfecf8aa4e6519562a94455b95c49e1b3c858a0.1720818878.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14	Merge branch '100GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: Switch API optimizations Marcin Szycik says: Optimize the process of creating a recipe in the switch block by removing duplicate switch ID words and changing how result indexes are fitted into recipes. In many cases this can decrease the number of recipes required to add a certain set of rules, potentially allowing a more varied set of rules to be created. Total rule count will also increase, since less words will be left unused/wasted. There are only 64 rules available in total, so every one counts. After this modification, many fields and some structs became unused or were simplified, resulting in overall simpler implementation. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Add tracepoint for adding and removing switch rules ice: Remove unused members from switch API ice: Optimize switch recipe creation ice: remove unused recipe bookkeeping data ice: Simplify bitmap setting in adding recipe ice: Remove reading all recipes before adding a new one ice: Remove unused struct ice_prot_lkup_ext members ==================== Link: https://patch.msgid.link/20240711181312.2019606-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	Merge branch '40GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-07-11 (net/intel) This series contains updates to most Intel network drivers. Tony removes MODULE_AUTHOR from drivers containing the entry. Simon Horman corrects a kdoc entry for i40e. Pawel adds implementation for devlink param "local_forwarding" on ice. Michal removes unneeded call, and code, for eswitch rebuild for ice. Sasha removed a no longer used field from igc. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igc: Remove the internal 'eee_advert' field ice: remove eswitch rebuild ice: Add support for devlink local_forwarding param i40e: correct i40e_addr_to_hkey() name in kdoc net: intel: Remove MODULE_AUTHORs ==================== Link: https://patch.msgid.link/20240711201932.2019925-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	sfc: falcon: Make I2C terminology more inclusive	Easwar Hariharan
	I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave" with more appropriate terms. Inspired by Wolfram's series to fix drivers/i2c/, fix the terminology for users of I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists in the specification. Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Link: https://patch.msgid.link/20240711052734.1273652-5-eahariha@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	net: phy: dp83td510: add cable testing support	Oleksij Rempel
	This patch implements the TDR test procedure as described in "Application Note DP83TD510E Cable Diagnostics Toolkit revC", section 3.2. The procedure was tested with "draka 08 signalkabel 2x0.8mm". The reported cable length was 5 meters more for each 20 meters of actual cable length. For instance, a 20-meter cable showed as 25 meters, and a 40-meter cable showed as 50 meters. Since other parts of the diagnostics provided by this PHY (e.g., Active Link Cable Diagnostics) require accurate cable characterization to provide proper results, this tuning can be implemented in a separate patch/interface. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> changes v2: - add comments - change post silence time to 1000ms Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240712152848.2479912-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	net: dpaa: Fix compilation Warning	Breno Leitao
	Remove variables that are defined and incremented but never read. This issue appeared in network tests[1] as: drivers/net/ethernet/freescale/dpaa/dpaa_eth_sysfs.c:38:6: warning: variable 'i' set but not used [-Wunused-but-set-variable] 38 \| int i = 0; \| ^ Link: https://netdev.bots.linux.dev/static/nipa/870263/13729811/build_clang/stderr [1] Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20240712134817.913756-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	eth: mlx5: expose NETIF_F_NTUPLE when ARFS is compiled out	Jakub Kicinski
	ARFS depends on NTUPLE filters, but the inverse is not true. Drivers which don't support ARFS commonly still support NTUPLE filtering. mlx5 has a Kconfig option to disable ARFS (MLX5_EN_ARFS) and does not advertise NTUPLE filters as a feature at all when ARFS is compiled out. That's not correct, ntuple filters indeed still work just fine (as long as MLX5_EN_RXNFC is enabled). This is needed to make the RSS test not skip all RSS context related testing. Acked-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://patch.msgid.link/20240711223722.297676-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	selftests: mptcp: lib: fix shellcheck errors	Matthieu Baerts (NGI0)
	It looks like we missed these two errors recently: - SC2068: Double quote array expansions to avoid re-splitting elements. - SC2145: Argument mixes string and array. Use * or separate argument. Two simple fixes, it is not supposed to change the behaviour as the variable names should not have any spaces in their names. Still, better to fix them to easily spot new issues. Fixes: f265d3119a29 ("selftests: mptcp: lib: use setup/cleanup_ns helpers") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240712-upstream-net-next-20240712-selftests-mptcp-fix-shellcheck-v1-1-1cb7180db40a@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	Merge branch 'mlx5-misc-2023-07-08-sf-max-eq'	Jakub Kicinski
	Saeed Mahameed says: ==================== mlx5 misc 2023-07-08 (sf max eq) Link: https://patchwork.kernel.org/project/netdevbpf/patch/20240708080025.1593555-2-tariqt@nvidia.com/ ==================== Link: https://patch.msgid.link/20240712003310.355106-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	net/mlx5: Use set number of max EQs	Daniel Jurgens
	If a maximum number of EQs has been set for an SF, use that amount. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://patch.msgid.link/20240712003310.355106-5-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	net/mlx5: Set default max eqs for SFs	Daniel Jurgens
	If the user hasn't configured max_io_eqs set a low default. The SF driver shouldn't try to create more than this, but FW will enforce this limit. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://patch.msgid.link/20240712003310.355106-4-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	net/mlx5: Set sf_eq_usage for SF max EQs	Daniel Jurgens
	When setting max_io_eqs for an SF function also set the sf_eq_usage_cap. This is to indicate to the SF driver from the PF that the user has set the max io eqs via devlink. So the SF driver can later query the proper max eq value from the new cap. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://patch.msgid.link/20240712003310.355106-3-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	net/mlx5: IFC updates for SF max IO EQs	Daniel Jurgens
	Expose a new cap sf_eq_usage. The vhca_resource_manager can write this cap, indicating the SF driver should use max_num_eqs_24b to determine how many EQs to use. Will be used in the next patch, to indicate to the SF driver from the PF that the user has set the max io eqs via devlink. So the SF driver can later query the proper max eq value from the new cap. devlink port function set pci/0000:08:00.0/32768 max_io_eqs 32 Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://patch.msgid.link/20240712003310.355106-2-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	net: mvpp2: Improve data types and use min()	Thorsten Blum
	Change the data type of the variable freq in mvpp2_rx_time_coal_set() and mvpp2_tx_time_coal_set() to u32 because port->priv->tclk also has the data type u32. Change the data type of the function parameter clk_hz in mvpp2_usec_to_cycles() and mvpp2_cycles_to_usec() to u32 accordingly and remove the following Coccinelle/coccicheck warning reported by do_div.cocci: WARNING: do_div() does a 64-by-32 division, please consider using div64_ul instead Use min() to simplify the code and improve its readability. Compile-tested only. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240711154741.174745-1-thorsten.blum@toblux.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	net: ethtool: Monotonically increase the message sequence number	Danielle Ratson
	Currently, during the module firmware flashing process, unicast notifications are sent from the kernel using the same sequence number, making it impossible for user space to track missed notifications. Monotonically increase the message sequence number, so the order of notifications could be tracked effectively. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20240711080934.2071869-1-danieller@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	Merge branch 'tcp-make-simultaneous-connect-rfc-compliant'	Jakub Kicinski
	Kuniyuki Iwashima says: ==================== tcp: Make simultaneous connect() RFC-compliant. Patch 1 fixes an issue that BPF TCP option parser is triggered for ACK instead of SYN+ACK in the case of simultaneous connect(). Patch 2 removes an wrong assumption in tcp_ao/self-connnect tests. v2: https://lore.kernel.org/netdev/20240708180852.92919-1-kuniyu@amazon.com/ v1: https://lore.kernel.org/netdev/20240704035703.95065-1-kuniyu@amazon.com/ ==================== Link: https://patch.msgid.link/20240710171246.87533-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	selftests: tcp: Remove broken SNMP assumptions for TCP AO self-connect tests.	Kuniyuki Iwashima
	tcp_ao/self-connect.c checked the following SNMP stats before/after connect() to confirm that the test exercises the simultaneous connect() path. * TCPChallengeACK * TCPSYNChallenge But the stats should not be counted for self-connect in the first place, and the assumption is no longer true. Let's remove the check. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Dmitry Safonov <dima@arista.com> Link: https://patch.msgid.link/20240710171246.87533-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13	tcp: Don't drop SYN+ACK for simultaneous connect().	Kuniyuki Iwashima
	RFC 9293 states that in the case of simultaneous connect(), the connection gets established when SYN+ACK is received. [0] TCP Peer A TCP Peer B 1. CLOSED CLOSED 2. SYN-SENT --> <SEQ=100><CTL=SYN> ... 3. SYN-RECEIVED <-- <SEQ=300><CTL=SYN> <-- SYN-SENT 4. ... <SEQ=100><CTL=SYN> --> SYN-RECEIVED 5. SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ... 6. ESTABLISHED <-- <SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED 7. ... <SEQ=100><ACK=301><CTL=SYN,ACK> --> ESTABLISHED However, since commit 0c24604b68fc ("tcp: implement RFC 5961 4.2"), such a SYN+ACK is dropped in tcp_validate_incoming() and responded with Challenge ACK. For example, the write() syscall in the following packetdrill script fails with -EAGAIN, and wrong SNMP stats get incremented. 0 socket(..., SOCK_STREAM\|SOCK_NONBLOCK, IPPROTO_TCP) = 3 +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress) +0 > S 0:0(0) <mss 1460,sackOK,TS val 1000 ecr 0,nop,wscale 8> +0 < S 0:0(0) win 1000 <mss 1000> +0 > S. 0:0(0) ack 1 <mss 1460,sackOK,TS val 3308134035 ecr 0,nop,wscale 8> +0 < S. 0:0(0) ack 1 win 1000 +0 write(3, ..., 100) = 100 +0 > P. 1:101(100) ack 1 -- # packetdrill cross-synack.pkt cross-synack.pkt:13: runtime error in write call: Expected result 100 but got -1 with errno 11 (Resource temporarily unavailable) # nstat ... TcpExtTCPChallengeACK 1 0.0 TcpExtTCPSYNChallenge 1 0.0 The problem is that bpf_skops_established() is triggered by the Challenge ACK instead of SYN+ACK. This causes the bpf prog to miss the chance to check if the peer supports a TCP option that is expected to be exchanged in SYN and SYN+ACK. Let's accept a bare SYN+ACK for active-open TCP_SYN_RECV sockets to avoid such a situation. Note that tcp_ack_snd_check() in tcp_rcv_state_process() is skipped not to send an unnecessary ACK, but this could be a bit risky for net.git, so this targets for net-next. Link: https://www.rfc-editor.org/rfc/rfc9293.html#section-3.5-7 [0] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240710171246.87533-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>