linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-12-07	tools: ynl-gen-c: don't require -o argument	Johannes Berg
	Without -o the tool currently crashes, but it's not marked as required. The only thing we can't do without it is to generate the correct #include for user source files, but we can put a placeholder instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20241206113100.89d35bf124d6.I9228fb704e6d5c9d8e046ef15025a47a48439c1e@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	tools: ynl-gen-c: annotate valid choices for --mode	Johannes Berg
	This makes argparse validate the input and helps users understand which modes are possible. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20241206113100.e2ab5cf6937c.Ie149a0ca5df713860964b44fe9d9ae547f2e1553@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	Merge branch 'net-convert-some-udp-tunnel-drivers-to-netdev_pcpu_stat_dstats'	Jakub Kicinski
	Guillaume Nault says: ==================== net: Convert some UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS. VXLAN, Geneve and Bareudp use various device counters for managing RX and TX statistics: * VXLAN uses the device core_stats for RX and TX drops, tstats for regular RX/TX counters and DEV_STATS_INC() for various types of RX/TX errors. * Geneve uses tstats for regular RX/TX counters and DEV_STATS_INC() for everything else, include RX/TX drops. * Bareudp, was recently converted to follow VXLAN behaviour, that is, device core_stats for RX and TX drops, tstats for regular RX/TX counters and DEV_STATS_INC() for other counter types. Let's consolidate statistics management around the dstats counters instead. This avoids using core_stats in VXLAN and Bareudp, as core_stats is supposed to be used by core networking code only (and not in drivers). This also allows Geneve to avoid using atomic increments when updating RX and TX drop counters, as dstats is per-cpu. Finally, this also simplifies the code as all three modules now handle stats in the same way and with only two different sets of counters (the per-cpu dstats and the atomic DEV_STATS_INC()). Patch 1 creates dstats helper functions that can be used outside of VRF (until then, dstats was VRF-specific). Then patches 2 to 4, convert VXLAN, Geneve and Bareudp, one by one. ==================== Link: https://patch.msgid.link/cover.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	bareudp: Handle stats using NETDEV_PCPU_STAT_DSTATS.	Guillaume Nault
	Bareudp uses the TSTATS infrastructure (dev_sw_netstats_()) for RX packet counters. It was also recently converted to use the device core stats (dev_core_stats_()) for RX and TX drops (see commit 788d5d655bc9 ("bareudp: Use pcpu stats to update rx_dropped counter.")). Since core stats are to be avoided in drivers, and for consistency with VXLAN and Geneve, let's convert packet stats handling to DSTATS, which can handle RX/TX stats and packet drops. Statistics that don't fit DSTATS are still updated atomically with DEV_STATS_INC(). Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/0f4f8448db3ff449ac6e939872b28cf3f8982da7.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	geneve: Handle stats using NETDEV_PCPU_STAT_DSTATS.	Guillaume Nault
	Geneve uses the TSTATS infrastructure (dev_sw_netstats_*()) for RX packet counters. All other counters are handled using atomic increments with DEV_STATS_INC(). Let's convert packet stats handling to DSTATS, which has a per-cpu counter for packet drops too, to avoid the cost of atomic increments in these cases. Statistics that don't fit DSTATS are still updated atomically with DEV_STATS_INC(). Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/7af5c09f3c26f0f231fbe383822ca5d1ce0278fa.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	vxlan: Handle stats using NETDEV_PCPU_STAT_DSTATS.	Guillaume Nault
	VXLAN uses the TSTATS infrastructure (dev_sw_netstats_()) for RX and TX packet counters. It also uses the device core stats (dev_core_stats_()) for RX and TX drops. Let's consolidate that using the DSTATS infrastructure, which can handle both packet counters and packet drops. Statistics that don't fit DSTATS are still updated atomically with DEV_STATS_INC(). While there, convert the "len" variable of vxlan_encap_bypass() to unsigned int, to respect the types of skb->len and dev_dstats_[rt]x_add(). Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/145558b184b3cda77911ca5682b6eb83c3ffed8e.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	vrf: Make pcpu_dstats update functions available to other modules.	Guillaume Nault
	Currently vrf is the only module that uses NETDEV_PCPU_STAT_DSTATS. In order to make this kind of statistics available to other modules, we need to define the update functions in netdevice.h. Therefore, let's define dev_dstats_*() functions for RX and TX packet updates (packets, bytes and drops). Use these new functions in vrf.c instead of vrf_rx_stats() and the other manual counter updates. While there, update the type of the "len" variables to "unsigned int", so that there're aligned with both skb->len and the new dstats update functions. Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/d7a552ee382c79f4854e7fcc224cf176cd21150d.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	Merge branch 'lan78xx-preparations-for-phylink'	Jakub Kicinski
	Oleksij Rempel says: ==================== lan78xx: Preparations for PHYlink This patch set is part of the preparatory work for migrating the lan78xx USB Ethernet driver to the PHYlink framework. During extensive testing, I observed that resetting the USB adapter can lead to various read/write errors. While the errors themselves are acceptable, they generate excessive log messages, resulting in significant log spam. This set improves error handling to reduce logging noise by addressing errors directly and returning early when necessary. Key highlights of this series include: - Enhanced error handling to reduce log spam while preserving the original error values, avoiding unnecessary overwrites. - Improved error reporting using the `%pe` specifier for better clarity in log messages. - Removal of redundant and problematic PHY fixups for LAN8835 and KSZ9031, with detailed explanations in the respective patches. - Cleanup of code structure, including unified `goto` labels for better readability and maintainability, even in simple editors. ==================== Link: https://patch.msgid.link/20241204084142.1152696-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Improve error handling in dataport and multicast writes	Oleksij Rempel
	Update `lan78xx_dataport_write` and `lan78xx_deferred_multicast_write` to: - Handle errors during register read/write operations. - Exit immediately on errors and log them using `%pe` for clarity. - Avoid silent failures by propagating error codes properly. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-11-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Add error handling to lan78xx_irq_bus_sync_unlock	Oleksij Rempel
	Update `lan78xx_irq_bus_sync_unlock` to handle errors in register read/write operations. If an error occurs, log it and exit the function appropriately. This ensures proper handling of failures during IRQ synchronization. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-10-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Add error handling to set_rx_max_frame_length and set_mtu	Oleksij Rempel
	Improve error handling in `lan78xx_set_rx_max_frame_length` by: - Checking return values from register read/write operations and propagating errors. - Exiting immediately on failure to ensure proper error reporting. In `lan78xx_change_mtu`, log errors when changing MTU fails, using `%pe` for clear error representation. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-9-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Add error handling to lan78xx_init_ltm	Oleksij Rempel
	Convert `lan78xx_init_ltm` to return error codes and handle errors properly. Previously, errors during the LTM initialization process were not propagated, potentially leading to undetected issues. This patch ensures: - Errors in `lan78xx_read_reg` and `lan78xx_write_reg` are checked and handled. - Errors are logged with detailed messages using `%pe` for clarity. - The function exits immediately on error, returning the error code. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-8-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Improve error handling in EEPROM and OTP operations	Oleksij Rempel
	Refine error handling in EEPROM and OTP read/write functions by: - Return error values immediately upon detection. - Avoid overwriting correct error codes with `-EIO`. - Preserve initial error codes as they were appropriate for specific failures. - Use `-ETIMEDOUT` for timeout conditions instead of `-EIO`. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-7-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Fix error handling in MII read/write functions	Oleksij Rempel
	Ensure proper error handling in `lan78xx_mdiobus_read` and `lan78xx_mdiobus_write` by checking return values of register read/write operations and returning errors to the caller. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-6-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Improve error reporting with %pe specifier	Oleksij Rempel
	Replace integer error codes with the `%pe` format specifier in register read and write error messages. This change provides human-readable error strings, making logs more informative and debugging easier. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-5-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: move functions to avoid forward definitions	Oleksij Rempel
	Move following functions to avoid forward declarations in the code: - lan78xx_start_hw() - lan78xx_stop_hw() - lan78xx_flush_fifo() - lan78xx_start_tx_path() - lan78xx_stop_tx_path() - lan78xx_flush_tx_fifo() - lan78xx_start_rx_path() - lan78xx_stop_rx_path() - lan78xx_flush_rx_fifo() These functions will be used in an upcoming PHYlink migration patch. No modifications to the functionality of the code are made. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Remove KSZ9031 PHY fixup	Oleksij Rempel
	Remove the KSZ9031RNX PHY fixup from the lan78xx driver. The fixup applied specific RGMII pad skew configurations globally, but these settings violate the RGMII specification and cause more harm than benefit. Key issues with the fixup: 1. Non-Compliant Timing: The fixup's delay settings fall outside the RGMII specification requirements of 1.5 ns to 2.0 ns: - RX Path: Total delay of 2.16 ns (PHY internal delay of 1.2 ns + 0.96 ns skew). - TX Path: Total delay of 0.96 ns, significantly below the RGMII minimum of 1.5 ns. 2. Redundant or Incorrect Configurations: - The RGMII skew registers written by the fixup do not meaningfully alter the PHY's default behavior and fail to account for its internal delays. - The TX_DATA pad skew was not configured, relying on power-on defaults that are insufficient for RGMII compliance. 3. Micrel Driver Support: By setting `PHY_INTERFACE_MODE_RGMII_ID`, the Micrel driver can calculate and assign appropriate skew values for the KSZ9031 PHY. This ensures better timing configurations without relying on external fixups. 4. System Interference: The fixup applied globally, reconfiguring all KSZ9031 PHYs in the system, even those unrelated to the LAN78xx adapter. This could lead to unintended and harmful behavior on unrelated interfaces. While the fixup is removed, a better mechanism is still needed to dynamically determine the optimal combination of PHY and MAC delays to fully meet RGMII requirements without relying on Device Tree or global fixups. This would allow for robust operation across different hardware configurations. The Micrel driver is capable of using the interface mode value to calculate and apply better skew values, providing a configuration much closer to the RGMII specification than the fixup. Removing the fixup ensures better default behavior and prevents harm to other system interfaces. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Remove LAN8835 PHY fixup	Oleksij Rempel
	Remove the PHY fixup for the LAN8835 PHY in the lan78xx driver due to the following reasons: - There is no publicly available information about the LAN8835 PHY. However, it appears to be the integrated PHY used in the LAN7800 and LAN7850 USB Ethernet controllers. These PHYs use the GMII interface, not RGMII as configured by the fixup. - The correct driver for handling the LAN8835 PHY functionality is the Microchip PHY driver (`drivers/net/phy/microchip.c`), which properly supports these integrated PHYs. - The PHY ID `0x0007C130` is actually used by the LAN8742A PHY, which only supports RMII. This interface is incompatible with the LAN78xx MAC, as the LAN7801 (the only LAN78xx version without an integrated PHY) supports only RGMII. - The mask applied for this fixup is overly broad, inadvertently covering both Microchip LAN88xx PHYs and unrelated SMSC LAN8742A PHYs, leading to potential conflicts with other devices. - Testing has shown that removing this fixup for LAN7800 and LAN7850 does not result in any noticeable difference in functionality, as the Microchip PHY driver (`drivers/net/phy/microchip.c`) handles all necessary configurations for these integrated PHYs. - Registering this fixup globally (not limited to USB devices) risks conflicts by unintentionally modifying other interfaces whenever a LAN7801 adapter is connected to the system. Note that both LAN7800 and LAN7850 USB Ethernet controllers use an integrated PHY with the ID `0x0007C132`. Additionally, the LAN7515, a specialized part for Raspberry Pi, includes an integrated LAN7800 USB Ethernet controller and USB hub in a multifunctional chip design, and it also uses the same PHY ID (`0x0007C132`). Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	Merge branch 'net-phylib-eee-cleanups'	Jakub Kicinski
	Russell King says: ==================== net: phylib EEE cleanups Clean up phylib's EEE support. Patches previously posted as RFC as part of the phylink EEE series. Patch 1 changes the Marvell driver to use the state we store in struct phy_device, rather than manually calling phydev->eee_cfg.eee_enabled. Patch 2 avoids genphy_c45_ethtool_get_eee() setting ->eee_enabled, as we copy that from phydev->eee_cfg.eee_enabled later, and after patch 3 mo one uses this after calling genphy_c45_ethtool_get_eee(). In fact, the only caller of this function now is phy_ethtool_get_eee(). As all callers to genphy_c45_eee_is_active() now pass NULL as its is_enabled flag, this is no longer useful. Remove the argument in patch 3. Patch 4 updates the phylib documentation to make it absolutely clear that phy_ethtool_get_eee() now fills in all members of struct ethtool_keee, which is why we now have so many buggy network drivers. ==================== Link: https://patch.msgid.link/Z1GDZlFyF2fsFa3S@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: phy: update phy_ethtool_get_eee() documentation	Russell King (Oracle)
	Update the phy_ethtool_get_eee() documentation to make it clear that all members of struct ethtool_keee are written by this function. keee.supported, keee.advertised, keee.lp_advertised and keee.eee_active are all written by genphy_c45_ethtool_get_eee(). keee.tx_lpi_timer, keee.tx_lpi_enabled and keee.eee_enabled are all written by eeecfg_to_eee(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tJ9JH-006LIz-SO@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: phy: remove genphy_c45_eee_is_active()'s is_enabled arg	Russell King (Oracle)
	All callers to genphy_c45_eee_is_active() now pass NULL as the is_enabled argument, which means we never use the value computed in this function. Remove the argument and clean up this function. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/E1tJ9JC-006LIt-Ne@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: phy: avoid genphy_c45_ethtool_get_eee() setting eee_enabled	Russell King (Oracle)
	genphy_c45_ethtool_get_eee() is only called from phy_ethtool_get_eee(), which then calls eeecfg_to_eee(). eeecfg_to_eee() will overwrite keee.eee_enabled, so there's no point setting keee.eee_enabled in genphy_c45_ethtool_get_eee(). Remove this assignment. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/E1tJ9J7-006LIn-Jr@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: phy: marvell: use phydev->eee_cfg.eee_enabled	Russell King (Oracle)
	Rather than calling genphy_c45_ethtool_get_eee() to retrieve whether EEE is enabled, use the value stored in the phy_device eee_cfg structure. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/E1tJ9J2-006LIh-Fl@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	selftests: net: cleanup busy_poller.c	Joe Damato
	Fix various integer type conversions by using strtoull and a temporary variable which is bounds checked before being casted into the appropriate cfg_* variable for use by the test program. While here: - free the strdup'd cfg string for overall hygenie. - initialize napi_id = 0 in setup_queue to avoid warnings on some compilers. Signed-off-by: Joe Damato <jdamato@fastly.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241204163239.294123-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: tipc: remove one synchronize_net() from tipc_nametbl_stop()	Eric Dumazet
	tipc_exit_net() is very slow and is abused by syzbot. tipc_nametbl_stop() is called for each netns being dismantled. Calling synchronize_net() right before freeing tn->nametbl is a big hammer. Replace this with kfree_rcu(). Note that RCU is not properly used here, otherwise tn->nametbl should be cleared before the synchronize_net() or kfree_rcu(), or even before the cleanup loop. We might need to fix this at some point. Also note tipc uses other synchronize_rcu() calls, more work is needed to make tipc_exit_net() much faster. List of remaining calls to synchronize_rcu() tipc_detach_loopback() (dev_remove_pack()) tipc_bcast_stop() tipc_sk_rht_destroy() Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241204210234.319484-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: simplify resource acquisition + ioremap	Rosen Penev
	get resource + request_mem_region + ioremap can all be done by a single function. Replace them with devm_platform_get_and_ioremap_resource or\ devm_platform_ioremap_resource where res is not used. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> # sja1000_platform.c Link: https://patch.msgid.link/20241203231337.182391-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	Merge branch 'ucc_geth-phylink-conversion'	David S. Miller
	Maxime Chevallier says: ==================== net: freescale: ucc_geth: Phylink conversion This is V3 of the phylink conversion for ucc_geth. The main changes in this V3 are related to error handling in the patches 1 and 10 to report an error when the deprecated "interface" property is found in DT. Doing so, I found and addressed some issues with the jump labels in the error paths, impacting patches 1 and 10. The rest of the changes are just a rebase on net-next. Some of the V2 changes haven't been reviewed, so I stress out that I'm still uncertain about the way WoL is handled is patches 4 and 10. Thanks, Maxime Link to V1: https://lore.kernel.org/netdev/20241107170255.1058124-1-maxime.chevallier@bootlin.com/ Link to V2: https://lore.kernel.org/netdev/20241114153603.307872-1-maxime.chevallier@bootlin.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: phylink conversion	Maxime Chevallier
	ucc_geth is quite capable in terms of supported interfaces, and even includes an externally controlled PCS (well, TBI). Port that driver to phylink. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: Introduce a helper to check Reduced modes	Maxime Chevallier
	A number of parallel MII interfaces also exist in a "Reduced" mode, usually with higher clock rates and fewer data lines, to ease the hardware design. This is what the 'R' stands for in RGMII, RMII, RTBI, RXAUI, etc. The UCC Geth controller has a special configuration bit that needs to be set when the MII mode is one of the supported reduced modes. Add a local helper for that. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: Move the serdes configuration around	Maxime Chevallier
	The uec_configure_serdes() function deals with serialized linkmodes settings. It's used during the link bringup sequence. It is planned to be used during the phylink conversion for mac configuration, but it needs to me moved around in the process. To make the phylink port clearer, this commit moves the function without any feature change. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: Hardcode the preamble length to 7 bytes	Maxime Chevallier
	The preamble length can be configured in ucc_geth, however it just ends-up always being configured to 7 bytes, as nothing ever changes the default value of 7. Make that value the default value when the MACCFG2 register gets initialized, and remove the code to configure that value altogether. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: Simplify frame length check	Maxime Chevallier
	The frame length check is configured when the phy interface is setup. However, it's configured according to an internal flag that is always false. So, just make so that we disable the relevant bit in the MACCFG2 register upon accessing it for other MAC configuration operations. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: Use the correct type to store WoL opts	Maxime Chevallier
	The WoL opts are represented through a bitmask stored in a u32. As this mask is copied as-is in the driver, make sure we use the exact same type to store them internally. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: Fix WOL configuration	Maxime Chevallier
	The get/set_wol ethtool ops rely on querying the PHY for its WoL capabilities, checking for the presence of a PHY and a PHY interrupts isn't enough. Address that by cleaning up the WoL configuration sequence. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: Use netdev->phydev to access the PHY	Maxime Chevallier
	As this driver pre-dates phylib, it uses a private pointer to get a reference to the attached phy_device. Drop that pointer and use the netdev's pointer instead. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: split adjust_link for phylink conversion	Maxime Chevallier
	Preparing the phylink conversion, split the adjust_link callbaclk, by clearly separating the mac configuration, link_up and link_down phases. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06	net: freescale: ucc_geth: Drop support for the "interface" DT property	Maxime Chevallier
	In april 2007, ucc_geth was converted to phylib with : commit 728de4c927a3 ("ucc_geth: migrate ucc_geth to phylib"). In that commit, the device-tree property "interface", that could be used to retrieve the PHY interface mode was deprecated. DTS files that still used that property were converted along the way, in the following commit, also dating from april 2007 : commit 0fd8c47cccb1 ("[POWERPC] Replace undocumented interface properties in dts files") 17 years later, there's no users of that property left and I hope it's safe to say we can remove support from that in the ucc_geth driver, making the probe() function a bit simpler. Should there be any users that have a DT that was generated when 2.6.21 was cutting-edge, print an error message with hints on how to convert the devicetree if the 'interface' property is found. With that property gone, we can greatly simplify the parsing of the phy-interface-mode from the devicetree by using of_get_phy_mode(), allowing the removal of the open-coded parsing in the driver. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-05	Merge branch 'xdp-a-fistful-of-generic-changes-pt-i'	Jakub Kicinski
	Alexander Lobakin says: ==================== xdp: a fistful of generic changes pt. I XDP for idpf is currently 6 chapters: * convert Rx to libeth; * convert Tx and stats to libeth; * generic XDP and XSk code changes (you are here); * generic XDP and XSk code additions; * actual XDP for idpf via new libeth_xdp; * XSk for idpf (via ^). Part III does the following: * improve &xdp_buff_xsk cacheline placement; * does some cleanups with marking read-only bpf_prog and xdp_buff arguments const for some generic functions; * allows attaching already registered XDP memory model to RxQ info; * makes system percpu page_pools valid XDP memory models; * starts using netmems in the XDP core code (1 function); * allows mixing pages from several page_pools within one XDP frame; * optimizes &xdp_frame layout and removes no-more-used field. Bullets 4-6 are the most important ones. All of them are prereqs to libeth_xdp. ==================== Link: https://patch.msgid.link/20241203173733.3181246-1-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	page_pool: make page_pool_put_page_bulk() handle array of netmems	Alexander Lobakin
	Currently, page_pool_put_page_bulk() indeed takes an array of pointers to the data, not pages, despite the name. As one side effect, when you're freeing frags from &skb_shared_info, xdp_return_frame_bulk() converts page pointers to virtual addresses and then page_pool_put_page_bulk() converts them back. Moreover, data pointers assume every frag is placed in the host memory, making this function non-universal. Make page_pool_put_page_bulk() handle array of netmems. Pass frag netmems directly and use virt_to_netmem() when freeing xdpf->data, so that the PP core will then get the compound netmem and take care of the rest. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20241203173733.3181246-9-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	netmem: add a couple of page helper wrappers	Alexander Lobakin
	Add the following netmem counterparts: * virt_to_netmem() -- simple page_to_netmem(virt_to_page()) wrapper; * netmem_is_pfmemalloc() -- page_is_pfmemalloc() for page-backed netmems, false otherwise; and the following "unsafe" versions: * __netmem_to_page() * __netmem_get_pp() * __netmem_address() They do the same as their non-underscored buddies, but assume the netmem is always page-backed. When working with header &page_pools, you don't need to check whether netmem belongs to the host memory and you can never get NULL instead of &page. Checks for the LSB, clearing the LSB, branches take cycles and increase object code size, sometimes significantly. When you're sure your PP is always host, you can avoid this by using the underscored counterparts. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20241203173733.3181246-8-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	xdp: register system page pool as an XDP memory model	Toke Høiland-Jørgensen
	To make the system page pool usable as a source for allocating XDP frames, we need to register it with xdp_reg_mem_model(), so that page return works correctly. This is done in preparation for using the system page_pool to convert XDP_PASS XSk frames to skbs; for the same reason, make the per-cpu variable non-static so we can access it from other source files as well (but w/o exporting). Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20241203173733.3181246-7-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	xsk: allow attaching XSk pool via xdp_rxq_info_reg_mem_model()	Alexander Lobakin
	When you register an XSk pool as XDP Rxq info memory model, you then need to manually attach it after the registration. Let the user combine both actions into one by just passing a pointer to the pool directly to xdp_rxq_info_reg_mem_model(), which will take care of calling xsk_pool_set_rxq_info(). This looks similar to how a &page_pool gets registered and reduce repeating driver code. Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20241203173733.3181246-6-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	xdp: allow attaching already registered memory model to xdp_rxq_info	Alexander Lobakin
	One may need to register memory model separately from xdp_rxq_info. One simple example may be XDP test run code, but in general, it might be useful when memory model registering is managed by one layer and then XDP RxQ info by a different one. Allow such scenarios by adding a simple helper which "attaches" already registered memory model to the desired xdp_rxq_info. As this is mostly needed for Page Pool, add a special function to do that for a &page_pool pointer. Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20241203173733.3181246-5-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	xdp, xsk: constify read-only arguments of some static inline helpers	Alexander Lobakin
	Lots of read-only helpers for &xdp_buff and &xdp_frame, such as getting the frame length, skb_shared_info etc., don't have their arguments marked with `const` for no reason. Add the missing annotations to leave less place for mistakes and more for optimization. Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20241203173733.3181246-4-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	bpf, xdp: constify some bpf_prog * function arguments	Alexander Lobakin
	In lots of places, bpf_prog pointer is used only for tracing or other stuff that doesn't modify the structure itself. Same for net_device. Address at least some of them and add `const` attributes there. The object code didn't change, but that may prevent unwanted data modifications and also allow more helpers to have const arguments. Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	xsk: align &xdp_buff_xsk harder	Alexander Lobakin
	After the series "XSk buff on a diet" by Maciej, the greatest pow-2 which &xdp_buff_xsk can be divided got reduced from 16 to 8 on x86_64. Also, sizeof(xdp_buff_xsk) now is 120 bytes, which, taking the previous sentence into account, leads to that it leaves 8 bytes at the end of cacheline, which means an array of buffs will have its elements messed between the cachelines chaotically. Use __aligned_largest for this struct. This alignment is usually 16 bytes, which makes it fill two full cachelines and align an array nicely. ___cacheline_aligned may be excessive here, especially on arches with 128-256 byte CLs, as well as 32-bit arches (76 -> 96 bytes on MIPS32R2), while not doing better than _largest. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20241203173733.3181246-2-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	Merge branch 'net_sched-sch_sfq-reject-limit-of-1'	Jakub Kicinski
	Octavian Purdila says: ==================== net_sched: sch_sfq: reject limit of 1 The implementation does not properly support limits of 1. Add an in-kernel check, in addition to existing iproute2 check, since other tools may be used for configuration. This patch set also adds a selfcheck to test that a limit of 1 is rejected. An alternative (or in addition) we could fix the implementation by setting q->tail to NULL in sfq_drop if this is the last slot we marked empty, e.g.: --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -317,8 +317,11 @@ static unsigned int sfq_drop(struct Qdisc sch, struct sk_buff to_free) / It is difficult to believe, but ALL THE SLOTS HAVE LENGTH 1. / x = q->tail->next; slot = &q->slots[x]; - q->tail->next = slot->next; q->ht[slot->hash] = SFQ_EMPTY_SLOT; + if (x == slot->next) + q->tail = NULL; / no more active slots */ + else + q->tail->next = slot->next; goto drop; } ==================== Link: https://patch.msgid.link/20241204030520.2084663-1-tavip@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	selftests/tc-testing: sfq: test that kernel rejects limit of 1	Octavian Purdila
	Add test to check that the kernel rejects a configuration with the limit set to 1. Signed-off-by: Octavian Purdila <tavip@google.com> Link: https://patch.msgid.link/20241204030520.2084663-3-tavip@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	net_sched: sch_sfq: don't allow 1 packet limit	Octavian Purdila
	The current implementation does not work correctly with a limit of 1. iproute2 actually checks for this and this patch adds the check in kernel as well. This fixes the following syzkaller reported crash: UBSAN: array-index-out-of-bounds in net/sched/sch_sfq.c:210:6 index 65535 is out of range for type 'struct sfq_head[128]' CPU: 0 PID: 2569 Comm: syz-executor101 Not tainted 5.10.0-smp-DEV #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x125/0x19f lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:148 [inline] __ubsan_handle_out_of_bounds+0xed/0x120 lib/ubsan.c:347 sfq_link net/sched/sch_sfq.c:210 [inline] sfq_dec+0x528/0x600 net/sched/sch_sfq.c:238 sfq_dequeue+0x39b/0x9d0 net/sched/sch_sfq.c:500 sfq_reset+0x13/0x50 net/sched/sch_sfq.c:525 qdisc_reset+0xfe/0x510 net/sched/sch_generic.c:1026 tbf_reset+0x3d/0x100 net/sched/sch_tbf.c:319 qdisc_reset+0xfe/0x510 net/sched/sch_generic.c:1026 dev_reset_queue+0x8c/0x140 net/sched/sch_generic.c:1296 netdev_for_each_tx_queue include/linux/netdevice.h:2350 [inline] dev_deactivate_many+0x6dc/0xc20 net/sched/sch_generic.c:1362 __dev_close_many+0x214/0x350 net/core/dev.c:1468 dev_close_many+0x207/0x510 net/core/dev.c:1506 unregister_netdevice_many+0x40f/0x16b0 net/core/dev.c:10738 unregister_netdevice_queue+0x2be/0x310 net/core/dev.c:10695 unregister_netdevice include/linux/netdevice.h:2893 [inline] __tun_detach+0x6b6/0x1600 drivers/net/tun.c:689 tun_detach drivers/net/tun.c:705 [inline] tun_chr_close+0x104/0x1b0 drivers/net/tun.c:3640 __fput+0x203/0x840 fs/file_table.c:280 task_work_run+0x129/0x1b0 kernel/task_work.c:185 exit_task_work include/linux/task_work.h:33 [inline] do_exit+0x5ce/0x2200 kernel/exit.c:931 do_group_exit+0x144/0x310 kernel/exit.c:1046 __do_sys_exit_group kernel/exit.c:1057 [inline] __se_sys_exit_group kernel/exit.c:1055 [inline] __x64_sys_exit_group+0x3b/0x40 kernel/exit.c:1055 do_syscall_64+0x6c/0xd0 entry_SYSCALL_64_after_hwframe+0x61/0xcb RIP: 0033:0x7fe5e7b52479 Code: Unable to access opcode bytes at RIP 0x7fe5e7b5244f. RSP: 002b:00007ffd3c800398 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe5e7b52479 RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000 RBP: 00007fe5e7bcd2d0 R08: ffffffffffffffb8 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe5e7bcd2d0 R13: 0000000000000000 R14: 00007fe5e7bcdd20 R15: 00007fe5e7b24270 The crash can be also be reproduced with the following (with a tc recompiled to allow for sfq limits of 1): tc qdisc add dev dummy0 handle 1: root tbf rate 1Kbit burst 100b lat 1s ../iproute2-6.9.0/tc/tc qdisc add dev dummy0 handle 2: parent 1:10 sfq limit 1 ifconfig dummy0 up ping -I dummy0 -f -c2 -W0.1 8.8.8.8 sleep 1 Scenario that triggers the crash: * the first packet is sent and queued in TBF and SFQ; qdisc qlen is 1 * TBF dequeues: it peeks from SFQ which moves the packet to the gso_skb list and keeps qdisc qlen set to 1. TBF is out of tokens so it schedules itself for later. * the second packet is sent and TBF tries to queues it to SFQ. qdisc qlen is now 2 and because the SFQ limit is 1 the packet is dropped by SFQ. At this point qlen is 1, and all of the SFQ slots are empty, however q->tail is not NULL. At this point, assuming no more packets are queued, when sch_dequeue runs again it will decrement the qlen for the current empty slot causing an underflow and the subsequent out of bounds access. Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Octavian Purdila <tavip@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241204030520.2084663-2-tavip@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05	net_sched: sch_fq: add three drop_reason	Eric Dumazet
	Add three new drop_reason, more precise than generic QDISC_DROP: "tc -s qd" show aggregate counters, it might be more useful to use drop_reason infrastructure for bug hunting. 1) SKB_DROP_REASON_FQ_BAND_LIMIT Whenever a packet is added while its band limit is hit. Corresponding value in "tc -s qd" is bandX_drops XXXX 2) SKB_DROP_REASON_FQ_HORIZON_LIMIT Whenever a packet has a timestamp too far in the future. Corresponding value in "tc -s qd" is horizon_drops XXXX 3) SKB_DROP_REASON_FQ_FLOW_LIMIT Whenever a flow has reached its limit. Corresponding value in "tc -s qd" is flows_plimit XXXX Tested: tc qd replace dev eth1 root fq flow_limit 10 limit 100000 perf record -a -e skb:kfree_skb sleep 1; perf script udp_stream 12329 [004] 216.929492: skb:kfree_skb: skbaddr=0xffff888eabe17e00 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_FLOW_LIMIT udp_stream 12385 [006] 216.929593: skb:kfree_skb: skbaddr=0xffff888ef8827f00 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_FLOW_LIMIT udp_stream 12389 [005] 216.929871: skb:kfree_skb: skbaddr=0xffff888ecb9ba500 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_FLOW_LIMIT udp_stream 12316 [009] 216.930398: skb:kfree_skb: skbaddr=0xffff888eca286b00 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_FLOW_LIMIT udp_stream 12400 [008] 216.930490: skb:kfree_skb: skbaddr=0xffff888eabf93d00 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_FLOW_LIMIT tc qd replace dev eth1 root fq flow_limit 100 limit 10000 perf record -a -e skb:kfree_skb sleep 1; perf script udp_stream 18074 [001] 1058.318040: skb:kfree_skb: skbaddr=0xffffa23c881fc000 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_BAND_LIMIT udp_stream 18126 [005] 1058.320651: skb:kfree_skb: skbaddr=0xffffa23c6aad4000 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_BAND_LIMIT udp_stream 18118 [006] 1058.321065: skb:kfree_skb: skbaddr=0xffffa23df0d48a00 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_BAND_LIMIT udp_stream 18074 [001] 1058.321126: skb:kfree_skb: skbaddr=0xffffa23c881ffa00 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_BAND_LIMIT udp_stream 15815 [003] 1058.321224: skb:kfree_skb: skbaddr=0xffffa23c9835db00 rx_sk=(nil) protocol=34525 location=__dev_queue_xmit+0x9d9 reason: FQ_BAND_LIMIT tc -s -d qd sh dev eth1 qdisc fq 8023: root refcnt 257 limit 10000p flow_limit 100p buckets 1024 orphan_mask 1023 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 weights 589824 196608 65536 quantum 18Kb initial_quantum 92120b low_rate_threshold 550Kbit refill_delay 40ms timer_slack 10us horizon 10s horizon_drop Sent 492439603330 bytes 336953991 pkt (dropped 61724094, overlimits 0 requeues 4463) backlog 14611228b 9995p requeues 4463 flows 2965 (inactive 1151 throttled 0) band0_pkts 0 band1_pkts 9993 band2_pkts 0 gc 6347 highprio 0 fastpath 30 throttled 5 latency 2.32us flows_plimit 7403693 band1_drops 54320401 Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20241204171950.89829-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>