summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)Author
2024-02-06net: ravb: Move delay mode set in the driver's ndo_open APIClaudiu Beznea
Delay parsing and setting were done in the driver's probe API. As some IP variants switch to reset mode (and thus the register contents is lost) when setting clocks (due to module standby functionality) to be able to implement runtime PM keep the delay parsing in the driver's probe function and move the delay applying function to the driver's ndo_open API. Along with it, ravb_parse_delay_mode() function was moved close to ravb_set_delay_mode() function to have the delay specific code in the same place. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: ravb: Split GTI computation and set operationsClaudiu Beznea
ravb_set_gti() was computing the value of GTI based on the reference clock rate and then applied it to register. This was done on the driver's probe function. In order to implement runtime PM for all IP variants (as some IP variants switches to reset mode (and thus the registers content is lost) when module standby is configured through clock APIs) the GTI setup was split in 2 parts: one computing the value of the GTI register (done in the driver's probe function) and one applying the computed value to register (done in the driver's ndo_open API). Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: ravb: Move getting/requesting IRQs in the probe() methodClaudiu Beznea
The runtime PM implementation will disable clocks at the end of ravb_probe(). As some IP variants switch to reset mode as a result of setting module standby through clock disable APIs, to implement runtime PM the resource parsing and requesting are moved in the probe function and IP settings are moved in the open function. This is done because at the end of the probe some IP variants will switch anyway to reset mode and the registers content is lost. Also keeping only register settings operations in the ravb_open()/ravb_close() functions will make them faster. Commit moves IRQ requests to ravb_probe() to have all the IRQs ready when the interface is open. As now getting/requesting IRQs is done in a single place there is no need to keep intermediary data (like ravb_rx_irqs[] and ravb_tx_irqs[] arrays or IRQs in struct ravb_private). In order to avoid accessing the IP registers while the IP is runtime suspended (e.g. in the timeframe b/w the probe requests shared IRQs and IP clocks are enabled) in the interrupt handlers were introduced pm_runtime_active() checks. The device runtime PM usage counter has been incremented to avoid disabling the device's clocks while the check is in progress (if any). This is a preparatory change to add runtime PM support for all IP variants. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: ravb: Move reference clock enable/disable on runtime PM APIsClaudiu Beznea
Reference clock could be or not be part of the power domain. If it is part of the power domain, the power domain takes care of properly setting it. In case it is not part of the power domain and full runtime PM support is available in driver the clock will not be propertly disabled/enabled at runtime. For this, keep the prepare/unprepare operations in the driver's probe()/remove() functions and move the enable/disable in runtime PM functions. By doing this, the previous ravb_runtime_nop() function was renamed ravb_runtime_suspend() and the comment was removed. A proper runtime PM resume function was added (ravb_runtime_resume()). The current driver still don't need to make any register settings on runtime suspend/resume (as expressed in the removed comment) because, currently, pm_runtime_put_sync() is called on the driver remove function. This will be changed in the next commits (that extends the runtime PM support) such that proper register settings (along with runtime resume/suspend) will be done on ravb_open()/ravb_close(). Along with it, the other clock request operations were moved close to reference clock request and prepare to have all the clock requests specific code grouped together. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: ravb: Assert/de-assert reset on suspend/resumeClaudiu Beznea
RZ/G3S can go to deep sleep states where power to most of the SoC parts is off. When resuming from such a state, the Ethernet controller needs to be reinitialized. De-asserting the reset signal for it should also be done. Thus, add reset assert/de-assert on suspend/resume functions. On the resume function, the de-assert was not reverted in case of failures to give the user a chance to restore the interface (e.g., bringing down/up the interface) in case suspend/resume failed. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: ravb: Use tabs instead of spacesClaudiu Beznea
Use tabs instead of spaces in the ravb_set_rate_gbeth() function. This aligns with the coding style requirements. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: ravb: Switch to SYSTEM_SLEEP_PM_OPS()/RUNTIME_PM_OPS() and pm_ptr()Claudiu Beznea
SET_SYSTEM_SLEEP_PM_OPS() and SET_RUNTIME_PM_OPS() are deprecated now and require __maybe_unused protection against unused function warnings. The usage of pm_ptr() and SYSTEM_SLEEP_PM_OPS()/RUNTIME_PM_OPS() allows the compiler to see the functions, thus suppressing the warning. Thus drop the __maybe_unused markings. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: ravb: Make reset controller support mandatoryClaudiu Beznea
On the RZ/G3S SoC the reset controller is mandatory for the IP to work. The device tree binding documentation for the ravb driver specifies that the resets are mandatory. Based on this, make the resets mandatory also in driver for all ravb devices. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: ravb: Rely on PM domain to enable gptp_clkClaudiu Beznea
ravb_rzv2m_hw_info::gptp_ref_clk is enabled only for RZ/V2M. RZ/V2M is an ARM64-based device which selects power domains by default and CONFIG_PM. The RZ/V2M Ethernet DT node has proper power-domain binding available in device tree from the commit that added the Ethernet node. (4872ca1f92b0 ("arm64: dts: renesas: r9a09g011: Add ethernet nodes")). Power domain support was available in the rzg2l-cpg.c driver when the Ethernet DT node has been enabled in RZ/V2M device tree. (ef3c613ccd68 ("clk: renesas: Add CPG core wrapper for RZ/G2L SoC")). Thus, remove the explicit clock enable for gptp_clk (and treat it as the other clocks are treated) as it is not needed and removing it doesn't break the ABI according to the above explanations. By removing the enable/disable operation from the driver we can add runtime PM support (which operates on clocks) w/o the need to handle the gptp_clk in the Ethernet driver functions like ravb_runtime_nop(). PM domain does all that is needed. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: ravb: Let IP-specific receive function to interrogate descriptorsClaudiu Beznea
ravb_poll() initial code used to interrogate the first descriptor of the RX queue in case gPTP is false to determine if ravb_rx() should be called. This is done for non-gPTP IPs. For gPTP IPs the driver PTP-specific information was used to determine if receive function should be called. As every IP has its own receive function that interrogates the RX descriptors list in the same way the ravb_poll() was doing there is no need to double check this in ravb_poll(). Removing the code from ravb_poll() leads to a cleaner code. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06net: encx24j600: convert to use maple tree register cacheBo Liu
The maple tree register cache is based on a much more modern data structure than the rbtree cache and makes optimisation choices which are probably more appropriate for modern systems than those made by the rbtree cache. Signed-off-by: Bo Liu <liubo03@inspur.com> Link: https://lore.kernel.org/r/20240202064336.39138-1-liubo03@inspur.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-05net: dsa: qca8k: consistently use "ret" rather than "err" for error codesVladimir Oltean
It was pointed out during the review [1] of commit 68e1010cda79 ("net: dsa: qca8k: put MDIO bus OF node on qca8k_mdio_register() failure") that the rest of the qca8k driver uses "int ret" rather than "int err". Make everything consistent in that regard, not only qca8k_mdio_register(), but also qca8k_setup_mdio_bus(). [1] https://lore.kernel.org/netdev/qyl2w3ownx5q7363kqxib52j5htar4y6pkn7gen27rj45xr4on@pvy5agi6o2te/ Suggested-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-05net: dsa: qca8k: put MDIO controller OF node if unavailableVladimir Oltean
It was pointed out during the review [1] of commit e66bf63a7f67 ("net: dsa: qca8k: skip MDIO bus creation if its OF node has status = "disabled"") that we now leak a reference to the "mdio" OF node if it is disabled. This is only a concern when using dynamic OF as far as I can tell (like probing on an overlay), since OF nodes are never freed in the regular case. Additionally, I'm unaware of any actual device trees (in production or elsewhere) which have status = "disabled" for the MDIO OF node. So handling this as a simple enhancement. [1] https://lore.kernel.org/netdev/CAJq09z4--Ug+3FAmp=EimQ8HTQYOWOuVon-PUMGB5a1N=RPv4g@mail.gmail.com/ Suggested-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-05net: ocelot: update the MODULE_DESCRIPTION()Breno Leitao
commit 1c870c63d7d2 ("net: fill in MODULE_DESCRIPTION()s for ocelot") got a suggestion from Vladimir Oltean after it had landed in net-next. Rewrite the module description according to Vladimir's suggestion. Fixes: 1c870c63d7d2 ("net: fill in MODULE_DESCRIPTION()s for ocelot") Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-05tsnep: Add helper for RX XDP_RING_NEED_WAKEUP flagGerhard Engleder
Similar chunk of code is used in tsnep_rx_poll_zc() and tsnep_rx_reopen_xsk() to maintain the RX XDP_RING_NEED_WAKEUP flag. Consolidate the code to common helper function. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-04tun: Implement ethtool's get_channels() callbackYunjian Wang
Implement the tun .get_channels functionality. This feature is necessary for some tools, such as libxdp, which need to retrieve the queue count. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-04r8169: add support for RTL8126AHeiner Kallweit
This adds support for the RTL8126A found on Asus z790 Maximus Formula. It was successfully tested w/o the firmware at 1000Mbps. Firmware file has been provided by Realtek and submitted to linux-firmware. 2.5G and 5G modes are untested. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-04net: micrel: Fix the frequency adjustmentsHoratiu Vultur
By default lan8841's 1588 clock frequency is 125MHz. But when adjusting the frequency, it is using the 1PPM format of the lan8814. Which is the wrong format as lan8814 has a 1588 clock frequency of 250MHz. So then for each 1PPM adjustment would adjust less than expected. Therefore fix this by using the correct 1PPM format for lan8841. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-03net: phy: qcom: qca808x: default to LED active High if not setChristian Marangi
qca808x PHY provide support for the led_polarity_set OP to configure and apply the active-low property but on PHY reset, the Active High bit is not set resulting in the LED driven as active-low. To fix this, check if active-low is not set in DT and enable Active High polarity by default to restore correct funcionality of the LED. Fixes: 7196062b64ee ("net: phy: at803x: add LED support for qca808x") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-03net: phy: qcom: qca808x: fix logic error in LED brightness setChristian Marangi
In switching to using phy_modify_mmd and a more short version of the LED ON/OFF condition in later revision, it was made a logic error where value ? QCA808X_LED_FORCE_ON : QCA808X_LED_FORCE_OFF is always true as value is always OR with QCA808X_LED_FORCE_EN due to missing () resulting in the testing condition being QCA808X_LED_FORCE_EN | value. Add the () to apply the correct condition and restore correct functionality of the brightness ON/OFF. Fixes: 7196062b64ee ("net: phy: at803x: add LED support for qca808x") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-02r8169: simplify EEE handlingHeiner Kallweit
We don't have to store the EEE modes to be advertised in the driver, phylib does this for us and stores it in phydev->advertising_eee. phylib also takes care of properly handling the EEE advertisement. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/27c336a8-ea47-483d-815b-02c45ae41da2@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-02net: phy: realtek: add support for RTL8126A-integrated 5Gbps PHYHeiner Kallweit
A user reported that first consumer mainboards show up with a RTL8126A 5Gbps MAC/PHY. This adds support for the integrated PHY, which is also available stand-alone. From a PHY driver perspective it's treated the same as the 2.5Gbps PHY's, we just have to support the new PHY ID. Reported-by: Joe Salmeri <jmscdba@gmail.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Joe Salmeri <jmscdba@gmail.com> Link: https://lore.kernel.org/r/0c8e67ea-6505-43d1-bd51-94e7ecd6e222@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-02octeontx2-af: Cleanup loopback device checksGeetha sowjanya
PCI device IDs of RVU device IDs are configurable and RVU PF0's (ie AF's) are currently assumed as VFs that identify loopback functionality ie LBKVFs. But in some cases these VFs can be setup for different functionality. Hence remove assumptions that AF's VFs are always LBK VFs by renaming 'is_afvf' as 'is_lbkvf' explicitly and also identify LBK VF using PCI dev ID. Similar change is done for other VF types. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-02octeontx2-af: Create BPIDs free poolGeetha sowjanya
In current driver 64 BPIDs are reserved for LBK interfaces. These bpids are 1-to-1 mapped to LBK interface channel numbers. In some usecases one LBK interface required more than one bpids and in some case they may not require at all. These usescase can't be address with the current implementation as it always reserves only one bpid per LBK channel. This patch addresses this issue by creating free bpid pool from these 64 bpids instead of 1-to-1 mapping to the lbk channel. Now based on usecase LBK interface can request a bpid using (bp_enable()). This patch also reduces the number of bpids for cgx interfaces to 8 and adds proper error code Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-02net: phy: dp83867: Add support for active-low LEDsAlexander Stein
Add the led_polarity_set callback for setting LED polarity. Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-02net: mdio: ipq4019: add support for clock-frequency propertyChristian Marangi
The IPQ4019 MDIO internally divide the clock feed by AHB based on the MDIO_MODE reg. On reset or power up, the default value for the divider is 0xff that reflect the divider set to /256. This makes the MDC run at a very low rate, that is, considering AHB is always fixed to 100Mhz, a value of 390KHz. This hasn't have been a problem as MDIO wasn't used for time sensitive operation, it is now that on IPQ807x is usually mounted with PHY that requires MDIO to load their firmware (example Aquantia PHY). To handle this problem and permit to set the correct designed MDC frequency for the SoC add support for the standard "clock-frequency" property for the MDIO node. The divider supports value from /1 to /256 and the common value are to set it to /16 to reflect 6.25Mhz or to /8 on newer platform to reflect 12.5Mhz. To scan if the requested rate is supported by the divider, loop with each supported divider and stop when the requested rate match the final rate with the current divider. An error is returned if the rate doesn't match any value. On MDIO reset, the divider is restored to the requested value to prevent any kind of downclocking caused by the divider reverting to a default value. To follow 802.3 spec of 2.5MHz of default value, if divider is set at /256 and "clock-frequency" is not set in DT, assume nobody set the divider and try to find the closest MDC rate to 2.5MHz. (in the case of AHB set to 100MHz, it's 1.5625MHz) While at is also document other bits of the MDIO_MODE reg to have a clear idea of what is actually applied there. Documentation of some BITs is skipped as they are marked as reserved and their usage is not clear (RES 11:9 GENPHY 16:13 RES1 19:17) Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-01net: ipa: kill ipa_power_modem_queue_wake()Alex Elder
All ipa_power_modem_queue_wake() does is call netif_wake_queue() on the modem netdev. There is no need to wrap that call in a trivial function (and certainly not one defined in "ipa_power.c"). So get rid of ipa_power_modem_queue_wake(), and replace its one caller with a direct call to netif_wake_queue(). Determine the netdev pointer to use from the private TX endpoint's netdev pointer. Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20240130192305.250915-8-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: ipa: kill ipa_power_modem_queue_active()Alex Elder
All ipa_power_modem_queue_active() does now is call netif_wake_queue(). Just call netif_wake_queue() in the two places it's needed, and get rid of ipa_power_modem_queue_active(). Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20240130192305.250915-7-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: ipa: kill ipa_power_modem_queue_stop()Alex Elder
All ipa_power_modem_queue_stop() does now is call netif_stop_queue(). Just call netif_stop_queue() in the one place it's needed, and get rid of ipa_power_modem_queue_stop(). Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20240130192305.250915-6-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: ipa: kill the IPA power STOPPED flagAlex Elder
Currently the STOPPED IPA power flag is used to indicate that the transmit queue has been stopped. Previously this was used to avoid setting the STARTED flag unless the queue had already been stopped. It meant transmit queuing would be enabled on resume if it was stopped by the transmit path--and if so, it ensured it only got enabled once. We only stop the transmit queue in the transmit path. The STARTED flag has been removed, and it causes no real harm to enable transmits when they're already enabled. So we can get rid of the STOPPED flag and call netif_wake_queue() unconditionally. This makes the IPA power spinlock unnecessary, so it can be removed as well. Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20240130192305.250915-5-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: ipa: kill the STARTED IPA power flagAlex Elder
A transmit on the modem netdev can only complete if the IPA hardware is powered. Currently, if a transmit request arrives when the hardware was not powered, further transmits are be stopped to allow power-up to complete. Once power-up completes, transmits are once again enabled. Runtime resume can complete at the same time a transmit request is being handled, and previously there was a race between stopping and restarting transmits. The STARTED flag was used to ensure the stop request in the transmit path was skipped if the start request in the runtime resume path had already occurred. Now, the queue is *always* stopped in the transmit path, *before* determining whether power is ACTIVE. If power is found to already be active (or if the socket buffer is gets dropped), transmit is re-enabled. Otherwise it will (always) be enabled after runtime resume completes. The race between transmit and runtime resume no longer exists, so there is no longer any need to maintain the STARTED flag. Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20240130192305.250915-4-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: ipa: begin simplifying TX queue stopAlex Elder
There are a number of flags used in the IPA driver to attempt to manage race conditions that can occur between runtime resume and netdev transmit. If we disable TX before requesting power, we can avoid these races entirely, simplifying things considerably. This patch implements the main change, disabling transmit always in the net_device->ndo_start_xmit() callback, then re-enabling it again whenever we find power is active (or when we drop the skb). The patches that follow will refactor the "old" code to the point that most of it can be eliminated. Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20240130192305.250915-3-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: ipa: stash modem TX and RX endpointsAlex Elder
Rather than repeatedly looking up the endpoints in the name map, save the modem TX and RX endpoint pointers in the netdev private area. Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20240130192305.250915-2-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01idpf: avoid compiler padding in virtchnl2_ptype structPavan Kumar Linga
In the arm random config file, kconfig option 'CONFIG_AEABI' is disabled which results in adding the compiler flag '-mabi=apcs-gnu'. This causes the compiler to add padding in virtchnl2_ptype structure to align it to 8 bytes, resulting in the following size check failure: include/linux/build_bug.h:78:41: error: static assertion failed: "(6) == sizeof(struct virtchnl2_ptype)" 78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg) | ^~~~~~~~~~~~~~ include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert' 77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr) | ^~~~~~~~~~~~~~~ drivers/net/ethernet/intel/idpf/virtchnl2.h:26:9: note: in expansion of macro 'static_assert' 26 | static_assert((n) == sizeof(struct X)) | ^~~~~~~~~~~~~ drivers/net/ethernet/intel/idpf/virtchnl2.h:982:1: note: in expansion of macro 'VIRTCHNL2_CHECK_STRUCT_LEN' 982 | VIRTCHNL2_CHECK_STRUCT_LEN(6, virtchnl2_ptype); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ Avoid the compiler padding by using "__packed" structure attribute for the virtchnl2_ptype struct. Also align the structure by using "__aligned(2)" for better code optimization. Fixes: 0d7502a9b4a7 ("virtchnl: add virtchnl version 2 ops") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312220250.ufEm8doQ-lkp@intel.com Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240131222241.2087516-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01hv_netvsc: Fix race condition between netvsc_probe and netvsc_removeSouradeep Chakrabarti
In commit ac5047671758 ("hv_netvsc: Disable NAPI before closing the VMBus channel"), napi_disable was getting called for all channels, including all subchannels without confirming if they are enabled or not. This caused hv_netvsc getting hung at napi_disable, when netvsc_probe() has finished running but nvdev->subchan_work has not started yet. netvsc_subchan_work() -> rndis_set_subchannel() has not created the sub-channels and because of that netvsc_sc_open() is not running. netvsc_remove() calls cancel_work_sync(&nvdev->subchan_work), for which netvsc_subchan_work did not run. netif_napi_add() sets the bit NAPI_STATE_SCHED because it ensures NAPI cannot be scheduled. Then netvsc_sc_open() -> napi_enable will clear the NAPIF_STATE_SCHED bit, so it can be scheduled. napi_disable() does the opposite. Now during netvsc_device_remove(), when napi_disable is called for those subchannels, napi_disable gets stuck on infinite msleep. This fix addresses this problem by ensuring that napi_disable() is not getting called for non-enabled NAPI struct. But netif_napi_del() is still necessary for these non-enabled NAPI struct for cleanup purpose. Call trace: [ 654.559417] task:modprobe state:D stack: 0 pid: 2321 ppid: 1091 flags:0x00004002 [ 654.568030] Call Trace: [ 654.571221] <TASK> [ 654.573790] __schedule+0x2d6/0x960 [ 654.577733] schedule+0x69/0xf0 [ 654.581214] schedule_timeout+0x87/0x140 [ 654.585463] ? __bpf_trace_tick_stop+0x20/0x20 [ 654.590291] msleep+0x2d/0x40 [ 654.593625] napi_disable+0x2b/0x80 [ 654.597437] netvsc_device_remove+0x8a/0x1f0 [hv_netvsc] [ 654.603935] rndis_filter_device_remove+0x194/0x1c0 [hv_netvsc] [ 654.611101] ? do_wait_intr+0xb0/0xb0 [ 654.615753] netvsc_remove+0x7c/0x120 [hv_netvsc] [ 654.621675] vmbus_remove+0x27/0x40 [hv_vmbus] Cc: stable@vger.kernel.org Fixes: ac5047671758 ("hv_netvsc: Disable NAPI before closing the VMBus channel") Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/1706686551-28510-1-git-send-email-schakrabarti@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01xen-netback: properly sync TX responsesJan Beulich
Invoking the make_tx_response() / push_tx_responses() pair with no lock held would be acceptable only if all such invocations happened from the same context (NAPI instance or dealloc thread). Since this isn't the case, and since the interface "spec" also doesn't demand that multicast operations may only be performed with no in-flight transmits, MCAST_{ADD,DEL} processing also needs to acquire the response lock around the invocations. To prevent similar mistakes going forward, "downgrade" the present functions to private helpers of just the two remaining ones using them directly, with no forward declarations anymore. This involves renaming what so far was make_tx_response(), for the new function of that name to serve the new (wrapper) purpose. While there, - constify the txp parameters, - correct xenvif_idx_release()'s status parameter's type, - rename {,_}make_tx_response()'s status parameters for consistency with xenvif_idx_release()'s. Fixes: 210c34dcd8d9 ("xen-netback: add support for multicast control") Cc: stable@vger.kernel.org Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org> Link: https://lore.kernel.org/r/980c6c3d-e10e-4459-8565-e8fbde122f00@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net/mlx5: DPLL, Implement lock status error valueJiri Pirko
Fill-up the lock status error value properly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01dpll: extend lock_status_get() op by status error and expose to userJiri Pirko
Pass additional argunent status_error over lock_status_get() so drivers can fill it up. In case they do, expose the value over previously introduced attribute to user. Do it only in case the current lock_status is either "unlocked" or "holdover". Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01octeontx2-pf: Remove xdp queues on program detachGeetha sowjanya
XDP queues are created/destroyed when a XDP program is attached/detached. In current driver xdp_queues are not getting destroyed on program exit due to incorrect xdp_queue and tot_tx_queue count values. This patch fixes the issue by setting tot_tx_queue and xdp_queue count to correct values. It also fixes xdp.data_hard_start address. Fixes: 06059a1a9a4a ("octeontx2-pf: Add XDP support to netdev PF") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Link: https://lore.kernel.org/r/20240130120610.16673-1-gakula@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Reduce lines with longer column width boundaryDavid Arinzon
This patch reduces some of the lines by removing newlines where more variables or print strings can be pushed back to the previous line while still adhering to the styling guidelines. Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: handle ena_calc_io_queue_size() possible errorsDavid Arinzon
Fail queue size calculation when the device returns maximum TX/RX queue sizes that are smaller than the allowed minimum. Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Change default print level for netif_ printsDavid Arinzon
The netif_* functions are used by the driver to log events into the kernel ring (dmesg) similar to the netdev_* ones. Unlike the latter, the netif_* function family allow the user to choose what events get logged using ethtool: sudo ethtool -s [interface] msglvl [msg_type] on By default the events which get logged are slow-path related and aren't printed often (e.g. interface up related prints). This patch removes the NETIF_MSG_TX_DONE type (called every TX completion polling) from the defaults and adds NETIF_MSG_IFDOWN instead as it makes more sensible defaults. This patch also transforms ena_down() print from netif_info into netif_dbg (same as the analogue print in ena_up()) as it suits it better. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Relocate skb_tx_timestamp() to improve time stamping accuracyDavid Arinzon
Move skb_tx_timestamp() closer to the actual time the driver sends the packets to the device. Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Add more information on TX timeoutsDavid Arinzon
The function responsible for polling TX completions might not receive the CPU resources it needs due to higher priority tasks running on the requested core. The driver might not be able to recognize such cases, but it can use its state to suspect that they happened. If both conditions are met: - napi hasn't been executed more than the TX completion timeout value - napi is scheduled (meaning that we've received an interrupt) Then it's more likely that the napi handler isn't scheduled because of an overloaded CPU. It was decided that for this case, the driver would wait twice as long as the regular timeout before scheduling a reset. The driver uses ENA_REGS_RESET_SUSPECTED_POLL_STARVATION reset reason to indicate this case to the device. This patch also adds more information to the ena_tx_timeout() callback. This function is called by the kernel when it detects that a specific TX queue has been closed for too long. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Change error print during ena_device_init()David Arinzon
The print was re-worded to a more informative one. Signed-off-by: Shahar Itzko <itzko@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Remove CQ tail pointer updateDavid Arinzon
The functionality was added to allow the drivers to create an SQ and CQ of different sizes. When the RX/TX SQ and CQ have the same size, such update isn't necessary as the device can safely assume it doesn't override unprocessed completions. However, if the SQ is larger than the CQ, the device might "have" more completions it wants to update about than there's room in the CQ. There's no support for different SQ and CQ sizes, therefore, removing the API and its usage. '____cacheline_aligned' compiler attribute was added to 'struct ena_com_io_cq' to ensure that the removal of the 'cq_head_db_reg' field doesn't change the cache-line layout of this struct. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Enable DIM by defaultDavid Arinzon
Dynamic Interrupt Moderation (DIM) is a technique designed to balance the need for timely data processing with the desire to minimize CPU overhead. Instead of generating an interrupt for every received packet, the system can dynamically adjust the rate at which interrupts are generated based on the incoming traffic patterns. Enabling DIM by default to improve the user experience. DIM can be turned on/off through ethtool: `ethtool -C <interface> adaptive-rx <on/off>` Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Minor cosmetic changesDavid Arinzon
A few changes for better readability and style 1. Adding / Removing newlines 2. Removing an unnecessary and confusing comment 3. Using an existing variable rather than re-checking a field Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Remove an unused fieldDavid Arinzon
Remove io_sq->header_addr field because it is no longer in use. LLQ was updated to support a bounce buffer so there is no need in saving the header address of the sq. Signed-off-by: Nati Koler <nkoler@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>