Age | Commit message (Collapse) | Author |
|
Delay parsing and setting were done in the driver's probe API. As some IP
variants switch to reset mode (and thus the register contents is lost) when
setting clocks (due to module standby functionality) to be able to
implement runtime PM keep the delay parsing in the driver's probe function
and move the delay applying function to the driver's ndo_open API.
Along with it, ravb_parse_delay_mode() function was moved close to
ravb_set_delay_mode() function to have the delay specific code in the
same place.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ravb_set_gti() was computing the value of GTI based on the reference clock
rate and then applied it to register. This was done on the driver's probe
function. In order to implement runtime PM for all IP variants (as some IP
variants switches to reset mode (and thus the registers content is lost)
when module standby is configured through clock APIs) the GTI setup was
split in 2 parts: one computing the value of the GTI register (done in the
driver's probe function) and one applying the computed value to register
(done in the driver's ndo_open API).
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The runtime PM implementation will disable clocks at the end of
ravb_probe(). As some IP variants switch to reset mode as a result of
setting module standby through clock disable APIs, to implement runtime PM
the resource parsing and requesting are moved in the probe function and IP
settings are moved in the open function. This is done because at the end of
the probe some IP variants will switch anyway to reset mode and the
registers content is lost. Also keeping only register settings operations
in the ravb_open()/ravb_close() functions will make them faster.
Commit moves IRQ requests to ravb_probe() to have all the IRQs ready when
the interface is open. As now getting/requesting IRQs is done in a single
place there is no need to keep intermediary data (like ravb_rx_irqs[] and
ravb_tx_irqs[] arrays or IRQs in struct ravb_private).
In order to avoid accessing the IP registers while the IP is runtime
suspended (e.g. in the timeframe b/w the probe requests shared IRQs and
IP clocks are enabled) in the interrupt handlers were introduced
pm_runtime_active() checks. The device runtime PM usage counter has been
incremented to avoid disabling the device's clocks while the check is in
progress (if any).
This is a preparatory change to add runtime PM support for all IP variants.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Reference clock could be or not be part of the power domain. If it is part
of the power domain, the power domain takes care of properly setting it. In
case it is not part of the power domain and full runtime PM support is
available in driver the clock will not be propertly disabled/enabled at
runtime. For this, keep the prepare/unprepare operations in the driver's
probe()/remove() functions and move the enable/disable in runtime PM
functions.
By doing this, the previous ravb_runtime_nop() function was renamed
ravb_runtime_suspend() and the comment was removed. A proper runtime PM
resume function was added (ravb_runtime_resume()). The current driver
still don't need to make any register settings on runtime suspend/resume
(as expressed in the removed comment) because, currently,
pm_runtime_put_sync() is called on the driver remove function. This will be
changed in the next commits (that extends the runtime PM support) such
that proper register settings (along with runtime resume/suspend) will be
done on ravb_open()/ravb_close().
Along with it, the other clock request operations were moved close to
reference clock request and prepare to have all the clock requests
specific code grouped together.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
RZ/G3S can go to deep sleep states where power to most of the SoC parts is
off. When resuming from such a state, the Ethernet controller needs to be
reinitialized. De-asserting the reset signal for it should also be done.
Thus, add reset assert/de-assert on suspend/resume functions.
On the resume function, the de-assert was not reverted in case of failures
to give the user a chance to restore the interface (e.g., bringing down/up
the interface) in case suspend/resume failed.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Use tabs instead of spaces in the ravb_set_rate_gbeth() function.
This aligns with the coding style requirements.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
SET_SYSTEM_SLEEP_PM_OPS() and SET_RUNTIME_PM_OPS() are deprecated now
and require __maybe_unused protection against unused function warnings.
The usage of pm_ptr() and SYSTEM_SLEEP_PM_OPS()/RUNTIME_PM_OPS() allows
the compiler to see the functions, thus suppressing the warning. Thus
drop the __maybe_unused markings.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
On the RZ/G3S SoC the reset controller is mandatory for the IP to work.
The device tree binding documentation for the ravb driver specifies that
the resets are mandatory. Based on this, make the resets mandatory also in
driver for all ravb devices.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ravb_rzv2m_hw_info::gptp_ref_clk is enabled only for RZ/V2M. RZ/V2M
is an ARM64-based device which selects power domains by default and
CONFIG_PM. The RZ/V2M Ethernet DT node has proper power-domain binding
available in device tree from the commit that added the Ethernet node.
(4872ca1f92b0 ("arm64: dts: renesas: r9a09g011: Add ethernet nodes")).
Power domain support was available in the rzg2l-cpg.c driver when the
Ethernet DT node has been enabled in RZ/V2M device tree.
(ef3c613ccd68 ("clk: renesas: Add CPG core wrapper for RZ/G2L SoC")).
Thus, remove the explicit clock enable for gptp_clk (and treat it as the
other clocks are treated) as it is not needed and removing it doesn't
break the ABI according to the above explanations.
By removing the enable/disable operation from the driver we can add
runtime PM support (which operates on clocks) w/o the need to handle
the gptp_clk in the Ethernet driver functions like ravb_runtime_nop().
PM domain does all that is needed.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ravb_poll() initial code used to interrogate the first descriptor of the
RX queue in case gPTP is false to determine if ravb_rx() should be called.
This is done for non-gPTP IPs. For gPTP IPs the driver PTP-specific
information was used to determine if receive function should be called. As
every IP has its own receive function that interrogates the RX descriptors
list in the same way the ravb_poll() was doing there is no need to double
check this in ravb_poll(). Removing the code from ravb_poll() leads to a
cleaner code.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The maple tree register cache is based on a much more modern data structure
than the rbtree cache and makes optimisation choices which are probably
more appropriate for modern systems than those made by the rbtree cache.
Signed-off-by: Bo Liu <liubo03@inspur.com>
Link: https://lore.kernel.org/r/20240202064336.39138-1-liubo03@inspur.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
It was pointed out during the review [1] of commit 68e1010cda79 ("net:
dsa: qca8k: put MDIO bus OF node on qca8k_mdio_register() failure") that
the rest of the qca8k driver uses "int ret" rather than "int err".
Make everything consistent in that regard, not only
qca8k_mdio_register(), but also qca8k_setup_mdio_bus().
[1] https://lore.kernel.org/netdev/qyl2w3ownx5q7363kqxib52j5htar4y6pkn7gen27rj45xr4on@pvy5agi6o2te/
Suggested-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It was pointed out during the review [1] of commit e66bf63a7f67 ("net:
dsa: qca8k: skip MDIO bus creation if its OF node has status =
"disabled"") that we now leak a reference to the "mdio" OF node if it is
disabled.
This is only a concern when using dynamic OF as far as I can tell (like
probing on an overlay), since OF nodes are never freed in the regular
case. Additionally, I'm unaware of any actual device trees (in
production or elsewhere) which have status = "disabled" for the MDIO OF
node. So handling this as a simple enhancement.
[1] https://lore.kernel.org/netdev/CAJq09z4--Ug+3FAmp=EimQ8HTQYOWOuVon-PUMGB5a1N=RPv4g@mail.gmail.com/
Suggested-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 1c870c63d7d2 ("net: fill in MODULE_DESCRIPTION()s for ocelot")
got a suggestion from Vladimir Oltean after it had landed in net-next.
Rewrite the module description according to Vladimir's suggestion.
Fixes: 1c870c63d7d2 ("net: fill in MODULE_DESCRIPTION()s for ocelot")
Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similar chunk of code is used in tsnep_rx_poll_zc() and
tsnep_rx_reopen_xsk() to maintain the RX XDP_RING_NEED_WAKEUP flag.
Consolidate the code to common helper function.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement the tun .get_channels functionality. This feature is necessary
for some tools, such as libxdp, which need to retrieve the queue count.
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This adds support for the RTL8126A found on Asus z790 Maximus Formula.
It was successfully tested w/o the firmware at 1000Mbps. Firmware file
has been provided by Realtek and submitted to linux-firmware.
2.5G and 5G modes are untested.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
By default lan8841's 1588 clock frequency is 125MHz. But when adjusting
the frequency, it is using the 1PPM format of the lan8814. Which is the
wrong format as lan8814 has a 1588 clock frequency of 250MHz. So then
for each 1PPM adjustment would adjust less than expected.
Therefore fix this by using the correct 1PPM format for lan8841.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
qca808x PHY provide support for the led_polarity_set OP to configure
and apply the active-low property but on PHY reset, the Active High bit
is not set resulting in the LED driven as active-low.
To fix this, check if active-low is not set in DT and enable Active High
polarity by default to restore correct funcionality of the LED.
Fixes: 7196062b64ee ("net: phy: at803x: add LED support for qca808x")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In switching to using phy_modify_mmd and a more short version of the
LED ON/OFF condition in later revision, it was made a logic error where
value ? QCA808X_LED_FORCE_ON : QCA808X_LED_FORCE_OFF is always true as
value is always OR with QCA808X_LED_FORCE_EN due to missing ()
resulting in the testing condition being QCA808X_LED_FORCE_EN | value.
Add the () to apply the correct condition and restore correct
functionality of the brightness ON/OFF.
Fixes: 7196062b64ee ("net: phy: at803x: add LED support for qca808x")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We don't have to store the EEE modes to be advertised in the driver,
phylib does this for us and stores it in phydev->advertising_eee.
phylib also takes care of properly handling the EEE advertisement.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/27c336a8-ea47-483d-815b-02c45ae41da2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A user reported that first consumer mainboards show up with a RTL8126A
5Gbps MAC/PHY. This adds support for the integrated PHY, which is also
available stand-alone. From a PHY driver perspective it's treated the
same as the 2.5Gbps PHY's, we just have to support the new PHY ID.
Reported-by: Joe Salmeri <jmscdba@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Joe Salmeri <jmscdba@gmail.com>
Link: https://lore.kernel.org/r/0c8e67ea-6505-43d1-bd51-94e7ecd6e222@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
PCI device IDs of RVU device IDs are configurable and
RVU PF0's (ie AF's) are currently assumed as VFs that
identify loopback functionality ie LBKVFs. But in some
cases these VFs can be setup for different functionality.
Hence remove assumptions that AF's VFs are always LBK VFs
by renaming 'is_afvf' as 'is_lbkvf' explicitly and also
identify LBK VF using PCI dev ID. Similar change is done
for other VF types.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In current driver 64 BPIDs are reserved for LBK interfaces.
These bpids are 1-to-1 mapped to LBK interface channel numbers.
In some usecases one LBK interface required more than one
bpids and in some case they may not require at all.
These usescase can't be address with the current implementation
as it always reserves only one bpid per LBK channel.
This patch addresses this issue by creating free bpid pool from these
64 bpids instead of 1-to-1 mapping to the lbk channel.
Now based on usecase LBK interface can request a bpid using (bp_enable()).
This patch also reduces the number of bpids for cgx interfaces to 8
and adds proper error code
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the led_polarity_set callback for setting LED polarity.
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The IPQ4019 MDIO internally divide the clock feed by AHB based on the
MDIO_MODE reg. On reset or power up, the default value for the
divider is 0xff that reflect the divider set to /256.
This makes the MDC run at a very low rate, that is, considering AHB is
always fixed to 100Mhz, a value of 390KHz.
This hasn't have been a problem as MDIO wasn't used for time sensitive
operation, it is now that on IPQ807x is usually mounted with PHY that
requires MDIO to load their firmware (example Aquantia PHY).
To handle this problem and permit to set the correct designed MDC
frequency for the SoC add support for the standard "clock-frequency"
property for the MDIO node.
The divider supports value from /1 to /256 and the common value are to
set it to /16 to reflect 6.25Mhz or to /8 on newer platform to reflect
12.5Mhz.
To scan if the requested rate is supported by the divider, loop with
each supported divider and stop when the requested rate match the final
rate with the current divider. An error is returned if the rate doesn't
match any value.
On MDIO reset, the divider is restored to the requested value to prevent
any kind of downclocking caused by the divider reverting to a default
value.
To follow 802.3 spec of 2.5MHz of default value, if divider is set at
/256 and "clock-frequency" is not set in DT, assume nobody set the
divider and try to find the closest MDC rate to 2.5MHz. (in the case of
AHB set to 100MHz, it's 1.5625MHz)
While at is also document other bits of the MDIO_MODE reg to have a
clear idea of what is actually applied there.
Documentation of some BITs is skipped as they are marked as reserved and
their usage is not clear (RES 11:9 GENPHY 16:13 RES1 19:17)
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
All ipa_power_modem_queue_wake() does is call netif_wake_queue()
on the modem netdev. There is no need to wrap that call in a
trivial function (and certainly not one defined in "ipa_power.c").
So get rid of ipa_power_modem_queue_wake(), and replace its one
caller with a direct call to netif_wake_queue(). Determine the
netdev pointer to use from the private TX endpoint's netdev pointer.
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-8-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
All ipa_power_modem_queue_active() does now is call netif_wake_queue().
Just call netif_wake_queue() in the two places it's needed, and get
rid of ipa_power_modem_queue_active().
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-7-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
All ipa_power_modem_queue_stop() does now is call netif_stop_queue().
Just call netif_stop_queue() in the one place it's needed, and get
rid of ipa_power_modem_queue_stop().
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-6-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently the STOPPED IPA power flag is used to indicate that the
transmit queue has been stopped. Previously this was used to avoid
setting the STARTED flag unless the queue had already been stopped.
It meant transmit queuing would be enabled on resume if it was
stopped by the transmit path--and if so, it ensured it only got
enabled once.
We only stop the transmit queue in the transmit path. The STARTED
flag has been removed, and it causes no real harm to enable
transmits when they're already enabled. So we can get rid of
the STOPPED flag and call netif_wake_queue() unconditionally.
This makes the IPA power spinlock unnecessary, so it can be removed
as well.
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-5-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A transmit on the modem netdev can only complete if the IPA hardware
is powered. Currently, if a transmit request arrives when the
hardware was not powered, further transmits are be stopped to allow
power-up to complete. Once power-up completes, transmits are once
again enabled.
Runtime resume can complete at the same time a transmit request is
being handled, and previously there was a race between stopping and
restarting transmits. The STARTED flag was used to ensure the
stop request in the transmit path was skipped if the start request
in the runtime resume path had already occurred.
Now, the queue is *always* stopped in the transmit path, *before*
determining whether power is ACTIVE. If power is found to already
be active (or if the socket buffer is gets dropped), transmit is
re-enabled. Otherwise it will (always) be enabled after runtime
resume completes.
The race between transmit and runtime resume no longer exists, so
there is no longer any need to maintain the STARTED flag.
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-4-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are a number of flags used in the IPA driver to attempt to
manage race conditions that can occur between runtime resume and
netdev transmit. If we disable TX before requesting power, we can
avoid these races entirely, simplifying things considerably.
This patch implements the main change, disabling transmit always in
the net_device->ndo_start_xmit() callback, then re-enabling it again
whenever we find power is active (or when we drop the skb).
The patches that follow will refactor the "old" code to the point
that most of it can be eliminated.
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-3-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rather than repeatedly looking up the endpoints in the name map,
save the modem TX and RX endpoint pointers in the netdev private
area.
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-2-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the arm random config file, kconfig option 'CONFIG_AEABI' is
disabled which results in adding the compiler flag '-mabi=apcs-gnu'.
This causes the compiler to add padding in virtchnl2_ptype
structure to align it to 8 bytes, resulting in the following
size check failure:
include/linux/build_bug.h:78:41: error: static assertion failed: "(6) == sizeof(struct virtchnl2_ptype)"
78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
| ^~~~~~~~~~~~~~
include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
| ^~~~~~~~~~~~~~~
drivers/net/ethernet/intel/idpf/virtchnl2.h:26:9: note: in expansion of macro 'static_assert'
26 | static_assert((n) == sizeof(struct X))
| ^~~~~~~~~~~~~
drivers/net/ethernet/intel/idpf/virtchnl2.h:982:1: note: in expansion of macro 'VIRTCHNL2_CHECK_STRUCT_LEN'
982 | VIRTCHNL2_CHECK_STRUCT_LEN(6, virtchnl2_ptype);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
Avoid the compiler padding by using "__packed" structure
attribute for the virtchnl2_ptype struct. Also align the
structure by using "__aligned(2)" for better code optimization.
Fixes: 0d7502a9b4a7 ("virtchnl: add virtchnl version 2 ops")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312220250.ufEm8doQ-lkp@intel.com
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20240131222241.2087516-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In commit ac5047671758 ("hv_netvsc: Disable NAPI before closing the
VMBus channel"), napi_disable was getting called for all channels,
including all subchannels without confirming if they are enabled or not.
This caused hv_netvsc getting hung at napi_disable, when netvsc_probe()
has finished running but nvdev->subchan_work has not started yet.
netvsc_subchan_work() -> rndis_set_subchannel() has not created the
sub-channels and because of that netvsc_sc_open() is not running.
netvsc_remove() calls cancel_work_sync(&nvdev->subchan_work), for which
netvsc_subchan_work did not run.
netif_napi_add() sets the bit NAPI_STATE_SCHED because it ensures NAPI
cannot be scheduled. Then netvsc_sc_open() -> napi_enable will clear the
NAPIF_STATE_SCHED bit, so it can be scheduled. napi_disable() does the
opposite.
Now during netvsc_device_remove(), when napi_disable is called for those
subchannels, napi_disable gets stuck on infinite msleep.
This fix addresses this problem by ensuring that napi_disable() is not
getting called for non-enabled NAPI struct.
But netif_napi_del() is still necessary for these non-enabled NAPI struct
for cleanup purpose.
Call trace:
[ 654.559417] task:modprobe state:D stack: 0 pid: 2321 ppid: 1091 flags:0x00004002
[ 654.568030] Call Trace:
[ 654.571221] <TASK>
[ 654.573790] __schedule+0x2d6/0x960
[ 654.577733] schedule+0x69/0xf0
[ 654.581214] schedule_timeout+0x87/0x140
[ 654.585463] ? __bpf_trace_tick_stop+0x20/0x20
[ 654.590291] msleep+0x2d/0x40
[ 654.593625] napi_disable+0x2b/0x80
[ 654.597437] netvsc_device_remove+0x8a/0x1f0 [hv_netvsc]
[ 654.603935] rndis_filter_device_remove+0x194/0x1c0 [hv_netvsc]
[ 654.611101] ? do_wait_intr+0xb0/0xb0
[ 654.615753] netvsc_remove+0x7c/0x120 [hv_netvsc]
[ 654.621675] vmbus_remove+0x27/0x40 [hv_vmbus]
Cc: stable@vger.kernel.org
Fixes: ac5047671758 ("hv_netvsc: Disable NAPI before closing the VMBus channel")
Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/1706686551-28510-1-git-send-email-schakrabarti@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Invoking the make_tx_response() / push_tx_responses() pair with no lock
held would be acceptable only if all such invocations happened from the
same context (NAPI instance or dealloc thread). Since this isn't the
case, and since the interface "spec" also doesn't demand that multicast
operations may only be performed with no in-flight transmits,
MCAST_{ADD,DEL} processing also needs to acquire the response lock
around the invocations.
To prevent similar mistakes going forward, "downgrade" the present
functions to private helpers of just the two remaining ones using them
directly, with no forward declarations anymore. This involves renaming
what so far was make_tx_response(), for the new function of that name
to serve the new (wrapper) purpose.
While there,
- constify the txp parameters,
- correct xenvif_idx_release()'s status parameter's type,
- rename {,_}make_tx_response()'s status parameters for consistency with
xenvif_idx_release()'s.
Fixes: 210c34dcd8d9 ("xen-netback: add support for multicast control")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Link: https://lore.kernel.org/r/980c6c3d-e10e-4459-8565-e8fbde122f00@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fill-up the lock status error value properly.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Pass additional argunent status_error over lock_status_get()
so drivers can fill it up. In case they do, expose the value over
previously introduced attribute to user. Do it only in case the
current lock_status is either "unlocked" or "holdover".
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
XDP queues are created/destroyed when a XDP program
is attached/detached. In current driver xdp_queues are not
getting destroyed on program exit due to incorrect xdp_queue
and tot_tx_queue count values.
This patch fixes the issue by setting tot_tx_queue and xdp_queue
count to correct values. It also fixes xdp.data_hard_start address.
Fixes: 06059a1a9a4a ("octeontx2-pf: Add XDP support to netdev PF")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Link: https://lore.kernel.org/r/20240130120610.16673-1-gakula@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This patch reduces some of the lines by removing newlines
where more variables or print strings can be pushed back
to the previous line while still adhering to the styling
guidelines.
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fail queue size calculation when the device returns maximum
TX/RX queue sizes that are smaller than the allowed minimum.
Signed-off-by: Osama Abboud <osamaabb@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The netif_* functions are used by the driver to log events into the
kernel ring (dmesg) similar to the netdev_* ones. Unlike the latter,
the netif_* function family allow the user to choose what events get
logged using ethtool:
sudo ethtool -s [interface] msglvl [msg_type] on
By default the events which get logged are slow-path related and aren't
printed often (e.g. interface up related prints). This patch removes the
NETIF_MSG_TX_DONE type (called every TX completion polling) from the
defaults and adds NETIF_MSG_IFDOWN instead as it makes more sensible
defaults.
This patch also transforms ena_down() print from netif_info into
netif_dbg (same as the analogue print in ena_up()) as it suits it
better.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Move skb_tx_timestamp() closer to the actual time the driver sends the
packets to the device.
Signed-off-by: Osama Abboud <osamaabb@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The function responsible for polling TX completions might not receive
the CPU resources it needs due to higher priority tasks running on the
requested core.
The driver might not be able to recognize such cases, but it can use its
state to suspect that they happened. If both conditions are met:
- napi hasn't been executed more than the TX completion timeout value
- napi is scheduled (meaning that we've received an interrupt)
Then it's more likely that the napi handler isn't scheduled because of
an overloaded CPU.
It was decided that for this case, the driver would wait twice as long
as the regular timeout before scheduling a reset.
The driver uses ENA_REGS_RESET_SUSPECTED_POLL_STARVATION reset reason to
indicate this case to the device.
This patch also adds more information to the ena_tx_timeout() callback.
This function is called by the kernel when it detects that a specific TX
queue has been closed for too long.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The print was re-worded to a more informative one.
Signed-off-by: Shahar Itzko <itzko@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The functionality was added to allow the drivers to create an
SQ and CQ of different sizes.
When the RX/TX SQ and CQ have the same size, such update isn't
necessary as the device can safely assume it doesn't override
unprocessed completions. However, if the SQ is larger than the CQ,
the device might "have" more completions it wants to update about
than there's room in the CQ.
There's no support for different SQ and CQ sizes, therefore,
removing the API and its usage.
'____cacheline_aligned' compiler attribute was added to
'struct ena_com_io_cq' to ensure that the removal of the
'cq_head_db_reg' field doesn't change the cache-line layout
of this struct.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Dynamic Interrupt Moderation (DIM) is a technique
designed to balance the need for timely data processing
with the desire to minimize CPU overhead.
Instead of generating an interrupt for every received
packet, the system can dynamically adjust the rate at
which interrupts are generated based on the incoming
traffic patterns.
Enabling DIM by default to improve the user experience.
DIM can be turned on/off through ethtool:
`ethtool -C <interface> adaptive-rx <on/off>`
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Osama Abboud <osamaabb@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
A few changes for better readability and style
1. Adding / Removing newlines
2. Removing an unnecessary and confusing comment
3. Using an existing variable rather than re-checking a field
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove io_sq->header_addr field because it is no longer
in use.
LLQ was updated to support a bounce buffer so there is
no need in saving the header address of the sq.
Signed-off-by: Nati Koler <nkoler@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|