Age | Commit message (Collapse) | Author |
|
Existing MANA design assigns IRQ to every CPU, including sibling
hyper-threads. This may cause multiple IRQs to be active simultaneously
in the same core and may reduce the network performance.
Improve the performance by assigning IRQ to non sibling CPUs in local
NUMA node. The performance improvement we are getting using ntttcp with
following patch is around 15 percent against existing design and
approximately 11 percent, when trying to assign one IRQ in each core
across NUMA nodes, if enough cores are present.
The change will improve the performance for the system
with high number of CPU, where number of CPUs in a node is more than
64 CPUs. Nodes with 64 CPUs or less than 64 CPUs will not be affected
by this change.
The performance study was done using ntttcp tool in Azure.
The node had 2 nodes with 32 cores each, total 128 vCPU and number of channels
were 32 for 32 RX rings.
The below table shows a comparison between existing design and new
design:
IRQ node-num core-num CPU performance(%)
1 0 | 0 0 | 0 0 | 0-1 0
2 0 | 0 0 | 1 1 | 2-3 3
3 0 | 0 1 | 2 2 | 4-5 10
4 0 | 0 1 | 3 3 | 6-7 15
5 0 | 0 2 | 4 4 | 8-9 15
...
...
25 0 | 0 12| 24 24| 48-49 12
...
32 0 | 0 15| 31 31| 62-63 12
33 0 | 0 16| 0 32| 0-1 10
...
64 0 | 0 31| 31 63| 62-63 0
Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Souradeep investigated that the driver performs faster if IRQs are
spread on CPUs with the following heuristics:
1. No more than one IRQ per CPU, if possible;
2. NUMA locality is the second priority;
3. Sibling dislocality is the last priority.
Let's consider this topology:
Node 0 1
Core 0 1 2 3
CPU 0 1 2 3 4 5 6 7
The most performant IRQ distribution based on the above topology
and heuristics may look like this:
IRQ Nodes Cores CPUs
0 1 0 0-1
1 1 1 2-3
2 1 0 0-1
3 1 1 2-3
4 2 2 4-5
5 2 3 6-7
6 2 2 4-5
7 2 3 6-7
The irq_setup() routine introduced in this patch leverages the
for_each_numa_hop_mask() iterator and assigns IRQs to sibling groups
as described above.
According to [1], for NUMA-aware but sibling-ignorant IRQ distribution
based on cpumask_local_spread() performance test results look like this:
./ntttcp -r -m 16
NTTTCP for Linux 1.4.0
---------------------------------------------------------
08:05:20 INFO: 17 threads created
08:05:28 INFO: Network activity progressing...
08:06:28 INFO: Test run completed.
08:06:28 INFO: Test cycle finished.
08:06:28 INFO: ##### Totals: #####
08:06:28 INFO: test duration :60.00 seconds
08:06:28 INFO: total bytes :630292053310
08:06:28 INFO: throughput :84.04Gbps
08:06:28 INFO: retrans segs :4
08:06:28 INFO: cpu cores :192
08:06:28 INFO: cpu speed :3799.725MHz
08:06:28 INFO: user :0.05%
08:06:28 INFO: system :1.60%
08:06:28 INFO: idle :96.41%
08:06:28 INFO: iowait :0.00%
08:06:28 INFO: softirq :1.94%
08:06:28 INFO: cycles/byte :2.50
08:06:28 INFO: cpu busy (all) :534.41%
For NUMA- and sibling-aware IRQ distribution, the same test works
15% faster:
./ntttcp -r -m 16
NTTTCP for Linux 1.4.0
---------------------------------------------------------
08:08:51 INFO: 17 threads created
08:08:56 INFO: Network activity progressing...
08:09:56 INFO: Test run completed.
08:09:56 INFO: Test cycle finished.
08:09:56 INFO: ##### Totals: #####
08:09:56 INFO: test duration :60.00 seconds
08:09:56 INFO: total bytes :741966608384
08:09:56 INFO: throughput :98.93Gbps
08:09:56 INFO: retrans segs :6
08:09:56 INFO: cpu cores :192
08:09:56 INFO: cpu speed :3799.791MHz
08:09:56 INFO: user :0.06%
08:09:56 INFO: system :1.81%
08:09:56 INFO: idle :96.18%
08:09:56 INFO: iowait :0.00%
08:09:56 INFO: softirq :1.95%
08:09:56 INFO: cycles/byte :2.25
08:09:56 INFO: cpu busy (all) :569.22%
[1] https://lore.kernel.org/all/20231211063726.GA4977@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Co-developed-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This commit introduces support for the KSZ8567, a robust 7-port
Ethernet switch. The KSZ8567 features two RGMII/MII/RMII interfaces,
each capable of gigabit speeds, complemented by five 10/100 Mbps
MAC/PHYs.
Signed-off-by: Philippe Schenker <philippe.schenker@impulsing.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240130083419.135763-2-dev@pschenker.ch
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently the teardown/setup flow for driver probe/remove is quite
a bit different from the reset flows in pdsc_fw_down()/pdsc_fw_up().
One key piece that's missing are the calls to pci_alloc_irq_vectors()
and pci_free_irq_vectors(). The pcie reset case is calling
pci_free_irq_vectors() on reset_prepare, but not calling the
corresponding pci_alloc_irq_vectors() on reset_done. This is causing
unexpected/unwanted interrupt behavior due to the adminq interrupt
being accidentally put into legacy interrupt mode. Also, the
pci_alloc_irq_vectors()/pci_free_irq_vectors() functions are being
called directly in probe/remove respectively.
Fix this inconsistency by making the following changes:
1. Always call pdsc_dev_init() in pdsc_setup(), which calls
pci_alloc_irq_vectors() and get rid of the now unused
pds_dev_reinit().
2. Always free/clear the pdsc->intr_info in pdsc_teardown()
since this structure will get re-alloced in pdsc_setup().
3. Move the calls of pci_free_irq_vectors() to pdsc_teardown()
since pci_alloc_irq_vectors() will always be called in
pdsc_setup()->pdsc_dev_init() for both the probe/remove and
reset flows.
4. Make sure to only create the debugfs "identity" entry when it
doesn't already exist, which it will in the reset case because
it's already been created in the initial call to pdsc_dev_init().
Fixes: ffa55858330f ("pds_core: implement pci reset handlers")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240129234035.69802-7-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During reset the BARs might be accessed when they are
unmapped. This can cause unexpected issues, so fix it by
clearing the cached BAR values so they are not accessed
until they are re-mapped.
Also, make sure any places that can access the BARs
when they are NULL are prevented.
Fixes: 49ce92fbee0b ("pds_core: add FW update feature to devlink")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-6-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are multiple paths that can result in using the pdsc's
adminq.
[1] pdsc_adminq_isr and the resulting work from queue_work(),
i.e. pdsc_work_thread()->pdsc_process_adminq()
[2] pdsc_adminq_post()
When the device goes through reset via PCIe reset and/or
a fw_down/fw_up cycle due to bad PCIe state or bad device
state the adminq is destroyed and recreated.
A NULL pointer dereference can happen if [1] or [2] happens
after the adminq is already destroyed.
In order to fix this, add some further state checks and
implement reference counting for adminq uses. Reference
counting was used because multiple threads can attempt to
access the adminq at the same time via [1] or [2]. Additionally,
multiple clients (i.e. pds-vfio-pci) can be using [2]
at the same time.
The adminq_refcnt is initialized to 1 when the adminq has been
allocated and is ready to use. Users/clients of the adminq
(i.e. [1] and [2]) will increment the refcnt when they are using
the adminq. When the driver goes into a fw_down cycle it will
set the PDSC_S_FW_DEAD bit and then wait for the adminq_refcnt
to hit 1. Setting the PDSC_S_FW_DEAD before waiting will prevent
any further adminq_refcnt increments. Waiting for the
adminq_refcnt to hit 1 allows for any current users of the adminq
to finish before the driver frees the adminq. Once the
adminq_refcnt hits 1 the driver clears the refcnt to signify that
the adminq is deleted and cannot be used. On the fw_up cycle the
driver will once again initialize the adminq_refcnt to 1 allowing
the adminq to be used again.
Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-5-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The initial design for the adminq interrupt was done based
on client drivers having their own adminq and adminq
interrupt. So, each client driver's adminq isr would use
their specific adminqcq for the private data struct. For the
time being the design has changed to only use a single
adminq for all clients. So, instead use the struct pdsc for
the private data to simplify things a bit.
This also has the benefit of not dereferencing the adminqcq
to access the pdsc struct when the PDSC_S_STOPPING_DRIVER bit
is set and the adminqcq has actually been cleared/freed.
Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-4-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a small window where pdsc_work_thread()
calls pdsc_process_adminq() and pdsc_process_adminq()
passes the PDSC_S_STOPPING_DRIVER check and starts
to process adminq/notifyq work and then the driver
starts a fw_down cycle. This could cause some
undefined behavior if the notifyqcq/adminqcq are
free'd while pdsc_process_adminq() is running. Use
cancel_work_sync() on the adminqcq's work struct
to make sure any pending work items are cancelled
and any in progress work items are completed.
Also, make sure to not call cancel_work_sync() if
the work item has not be initialized. Without this,
traces will happen in cases where a reset fails and
teardown is called again or if reset fails and the
driver is removed.
Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-3-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The PCIe reset handlers can run at the same time as the
health thread. This can cause the health thread to
stomp on the PCIe reset. Fix this by preventing the
health thread from running while a PCIe reset is happening.
As part of this use timer_shutdown_sync() during reset and
remove to make sure the timer doesn't ever get rearmed.
Fixes: ffa55858330f ("pds_core: implement pci reset handlers")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-2-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Almost all the QCA8081 PHY driver OPs are specific and only some of them
use the generic at803x.
To make the at803x code slimmer, move all the specific qca808x regs and
functions to a dedicated PHY driver.
Probe function and priv struct is reworked to allocate and use only the
qca808x specific data. Unused data from at803x PHY driver are dropped
from at803x priv struct.
Also a new Kconfig is introduced QCA808X_PHY, to compile the newly
introduced PHY driver for QCA8081 PHY.
As the Kconfig name starts with Qualcomm the same order is kept.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240129141600.2592-6-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move additional functions to shared library in preparation for qca808x
PHY Family to be detached from at803x driver.
Only the shared defines are moved to the shared qcom.h header.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240129141600.2592-5-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Deatch qca83xx PHY driver from at803x.
The QCA83xx PHYs implement specific function and doesn't use generic
at803x so it can be detached from the driver and moved to a dedicated
one.
Probe function and priv struct is reimplemented to allocate and use
only the qca83xx specific data. Unused data from at803x PHY driver
are dropped from at803x priv struct.
This is to make slimmer PHY drivers instead of including lots of bloat
that would never be used in specific SoC.
A new Kconfig flag QCA83XX_PHY is introduced to compile the new
introduced PHY driver.
As the Kconfig name starts with Qualcomm the same order is kept.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240129141600.2592-4-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Create and move functions to shared library in preparation for qca83xx
PHY Family to be detached from at803x driver.
Only the shared defines are moved to the shared qcom.h header.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240129141600.2592-3-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for addition of other Qcom PHY and to tidy things up,
move the at803x PHY driver to dedicated directory.
The same order in the Kconfig selection is saved.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240129141600.2592-2-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
An interrupt handler was added to the driver as well as functions
to enable interrupts at the phy.
There are several interrupts maskable at the phy, but only link change
interrupts are handled by the driver yet.
Signed-off-by: Andre Werner <andre.werner@systec-electronic.com>
Link: https://lore.kernel.org/r/20240129135734.18975-3-andre.werner@systec-electronic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If phydev->irq is set unconditionally, check
for valid interrupt handler or fall back to polling mode to prevent
nullptr exceptions in interrupt service routine.
Signed-off-by: Andre Werner <andre.werner@systec-electronic.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240129135734.18975-2-andre.werner@systec-electronic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Not all mv88e6xxx device support C45 read/write operations. Those
which do not return -EOPNOTSUPP. However, when phylib scans the bus,
it considers this fatal, and the probe of the MDIO bus fails, which in
term causes the mv88e6xxx probe as a whole to fail.
When there is no device on the bus for a given address, the pull up
resistor on the data line results in the read returning 0xffff. The
phylib core code understands this when scanning for devices on the
bus. C45 allows multiple devices to be supported at one address, so
phylib will perform a few reads at each address, so although thought
not the most efficient solution, it is a way to avoid fatal
errors. Make use of this as a minimal fix for stable to fix the
probing problems.
Follow up patches will rework how C45 operates to make it similar to
C22 which considers -ENODEV as a none-fatal, and swap mv88e6xxx to
using this.
Cc: stable@vger.kernel.org
Fixes: 743a19e38d02 ("net: dsa: mv88e6xxx: Separate C22 and C45 transactions")
Reported-by: Tim Menninger <tmenninger@purestorage.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240129224948.1531452-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
lan743x_ptp_request_tx_timestamp uses spin_lock_bh, but it is
only called from lan743x_tx_xmit_frame where all IRQs are
already disabled.
This fixes the "IRQs not enabled as expected" warning.
Signed-off-by: Lucas Tanure <tanure@linux.com>
Link: https://lore.kernel.org/r/20240128101849.107298-1-tanure@linux.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The company I2SE has been acquired a long time ago. Switch to
my private mail address before the I2SE account is deactivated.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to MODULE_LICENSE the driver is under a dual license.
So replace the BSD license text with the proper SPDX tag.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
All known SPI registers of the QCA700x are 16 bit long. So adjust
the formater width accordingly.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Most of the users doesn't know the expected signature of the QCA700x.
So provide it within the error message. Btw use lowercase for hex as
in the rest of the driver.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are two points with the calculation of RX buffer size which are
not optimal:
1. dev->mtu is a mutual parameter and it's currently initialized with
QCAFRM_MAX_MTU. But for RX buffer size calculation we always need the
maximum possible MTU. So better use the define directly.
2. This magic number 4 represent the hardware generated frame length
which is specific to SPI. We better replace this with the suitable
define.
There is no functional change.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently qca_spi reserves enough space for 4 complete Ethernet over SPI
frames in the receive buffer. Unfortunately this is hidden under a magic
number. So replace it with a more self explaining define.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
All defines in qca_spi.h except of the two ring limit defines have
a QCASPI prefix. Since the name is quite generic add the QCASPI prefix
to avoid possible name conflicts.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This member is never used. So drop it.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
qcafrm_fsm_decode has the almost the same function description in
qca_7k_common.c. So drop the comment here.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The skb spare room needs to be expanded for SPI header, footer
and possible padding within the TX path. So announce the necessary
space in order to avoid expensive skb_copy_expand calls.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The functions qcaspi_netdev_open/close are responsible of request &
free of the SPI interrupt, which wasn't the best choice because
allocation problems are discovered not during probe. So let us split
IRQ allocation & enabling, so we can take advantage of a device
managed IRQ.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Directly storing the result of kthread_run within the private
driver data makes it harder to identify if the pointer has a
running thread or not. So better use a local variable for
the result check and we don't have to care about error pointer
in the rest of the code.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We better not rely on that spi_thread points to a running
thread. So add an check for this.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On lan966x, it is possible to use debugfs to print different information
about the VCAPs. Information like, if it is enabled, how the ports are
configured, print the actual rules. The issue is that when printing how
the ports are configured for IS1 lookups, it was parsing the wrong
register to get this information. The fix consists in reading the
correct register that contains this information.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
bitmaps
Change genphy_c45_ethtool_[get|set]_eee to use EEE linkmode bitmaps.
This is a prerequisite for adding support for EEE modes beyond bit 31.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is in preparation of using the existing names for linkmode
bitmaps.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
side
In order to pass EEE link modes beyond bit 32 to userspace we have to
complement the 32 bit bitmaps in struct ethtool_eee with linkmode
bitmaps. Therefore, similar to ethtool_link_settings and
ethtool_link_ksettings, add a struct ethtool_keee. In a first step
it's an identical copy of ethtool_eee. This patch simply does a
s/ethtool_eee/ethtool_keee/g for all users.
No functional change intended.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Report taprio offload status. This includes per txq and global
counters of window_drops and tx_overruns.
Window_drops count include count of frames dropped because of
queueMaxSDU setting and HLBF error. Transmission overrun counter
inform the user application whether any packets are currently being
transmitted on a particular queue during a gate-close event.DWMAC IPs
takes care Transmission overrun won't happen hence this is always 0.
Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Keep per Tx-queue error count on Head-Of-Line Blocking due to frame
size(HLBF) error. The MAC raises HLBF error on one or more queues
when none of the time Intervals of open-gates in the GCL is greater
than or equal to the duration needed for frame transmission and by
default drops those packets that causes HLBF error. EST_FRM_SZ_ERR
register provides the One Hot encoded Queue numbers that have the
Frame Size related error.
Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for configuring queueMaxSDU. As DWMAC IPs doesn't support
queueMaxSDU table handle this in the SW. The maximum 802.3 frame size
that is allowed to be transmitted by any queue is queueMaxSDU +
16 bytes (i.e. 6 bytes SA + 6 bytes DA + 4 bytes FCS).
Inspired from intel i225 driver.
Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a missing quirk to enable support for the StarFive JH7100 SoC.
Additionally, for greater flexibility in operation, allow using the
rgmii-rxid and rgmii-txid phy modes.
Co-developed-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-01-29 (e1000e, ixgbe)
This series contains updates to e1000e and ixgbe drivers.
Jake corrects values used for maximum frequency adjustment for e1000e.
Christophe Jaillet adjusts error handling path so that semaphore is
released on ixgbe.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ixgbe: Fix an error handling path in ixgbe_read_iosf_sb_reg_x550()
e1000e: correct maximum frequency adjustment values
====================
Link: https://lore.kernel.org/r/20240129185240.787397-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Class-based I2C probing requires detect() and address_list to be
set in the I2C client driver, see checks in i2c_detect().
It's misleading to declare I2C_CLASS_HWMON support if this
precondition isn't met.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/77b5ab8e-20f2-4310-bd89-57db99e2f53b@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mlxsw driver uses 'unsigned int' for reference counters in several
structures. Instead, use refcount_t type which allows us to catch overflow
and underflow issues. Change the type of the counters and use the
appropriate API.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
mlxsw_sp stores an array of LAGs. When a port joins a LAG, in case that
this LAG is already in use, we only have to increase the reference counter.
Otherwise, we have to search for an unused LAG ID and configure it in
hardware. When a port leaves a LAG, we have to destroy it only for the last
user. This code can be simplified, for such requirements we usually add
get() and put() functions which create and destroy the object.
Add mlxsw_sp_lag_{get,put}() and use them. These functions take care of
the reference counter and hardware configuration if needed. Change the
reference counter to refcount_t type which catches overflow and underflow
issues.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, the function mlxsw_sp_lag_index_get() is called twice - first
as part of NETDEV_PRECHANGEUPPER event and later as part of
NETDEV_CHANGEUPPER. This function will be changed in the next patch. To
simplify the code, call it only once as part of NETDEV_CHANGEUPPER
event and set an error message using 'extack' in case of failure.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The maximum number of LAGs is queried from core several times. It is
used to allocate LAG array, and then to iterate over it. In addition, it
is used for PGT initialization. To simplify the code, instead of
querying it several times, store the value as part of 'mlxsw_sp' and use
it.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
A next patch will add mlxsw_sp_lag_{get,put}() functions to handle LAG
reference counting and create/destroy it only for first user/last user.
Remove mlxsw_sp_lag_get() function and access LAG array directly.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The structure mlxsw_sp_upper is used only as LAG. Rename it to
mlxsw_sp_lag and move it to spectrum.c file, as it is used only there.
Move the function mlxsw_sp_lag_get() with the structure.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
TSO and TBS cannot coexist. For now we set i.MX Ethernet QOS controller to
use the first TX queue with TSO and the rest for TBS.
TX queues with TBS can support etf qdisc hw offload.
Signed-off-by: Esben Haabendal <esben@geanix.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
With the dma conf being reallocated on each call to stmmac_open(), any
information in there is lost, unless we specifically handle it.
The STMMAC_TBS_EN bit is set when adding an etf qdisc, and the etf qdisc
therefore would stop working when link was set down and then back up.
Fixes: ba39b344e924 ("net: ethernet: stmicro: stmmac: generate stmmac dma conf before open")
Cc: stable@vger.kernel.org
Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When setting or getting PHC time, the higher bits of the second time (>32
bits) they were ignored. Meaning that setting some time in the future like
year 2150, it was failing to set this.
The issue can be reproduced like this:
# phc_ctl /dev/ptp1 set 10000000000
phc_ctl[12.290]: set clock time to 10000000000.000000000 or Sat Nov 20 17:46:40 2286
# phc_ctl /dev/ptp1 get
phc_ctl[15.309]: clock time is 1410065411.018055420 or Sun Sep 7 04:50:11 2014
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240126073042.1845153-1-horatiu.vultur@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|