Age | Commit message (Collapse) | Author |
|
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
|
|
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
iMX6 SoCs have an IP accelerator which can pad and strip two bytes in
the FIFO at the beginning of each packet. Since the FEC has alignment
restrictions on the buffer addresses, this feature is particularly
interesting for receive as it allows us to receive an appropriately
aligned packet for IP.
This allows us to reduce the CPU overhead by avoiding copying large
packets - and in the spirit of all network drivers which use this trick,
we provide a copybreak tunable which allows the packet size to be
copied to be adjusted.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Move the syncing for the DMA buffer back to the device immediately
after we've done with copying data from it. This means the logic in
the receive path handing the copying of data is now located together.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Move the final receive code sequence to a separate function.
Performance tests show that having lots of code integrated into one
big function can make the driver performance very dependent on
compiler behaviour, and more reliable performance can be obtained
by separating this out.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Implement this as two distinct code paths: this allows non-tagged
traffic to be copied in one go, whereas tagged traffic needs to be
copied in two separate chunks.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Rather than having the vlan tag processing scattered in multiple
different places, we can now localize the reading of the tag with
storing the tag in the skb. Group this code together.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Move the skb allocation before we sync the buffer for CPU access. In
order to do this, we need to detect whether the packet contains a VLAN
header so we can adjust the packet size appropriately.
This allows us to tidy the code a little in the following patches.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Move the quirks bitmap into the fec_enet_private data structure so we don't
need to keep reading it via a chain of pointers.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
fec_enet_bd_init() already frees the transmit skbuffs, so there's no
need for fec_restart() to do this again.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The driver has had a memory leak for quite some time: when the device is
probed, we allocate a page for the descriptor rings, but we never free
this.
Rather than trying to free it on the various probe failure paths, move
its allocation to device open time - we have to restart the FEC for this
to take effect. This also means we can free the descriptor rings when
we clean up in the device close function, which gives this a nice
symmetry.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Add scatter-gather support for SKB transmission. This allows the
driver to make use of GSO, which when enabled allows the iMX6Q to
increase TCP transmission throughput from about 320 to 420Mbps,
measured with iperf 2.0.5
We adjust the minimum transmit ring space according to whether SG
support is enabled or not. This allows non-SG configurations to avoid
the tx ring reservation necessary for SG, thereby making full use of
their available ring (since non-SG requires just one tx ring entry per
packet.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
This reverts commit d5e42c86e07935f3cab4dcfca787a733146d0c70
("net: fec: Don't clear IPV6 header checksum field when IP accelerator enable").
This reverts commit 79f339125ea316e910220e5f5b4ad30370f4de85
("net: fec: Add software TSO support"), as 96c50caa5148 ("net: fec:
Enable IP header hardware checksum") causes a regression with IPv6 by
overwriting the location where the IPv4 header checksum would be:
- 0x0000: 6000 0000 0028 0640 fd8f 7570 feb6 0001 `....(.@..up....
+ 0x0000: 6000 0000 0028 0640 fd8f 0000 feb6 0001 `....(.@........
This reverts commit 96c50caa5148e0e0a077672574785700885c6764
("net: fec: Enable IP header hardware checksum"), as it causes a
regression with IPv6 by overwriting the location where the IPv4
header checksum would be:
- 0x0000: 6000 0000 0028 0640 fd8f 7570 feb6 0001 `....(.@..up....
+ 0x0000: 6000 0000 0028 0640 fd8f 0000 feb6 0001 `....(.@........
|
|
Add ethtool support to retrieve and set the transmit and receive ring
sizes. This allows runtime tuning of these parameters, which can
result in better throughput with gigabit parts.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Rather than keep subtracting four off the packet length throughout the
receive path, do it at the point we read the packet length from the
descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Avoid checking for the enhanced buffer descriptor flag and the receive
checksumming/vlan flags - we now only set these feature flags if we
already know that we have enhanced buffer descriptors.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
descriptors
Only set the vlan tag and ip checksumming options if we have enhanced
buffer descriptors - enhanced descriptors are required for these
features.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The receive checksum flag is only changed upon ethtool configuration,
so move it into the flags member. In any case, this is checked
alongside the BUFDESC_EX flag.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Add a new flags field to contain the mostly static driver configuration,
the first of which is the bufdesc_ex flag.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Both fec_restart() and fec_stop() both perform a reset of the FEC. This
reset can result in an in-process MDIO transfer being terminated, which
can lead to the MDIO read/write functions timing out. Add some locking
to prevent this occuring.
We can't use a spinlock for this as the MDIO accessor functions use
wait_for_completion_timeout(), which sleeps.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
There is a commonality to fec_enet_mdio_read() and fec_enet_mdio_write()
which can be factored out. Factor that commonality out, since we need
to add some locking to prevent resets interfering with MDIO accesses.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
When SG is enabled, we must ensure that there are up to 1+MAX_SKB_FRAGS
free entries in the ring. When SG is disabled, this can be reduced to
one, and we can allow smaller rings. Adjust this setting according to
the state of SG.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Extend the previous commit to the receive descriptor ring as well. This
gets rid of the two nextdesc/prevdesc functions, since we now just need
to get the descriptor for an index instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Maintaining the transmit ring position via pointers is inefficient,
especially when it involves complex pointer manipulation, and overlap
with the receive descriptor logic.
Re-implement this using indexes, and a single function which returns
the descriptor (appropriate to the descriptor type). As an additional
benefit, using a union allows cleaner access to the descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Using a union gives clearer C code than the existing solution, and
allows the removal of some odd code from the receive path whose
purpose was to merely store the enhanced buffer descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Just as we need a barrier in the transmit path, we also need a barrier
in the receive path to ensure that we don't modify a handed over
descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Ensure that the writes to the descriptor data is visible to the hardware
before the descriptor is handed over to the hardware.
Having discussed this with Will Deacon, we need a wmb() between writing
the descriptor data and handing the descriptor over to the hardware.
The corresponding rmb() is in the ethernet hardware.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The FEC maintains two pointers into the transmit ring:
next - this is the insertion point in the ring, and points at a free
entry which can be written to.
dirty - this is the last dirty entry which we successfully cleaned, and
is always incremented prior to cleaning an entry.
The calculation used for the number of free packets is slightly buggy,
which is:
entries = dirty - next - 1
if (entries < 0)
entries += size;
Let's take some examples:
At ring initialisation, dirty is set to size-1, and next is set to 0.
This gives:
entries = size-1 - 0 - 1 = size - 2 => size - 2
But hang on, we have no packets in the empty ring, so why "size - 2" ?
Let's also check if we back the pointers up by one position - so
dirty=size-2, next=size-1.
entries = size-2 - size-1 - 1 = -2 - -1 - 1 = -2 => size - 2
Okay, so that's the same. Now, what about the "ring full" criteria.
We can never completely fill the transmit ring, because a completely
full ring is indistinguishable from a completely empty ring. We
reserve one entry to permit us to keep a distinction.
Hence, our "full" case is when both pointers are equal, so dirty=size-1,
next=size-1.
entries = size-1 - size-1 - 1 = -1 => size - 1
This is where things break down - in this case, the function is not
returning the number of free entries in the ring, because it should
be zero!
Fix this by changing the calculation to something which reflects the
actual ring behaviour:
entries = dirty - next;
if (entries < 0)
entries += size;
Plugging the above three cases into this gives:
entries = size-1 - 0 = size - 1 => size - 1
entries = size-2 - size-1 = -1 => size - 1
entries = size-1 - size-1 = 0 => 0
Here, we have more correct behaviour (remembering that we have reserved
one entry as described above).
The perverse thing is that every test at this function's called site
almost took account of the off-by-one error. Let's fix this to have
saner semantics - returning the number of permissible free entries in
the ring which can then be compared using expected tests against our
required numbers of packets.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
It is wasteful to keep checking whether we're stopped, and the number
of free packets to see whether we should restart the queue for each
entry in the ring which we process. Move this to the end of the
processing so we check once per ring clean.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Clean up the various callsites to this function; most callsites
are using a temporary variable and immediately following it with
a test. There is no need of a temporary variable for this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Move the calculation of the transmit DMA ring address to fec_enet_init()
so the CPU and DMA ring address calculations are localised.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Add support for byte queue limits, which allows the network schedulers
to control packet latency and packet queues more accurately. Further
information on this feature can be found at
https://lwn.net/Articles/469652/
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
UDP performance is extremely poor. iperf reports UDP receive speeds in
the range of 50Mbps with a high packet loss. The interface typically
reports a few thousand overrun errors per iperf run. This is far from
satisfactory.
Adjust the receive FIFO thresholds to reduce the number of errors.
Firstly, we decrease RAFL (receive almost full threshold) to the minimum
value of 4, which gives us the maximum available space in the FIFO prior
to indicating a FIFO overrun condition.
Secondly, we adjust the RSEM value to send a XOFF pause frame early
enough that an in-progress transmission can complete without overflowing
the FIFO.
Document these registers and the changes being made in the driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The FEC hardware in iMX6 is capable of separate control of each flow
control path: the receiver can be programmed via the receive control
register to detect flow control frames, and the transmitter can be
programmed via the receive FIFO thresholds to enable generation of
pause frames.
This means we can implement the full range of flow control: both
symmetric and asymmetric flow control. We support ethtool configuring
all options: forced manual mode, where each path can be controlled
individually, and autonegotiation mode.
In autonegotiation mode, the tx/rx enable bits can be used to influence
the outcome, though they don't precisely define which paths will be
enabled. One combination we don't support is "symmetric only" since we
can always configure each path independently, a case which Linux seems
to often get wrong.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Verify that the checksum offset is inside the packet header before
we zero the entry. This ensures that we don't perform an out of
bounds write.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
On transmit timeout or close, dirty transmit descriptors were not being
correctly cleaned: the only time that DMA mappings are cleaned is when
scanning the TX ring after a transmit interrupt. Fix this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The FEC receive error handling suffers from several problems:
1. When a FIFO overrun error occurs, the descriptor is closed and
reception stops. The remainder of the packet is discarded by the
MAC.
The documentation states that other status bits are invalid, and they
will be zero. However, practical experience on iMX6 shows this is
not the case - the CR (crc error) bit will also be set.
This leads to each FIFO overrun event incrementing both the fifo
error count and the crc error count, which makes the error statistics
less useful. Fix this by ignoring all other status bits of the FIFO
overrun is set, and add a comment to that effect.
2. A late collision invalidates all but the overrun condition; the
remaining error conditions must be ignored.
3. Despite accounting for errors, it continues to receive the errored
packets and pass them into the network stack as if they were
correctly received.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The current kernel hang on i.MX6SX with rootfs mount from MMC.
The root cause is that ptp uses a periodic timer to access enet register
even if ipg clock is disabled.
FEC ptp driver start one period timer to read 1588 counter register in the
ptp init function that is called after FEC driver is probed.
To save power, after FEC probe finish, FEC driver disable all clocks including
ipg clock that is needed for register access.
i.MX5x, i.MX6q/dl/sl FEC register access don't cause system hang when ipg clock
is disabled, just return zero value. But for i.MX6sx SOC, it cause system hang.
To avoid the issue, we need to check ptp clock status before ptp timer count access.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This adds support for specifying the phy to be used with the fec in the
devicetree using the standard phy-handle property and also supports
fixed-link.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Get rid of the CONFIG_PM_SLEEP ifdef by annotating the suspend/resume functions
with '__maybe_unused' in order to keep the code simpler and shorter.
While at it, declare the suspend/resume functions in a single line.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both transmit and receive use the same infrastructure for calculating
the packet timestamp. Rather than duplicating the code, provide a
function to do this common work. Model this function in the Intel
e1000e version which avoids calling ns_to_ktime() within the spinlock;
the spinlock is critical for timecounter_cyc2time() but not
ns_to_ktime().
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove a useless status check in the transmit reap path - we have
already checked that the BD_ENET_TX_READY bit is clear, and as the
hardware only ever clears this bit, there is no way this test can ever
be true.
Acked-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we timeout on transmit, it would be useful to dump the transmit
ring, so we can see the ring state. This can be helpful to diagnose
the cause of transmit timeouts.
Acked-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This allows us to merge two separate preprocessor conditionals together.
Acked-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Clear any pending receive interrupt before we process a pending packet.
This helps to avoid any spurious interrupts being raised after we have
fully cleaned the receive ring, while still allowing an interrupt to be
raised if we receive another packet.
The position of this is critical: we must do this prior to reading the
next packet status to avoid potentially dropping an interrupt when a
packet is still pending.
Acked-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|