Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux into for-linus-4.5
|
|
inode struct members that track cgroup writeback information
should be reinitialized when inode gets allocated from
kmem_cache. Otherwise, their values remain and get used by the
new inode.
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Fixes: d10c80955265 ("writeback: implement foreign cgroup inode bdi_writeback switching")
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
There are some cases where rtt_us derives from deltas of jiffies,
instead of using usec timestamps.
Since we want to track minimal rtt, better to assume a delta of 0 jiffie
might be in fact be very close to 1 jiffie.
It is kind of sad jiffies_to_usecs(1) calls a function instead of simply
using a constant.
Fixes: f672258391b42 ("tcp: track min RTT using windowed min-filter")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
add new id (CONTEC C-NET(PC)C-100TX2)
Signed-off-by: Ken Kawasaki <ken_kawasaki@nifty.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The phy has not been initialized, disconnecting it in the error
path results in a NULL pointer exception. Drop the phy_disconnect
from the error path.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Marvell 88E6240 has been tested successfully without further
changes. Add entry to the table of supported devices.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Refactor tipc_node_xmit() to fail fast and fail early. Fix several
potential memory leaks in unexpected error paths.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit 5266698661401a ("tipc: let broadcast packet reception
use new link receive function") we introduced a new per-node
broadcast reception link instance. This link is created at the
moment the node itself is created. Unfortunately, the allocation
is done after the node instance has already been added to the node
lookup hash table. This creates a potential race condition, where
arriving broadcast packets are able to find and access the node
before it has been fully initialized, and before the above mentioned
link has been created. The result is occasional crashes in the function
tipc_bcast_rcv(), which is trying to access the not-yet existing link.
We fix this by deferring the addition of the node instance until after
it has been fully initialized in the function tipc_node_create().
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Michael Chan says:
====================
bnxt_en: Bug fixes.
Fixed autoneg logic and some related cleanups, fixed tx push operation,
and reduced default ring sizes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current default tx ring size of 512 causes an extra page to be
allocated for the tx ring with only 1 entry in it. Reduce it to
511. The default rx ring size is also reduced to 511 to use less
memory by default.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tx push is supported for small packets to reduce DMA latency. The
following bugs are fixed in this patch:
1. Fix the definition of the push BD which is different from the DMA BD.
2. The push buffer has to be zero padded to the next 64-bit word boundary
or tx checksum won't be correct.
3. Increase the tx push packet threshold to 164 bytes (192 bytes with the BD)
so that small tunneled packets are within the threshold.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
20G is not supported by production hardware and only the 40GbaseCR4 standard
is supported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cleanup bnxt_probe_phy() to cleanly separate 2 code blocks for autoneg
on and off. Autoneg flow control is possible only if autoneg is enabled.
In bnxt_get_settings(), Pause and Asym_Pause are always supported.
Only the advertisement bits change depending on the ethtool -A setting
in auto mode.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
1. Determine autoneg on|off setting from link_info->autoneg. Using the
firmware returned setting can be misleading if autoneg is changed and
there hasn't been a phy update from the firmware.
2. If autoneg is disabled, link_info->autoneg should be set to 0 to
indicate both speed and flow control autoneg are disabled.
3. To enable autoneg flow control, speed autoneg must be enabled.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A recent change to the mdb code confused the compiler to the point
where it did not realize that the port-group returned from
br_mdb_add_group() is always valid when the function returns a nonzero
return value, so we get a spurious warning:
net/bridge/br_mdb.c: In function 'br_mdb_add':
net/bridge/br_mdb.c:542:4: error: 'pg' may be used uninitialized in this function [-Werror=maybe-uninitialized]
__br_mdb_notify(dev, entry, RTM_NEWMDB, pg);
Slightly rearranging the code in br_mdb_add_group() makes the problem
go away, as gcc is clever enough to see that both functions check
for 'ret != 0'.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 9e8430f8d60d ("bridge: mdb: Passing the port-group pointer to br_mdb module")
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change has been made with the goal that kernel functions should
return something more descriptive than -1 on failure.
A variable `err` has been introduced for storing error codes.
The return value of kzalloc on failure should return a -1 and not a
-ENOMEM. This was found using Coccinelle. A simplified version of
the semantic patch used is:
//<smpl>
@@
expression *e;
identifier l1;
@@
e = kzalloc(...);
if (e == NULL) {
...
goto l1;
}
l1:
...
return -1
+ -ENOMEM
;
//</smpl
Furthermore, set `err` to -ENOMEM on failure of alloc_netdev(), and to
-ENODEV on failure of register_netdev() and probe_irq_off().
The single call site only checks that the return value is not 0,
hence no change is required at the call site.
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2016-02-15
This series contains updates to igb only.
Shota Suzuki cleans up unnecessary flag setting for 82576 in
igb_set_flag_queue_pairs() since the default block already sets
IGB_FLAG_QUEUE_PAIRS to the correct value anyways, so the e1000_82576
code block is not necessary and we can simply fall through. Then fixes
an issue where IGB_FLAG_QUEUE_PAIRS can now be set by using "ethtool -L"
option but is never cleared unless the driver is reloaded, so clear the
queue pairing if the pairing becomes unnecessary as a result of "ethtool
-L".
Mitch fixes the igbvf from giving up if it fails to get the hardware
mailbox lock. This can happen when the PF-VF communication channel is
heavily loaded and causes complete communications failure between the
PF and VF drivers, so add a counter and a delay so that the driver will
now retry ten times before giving up on getting the mailbox lock.
The remaining patches in the series are from Alex Duyck, starting with the
cleaning up code that sets the MAC address. Then refactors the VFTA and
VLVF configuration, to simplify and update to similar setups in the ixgbe
driver. Fixed an issue were VLANs headers size was being added to the
value programmed into the RLPML registers, yet these registers already
take into account the size of the VLAN headers when determining the
maximum packet length, so we can drop the code that adds the size to
the RLPML registers. Cleaned up the configuration of the VF port based
VLAN configuration. Also fixed the igb driver so that we can fully
support SR-IOV or the recently added NTUPLE filtering while allowing
support for VLAN promiscuous mode. Also added the ability to use the
bridge utility to add a FDB entry for the PF to an igb port.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Alexander Kochetkov says:
====================
Fixes for rockchip EMAC
Here is a set of 3 patches what fix koops, memory leak and
rockchip EMAC hang. Tested on radxarock lite.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
EMAC could be disabled, while there is some sb_buff
in use. That buffers got lost for linux.
In order to reproduce run on device during active ethernet work:
ifconfig eth0 down
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
EMAC reset internal tx ring pointer to zero at statup.
txbd_curr and txbd_dirty can be different from zero.
That cause ethernet transfer hang (no packets transmitted).
In order to reproduce, run on device:
ifconfig eth0 down
ifconfig eth0 up
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a race between arc_emac_tx() and arc_emac_tx_clean().
sk_buff got freed by arc_emac_tx_clean() while arc_emac_tx()
submitting sk_buff.
In order to free sk_buff arc_emac_tx_clean() checks:
if ((info & FOR_EMAC) || !txbd->data)
break;
...
dev_kfree_skb_irq(skb);
If condition false, arc_emac_tx_clean() free sk_buff.
In order to submit txbd, arc_emac_tx() do:
priv->tx_buff[*txbd_curr].skb = skb;
...
priv->txbd[*txbd_curr].data = cpu_to_le32(addr);
...
... <== arc_emac_tx_clean() check condition here
... <== (info & FOR_EMAC) is false
... <== !txbd->data is false
...
*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
In order to reproduce the situation,
run device:
# iperf -s
run on host:
# iperf -t 600 -c <device-ip-addr>
[ 28.396284] ------------[ cut here ]------------
[ 28.400912] kernel BUG at .../net/core/skbuff.c:1355!
[ 28.414019] Internal error: Oops - BUG: 0 [#1] SMP ARM
[ 28.419150] Modules linked in:
[ 28.422219] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B 4.4.0+ #120
[ 28.429516] Hardware name: Rockchip (Device Tree)
[ 28.434216] task: c0665070 ti: c0660000 task.ti: c0660000
[ 28.439622] PC is at skb_put+0x10/0x54
[ 28.443381] LR is at arc_emac_poll+0x260/0x474
[ 28.447821] pc : [<c03af580>] lr : [<c028fec4>] psr: a0070113
[ 28.447821] sp : c0661e58 ip : eea68502 fp : ef377000
[ 28.459280] r10: 0000012c r9 : f08b2000 r8 : eeb57100
[ 28.464498] r7 : 00000000 r6 : ef376594 r5 : 00000077 r4 : ef376000
[ 28.471015] r3 : 0030488b r2 : ef13e880 r1 : 000005ee r0 : eeb57100
[ 28.477534] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 28.484658] Control: 10c5387d Table: 8eaf004a DAC: 00000051
[ 28.490396] Process swapper/0 (pid: 0, stack limit = 0xc0660210)
[ 28.496393] Stack: (0xc0661e58 to 0xc0662000)
[ 28.500745] 1e40: 00000002 00000000
[ 28.508913] 1e60: 00000000 ef376520 00000028 f08b23b8 00000000 ef376520 ef7b6900 c028fc64
[ 28.517082] 1e80: 2f158000 c0661ea8 c0661eb0 0000012c c065e900 c03bdeac ffff95e9 c0662100
[ 28.525250] 1ea0: c0663924 00000028 c0661ea8 c0661ea8 c0661eb0 c0661eb0 0000001e c0660000
[ 28.533417] 1ec0: 40000003 00000008 c0695a00 0000000a c066208c 00000100 c0661ee0 c0027410
[ 28.541584] 1ee0: ef0fb700 2f158000 00200000 ffff95e8 00000004 c0662100 c0662080 00000003
[ 28.549751] 1f00: 00000000 00000000 00000000 c065b45c 0000001e ef005000 c0647a30 00000000
[ 28.557919] 1f20: 00000000 c0027798 00000000 c005cf40 f0802100 c0662ffc c0661f60 f0803100
[ 28.566088] 1f40: c0661fb8 c00093bc c000ffb4 60070013 ffffffff c0661f94 c0661fb8 c00137d4
[ 28.574267] 1f60: 00000001 00000000 00000000 c001ffa0 00000000 c0660000 00000000 c065a364
[ 28.582441] 1f80: c0661fb8 c0647a30 00000000 00000000 00000000 c0661fb0 c000ffb0 c000ffb4
[ 28.590608] 1fa0: 60070013 ffffffff 00000051 00000000 00000000 c005496c c0662400 c061bc40
[ 28.598776] 1fc0: ffffffff ffffffff 00000000 c061b680 00000000 c0647a30 00000000 c0695294
[ 28.606943] 1fe0: c0662488 c0647a2c c066619c 6000406a 413fc090 6000807c 00000000 00000000
[ 28.615127] [<c03af580>] (skb_put) from [<ef376520>] (0xef376520)
[ 28.621218] Code: e5902054 e590c090 e3520000 0a000000 (e7f001f2)
[ 28.627307] ---[ end trace 4824734e2243fdb6 ]---
[ 34.377068] Internal error: Oops: 17 [#1] SMP ARM
[ 34.382854] Modules linked in:
[ 34.385947] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.4.0+ #120
[ 34.392219] Hardware name: Rockchip (Device Tree)
[ 34.396937] task: ef02d040 ti: ef05c000 task.ti: ef05c000
[ 34.402376] PC is at __dev_kfree_skb_irq+0x4/0x80
[ 34.407121] LR is at arc_emac_poll+0x130/0x474
[ 34.411583] pc : [<c03bb640>] lr : [<c028fd94>] psr: 60030013
[ 34.411583] sp : ef05de68 ip : 0008e83c fp : ef377000
[ 34.423062] r10: c001bec4 r9 : 00000000 r8 : f08b24c8
[ 34.428296] r7 : f08b2400 r6 : 00000075 r5 : 00000019 r4 : ef376000
[ 34.434827] r3 : 00060000 r2 : 00000042 r1 : 00000001 r0 : 00000000
[ 34.441365] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 34.448507] Control: 10c5387d Table: 8f25c04a DAC: 00000051
[ 34.454262] Process ksoftirqd/0 (pid: 3, stack limit = 0xef05c210)
[ 34.460449] Stack: (0xef05de68 to 0xef05e000)
[ 34.464827] de60: ef376000 c028fd94 00000000 c0669480 c0669480 ef376520
[ 34.473022] de80: 00000028 00000001 00002ae4 ef376520 ef7b6900 c028fc64 2f158000 ef05dec0
[ 34.481215] dea0: ef05dec8 0000012c c065e900 c03bdeac ffff983f c0662100 c0663924 00000028
[ 34.489409] dec0: ef05dec0 ef05dec0 ef05dec8 ef05dec8 ef7b6000 ef05c000 40000003 00000008
[ 34.497600] dee0: c0695a00 0000000a c066208c 00000100 ef05def8 c0027410 ef7b6000 40000000
[ 34.505795] df00: 04208040 ffff983e 00000004 c0662100 c0662080 00000003 ef05c000 ef027340
[ 34.513985] df20: ef05c000 c0666c2c 00000000 00000001 00000002 00000000 00000000 c0027568
[ 34.522176] df40: ef027340 c003ef48 ef027300 00000000 ef027340 c003edd4 00000000 00000000
[ 34.530367] df60: 00000000 c003c37c ffffff7f 00000001 00000000 ef027340 00000000 00030003
[ 34.538559] df80: ef05df80 ef05df80 00000000 00000000 ef05df90 ef05df90 ef05dfac ef027300
[ 34.546750] dfa0: c003c2a4 00000000 00000000 c000f578 00000000 00000000 00000000 00000000
[ 34.554939] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 34.563129] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff dfff7fff
[ 34.571360] [<c03bb640>] (__dev_kfree_skb_irq) from [<c028fd94>] (arc_emac_poll+0x130/0x474)
[ 34.579840] [<c028fd94>] (arc_emac_poll) from [<c03bdeac>] (net_rx_action+0xdc/0x28c)
[ 34.587712] [<c03bdeac>] (net_rx_action) from [<c0027410>] (__do_softirq+0xcc/0x1f8)
[ 34.595482] [<c0027410>] (__do_softirq) from [<c0027568>] (run_ksoftirqd+0x2c/0x50)
[ 34.603168] [<c0027568>] (run_ksoftirqd) from [<c003ef48>] (smpboot_thread_fn+0x174/0x18c)
[ 34.611466] [<c003ef48>] (smpboot_thread_fn) from [<c003c37c>] (kthread+0xd8/0xec)
[ 34.619075] [<c003c37c>] (kthread) from [<c000f578>] (ret_from_fork+0x14/0x3c)
[ 34.626317] Code: e8bd8010 e3a00000 e12fff1e e92d4010 (e59030a4)
[ 34.632572] ---[ end trace cca5a3d86a82249a ]---
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch corrects the unaligned accesses seen on GRE TEB tunnels when
generating hash keys. Specifically what this patch does is make it so that
we force the use of skb_copy_bits when the GRE inner headers will be
unaligned due to NET_IP_ALIGNED being a non-zero value.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Saeed Mahameed says:
====================
mlx5 driver fixes for 4.5-rc2
We added here a patch from Matan and Alaa for addressing Linus comments on
the mess w.r.t reserved field names in the driver/firmware auto-generated file.
Once the patch hits linus tree, we'll ask Doug to rebase his tree on that
rc so both net-next and rdma-next development for 4.6 will be done under
the fixed robust form.
Also provided two patches that addresses the dynamic ndo initialization
issue of mlx5e netdevice.
Or and Saeed.
changes from V1: (Only first patch was changed)
In this V we fixed the issues addressed in Or's previous e-mail.
1. Offsets took into account two dimensional u8 arrays
2. Offsets took into account nesting unions and structs
3. Offsets for unions
4. Offsets for any reserved field
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently our netdevice ops is a one static global variable which
is referenced by all mlx5e netdevice instances. This can be
problematic when different driver instances do not share same
HW capabilities (e.g SRIOV PF and VFs probed to the host).
Now we have two constant global netdevice ops variables, one
for basic netdevice ops and the other with extended SRIOV ops,
on netdevice construction we choose the one suitable for
current device capabilities.
Fixes: 66e49dedada6 ("net/mlx5e: Add support for SR-IOV ndos")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently mlx5e_select_queue is redundant since num_tc is always 1.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
mlx5_ifc.h is a header file representing the API and ABI between
the driver to the firmware and hardware. This file is used from
both the mlx5_ib and mlx5_core drivers.
Previously, this file used incrementing counter to indicate
reserved fields, for example:
struct mlx5_ifc_odp_per_transport_service_cap_bits {
u8 send[0x1];
u8 receive[0x1];
u8 write[0x1];
u8 read[0x1];
u8 reserved_0[0x1];
u8 srq_receive[0x1];
u8 reserved_1[0x1a];
};
If one developer implements through net-next feature A that uses
reserved_0, they replace it with featureA and renames reserved_1 to
reserved_0. In the same kernel cycle, a 2nd developer could implement
feature B through the rdma tree, that uses reserved_1 and split it to
featureB and a smaller reserved_1 field. This will cause a conflict
when the two trees are merged.
The source of this conflict is that the 1st developer changed *all*
reserved fields.
As Linus suggested, we change the layout of structs to:
struct mlx5_ifc_odp_per_transport_service_cap_bits {
u8 send[0x1];
u8 receive[0x1];
u8 write[0x1];
u8 read[0x1];
u8 reserved_at_4[0x1];
u8 srq_receive[0x1];
u8 reserved_at_6[0x1a];
};
This makes the conflicts much more rare and preserves the locality of
changes.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This gets us functional GPU reset again, like we had until a refactor
at merge time. Tested with a little patch to stuff in a broken binner
job every 100 frames.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
This may actually get us a feature that the closed driver didn't have:
turning off the GPU in between rendering jobs, while the V3D device is
still opened by the client.
There may be some tuning to be applied here to use autosuspend so that
we don't bounce the device's power so much, but in steady-state
GPU-bound rendering we keep the power on (since we keep multiple jobs
outstanding) and even if we power cycle on every job we can still
manage at least 680 fps.
More importantly, though, runtime PM will allow us to power off the
device to do a GPU reset.
v2: Switch #ifdef to CONFIG_PM not CONFIG_PM_SLEEP (caught by kbuild
test robot)
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
We were tracking the "where are the head pointers pointing" globally,
so if another job reused the same BOs and execution was at the same
point as last time we checked, we'd stop and trigger a reset even
though the GPU had made progress.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
These ioctls end up getting exposed to fairly directly to GL users,
and having normal user operations print DRM errors is obviously wrong.
The message was originally to give us some idea of what happened when
a hang occurred, but we have a DRM_INFO from reset for that.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
This caused the wait ioctls to claim that waiting had completed when
we actually got interrupted by a signal before it was done. Fixes
broken rendering throttling that produced serious lag in X window
dragging.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
Fixes igt vc4_create_bo/create-bo-0 by returning -EINVAL from the
ioctl instead of -ENOMEM.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
Apparently in hardware (as opposed to simulation), the clear colors
need to be uploaded before the render config, otherwise they won't
take effect. Fixes igt's vc4_wait_bo/used-bo-* subtests.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
This is ABI future-proofing if we ever want to extend the pad to mean
something.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
Jacob Keller says:
====================
ethtool: correct {GS}CHANNELS and {GS}RXFH conflict
This patch series fixes up ethtool_set_channels operation which
allowed modifying the RXFH table indirectly by reducing the number of
queues below the current max queue used by the Rx flow table. Most
drivers incorrectly allowed this to destroy the Rx flow table and
would then start by reinitializing it to default settings. However,
drivers are not able to correctly handle the conflict since there was
no way to differentiate between the default settings and the user
requested explicit settings.
To fix this, implement a new netdev private flag which we use to
indicate whether the RXFH has been user configured. If someone has
a better alternative of how to store this information, let me know.
I am not sure that priv_flags is the best solution but I have not had
any better idea.
Secondly, we add a function which just calls the driver's get_rxfh
callback to determine the current indirection table. Loop through this
and we can determine the current highest queue that will be used by
RSS.
Now, modify ethtool_set_channels to add a check ensuring that if (a)
we have had rxfh configured by user, (b) we can get the maximum RSS
queue currently used, then we ensure that the newly requested Rx count
(or combined count) is at least as high as this maximum RSS queue. The
reasoning here is that we can always safely increase the number of
queues. If we decrease the queues we must ensure that the decrease
does not go lower than the highest in-use queue for the Rx flow table.
Drivers may still need to be patched if they currently overwrite the
Rx flow table during channel configuration. If the driver currently
always resets Rx flow table when increasing number of queues it must
be patched to only do this when netif_is_rxfh_configured returns
false.
The second patch simply adds a check to ensure that all provided
channel counts fit within driver defined maximums.
The third patch fixes fm10k to correctly reconfigure the RSS reta
table whenever it is still unconfigured. This means that the default
state will provide RSS to every queue. Once the user has configured
RXFH, then we should maintain it. In addition, since the case where we
must reconfigure the RSS table in this case should now no longer
occur, add a dev_err message to indicate the user that we did so.
I have also supplied an ethtool patch to enable setting the default Rx
flow indirection table. Without this, current ethtool does not support
sending an indir_size of 0, and thus does not correctly support
configuring back to the default.
Changes in v2:
* fixed compile error
* fixed incorrect comparison with max_rx_in_use
* adjusted looping over dev_size
* removed inline on function
* dropped patch about separating combined vs asymmetric channels
* verified behavior using fm10k driver
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Also print an error message incase we do have to reconfigure as this
should no longer happen anymore due to ethtool changes. If it somehow
does occur, user should be made aware of it.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a sanity check to ensure that all requested channel sizes are within
bounds, which should reduce errors in driver implementation.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ethernet drivers implementing both {GS}RXFH and {GS}CHANNELS ethtool ops
incorrectly allow SCHANNELS when it would conflict with the settings
from SRXFH. This occurs because it is not possible for drivers to
understand whether their Rx flow indirection table has been configured
or is in the default state. In addition, drivers currently behave in
various ways when increasing the number of Rx channels.
Some drivers will always destroy the Rx flow indirection table when this
occurs, whether it has been set by the user or not. Other drivers will
attempt to preserve the table even if the user has never modified it
from the default driver settings. Neither of these situation is
desirable because it leads to unexpected behavior or loss of user
configuration.
The correct behavior is to simply return -EINVAL when SCHANNELS would
conflict with the current Rx flow table settings. However, it should
only do so if the current settings were modified by the user. If we
required that the new settings never conflict with the current (default)
Rx flow settings, we would force users to first reduce their Rx flow
settings and then reduce the number of Rx channels.
This patch proposes a solution implemented in net/core/ethtool.c which
ensures that all drivers behave correctly. It checks whether the RXFH
table has been configured to non-default settings, and stores this
information in a private netdev flag. When the number of channels is
requested to change, it first ensures that the current Rx flow table is
not going to assign flows to now disabled channels.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We need that for a custom hardware that needs the reverse reset
sequence.
Signed-off-by: Bernhard Walle <bernhard@bwalle.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is presently a race condition between the bonding periodic
link monitor and the updating of a slave's speed and duplex. The former
occurs on a periodic basis, and the latter in response to a driver's
calling of netif_carrier_on.
It is possible for the periodic monitor to run between the
driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
event that causes bonding to update the slave's speed and duplex. This
manifests most notably as a report that a slave is up and "0 Mbps full
duplex" after enslavement, but in principle could report an incorrect
speed and duplex after any link up event if the device comes up with a
different speed or duplex. This affects the 802.3ad aggregator
selection, as the speed and duplex are selection criteria.
This is fixed by updating the speed and duplex in the periodic
monitor, prior to using that information.
This was done historically in bonding, but the call to
bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding:
don't call update_speed_duplex() under spinlocks"), as it might sleep
under lock. Later, the locking was changed to only hold RTNL, and so
after commit 876254ae2758 ("bonding: don't call update_speed_duplex()
under spinlocks") this call is again safe.
Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The am79c961a.c driver fails to build with clang because of an
unusual inline assembly construct:
drivers/net/ethernet/amd/am79c961a.c:53:7: error: invalid % escape in inline assembly string
"str%?h %1, [%2] @ NET_RAP\n\t"
The same change has been done a decade ago in arch/arm as of
6a39dd6222dd ("[ARM] 3759/2: Remove uses of %?"), but apparently
some drivers were missed.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The smc91x driver doesn't honor the probe deferral mechanism when the
interrupt source is not yet available, such as one provided by a gpio
controller not probed.
Fix this by propagating the platform_get_irq() error code as the probe
return value.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Florian Fainelli says:
====================
net: phy: bcm7xxx: Misc cleanups
These two patches are cleanups to the BCM7xxx internal PHY driver:
- fix a constant name missing a X (as in BCM7XXX)
- add a macro to reduce the amount of code duplication to add new entries
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduce a macro which helps adding new 40NM EPHY entries and reduces the
amount of boilerplate code.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver is BCM7xxx, we were missing an additional X in the constant naming,
fix that to be consistent.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Florian Fainelli says:
====================
Subject: [PATCH net v2 0/4] net: phy: bcm7xxx 40nm PHY fixes
Here is a collection of fixes for the 40nm Ethernet PHY supported
by the 7xxx PHY driver, please also queue these fixes for stable.
Changes in v2:
- dropped the cleanup patch, not appropriate
- added another patch removing bogus wildcard entries
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove the two wildcard entries, they serve no purpose and will match way too
many devices, some of them being covered by the driver in
drivers/net/phy/broadcom.c. Remove the now unused bcm7xxx_dummy_config_init()
function which would produce a warning.
Fixes: b560a58c45c6 ("net: phy: add Broadcom BCM7xxx internal PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since we were wrongly advertising gigabit features for these 10/100 only
Ethernet PHYs, bcm7xxx_config_init() which is supposed to apply workaround
would have not run since the check would be true, now that we have fixed the
PHY features, remove that check since it has no reasoning to be there anymore.
Fixes: e18556ee3bd83 ("net: phy: bcm7xxx: do not use PHY_BRCM_100MBPS_WAR")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The PHY entries for BCM7425/29/35 declare the 40nm Ethernet PHY as being
10/100/1000 capable, while this is just a 10/100 capable PHY device, fix that.
Fixes: d068b02cfdfc2 ("net: phy: add BCM7425 and BCM7429 PHYs")
Fixes: 9458ceab4917 ("net: phy: bcm7xxx: Add entry for BCM7435")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|