Age | Commit message (Collapse) | Author |
|
When a driver implements the ndo_xdp_xmit() function, there is
(currently) no generic way to determine whether it is safe to call.
It is e.g. unsafe to call the drivers ndo_xdp_xmit, if it have not
allocated the needed XDP TX queues yet. This is the case for
virtio_net, which first allocates the XDP TX queues once an XDP/bpf
prog is attached (in virtnet_xdp_set()).
Thus, a crash will occur for virtio_net when redirecting to another
virtio_net device's ndo_xdp_xmit, which have not attached a XDP prog.
The sample xdp_redirect_map tries to attach a dummy XDP prog to take
this into account, but it can also easily fail if the virtio_net (or
actually underlying vhost driver) have not allocated enough extra
queues for the device.
Allocating more queue this is currently a manual config.
Hint for libvirt XML add:
<driver name='vhost' queues='16'>
<host mrg_rxbuf='off'/>
<guest tso4='off' tso6='off' ecn='off' ufo='off'/>
</driver>
The solution in this patch is to check that the device have loaded an
XDP/bpf prog before proceeding. This is similar to the check
performed in driver ixgbe.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
XDP_REDIRECT calling xdp_do_redirect() can fail for multiple reasons
(which can be inspected by tracepoints). The current semantics is that
on failure the driver calling xdp_do_redirect() must handle freeing or
recycling the page associated with this frame. This can be seen as an
optimization, as drivers usually have an optimized XDP_DROP code path
for frame recycling in place already.
The virtio_net driver didn't handle when xdp_do_redirect() failed.
This caused a memory leak as the page refcnt wasn't decremented on
failures.
The function __virtnet_xdp_xmit() did handle one type of failure,
when the xmit queue virtqueue_add_outbuf() is full, which "hides"
releasing a refcnt on the page. Instead the function __virtnet_xdp_xmit()
must follow API of xdp_do_redirect(), which on errors leave it up to
the caller to free the page, of the failed send operation.
Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When configuring virtio_net to use the code path 'receive_small()',
in-order to get correct XDP_REDIRECT support, I discovered TCP packets
would get silently dropped when loading an XDP program action XDP_PASS.
The bug seems to be that receive_small() when XDP is loaded check that
hdr->hdr.flags is zero, which seems wrong as hdr.flags contains the
flags VIRTIO_NET_HDR_F_* :
#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */
#define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */
TCP got dropped as it had the VIRTIO_NET_HDR_F_DATA_VALID flag set.
The flags that are relevant here are the VIRTIO_NET_HDR_GSO_* flags
stored in hdr->hdr.gso_type. Thus, the fix is just check that none of
the gso_type flags have been set.
Fixes: bb91accf2733 ("virtio-net: XDP support for small buffers")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The virtio_net code have three different RX code-paths in receive_buf().
Two of these code paths can handle XDP, but one of them is broken for
at least XDP_REDIRECT.
Function(1): receive_big() does not support XDP.
Function(2): receive_small() support XDP fully and uses build_skb().
Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb().
The simple explanation is that receive_mergeable() is broken because
it uses napi_alloc_skb(), which violates XDP given XDP assumes packet
header+data in single page and enough tail room for skb_shared_info.
The longer explaination is that receive_mergeable() tries to
work-around and satisfy these XDP requiresments e.g. by having a
function xdp_linearize_page() that allocates and memcpy RX buffers
around (in case packet is scattered across multiple rx buffers). This
does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because
we have not implemented bpf_xdp_adjust_tail yet).
The XDP_REDIRECT action combined with cpumap is broken, and cause hard
to debug crashes. The main issue is that the RX packet does not have
the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing
skb_shared_info to overlap the next packets head-room (in which cpumap
stores info).
Reproducing depend on the packet payload length and if RX-buffer size
happened to have tail-room for skb_shared_info or not. But to make
this even harder to troubleshoot, the RX-buffer size is runtime
dynamically change based on an Exponentially Weighted Moving Average
(EWMA) over the packet length, when refilling RX rings.
This patch only disable XDP_REDIRECT support in receive_mergeable()
case, because it can cause a real crash.
IMHO we should consider NOT supporting XDP in receive_mergeable() at
all, because the principles behind XDP are to gain speed by (1) code
simplicity, (2) sacrificing memory and (3) where possible moving
runtime checks to setup time. These principles are clearly being
violated in receive_mergeable(), that e.g. runtime track average
buffer size to save memory consumption.
In the longer run, we should consider introducing a separate receive
function when attaching an XDP program, and also change the memory
model to be compatible with XDP when attaching an XDP prog.
Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-02-20
The following pull request includes some fixes for the mlx5 core and
netdevice driver.
Please pull and let me know if there's any issue.
-stable 4.10.y:
('net/mlx5e: Fix loopback self test when GRO is off')
-stable 4.12.y:
('net/mlx5e: Specify numa node when allocating drop rq')
-stable 4.13.y:
('net/mlx5e: Verify inline header size do not exceed SKB linear size')
-stable 4.15.y:
('net/mlx5e: Fix TCP checksum in LRO buffers')
('net/mlx5: Fix error handling when adding flow rules')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If building match list or adding existing fg fails when
node is locked, function returned without unlocking it.
This happened if node version changed or adding existing fg
returned with EAGAIN after jumping to search_again_locked label.
Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
First use of drop counters happens in esw_apply_vport_conf function,
while they are allocated later in the flow. Fix that by moving
esw_vport_create_drop_counters function to be called before the first use.
Fixes: b8a0dbe3a90b ("net/mlx5e: E-switch, Add steering drop counters")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
We can't allow only some of the rules sharing an FTE to ask for
header re-write, add it to the conflicting action checks.
Fixes: 0d235c3fabb7 ('net/mlx5: Add hash table to search FTEs in a flow-group')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The adapter uses the cache_line_128byte setting to set the bounds for
end padding. On systems where the cacheline size is greater than 128B
use 128B instead of the default of 64B. This results in fewer partial
cacheline writes. There's a 50% chance it will pad to the end of a 256B
cache line vs only 25% when using 64B.
Fixes: f32f5bd2eb7e ("net/mlx5: Configure cache line size for start and end padding")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
When allocating a drop rq, no numa node is explicitly set which means
allocations are done on node zero. This is not necessarily the nearest
numa node to the HCA, and even worse, might even be a memoryless numa
node.
Choose the numa_node given to us by the pci device in order to properly
allocate the coherent dma memory instead of assuming zero is valid.
Fixes: 556dd1b9c313 ("net/mlx5e: Set drop RQ's necessary parameters only")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
This isn't supported when we emulate eswitch vlan push action which
is the current state of things.
Fixes: 8b32580df1cb ('net/mlx5e: Add TC vlan action for SRIOV offloads')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Address these sparse warnings on drivers/net/ethernet/mellanox/mlx5
[..]/core/diag/fs_tracepoint.c:99:53: warning: non-constant initializer for static object
[..]/core/diag/fs_tracepoint.c:102:53: warning: non-constant initializer for static object
etc
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Fix these gcc warnings on drivers/net/ethernet/mellanox/mlx5:
[..]/core/lib/clock.c:454:6: warning: no previous prototype for 'mlx5_init_clock' [-Wmissing-prototypes]
[..]/core/lib/clock.c:510:6: warning: no previous prototype for 'mlx5_cleanup_clock' [-Wmissing-prototypes]
[..]/core/en_main.c:3141:5: warning: no previous prototype for 'mlx5e_setup_tc' [-Wmissing-prototypes]
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Driver tries to copy at least MLX5E_MIN_INLINE bytes into the control
segment of the WQE. It assumes that the linear part contains at least
MLX5E_MIN_INLINE bytes, which can be wrong.
Cited commit verified that driver will not copy more bytes into the
inline header part that the actual size of the packet. Re-factor this
check to make sure we do not exceed the linear part as well.
This fix is aligned with the current driver's assumption that the entire
L2 will be present in the linear part of the SKB.
Fixes: 6aace17e64f4 ("net/mlx5e: Fix inline header size for small packets")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
When GRO is off, the transport header pointer in sk_buff is
initialized to network's header.
To find the udp header, instead of using udp_hdr() which assumes
skb_network_header was set, manually calculate the udp header offset.
Fixes: 0952da791c97 ("net/mlx5e: Add support for loopback selftest")
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
When receiving an LRO packet, the checksum field is set by the hardware
to the checksum of the first coalesced packet. Obviously, this checksum
is not valid for the merged LRO packet and should be fixed. We can use
the CQE checksum which covers the checksum of the entire merged packet
TCP payload to help us calculate the checksum incrementally.
Tested by sending IPv4/6 traffic with LRO enabled, RX checksum disabled
and watching nstat checksum error counters (in addition to the obvious
bandwidth drop caused by checksum errors).
This bug is usually "hidden" since LRO packets would go through the
CHECKSUM_UNNECESSARY flow which does not validate the packet checksum.
It's important to note that previous to this patch, LRO packets provided
with CHECKSUM_UNNECESSARY are indeed packets with a correct validated
checksum (even though the checksum inside the TCP header is incorrect),
since the hardware LRO aggregation is terminated upon receiving a packet
with bad checksum.
Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
After introduction of commit d0869c0071e4, there were some instances of
RX queue entries from a previous session (before the device was closed
and reopened) returned to the NAPI polling routine. Since the corresponding
socket buffers were freed, this resulted in a panic on reopen. Include
a check for a NULL skb here to avoid this.
Fixes: d0869c0071e4 ("ibmvnic: Clean RX pool buffers during device close")
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In ungraceful host shutdown or driver crash case BMC connectivity is
lost. APE firmware is missing the driver state in this
case to keep the BMC connectivity alive.
This patch has below change to address this issue.
Heartbeat mechanism with APE firmware. This heartbeat mechanism
is needed to notify the APE firmware about driver state.
This patch also has the change in wait time for APE event from
1ms to 20ms as there can be some delay in getting response.
v2: Drop inline keyword as per David suggestion.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Satish Baddipadige <satish.baddipadige@broadcom.com>
Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When mlxsw replaces (or deletes) a route it removes the offload
indication from the replaced route. This is problematic for IPv4 routes,
as the offload indication is stored in the fib_info which is usually
shared between multiple routes.
Instead of unconditionally clearing the offload indication, only clear
it if no other route is using the fib_info.
Fixes: 3984d1a89fe7 ("mlxsw: spectrum_router: Provide offload indication using nexthop flags")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Tested-by: Alexander Petrovskiy <alexpe@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a command packet with invalid mux id is received, the packet would
not have a valid endpoint. This invalid endpoint maybe dereferenced
leading to a crash. Identified by manual code inspection.
Fixes: 3352e6c45760 ("net: qualcomm: rmnet: Convert the muxed endpoint to hlist")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With CONFIG_DEBUG_PREEMPT enabled, a warning was seen on device
creation. This occurs due to the incorrect cpu API usage in
ndo_get_stats64 handler.
BUG: using smp_processor_id() in preemptible [00000000] code: rmnetcli/5743
caller is debug_smp_processor_id+0x1c/0x24
Call trace:
[<ffffff9d48c8967c>] dump_backtrace+0x0/0x2a8
[<ffffff9d48c89bbc>] show_stack+0x20/0x28
[<ffffff9d4901fff8>] dump_stack+0xa8/0xe0
[<ffffff9d490421e0>] check_preemption_disabled+0x104/0x108
[<ffffff9d49042200>] debug_smp_processor_id+0x1c/0x24
[<ffffff9d494a36b0>] rmnet_get_stats64+0x64/0x13c
[<ffffff9d49b014e0>] dev_get_stats+0x68/0xd8
[<ffffff9d49d58df8>] rtnl_fill_stats+0x54/0x140
[<ffffff9d49b1f0b8>] rtnl_fill_ifinfo+0x428/0x9cc
[<ffffff9d49b23834>] rtmsg_ifinfo_build_skb+0x80/0xf4
[<ffffff9d49b23930>] rtnetlink_event+0x88/0xb4
[<ffffff9d48cd21b4>] raw_notifier_call_chain+0x58/0x78
[<ffffff9d49b028a4>] call_netdevice_notifiers_info+0x48/0x78
[<ffffff9d49b08bf8>] __netdev_upper_dev_link+0x290/0x5e8
[<ffffff9d49b08fcc>] netdev_master_upper_dev_link+0x3c/0x48
[<ffffff9d494a2e74>] rmnet_newlink+0xf0/0x1c8
[<ffffff9d49b23360>] rtnl_newlink+0x57c/0x6c8
[<ffffff9d49b2355c>] rtnetlink_rcv_msg+0xb0/0x244
[<ffffff9d49b5230c>] netlink_rcv_skb+0xb4/0xdc
[<ffffff9d49b204f4>] rtnetlink_rcv+0x34/0x44
[<ffffff9d49b51af0>] netlink_unicast+0x1ec/0x294
[<ffffff9d49b51fdc>] netlink_sendmsg+0x320/0x390
[<ffffff9d49ae6858>] sock_sendmsg+0x54/0x60
[<ffffff9d49ae91bc>] SyS_sendto+0x1a0/0x1e4
[<ffffff9d48c83770>] el0_svc_naked+0x24/0x28
Fixes: 192c4b5d48f2 ("net: qualcomm: rmnet: Add support for 64 bit stats")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With CONFIG_DEBUG_PREEMPT enabled, a crash with the following call
stack was observed when removing a real dev which had rmnet devices
attached to it.
To fix this, remove the netdev_upper link APIs and instead use the
existing information in rmnet_port and rmnet_priv to get the
association between real and rmnet devs.
BUG: sleeping function called from invalid context
in_atomic(): 0, irqs_disabled(): 0, pid: 5762, name: ip
Preemption disabled at:
[<ffffff9d49043564>] debug_object_active_state+0xa4/0x16c
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
PC is at ___might_sleep+0x13c/0x180
LR is at ___might_sleep+0x17c/0x180
[<ffffff9d48ce0924>] ___might_sleep+0x13c/0x180
[<ffffff9d48ce09c0>] __might_sleep+0x58/0x8c
[<ffffff9d49d6253c>] mutex_lock+0x2c/0x48
[<ffffff9d48ed4840>] kernfs_remove_by_name_ns+0x48/0xa8
[<ffffff9d48ed6ec8>] sysfs_remove_link+0x30/0x58
[<ffffff9d49b05840>] __netdev_adjacent_dev_remove+0x14c/0x1e0
[<ffffff9d49b05914>] __netdev_adjacent_dev_unlink_lists+0x40/0x68
[<ffffff9d49b08820>] netdev_upper_dev_unlink+0xb4/0x1fc
[<ffffff9d494a29f0>] rmnet_dev_walk_unreg+0x6c/0xc8
[<ffffff9d49b00b40>] netdev_walk_all_lower_dev_rcu+0x58/0xb4
[<ffffff9d494a30fc>] rmnet_config_notify_cb+0xf4/0x134
[<ffffff9d48cd21b4>] raw_notifier_call_chain+0x58/0x78
[<ffffff9d49b028a4>] call_netdevice_notifiers_info+0x48/0x78
[<ffffff9d49b0b568>] rollback_registered_many+0x230/0x3c8
[<ffffff9d49b0b738>] unregister_netdevice_many+0x38/0x94
[<ffffff9d49b1e110>] rtnl_delete_link+0x58/0x88
[<ffffff9d49b201dc>] rtnl_dellink+0xbc/0x1cc
[<ffffff9d49b2355c>] rtnetlink_rcv_msg+0xb0/0x244
[<ffffff9d49b5230c>] netlink_rcv_skb+0xb4/0xdc
[<ffffff9d49b204f4>] rtnetlink_rcv+0x34/0x44
[<ffffff9d49b51af0>] netlink_unicast+0x1ec/0x294
[<ffffff9d49b51fdc>] netlink_sendmsg+0x320/0x390
[<ffffff9d49ae6858>] sock_sendmsg+0x54/0x60
[<ffffff9d49ae6f94>] ___sys_sendmsg+0x298/0x2b0
[<ffffff9d49ae98f8>] SyS_sendmsg+0xb4/0xf0
[<ffffff9d48c83770>] el0_svc_naked+0x24/0x28
Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Fixes: 60d58f971c10 ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We're obviously not part of a memory reclaim path, so don't set the flag.
This also causes a warning in check_flush_dependency() since we end up
in a code path that flushes a non-reclaim workqueue, and we shouldn't do
that if we were really part of reclaim.
Reported-by: syzbot+41cdaf4232c50e658934@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
<Mark Rutland reported>
While fuzzing arm64 v4.16-rc1 with Syzkaller, I've been hitting a
misaligned atomic in __skb_clone:
atomic_inc(&(skb_shinfo(skb)->dataref));
where dataref doesn't have the required natural alignment, and the
atomic operation faults. e.g. i often see it aligned to a single
byte boundary rather than a four byte boundary.
AFAICT, the skb_shared_info is misaligned at the instant it's
allocated in __napi_alloc_skb() __napi_alloc_skb()
</end of report>
Problem is caused by tun_napi_alloc_frags() using
napi_alloc_frag() with user provided seg sizes,
leading to other users of this API getting unaligned
page fragments.
Since we would like to not necessarily add paddings or alignments to
the frags that tun_napi_alloc_frags() attaches to the skb, switch to
another page frag allocator.
As a bonus skb_page_frag_refill() can use GFP_KERNEL allocations,
meaning that we can not deplete memory reserves as easily.
Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We've run into a problem where our device is attached
to a Virtual Machine and the use of the new pci_set_vpd_size()
API doesn't help. The VM kernel has been informed that
the accesses are okay, but all of the actual VPD Capability
Accesses are trapped down into the KVM Hypervisor where it
goes ahead and imposes the silent denials.
The right idea is to follow the kernel.org
commit 1c7de2b4ff88 ("PCI: Enable access to non-standard VPD for
Chelsio devices (cxgb3)") which Alexey Kardashevskiy authored
to establish a PCI Quirk for our T3-based adapters. This commit
extends that PCI Quirk to cover Chelsio T4 devices and later.
The advantage of this approach is that the VPD Size gets set early
in the Base OS/Hypervisor Boot and doesn't require that the cxgb4
driver even be available in the Base OS/Hypervisor. Thus PF4 can
be exported to a Virtual Machine and everything should work.
Fixes: 67e658794ca1 ("cxgb4: Set VPD size so we can read both VPD structures")
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Set correct size of the CIM LA dump for T6.
Fixes: 27887bc7cb7f ("cxgb4: collect hardware LA dumps")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
free pf 0-3 resources, commit baf5086840ab ("cxgb4:
restructure VF mgmt code") erroneously removed the
code which frees the pf 0-3 resources, causing the
probe of pf 0-3 to fail in case of driver reload.
Fixes: baf5086840ab ("cxgb4: restructure VF mgmt code")
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In AP mode, when a new station associates, rs is initialized immediately
upon association completion, before the phy context is updated with the
association parameters, so the sta bandwidth might be wider than the phy
context allows.
To avoid this issue, always initialize rs with 20mhz bandwidth rate, and
after authorization, when the phy context is already up-to-date, re-init
rs with the correct bw.
Signed-off-by: Naftali Goldstein <naftali.goldstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Canceling the periodic timestamp work should be
done in the opposite flow to where it was started.
This also prevents from sending the MARKER command
during the mac_stop flow - causing a false queue hang
(FW is no longer there to send a response).
Fixes: 93b167c13a3a ("iwlwifi: runtime: sync FW and host clocks for logs")
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Our Transmit Frame Descriptor (TFD) is a DMA descriptor that
includes several pointers to be able to transmit a packet
which is not physically contiguous.
Depending on the hardware being use, we can have 20 or 25
pointers in a single TFD. In both cases, it is more than
enough and it is quite hard to hit this limit.
It has been reported that when using specific applications
(Ktorrent), we can actually use all the pointers and then
a long standing bug showed up.
When we free the TFD, we check its number of valid pointers
and make sure it doesn't exceed the number of pointers the
hardware support.
This check had an off by one bug: it is perfectly valid to
free the 20 pointers if the TFD has 20 pointers.
Fix that.
https://bugzilla.kernel.org/show_bug.cgi?id=197981
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
In IBSS, the mac80211 sets the cab_queue to be invalid.
However, the multicast station uses it, so we need to override it.
A previous patch did it, but it was nested inside the if's and was
applied only for legacy FWs that don't support the new station type
API, instead of being applied for all paths.
In addition, add a missing NL80211_IFTYPE_ADHOC to the initialization
of the queues in iwl_mvm_mac_ctxt_init()
Fixes: ee48b72211f8 ("iwlwifi: mvm: support ibss in dqa mode")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
A previous patch allowed the same PN for packets originating from the
same AMSDU by copying PN only for the last packet in the series.
This however is bogus since we cannot assume the last frame will be
received on the same queue, and if it is received on a different ueue
we will end up not incrementing the PN and possibly let the next
packet to have the same PN and pass through.
Change the logic instead to driver explicitly indicate for the second
sub frame and on to be allowed to have the same PN as the first
subframe. Indicate it to mac80211 as well for the fallback queue.
Fixes: f1ae02b186d9 ("iwlwifi: mvm: allow same PN for de-aggregated AMSDU")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
During device close or reset, there were some cases of outstanding
RX socket buffers not being freed. Include a function similar to the
one that already exists to clean TX socket buffers in this case.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a RX buffer is returned to the client driver with an error, free the
corresponding socket buffer before continuing.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This memory is allocated during initialization but never freed,
so do that now.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During device bringup, the driver exchanges login buffers with
firmware. These buffers contain information such number of TX
and RX queues alloted to the device, RX buffer size, etc. These
buffers weren't being properly freed on device reset or close.
We can free the buffer we send to firmware as soon as we get
a response. There is information in the response buffer that
the driver needs for normal operation so retain it until the
next reset or removal.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pushes back setting the carrier on until the end of the reset
code. This resolves a bug where a watchdog timer was detecting
that a TX queue had stalled before the adapter reset was complete.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit aa136d0c82fcd6af14535853c30e219e02b2692d.
As I previously[1] pointed out this implementation of XDP_REDIRECT is
wrong. XDP_REDIRECT is a facility that must work between different
NIC drivers. Another NIC driver can call ndo_xdp_xmit/nicvf_xdp_xmit,
but your driver patch assumes payload data (at top of page) will
contain a queue index and a DMA addr, this is not true and worse will
likely contain garbage.
Given you have not fixed this in due time (just reached v4.16-rc1),
the only option I see is a revert.
[1] http://lkml.kernel.org/r/20171211130902.482513d3@redhat.com
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Christina Jacob <cjacob@caviumnetworks.com>
Cc: Aleksey Makarov <aleksey.makarov@cavium.com>
Fixes: aa136d0c82fc ("net: thunderx: Add support for xdp redirect")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since mlxsw_sp_fib_create() and mlxsw_sp_mr_table_create()
use ERR_PTR macro to propagate int err through return of a pointer,
the return value is not NULL in case of failure. So if one
of the calls fails, one of vr->fib4, vr->fib6 or vr->mr4_table
is not NULL and mlxsw_sp_vr_is_used wrongly assumes
that vr is in use which leads to crash like following one:
[ 1293.949291] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c9
[ 1293.952729] IP: mlxsw_sp_mr_table_flush+0x15/0x70 [mlxsw_spectrum]
Fix this by using local variables to hold the pointers and set vr->*
only in case everything went fine.
Fixes: 76610ebbde18 ("mlxsw: spectrum_router: Refactor virtual router handling")
Fixes: a3d9bc506d64 ("mlxsw: spectrum_router: Extend virtual routers with IPv6 support")
Fixes: d42b0965b1d4 ("mlxsw: spectrum_router: Add multicast routes notification handling functionality")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Prevent a kernel panic on reboot if ptp_clock is NULL by checking
the ptp pointer before using it.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Fixes: 8c56df372bc1 ("net: add support for Cavium PTP coprocessor")
Cc: Radoslaw Biernacki <rad@semihalf.com>
Cc: Aleksey Makarov <aleksey.makarov@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The control channel calls registered callbacks when control messages
such as XDomain protocol messages are received. The control channel
handling is done in a worker running on system workqueue which means the
networking driver can't run tear down flow which includes sending
disconnect request and waiting for a reply in the same worker. Otherwise
reply is never received (as the work is already running) and the
operation times out.
To fix this run disconnect ThunderboltIP flow asynchronously once
ThunderboltIP logout message is received.
Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When suspending to mem or disk the Thunderbolt controller typically goes
down as well tearing down the connection automatically. However, when
suspend to idle is used this does not happen so we need to make sure the
connection is properly disconnected before it can be re-established
during resume.
Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, if Wake-on-LAN is enabled, the SH-ETH device's module clock
is manually kept running during system suspend, to make sure the device
stays active.
Since commits 91c719f5ec6671f7 ("soc: renesas: rcar-sysc: Keep wakeup
sources active during system suspend") and 744dddcae84441b1 ("clk:
renesas: mstp: Keep wakeup sources active during system suspend"), this
workaround is no longer needed. Hence remove all explicit clock
handling to keep the device active.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, if Wake-on-LAN is enabled, the EtherAVB device's module clock
is manually kept running during system suspend, to make sure the device
stays active.
Since commit 91c719f5ec6671f7 ("soc: renesas: rcar-sysc: Keep wakeup
sources active during system suspend") , this workaround is no longer
needed. Hence remove all explicit clock handling to keep the device
active.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When forcing a specific link mode, the PHY driver must clear the
existing speed and duplex bits in BMCR while preserving some other
control bits. This logic was accidentally inverted with the introduction
of phy_modify().
Fixes: fea23fb591cc ("net: phy: convert read-modify-write to phy_modify()")
Signed-off-by: Ingo van Lil <inguin@gmx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IPv6 doesn't work on the MacchiatoBIN board. It is caused by broken
multicast address filter in the mvpp2 driver.
The driver loads doesn't load any multicast entries if "allmulti" is not
set. This condition should be reversed.
The condition !netdev_mc_empty(dev) is useless (because
netdev_for_each_mc_addr is nop if the list is empty).
This patch also fixes a possible overflow of the multicast list - if
mvpp2_prs_mac_da_accept fails, we set the allmulti flag and retry.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
with de-mangling cleanups yet to come.
NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do. But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.
The next patch from Al will sort out the final differences, and we
should be all done.
Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull networking fixes from David Miller:
1) Make allocations less aggressive in x_tables, from Minchal Hocko.
2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso.
3) Fix connection loss problems in rtlwifi, from Larry Finger.
4) Correct DRAM dump length for some chips in ath10k driver, from Yu
Wang.
5) Fix ABORT handling in rxrpc, from David Howells.
6) Add SPDX tags to Sun networking drivers, from Shannon Nelson.
7) Some ipv6 onlink handling fixes, from David Ahern.
8) Netem packet scheduler interval calcualtion fix from Md. Islam.
9) Don't put crypto buffers on-stack in rxrpc, from David Howells.
10) Fix handling of error non-delivery status in netlink multicast
delivery over multiple namespaces, from Nicolas Dichtel.
11) Missing xdp flush in tuntap driver, from Jason Wang.
12) Synchonize RDS protocol netns/module teardown with rds object
management, from Sowini Varadhan.
13) Add nospec annotations to mpls, from Dan Williams.
14) Fix SKB truesize handling in TIPC, from Hoang Le.
15) Interrupt masking fixes in stammc from Niklas Cassel.
16) Don't allow ptr_ring objects to be sized outside of kmalloc's
limits, from Jason Wang.
17) Don't allow SCTP chunks to be built which will have a length
exceeding the chunk header's 16-bit length field, from Alexey
Kodanev.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits)
ibmvnic: Remove skb->protocol checks in ibmvnic_xmit
bpf: fix rlimit in reuseport net selftest
sctp: verify size of a new chunk in _sctp_make_chunk()
s390/qeth: fix SETIP command handling
s390/qeth: fix underestimated count of buffer elements
ptr_ring: try vmalloc() when kmalloc() fails
ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE
net: stmmac: remove redundant enable of PMT irq
net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4
net: stmmac: discard disabled flags in interrupt status register
ibmvnic: Reset long term map ID counter
tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
selftests/bpf: add selftest that use test_libbpf_open
selftests/bpf: add test program for loading BPF ELF files
tools/libbpf: improve the pr_debug statements to contain section numbers
bpf: Sync kernel ABI header with tooling header for bpf_common.h
net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
net: thunder: change q_len's type to handle max ring size
tipc: fix skb truesize/datasize ratio control
net/sched: cls_u32: fix cls_u32 on filter replace
...
|
|
Having these checks in ibmvnic_xmit causes problems with VLAN
tagging and balance-alb/tlb bonding modes. The restriction they
imposed can be removed.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For dwmac4, GMAC_INT_DEFAULT_ENABLE already includes
GMAC_INT_PMT_EN, so it is redundant to check if hw->pmt
is set, and if so, setting the bit again.
For dwmac1000, GMAC_INT_DEFAULT_MASK does not include
GMAC_INT_DISABLE_PMT, so it is redundant to check if
hw->pmt is set, and if so, clearing an already cleared bit.
Improve code readability by removing this redundant code.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|