summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)Author
2018-02-23net: fib_rules: Add new attribute to set protocolDonald Sharp
For ages iproute2 has used `struct rtmsg` as the ancillary header for FIB rules and in the process set the protocol value to RTPROT_BOOT. Until ca56209a66 ("net: Allow a rule to track originating protocol") the kernel rules code ignored the protocol value sent from userspace and always returned 0 in notifications. To avoid incompatibility with existing iproute2, send the protocol as a new attribute. Fixes: cac56209a66 ("net: Allow a rule to track originating protocol") Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-23net/mlx5: E-Switch, Reload IB interface when switching devlink modesMark Bloch
Up until this point it wasn't possible to activate IB representors when switching to switchdev mode, remove this limitation. We trigger reload of the PF IB interface in order to make sure that already allocated resources are invalid and new resources will be opened correctly with all the limitations of switchdev mode applied (only raw packet capabilities, without RoCE). We also move the remove/add to a place where the E-Switch mode is set/unset to better control when to trigger this action, this will allow the IB side to start in the correct mode. For better code reuse, create a function which reloads an interface and export it. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Optimize HW steering tables in switchdev modeMark Bloch
Under switchdev mode we insert an eswitch miss rule causing any unmatched traffic to be sent towards the PF vport. This miss rule can be optimized if we break it to two, one case is for multicast traffic and the other for unicast. Breaking the miss rule into two (unicast and multicast) allows the firmware to program the hardware in a more efficient way. Using ConncetX-5 Ex with IXIA and testpmd (which use IB representors): IXIA -> NIC -> PF -> IB representor -> NIC -> VF: - Without this optimization: 9.2 MPPS. - With this optimization: 18 MPPS. VF -> NIC -> IB representor-> PF -> NIC -> IXIA: - Without this optimization: 17 MPPS. - With this optimization: 23.4 MPPS. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Increase number of FTEs in FDB in switchdev modeMark Bloch
The max FTE number should be the max number of SQs that can be opened. Ethernet representors open one SQ each. Once we add IB representor this will increase (depends on the user). For now lets start with 31 per IB representor and if needed increase in the future. This increase only affects the number of FTEs in the slow path FDB, offloaded rules (done via TC on the fast path portion of the FDB) aren't affected. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Move representors definition to a global scopeMark Bloch
In preparation for IB representors, move representors structs to a global scope, also expose functions needed for registration, unregistration, eswitch mode and creating a flow rule to direct traffic from SQs to the right VF. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Add callback to get representor deviceMark Bloch
Add a callback interface to get a protocol device (per representor type). The Ethernet representors will expose their netdev via this interface. This functionality can be later used by IB representor in order to find the corresponding net device representor. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23r8169: simplify and improve check for dashHeiner Kallweit
r8168_check_dash() returns false anyway for all chip versions not supporting dash. So we can simplify the check conditions. In addition change the check functions to return bool instead of int, because they actually return a bool value. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-23r8169: disable WOL per defaultHeiner Kallweit
Currently, if BIOS enables WOL in the chip, settings are inconsistent because the device isn't marked as wakeup-enabled (if not done explicitly via userspace tools). This causes issues with suspend/ resume because mdio_bus_phy_may_suspend() checks whether device is wakeup-enabled. In detail MDIO bus access in phy_suspend() can fail because the MDIO bus is disabled. In the history of the driver we find two competing approaches: 8f9d5138035d "r8169: remember WOL preferences on driver load" prefers to preserve what the BIOS may have set, whilst bde135a672bf "r8169: only enable PCI wakeups when WOL is active" disabled PCI wakeup per default to work around a bug on one platform. Seems like nobody complained after the latter patch about non-working WOL, what makes me think that nobody uses WOL w/o configuring it explicitly. My opinion: Vast majority of users doesn't use WOL even if the BIOS enables it in the chip. And having WOL being active keeps the PHY(s) from powering down if being idle. If somebody needs WOL, he can enable it during boot, e.g. by configuring systemd.link/WakeOnLan. Therefore, to make WOL consistent again, disable it per default. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-23gianfar: simplify FCS handling and fix memory leakAndy Spencer
Previously, buffer descriptors containing only the frame check sequence (FCS) were skipped and not added to the skb. However, the page reference count was still incremented, leading to a memory leak. Fixing this inside gfar_add_rx_frag() is difficult due to reserved memory handling and page reuse. Instead, move the FCS handling to gfar_process_frame() and trim off the FCS before passing the skb up the networking stack. Signed-off-by: Andy Spencer <aspencer@spacex.com> Signed-off-by: Jim Gruen <jgruen@spacex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-23macvlan: fix use-after-free in macvlan_common_newlink()Alexey Kodanev
The following use-after-free was reported by KASan when running LTP macvtap01 test on 4.16-rc2: [10642.528443] BUG: KASAN: use-after-free in macvlan_common_newlink+0x12ef/0x14a0 [macvlan] [10642.626607] Read of size 8 at addr ffff880ba49f2100 by task ip/18450 ... [10642.963873] Call Trace: [10642.994352] dump_stack+0x5c/0x7c [10643.035325] print_address_description+0x75/0x290 [10643.092938] kasan_report+0x28d/0x390 [10643.137971] ? macvlan_common_newlink+0x12ef/0x14a0 [macvlan] [10643.207963] macvlan_common_newlink+0x12ef/0x14a0 [macvlan] [10643.275978] macvtap_newlink+0x171/0x260 [macvtap] [10643.334532] rtnl_newlink+0xd4f/0x1300 ... [10646.256176] Allocated by task 18450: [10646.299964] kasan_kmalloc+0xa6/0xd0 [10646.343746] kmem_cache_alloc_trace+0xf1/0x210 [10646.397826] macvlan_common_newlink+0x6de/0x14a0 [macvlan] [10646.464386] macvtap_newlink+0x171/0x260 [macvtap] [10646.522728] rtnl_newlink+0xd4f/0x1300 ... [10647.022028] Freed by task 18450: [10647.061549] __kasan_slab_free+0x138/0x180 [10647.111468] kfree+0x9e/0x1c0 [10647.147869] macvlan_port_destroy+0x3db/0x650 [macvlan] [10647.211411] rollback_registered_many+0x5b9/0xb10 [10647.268715] rollback_registered+0xd9/0x190 [10647.319675] register_netdevice+0x8eb/0xc70 [10647.370635] macvlan_common_newlink+0xe58/0x14a0 [macvlan] [10647.437195] macvtap_newlink+0x171/0x260 [macvtap] Commit d02fd6e7d293 ("macvlan: Fix one possible double free") handles the case when register_netdevice() invokes ndo_uninit() on error and as a result free the port. But 'macvlan_port_get_rtnl(dev))' check (returns dev->rx_handler_data), which was added by this commit in order to prevent double free, is not quite correct: * for macvlan it always returns NULL because 'lowerdev' is the one that was used to register rx handler (port) in macvlan_port_create() as well as to unregister it in macvlan_port_destroy(). * for macvtap it always returns a valid pointer because macvtap registers its own rx handler before macvlan_common_newlink(). Fixes: d02fd6e7d293 ("macvlan: Fix one possible double free") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22dsa: ptp: mark dummy helpers as 'inline'Arnd Bergmann
Declaring a static function in a header leads to a warning every time that header gets included without the function being used: In file included from drivers/net/dsa/mv88e6xxx/chip.c:42: drivers/net/dsa/mv88e6xxx/ptp.h:92:13: error: 'mv88e6xxx_hwtstamp_work' defined but not used [-Werror=unused-function] static long mv88e6xxx_hwtstamp_work(struct ptp_clock_info *ptp) In file included from drivers/net/dsa/mv88e6xxx/chip.c:38: drivers/net/dsa/mv88e6xxx/global2.h:355:12: error: 'mv88e6xxx_g2_wait' defined but not used [-Werror=unused-function] static int mv88e6xxx_g2_wait(struct mv88e6xxx_chip *chip, int reg, u16 mask) ^~~~~~~~~~~~~~~~~ drivers/net/dsa/mv88e6xxx/global2.h:350:12: error: 'mv88e6xxx_g2_update' defined but not used [-Werror=unused-function] static int mv88e6xxx_g2_update(struct mv88e6xxx_chip *chip, int reg, u16 update) ^~~~~~~~~~~~~~~~~~~ drivers/net/dsa/mv88e6xxx/global2.h:345:12: error: 'mv88e6xxx_g2_write' defined but not used [-Werror=unused-function] static int mv88e6xxx_g2_write(struct mv88e6xxx_chip *chip, int reg, u16 val) ^~~~~~~~~~~~~~~~~~ drivers/net/dsa/mv88e6xxx/global2.h:340:12: error: 'mv88e6xxx_g2_read' defined but not used [-Werror=unused-function] static int mv88e6xxx_g2_read(struct mv88e6xxx_chip *chip, int reg, u16 *val) This marks all such functions in dsa inline to make sure we don't warn about them. Fixes: c6fe0ad2c349 ("net: dsa: mv88e6xxx: add rx/tx timestamping support") Fixes: 0d632c3d6fe3 ("net: dsa: mv88e6xxx: add accessors for PTP/TAI registers") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22net: aquantia: Fix error handling in aq_pci_probe()Dan Carpenter
We should check "self->aq_hw" for allocation failure, and also we should free it on the error paths. Fixes: 23ee07ad3c2f ("net: aquantia: Cleanup pci functions module") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22nfp: advertise firmware for mixed 10G/25G modeDirk van der Merwe
The AMDA0099-0001 platform can support the 1x10G + 1x25G mixed mode operation. Recently, firmware has been added for this configuration mode. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22aquantia: add Makefiles to all directoriesJakub Kicinski
To be able to build separate objects we need to provide Kbuild with a Makefile in each directory. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22nfp: add Makefiles to all directoriesJakub Kicinski
To be able to build separate objects we need to provide Kbuild with a Makefile in each directory. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22Merge tag 'mac80211-next-for-davem-2018-02-22' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Various updates across wireless. One thing to note: I've included a new ethertype that wireless uses (ETH_P_PREAUTH) in if_ether.h. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22Merge tag 'mac80211-for-davem-2018-02-22' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Various fixes across the tree, the shortlog basically says it all: cfg80211: fix cfg80211_beacon_dup -> old bug in this code cfg80211: clear wep keys after disconnection -> certain ways of disconnecting left the keys mac80211: round IEEE80211_TX_STATUS_HEADROOM up to multiple of 4 -> alignment issues with using 14 bytes mac80211: Do not disconnect on invalid operating class -> if the AP has a bogus operating class, let it be mac80211: Fix sending ADDBA response for an ongoing session -> don't send the same frame twice cfg80211: use only 1Mbps for basic rates in mesh -> interop issue with old versions of our code mac80211_hwsim: don't use WQ_MEM_RECLAIM -> it causes splats because it flushes work on a non-reclaim WQ regulatory: add NUL to request alpha2 -> nla_put_string() issue from Kees mac80211: mesh: fix wrong mesh TTL offset calculation -> protocol issue mac80211: fix a possible leak of station stats -> error path might leak memory mac80211: fix calling sleeping function in atomic context -> percpu allocations need to be made with gfp flags ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22ibmvnic: Split counters for scrq/pools/napiNathan Fontenot
The approach of one counter to rule them all when tracking the number of active sub-crqs, pools, and napi has problems handling some failover scenarios. This is due to the split in initializing the sub crqs, pools and napi in different places and the placement of updating the active counts. This patch simplifies this by having a counter for tx and rx sub-crqs, pools, and napi. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22net: dsa: mv88e6xxx: scratch registers and external MDIO pinsAndrew Lunn
MV88E6352 and later switches support GPIO control through the "Scratch & Misc" global2 register. Two of the pins controlled this way on the mv88e6390 family are the external MDIO pins. They can either by used as part of the MII interface for port 0, GPIOs, or MDIO. Add a function to configure them for MDIO, if possible, and call it when registering the external MDIO bus. Suggested-by: Russell King <rmk@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22ibmvnic: Fix TX descriptor trackingThomas Falcon
With the recent change, transmissions that only needed one descriptor were being missed. The result is that such packets were tracked as outstanding transmissions but never removed when its completion notification was received. Fixes: ffc385b95adb ("ibmvnic: Keep track of supplementary TX descriptors") Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22ibmvnic: Fix early release of login bufferThomas Falcon
The login buffer is released before the driver can perform sanity checks between resources the driver requested and what firmware will provide. Don't release the login buffer until the sanity check is performed. Fixes: 34f0f4e3f488 ("ibmvnic: Fix login buffer memory leaks") Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22net/smc9194: Remove bogus CONFIG_MAC referenceFinn Thain
AFAIK the only version of smc9194.c with Mac support is the one in the linux-mac68k CVS repo, which never made it to the mainline. Despite that, from v2.3.45, arch/m68k/config.in listed CONFIG_SMC9194 under CONFIG_MAC. This mistake got carried over into Kconfig in v2.5.55. (See pre-git era "[PATCH] add m68k dependencies to net driver config".) Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22Merge tag 'mlx5-updates-2018-02-21' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-updates-2018-02-21 This series includes shared code updates for mlx5 core driver for both netdev and rdma subsystems. By Saeed, First six patches of the series are meant to address a performance issue and should provide a performance boost for multi core IRQ interrupt hungry workloads. The issue is fixed in the first patch, all other patches are meant to refactor the code in light of this fix. The problem it comes to fix, is a shared spinlock accessed across all HCA IRQs which protects the CQ database. To solve this we simply move the CQ database and its spinlock to be per EQ (IRQ), thus per core. By Yonatan, Fragmented completion queue (CQ) for RDMA, core driver implementation to create fragmented CQ buffers rather than one large contiguous memory buffer, the implementation scheme already exist and used by the netdev CQs, the patch shares that code with the rdma CQ creation flow and makes use of the new API in mlx5_ib driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22smsc75xx: fix smsc75xx_set_features()Eric Dumazet
If an attempt is made to disable RX checksums, USB adapter is changed but netdev->features is not, because smsc75xx_set_features() returns a non zero value. This throws errors from netdev_rx_csum_fault() : <devname>: hw csum failure Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steve Glendinning <steve.glendinning@shawell.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-22net: faraday add nds32 support.Greentime Hu
This patch is used to support nds32 architecture to use these faraday mac IP. Signed-off-by: Greentime Hu <greentime@andestech.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
2018-02-21ipvlan: selects master_l3 device instead of depending on itMatteo Croce
The L3 Master device is just a glue between the core networking code and device drivers, so it should be selected automatically rather than requiring to be enabled explicitly. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21ipvlan: drop ipv6 dependencyMatteo Croce
IPVlan has an hard dependency on IPv6, refactor the ipvlan code to allow compiling it with IPv6 disabled, move duplicate code into addr_equal() and refactor series of if-else into a switch. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21net: Allow a rule to track originating protocolDonald Sharp
Allow a rule that is being added/deleted/modified or dumped to contain the originating protocol's id. The protocol is handled just like a routes originating protocol is. This is especially useful because there is starting to be a plethora of different user space programs adding rules. Allow the vrf device to specify that the kernel is the originator of the rule created for this device. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21amd-xgbe: Restore PCI interrupt enablement setting on resumeTom Lendacky
After resuming from suspend, the PCI device support must re-enable the interrupt setting so that interrupts are actually delivered. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21ibmvnic: Correct goto target for tx irq initialization failureNathan Fontenot
When a failure occurs during initialization of the tx sub crq irqs, we should branch to the cleanup of the tx irqs. The current code branches to the rx irq cleanup and attempts to cleanup the rx irqs which have not been initialized. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21virtio_net: fix ndo_xdp_xmit crash towards dev not ready for XDPJesper Dangaard Brouer
When a driver implements the ndo_xdp_xmit() function, there is (currently) no generic way to determine whether it is safe to call. It is e.g. unsafe to call the drivers ndo_xdp_xmit, if it have not allocated the needed XDP TX queues yet. This is the case for virtio_net, which first allocates the XDP TX queues once an XDP/bpf prog is attached (in virtnet_xdp_set()). Thus, a crash will occur for virtio_net when redirecting to another virtio_net device's ndo_xdp_xmit, which have not attached a XDP prog. The sample xdp_redirect_map tries to attach a dummy XDP prog to take this into account, but it can also easily fail if the virtio_net (or actually underlying vhost driver) have not allocated enough extra queues for the device. Allocating more queue this is currently a manual config. Hint for libvirt XML add: <driver name='vhost' queues='16'> <host mrg_rxbuf='off'/> <guest tso4='off' tso6='off' ecn='off' ufo='off'/> </driver> The solution in this patch is to check that the device have loaded an XDP/bpf prog before proceeding. This is similar to the check performed in driver ixgbe. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21virtio_net: fix memory leak in XDP_REDIRECTJesper Dangaard Brouer
XDP_REDIRECT calling xdp_do_redirect() can fail for multiple reasons (which can be inspected by tracepoints). The current semantics is that on failure the driver calling xdp_do_redirect() must handle freeing or recycling the page associated with this frame. This can be seen as an optimization, as drivers usually have an optimized XDP_DROP code path for frame recycling in place already. The virtio_net driver didn't handle when xdp_do_redirect() failed. This caused a memory leak as the page refcnt wasn't decremented on failures. The function __virtnet_xdp_xmit() did handle one type of failure, when the xmit queue virtqueue_add_outbuf() is full, which "hides" releasing a refcnt on the page. Instead the function __virtnet_xdp_xmit() must follow API of xdp_do_redirect(), which on errors leave it up to the caller to free the page, of the failed send operation. Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21virtio_net: fix XDP code path in receive_small()Jesper Dangaard Brouer
When configuring virtio_net to use the code path 'receive_small()', in-order to get correct XDP_REDIRECT support, I discovered TCP packets would get silently dropped when loading an XDP program action XDP_PASS. The bug seems to be that receive_small() when XDP is loaded check that hdr->hdr.flags is zero, which seems wrong as hdr.flags contains the flags VIRTIO_NET_HDR_F_* : #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */ #define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */ TCP got dropped as it had the VIRTIO_NET_HDR_F_DATA_VALID flag set. The flags that are relevant here are the VIRTIO_NET_HDR_GSO_* flags stored in hdr->hdr.gso_type. Thus, the fix is just check that none of the gso_type flags have been set. Fixes: bb91accf2733 ("virtio-net: XDP support for small buffers") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21virtio_net: disable XDP_REDIRECT in receive_mergeable() caseJesper Dangaard Brouer
The virtio_net code have three different RX code-paths in receive_buf(). Two of these code paths can handle XDP, but one of them is broken for at least XDP_REDIRECT. Function(1): receive_big() does not support XDP. Function(2): receive_small() support XDP fully and uses build_skb(). Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb(). The simple explanation is that receive_mergeable() is broken because it uses napi_alloc_skb(), which violates XDP given XDP assumes packet header+data in single page and enough tail room for skb_shared_info. The longer explaination is that receive_mergeable() tries to work-around and satisfy these XDP requiresments e.g. by having a function xdp_linearize_page() that allocates and memcpy RX buffers around (in case packet is scattered across multiple rx buffers). This does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because we have not implemented bpf_xdp_adjust_tail yet). The XDP_REDIRECT action combined with cpumap is broken, and cause hard to debug crashes. The main issue is that the RX packet does not have the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing skb_shared_info to overlap the next packets head-room (in which cpumap stores info). Reproducing depend on the packet payload length and if RX-buffer size happened to have tail-room for skb_shared_info or not. But to make this even harder to troubleshoot, the RX-buffer size is runtime dynamically change based on an Exponentially Weighted Moving Average (EWMA) over the packet length, when refilling RX rings. This patch only disable XDP_REDIRECT support in receive_mergeable() case, because it can cause a real crash. IMHO we should consider NOT supporting XDP in receive_mergeable() at all, because the principles behind XDP are to gain speed by (1) code simplicity, (2) sacrificing memory and (3) where possible moving runtime checks to setup time. These principles are clearly being violated in receive_mergeable(), that e.g. runtime track average buffer size to save memory consumption. In the longer run, we should consider introducing a separate receive function when attaching an XDP program, and also change the memory model to be compatible with XDP when attaching an XDP prog. Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21Merge tag 'mlx5-fixes-2018-02-20' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-02-20 The following pull request includes some fixes for the mlx5 core and netdevice driver. Please pull and let me know if there's any issue. -stable 4.10.y: ('net/mlx5e: Fix loopback self test when GRO is off') -stable 4.12.y: ('net/mlx5e: Specify numa node when allocating drop rq') -stable 4.13.y: ('net/mlx5e: Verify inline header size do not exceed SKB linear size') -stable 4.15.y: ('net/mlx5e: Fix TCP checksum in LRO buffers') ('net/mlx5: Fix error handling when adding flow rules') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21ibmvnic: Allocate max queues stats buffersNathan Fontenot
To avoid losing any stats when the number of sub-crqs change, allocate the max number of stats buffers so a stats buffer exists all possible sub-crqs. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21ibmvnic: Make napi usage dynamicNathan Fontenot
In order to handle the number of rx sub crqs changing during a driver reset, the ibmvnic driver also needs to update the number of napi. To do this the code to init and free napi's is moved to their own routines so they can be called during the reset process. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21ibmvnic: Free and re-allocate scrqs when tx/rx scrqs changeNathan Fontenot
When the driver resets it is possible that the number of tx/rx sub-crqs can change. This patch handles this so that the driver does not try to access non-existent sub-crqs. The count for releasing sub crqs depends on the adapter state. The active queue count is not set in probe, so if we are relasing in probe state we use the request queue count. Additionally, a parameter is added to release_sub_crqs() so that we know if the h_call to free the sub-crq needs to be made. In the reset path we have to do a reset of the main crq, which is a free followed by a register of the main crq. The free of main crq results in all of the sub crq's being free'ed. When updating sub-crq count in the reset path we do not want to h_free the sub-crqs, they are already free'ed. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21ibmvnic: Move active sub-crq count settingsNathan Fontenot
Inpreparation for using the active scrq count to track more active resources, move the setting of the active count to after initialization occurs in initial driver init and during driver reset. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21ibmvnic: Rename active queue count variablesNathan Fontenot
Rename the tx/rx active pool variables to be tx/rx active scrq counts. The tx/rx pools are per sub-crq so this is a more appropriate name. This also is a preparatory step for using thiese variables for handling updates to sub-crqs and napi based on the active count. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21net/mac8390: Fix log messagesFinn Thain
Use dev_foo() to log the slot number instead of the unexpanded "eth%d" format string. Disambiguate the two identical "Card type %s is unsupported" messages. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21net/mac8390: Convert to nubus_driverFinn Thain
This resolves an old bug that constrained this driver to no more than one card. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21net/8390: Fix msg_enable patch snafuFinn Thain
The lib8390 module parameter 'msg_enable' doesn't do anything useful: it causes an ancient version string to be logged. Remove redundant code that logs the same string. In ne.c and wd.c, the value of ei_local->msg_enable is used before being assigned. Use ne_msg_enable and wd_msg_enable, respectively. Most of the other 8390 drivers never assign ei_local->msg_enable. Use the 'msg_enable' module parameter from lib8390 as the default value. Eliminate the pointless static and local variables. Clean up an indentation mistake. All of these issues originated from the same patch. Cc: Russell King <linux@armlinux.org.uk> Fixes: c45f812f0280 ("8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature") Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21net/8390: Remove redundant make dependenciesFinn Thain
The hydra, zorro8390 and mcf8390 drivers all #include "lib8390.c" and have no need for 8390.o. modinfo confirms no dependency on 8390.ko. Drop the redundant dependency from the Makefile. objdump confirms that this patch has no effect on the module binaries. The superfluous additions of 8390.o were introduced in commit 644570b83026 ("8390: Move the 8390 related drivers"). Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21r8169: remove not needed PHY soft reset in rtl8168e_2_hw_phy_configHeiner Kallweit
rtl8169_init_phy() resets the PHY anyway after applying the chip-specific PHY configuration. So we don't need to soft-reset the PHY as part of the chip-specific configuration. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driverXue Liu
The MCR20AVHM transceiver (or MCR20A) is a low power, high-performance 2.4 GHz, IEEE 802.15.4 compliant transceiver. This driver implements a subset of ieee802154_ops. It has no support for CSMA due to lack of hardware support. It has currently no support for its proprietary Dual-PAN feature. https://www.nxp.com/docs/en/reference-manual/MCR20RM.pdf Signed-off-by: Xue Liu <liuxuenetmail@gmail.com> Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
2018-02-20r8169: remove some WOL-related dead codeHeiner Kallweit
Commit bde135a672bf "r8169: only enable PCI wakeups when WOL is active" removed the only user of flag RTL_FEATURE_WOL. So let's remove some now dead code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-20net/mlx5: Fix error handling when adding flow rulesVlad Buslov
If building match list or adding existing fg fails when node is locked, function returned without unlocking it. This happened if node version changed or adding existing fg returned with EAGAIN after jumping to search_again_locked label. Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-20net/mlx5: E-Switch, Fix drop counters use before creationEugenia Emantayev
First use of drop counters happens in esw_apply_vport_conf function, while they are allocated later in the flow. Fix that by moving esw_vport_create_drop_counters function to be called before the first use. Fixes: b8a0dbe3a90b ("net/mlx5e: E-switch, Add steering drop counters") Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-20net/mlx5: Add header re-write to the checks for conflicting actionsOr Gerlitz
We can't allow only some of the rules sharing an FTE to ask for header re-write, add it to the conflicting action checks. Fixes: 0d235c3fabb7 ('net/mlx5: Add hash table to search FTEs in a flow-group') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>