summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)Author
2018-04-19net-next: ax88796: Do not free IRQ in ax_remove() (already freed in ax_close()).Michael Karcher
This complements the fix in 82533ad9a1c ("net: ethernet: ax88796: don't call free_irq without request_irq first") that removed the free_irq call in the error path of probe, to also not call free_irq when remove is called to revert the effects of probe. Fixes: 82533ad9a1c (net: ethernet: ax88796: don't call free_irq without request_irq first) Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19net-next: ax88796: Attach MII bus only when openMichael Karcher
Call ax_mii_init in ax_open(), and unregister/remove mdiobus resources in ax_close(). This is needed to be able to unload the module, as the module is busy while the MII bus is attached. Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19net-next: ax88796: Fix MAC address readingMichael Karcher
To read the MAC address from the (virtual) SAprom, the remote DMA unit needs to be set up like for every other process access to card-local memory. Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19net/ipv6: Rename fib6_info struct elementsDavid Ahern
Change the prefix for fib6_info struct elements from rt6i_ to fib6_. rt6i_pcpu and rt6i_exception_bucket are left as is given that they point to rt6_info entries. Rename only; not functional change intended. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19net-next/hinic: add arm64 supportZhao Chen
This patch enables arm64 platform support for the HINIC driver. Signed-off-by: Zhao Chen <zhaochen6@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19net: stmmac: Disable ACS Feature for GMAC >= 4Jose Abreu
ACS Feature is currently enabled for GMAC >= 4 but the llc_snap status is never checked in descriptor rx_status callback. This will cause stmmac to always strip packets even that ACS feature is already stripping them. Lets be safe and disable the ACS feature for GMAC >= 4 and always strip the packets for this GMAC version. Fixes: 477286b53f55 ("stmmac: add GMAC4 core support") Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19net: mvpp2: Fix DMA address mask sizeMaxime Chevallier
PPv2 TX/RX descriptors uses 40bits DMA addresses, but 41 bits masks were used (GENMASK_ULL(40, 0)). This commit fixes that by using the correct mask. Fixes: e7c5359f2eed ("net: mvpp2: introduce PPv2.2 HW descriptors and adapt accessors") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: qualcomm: rmnet: Fix warning seen with fill_infoSubash Abhinov Kasiviswanathan
When the last rmnet device attached to a real device is removed, the real device is unregistered from rmnet. As a result, the real device lookup fails resulting in a warning when the fill_info handler is called as part of the rmnet device unregistration. Fix this by returning the rmnet flags as 0 when no real device is present. WARNING: CPU: 0 PID: 1779 at net/core/rtnetlink.c:3254 rtmsg_ifinfo_build_skb+0xca/0x10d Modules linked in: CPU: 0 PID: 1779 Comm: ip Not tainted 4.16.0-11872-g7ce2367 #1 Stack: 7fe655f0 60371ea3 00000000 00000000 60282bc6 6006b116 7fe65600 60371ee8 7fe65660 6003a68c 00000000 900000000 Call Trace: [<6006b116>] ? printk+0x0/0x94 [<6001f375>] show_stack+0xfe/0x158 [<60371ea3>] ? dump_stack_print_info+0xe8/0xf1 [<60282bc6>] ? rtmsg_ifinfo_build_skb+0xca/0x10d [<6006b116>] ? printk+0x0/0x94 [<60371ee8>] dump_stack+0x2a/0x2c [<6003a68c>] __warn+0x10e/0x13e [<6003a82c>] warn_slowpath_null+0x48/0x4f [<60282bc6>] rtmsg_ifinfo_build_skb+0xca/0x10d [<60282c4d>] rtmsg_ifinfo_event.part.37+0x1e/0x43 [<60282c2f>] ? rtmsg_ifinfo_event.part.37+0x0/0x43 [<60282d03>] rtmsg_ifinfo+0x24/0x28 [<60264e86>] dev_close_many+0xba/0x119 [<60282cdf>] ? rtmsg_ifinfo+0x0/0x28 [<6027c225>] ? rtnl_is_locked+0x0/0x1c [<6026ca67>] rollback_registered_many+0x1ae/0x4ae [<600314be>] ? unblock_signals+0x0/0xae [<6026cdc0>] ? unregister_netdevice_queue+0x19/0xec [<6026ceec>] unregister_netdevice_many+0x21/0xa1 [<6027c765>] rtnl_delete_link+0x3e/0x4e [<60280ecb>] rtnl_dellink+0x262/0x29c [<6027c241>] ? rtnl_get_link+0x0/0x3e [<6027f867>] rtnetlink_rcv_msg+0x235/0x274 Fixes: be81a85f5f87 ("net: qualcomm: rmnet: Implement fill_info") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove jumbo_tx_csum from chip config structHeiner Kallweit
According to the chip configuration entries only RTL8169 (ver <= 06) supports tx checksumming for jumbo packets. By the way: constant JUMBO_1K is a little misleading because it refers to the standard packet size and not to a jumbo packet size. By implementing this rule we can get rid of configuring tx checksumming support per chip type. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: improve pci region handlingHeiner Kallweit
The region to be used is always the first of type IORESOURCE_MEM. We can implement this rule directly w/o having to specify which region is the first one per configuration entry. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: drop member txd_version from struct rtl8169_privateHeiner Kallweit
txd_version is used in rtl_init_one() only, so we can drop member txd_version from struct rtl8169_private. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: improve rtl8169_get_mac_versionHeiner Kallweit
Certain entries in array mac_info[] are redundant, so remove them: 0x7cf, 0x2c200000 (VER 33): matched by entry 0x7c8, 0x2c000000 0x7cf, 0x28300000 (VER 26): matched by entry 0x7c8, 0x28000000 0x7cf, 0x3cb00000 (VER 24): matched by entry 0x7c8, 0x3c800000 0x7cf, 0x3c400000 (VER 22): matched by entry 0x7c8, 0x3c000000 0x7cf, 0x38500000 (VER 17): matched by entry 0x7c8, 0x38000000 0x7cf, 0x44900000 (VER 39): matched by entry 0x7c8, 0x44800000 0x7cf, 0x40b00000 (VER 30): matched by entry 0x7c8, 0x40800000 0x7cf, 0x40a00000 (VER 30): matched by entry 0x7c8, 0x40800000 0x7cf, 0x34a00000 (VER 09): matched by entry 0x7c8, 0x34800000 0x7cf, 0x24a00000 (VER 09): matched by entry 0x7c8, 0x24800000 In addition don't mask out bits 30 and 29 when printing the XID. Most likely this is a relict from the times when the driver covered RTL8169 chip version only. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: don't display tp->mmio_addr addressHeiner Kallweit
For security reasons since commit ad67b74d2469 "printk: hash addresses printed with %p" %p doesn't display the full address any longer. We could switch to %px, but I think the pointer address doesn't provide a real benefit, so remove printing the hashed address. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: drop member opts1_mask from struct rtl8169_privateHeiner Kallweit
We can get rid of member opts1_mask and in addition save a few cpu cycles in the hot path of rtl_rx(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: change interrupt handler argument typeHeiner Kallweit
Code can be a little simplified by switching the interrupt handler argument type to struct rtl8169_private *. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: change argument type of counters handling functionsHeiner Kallweit
The counter handling functions don't deal with the net_device, so code can be simplified by changing the argument type to struct rtl8169_private *. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: change hw_start argument typeHeiner Kallweit
Code can be simplified by changing the argument type of hw_start callbacks from struct net_device * to struct rtl8169_private *. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove rtl8169_map_to_asicHeiner Kallweit
This function is very simple and used only once, so we can inline the two statements. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: replace rx_buf_sz with a constantHeiner Kallweit
rx_buf_sz is constant, so we don't have to pass it as parameter and in general can replace it with a constant. When working on this I noticed that also before in rtl_set_rx_max_size() a value of 0x4000 is set, what is not in line with the chip spec. According to the spec only bits 0..13 are used and we set an effective value of zero therefore. However, the driver still seems to work and due to potential side effects I'm reluctant to make a change. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove unneeded check in rtl8169_rx_fillHeiner Kallweit
rtl8169_rx_fill() is called only once and directly before the call array tp->Rx_databuff[] is filled with zero's. Therefore we don't need this check. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: improve rtl8169_init_ringHeiner Kallweit
This function doesn't use the net_device, therefore change the parameter to type struct rtl8169_private * to simplify the code. In addition we don't need the calculations in the memset statements, we can use the size of the arrays directly. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: simplify rtl8169_alloc_rx_dataHeiner Kallweit
dev->dev.parent has the same value as tp_to_dev(tp) (set by SET_NETDEV_DEV() in rtl_init_one()) and we know it can't be NULL. This allows us to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: switch to napi_schedule_irqoffHeiner Kallweit
napi_schedule() is called from hard irq context, so we can switch to napi_schedule_irqoff() and avoid some overhead. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: use constant NAPI_POLL_WAITHeiner Kallweit
We can use generic constant NAPI_POLL_WAIT instead of defining an own constant for the same value. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: use skb_copy_to_linear_data in rtl8169_try_rx_copyHeiner Kallweit
Not a giant leap for mankind, but let's avoid the open-coded memcpy and use standard helper skb_copy_to_linear_data instead. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove member align from struct rtl_cfg_infoHeiner Kallweit
Since commit 6f0333b8fde4 "r8169: use 50% less ram for RX ring" member align isn't used any longer, so remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove unused member features from structHeiner Kallweit
Member features of struct rtl8169_private isn't used any longer since commit 6c6aa15fdea5 "r8169: improve interrupt handling", so remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: k2g: add promiscuous mode supportWingMan Kwok
This patch adds support for promiscuous mode in k2g's network driver. When upper layer instructs to transition from non-promiscuous mode to promiscuous mode or vice versa K2G network driver needs to configure ALE accordingly so that in case of non-promiscuous mode, ALE will not flood all unicast packets to host port, while in promiscuous mode, it will pass all received unicast packets to host port. Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: add api to support set rx mode in netcp modulesWingMan Kwok
This patch adds an API to support setting rx mode in netcp modules. If a netcp module needs to be notified when upper layer transitions from one rx mode to another and react accordingly, such a module will implement the new API set_rx_mode added in this patch. Currently rx modes supported are PROMISCUOUS and NON_PROMISCUOUS modes. Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: support probe deferralMurali Karicheri
The netcp driver shouldn't proceed until the knav qmss and dma devices are ready. So return -EPROBE_DEFER if these devices are not ready. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18Revert "net: netcp: remove dead code from the driver"Murali Karicheri
As the probe sequence is not guaranteed contrary to the assumption of the commit 2d8e276a9030, same has to be reverted. commit 2d8e276a9030 ("net: netcp: remove dead code from the driver") Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: use of_get_phy_mode() to support different RGMII modesMurali Karicheri
The phy used for K2G allows for internal delays to be added optionally to the clock circuitry based on board desing. To add this support, enhance the driver to use of_get_phy_mode() to read the phy-mode from the phy device and pass the same to phy through of_phy_connect(). Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: re-use stats handling code for 2u hardwareMurali Karicheri
The stats block in 2u cpsw hardware is similar to the one on nu and hence handle it in a similar way by using a macro that includes 2u hardware as well. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: map vlan priorities to zero flowMurali Karicheri
The driver currently support only vlan priority zero. So map the vlan priorities to zero flow in hardware. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: use rgmii link status for 2u cpsw hardwareMurali Karicheri
Introduce rgmii link status to handle link state events for 2u cpsw hardware on K2G. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: add support for handling rgmii link interfaceMurali Karicheri
2u cpsw hardware on K2G uses rgmii link to interface with Phy. So add support for this interface in the code so that driver can be re-used for 2u hardware. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: make sgmii configuration conditionalMurali Karicheri
As a preparatory patch to add support for 2u cpsw hardware found on K2G SoC, make sgmii configuration conditional. This is required since 2u uses RGMII interface instead of SGMII and to allow for driver re-use. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: use macro for checking ss_version consistentlyMurali Karicheri
Driver currently uses macro for NU and XBE hardwrae, while other places for older hardware such as that on K2H/K SoC (version 1.4 of the cpsw hardware, it explicitly check for the ss_version inline. Add a new macro for version 1.4 and use it to customize code in the driver. While at it also fix similar issue with checking XBE version by re-using existing macro IS_SS_ID_XGBE(). Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18bpf: make netronome nfp compatible w/ bpf_xdp_adjust_tailNikita V. Shirokov
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as well (only "decrease" of pointer's location is going to be supported). changing of this pointer will change packet's size. for nfp driver we will just calculate packet's length unconditionally Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-18bpf: make cavium thunder compatible w/ bpf_xdp_adjust_tailNikita V. Shirokov
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as well (only "decrease" of pointer's location is going to be supported). changing of this pointer will change packet's size. for cavium's thunder driver we will just calculate packet's length unconditionally Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-18bpf: make bnxt compatible w/ bpf_xdp_adjust_tailNikita V. Shirokov
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as well (only "decrease" of pointer's location is going to be supported). changing of this pointer will change packet's size. for bnxt driver we will just calculate packet's length unconditionally Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-18bpf: make mlx4 compatible w/ bpf_xdp_adjust_tailNikita V. Shirokov
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as well (only "decrease" of pointer's location is going to be supported). changing of this pointer will change packet's size. for mlx4 driver we will just calculate packet's length unconditionally (the same way as it's already being done in mlx5) Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-17net/ipv6: Flip FIB entries to fib6_infoDavid Ahern
Convert all code paths referencing a FIB entry from rt6_info to fib6_info. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17net/ipv6: Move nexthop data to fib6_nhDavid Ahern
Introduce fib6_nh structure and move nexthop related data from rt6_info and rt6_info.dst to fib6_nh. References to dev, gateway or lwtstate from a FIB lookup perspective are converted to use fib6_nh; datapath references to dst version are left as is. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17sfc: check RSS is active for filter insertBert Kenward
For some firmware variants - specifically 'capture packed stream' - RSS filters are not valid. We must check if RSS is actually active rather than merely enabled. Fixes: 42356d9a137b ("sfc: support RSS spreading of ethtool ntuple filters") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17cxgb4vf: display pause settingsGanesh Goudar
Add support to display pause settings Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17xdp: transition into using xdp_frame for ndo_xdp_xmitJesper Dangaard Brouer
Changing API ndo_xdp_xmit to take a struct xdp_frame instead of struct xdp_buff. This brings xdp_return_frame and ndp_xdp_xmit in sync. This builds towards changing the API further to become a bulk API, because xdp_buff is not a queue-able object while xdp_frame is. V4: Adjust for commit 59655a5b6c83 ("tuntap: XDP_TX can use native XDP") V7: Adjust for commit d9314c474d4f ("i40e: add support for XDP_REDIRECT") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17xdp: transition into using xdp_frame for return APIJesper Dangaard Brouer
Changing API xdp_return_frame() to take struct xdp_frame as argument, seems like a natural choice. But there are some subtle performance details here that needs extra care, which is a deliberate choice. When de-referencing xdp_frame on a remote CPU during DMA-TX completion, result in the cache-line is change to "Shared" state. Later when the page is reused for RX, then this xdp_frame cache-line is written, which change the state to "Modified". This situation already happens (naturally) for, virtio_net, tun and cpumap as the xdp_frame pointer is the queued object. In tun and cpumap, the ptr_ring is used for efficiently transferring cache-lines (with pointers) between CPUs. Thus, the only option is to de-referencing xdp_frame. It is only the ixgbe driver that had an optimization, in which it can avoid doing the de-reference of xdp_frame. The driver already have TX-ring queue, which (in case of remote DMA-TX completion) have to be transferred between CPUs anyhow. In this data area, we stored a struct xdp_mem_info and a data pointer, which allowed us to avoid de-referencing xdp_frame. To compensate for this, a prefetchw is used for telling the cache coherency protocol about our access pattern. My benchmarks show that this prefetchw is enough to compensate the ixgbe driver. V7: Adjust for commit d9314c474d4f ("i40e: add support for XDP_REDIRECT") V8: Adjust for commit bd658dda4237 ("net/mlx5e: Separate dma base address and offset in dma_sync call") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17mlx5: use page_pool for xdp_return_frame callJesper Dangaard Brouer
This patch shows how it is possible to have both the driver local page cache, which uses elevated refcnt for "catching"/avoiding SKB put_page returns the page through the page allocator. And at the same time, have pages getting returned to the page_pool from ndp_xdp_xmit DMA completion. The performance improvement for XDP_REDIRECT in this patch is really good. Especially considering that (currently) the xdp_return_frame API and page_pool_put_page() does per frame operations of both rhashtable ID-lookup and locked return into (page_pool) ptr_ring. (It is the plan to remove these per frame operation in a followup patchset). The benchmark performed was RX on mlx5 and XDP_REDIRECT out ixgbe, with xdp_redirect_map (using devmap) . And the target/maximum capability of ixgbe is 13Mpps (on this HW setup). Before this patch for mlx5, XDP redirected frames were returned via the page allocator. The single flow performance was 6Mpps, and if I started two flows the collective performance drop to 4Mpps, because we hit the page allocator lock (further negative scaling occurs). Two test scenarios need to be covered, for xdp_return_frame API, which is DMA-TX completion running on same-CPU or cross-CPU free/return. Results were same-CPU=10Mpps, and cross-CPU=12Mpps. This is very close to our 13Mpps max target. The reason max target isn't reached in cross-CPU test, is likely due to RX-ring DMA unmap/map overhead (which doesn't occur in ixgbe to ixgbe testing). It is also planned to remove this unnecessary DMA unmap in a later patchset V2: Adjustments requested by Tariq - Changed page_pool_create return codes not return NULL, only ERR_PTR, as this simplifies err handling in drivers. - Save a branch in mlx5e_page_release - Correct page_pool size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ V5: Updated patch desc V8: Adjust for b0cedc844c00 ("net/mlx5e: Remove rq_headroom field from params") V9: - Adjust for 121e89275471 ("net/mlx5e: Refactor RQ XDP_TX indication") - Adjust for 73281b78a37a ("net/mlx5e: Derive Striding RQ size from MTU") - Correct handling if page_pool_create fail for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ V10: Req from Tariq - Change pool_size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17page_pool: refurbish version of page_pool codeJesper Dangaard Brouer
Need a fast page recycle mechanism for ndo_xdp_xmit API for returning pages on DMA-TX completion time, which have good cross CPU performance, given DMA-TX completion time can happen on a remote CPU. Refurbish my page_pool code, that was presented[1] at MM-summit 2016. Adapted page_pool code to not depend the page allocator and integration into struct page. The DMA mapping feature is kept, even-though it will not be activated/used in this patchset. [1] http://people.netfilter.org/hawk/presentations/MM-summit2016/generic_page_pool_mm_summit2016.pdf V2: Adjustments requested by Tariq - Changed page_pool_create return codes, don't return NULL, only ERR_PTR, as this simplifies err handling in drivers. V4: many small improvements and cleanups - Add DOC comment section, that can be used by kernel-doc - Improve fallback mode, to work better with refcnt based recycling e.g. remove a WARN as pointed out by Tariq e.g. quicker fallback if ptr_ring is empty. V5: Fixed SPDX license as pointed out by Alexei V6: Adjustments requested by Eric Dumazet - Adjust ____cacheline_aligned_in_smp usage/placement - Move rcu_head in struct page_pool - Free pages quicker on destroy, minimize resources delayed an RCU period - Remove code for forward/backward compat ABI interface V8: Issues found by kbuild test robot - Address sparse should be static warnings - Only compile+link when a driver use/select page_pool, mlx5 selects CONFIG_PAGE_POOL, although its first used in two patches Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>