summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-08-31net: ipv4: remove unused arg exact_dif in compute_scoreMiaohe Lin
The arg exact_dif is not used anymore, remove it. inet_exact_dif_match() is no longer needed after the above is removed, so remove it too. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: ipv6: remove unused arg exact_dif in compute_scoreMiaohe Lin
The arg exact_dif is not used anymore, remove it. inet6_exact_dif_match() is no longer needed after the above is removed, remove it too. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link modeGrygorii Strashko
In RMII link mode it's required to set bit 15 IFCTL_A in MAC_SL MAC_CONTROL register to enable support for 100Mbit link speed. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31Merge branch 'net-phy-add-Lynx-PCS-MDIO-module'David S. Miller
Ioana Ciornei says: ==================== net: phy: add Lynx PCS MDIO module Add support for the Lynx PCS as a separate module in drivers/net/phy/. The advantage of this structure is that multiple ethernet or switch drivers used on NXP hardware (ENETC, Seville, Felix DSA switch etc) can share the same implementation of PCS configuration and runtime management. The module implements phylink_pcs_ops and exports a phylink_pcs (incorporated into a lynx_pcs) which can be directly passed to phylink through phylink_pcs_set. The first 3 patches add some missing pieces in phylink and the locked mdiobus write accessor. Next, the Lynx PCS MDIO module is added as a standalone module. The majority of the code is extracted from the Felix DSA driver. The last patch makes the necessary changes in the Felix and Seville drivers in order to use the new common PCS implementation. At the moment, USXGMII (only with in-band AN), SGMII, QSGMII (with and without in-band AN) and 2500Base-X (only w/o in-band AN) are supported by the Lynx PCS MDIO module since these were also supported by Felix and no functional change is intended at this time. Changes in v2: * got rid of the mdio_lynx_pcs structure and directly exported the functions without the need of an indirection * made the necessary adjustments for this in the Felix DSA driver * solved the broken allmodconfig build test by making the module tristate instead of bool * fixed a memory leakage in the Felix driver (the pcs structure was allocated twice) Changes in v3: * added support for PHYLINK PCS ops in DSA (patch 5/9) * cleanup in Felix PHYLINK operations and migrate to phylink_mac_link_up() being the callback of choice for applying MAC configuration (patches 6-8) Changes in v4: * use the newly introduced phylink PCS mechanism * install the phylink_pcs in the phylink_mac_config DSA ops * remove the direct implementations of the PCS ops * do no use the SGMII_ prefix when referring to the IF_MORE register * add a phylink helper to decode the USXGMII code word * remove cleanup patches for Felix (these have been already accepted) * Seville (recently introduced) now has PCS support through the same Lynx PCS module Changes in v5: - move the pcs-lynx driver to drivers/net/pcs - reword the commit message a bit in 4/5 - add error checking and error propagation in 4/5 - s/IF_MODE_DUPLEX/IF_MODE_HALF_DUPLEX in 4/5 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: dsa: ocelot: use the Lynx PCS helpers in Felix and SevilleIoana Ciornei
Use the helper functions introduced by the newly added Lynx PCS MDIO module in the Felix VSC9959 and Seville VSC9953. Instead of representing the PCS as a phy_device, a mdio_device structure will be passed to the Lynx module which is now actually implementing all the PCS configuration and status reporting. All code previously used for PCS monitoring and runtime configuration is removed and replaced will calls to the Lynx PCS operations. Tested on the following SERDES protocols of LS1028A: 0x7777 (2500Base-X), 0x85bb (QSGMII), 0x9999 (SGMII) and 0x13bb (USXGMII). Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: phy: add Lynx PCS moduleIoana Ciornei
Add a Lynx PCS module which exposes the necessary operations to drive the PCS using phylink. The majority of the code is extracted from the Felix DSA driver, which will be also modified in a later patch, and exposed as a separate module for code reusability purposes. As such, this aims at feature and bug parity with the existing Felix DSA driver, and thus USXGMII, SGMII, QSGMII and 2500Base-X (only w/o in-band AN) are supported by the Lynx PCS module since these were also supported by Felix. The module can only be enabled by the drivers in need and not user selectable. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: mdiobus: add clause 45 mdiobus write accessorIoana Ciornei
Add the locked variant of the clause 45 mdiobus write accessor - mdiobus_c45_write(). Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: phylink: consider QSGMII interface mode in phylink_mii_c22_pcs_get_stateIoana Ciornei
The same link partner advertisement word is used for both QSGMII and SGMII, thus treat both interface modes using the same phylink_decode_sgmii_word() function. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: phylink: add helper function to decode USXGMII wordIoana Ciornei
With the new addition of the USXGMII link partner ability constants we can now introduce a phylink helper that decodes the USXGMII word and populates the appropriate fields in the phylink_link_state structure based on them. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31Merge tag 'docs-5.9-3' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation fixes from Jonathan Corbet: "A handful of documentation fixes for 5.9" * tag 'docs-5.9-3' of git://git.lwn.net/linux: Documentation: laptops: thinkpad-acpi: fix underline length build warning Documentation: fix typo for abituguru documentation docs: Fix function name trailing double-()s devices.txt: fix typo of "ubd" as "udb" Documentation: add riscv entry in list of existing profiles MAINTAINERS: mention documentation maintainer entry profile Fpga: Documentation: Replace deprecated :c:func: Usage IIO: Documentation: Replace deprecated :c:func: Usage Documentation/locking/locktypes: fix local_locks documentation
2020-08-31net/wan/fsl_ucc_hdlc: Add MODULE_DESCRIPTIONYueHaibing
Add missing MODULE_DESCRIPTION. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: hns: Remove unused macro AE_NAME_PORT_ID_IDXYueHaibing
There is no caller in tree. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: dl2k: Remove unused macro DRV_NAMEYueHaibing
There is no caller in tree any more. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: wan: slic_ds26522: Remove unused macro DRV_NAMEYueHaibing
There is no caller in tree any more. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31tipc: Remove unused macro TIPC_NACK_INTVYueHaibing
There is no caller in tree any more. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31tipc: Remove unused macro TIPC_FWD_MSGYueHaibing
There is no caller in tree any more. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31mptcp: Remove unused macro MPTCP_SAME_STATEYueHaibing
There is no caller in tree any more. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: clean up codestyleMiaohe Lin
This is a pure codestyle cleanup patch. No functional change intended. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: Use helper macro IP_MAX_MTU in __ip_append_data()Miaohe Lin
What 0xFFFF means here is actually the max mtu of a ip packet. Use help macro IP_MAX_MTU here. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: ethernet: ti: am65-cpts: fix i2083 genf (and estf) Reconfiguration IssueGrygorii Strashko
The new bit TX_GENF_CLR_EN has been added in AM65x SR2.0 to fix i2083 errata, which can be just set unconditionally for all SoCs. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31Merge branch 'sfc-clean-up-some-W-1-build-warnings'David S. Miller
Edward Cree says: ==================== sfc: clean up some W=1 build warnings A collection of minor fixes to issues flagged up by W=1. After this series, the only remaining warnings in the sfc driver are some 'member missing in kerneldoc' warnings from ptp.c. Tested by building on x86_64 and running 'ethtool -p' on an EF10 NIC; there was no error, but I couldn't observe the actual LED as I'm working remotely. [ Incidentally, ethtool_phys_id()'s behaviour on an error return looks strange — if I'm reading it right, it will break out of the inner loop but not the outer one, and eventually return the rc from the last run of the inner loop. Is this intended? ] ==================== Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31sfc: return errors from efx_mcdi_set_id_led, and de-indirectEdward Cree
W=1 warnings indicated that 'rc' was unused in efx_mcdi_set_id_led(); change the function to return int instead of void and plumb the rc through the caller efx_ethtool_phys_id(). Since (post-Falcon) all sfc NICs use MCDI for this, there's no point in indirecting through a nic_type method, so remove that and just call efx_mcdi_set_id_led() directly. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31sfc: fix kernel-doc on struct efx_loopback_stateEdward Cree
Missing 'struct' keyword caused "cannot understand function prototype" warnings. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31sfc: fix unused-but-set-variable warning in efx_farch_filter_remove_safeEdward Cree
Thanks to some past refactor, 'spec' is not actually used in this function; the code using it moved to the callee efx_farch_filter_remove. Remove the variable to fix a W=1 warning. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31sfc: fix W=1 warnings in efx_farch_handle_rx_not_okEdward Cree
Some of these RX-event flags aren't used at all, so remove them. Others are used only #ifdef DEBUG to log a message; suppress the unused-var warnings #ifndef DEBUG with a void cast. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31cxgb4: fix thermal zone device registrationPotnuri Bharat Teja
When multiple adapters are present in the system, pci hot-removing second adapter leads to the following warning as both the adapters registered thermal zone device with same thermal zone name/type. Therefore, use unique thermal zone name during thermal zone device initialization. Also mark thermal zone dev NULL once unregistered. [ 414.370143] ------------[ cut here ]------------ [ 414.370944] sysfs group 'power' not found for kobject 'hwmon0' [ 414.371747] WARNING: CPU: 9 PID: 2661 at fs/sysfs/group.c:281 sysfs_remove_group+0x76/0x80 [ 414.382550] CPU: 9 PID: 2661 Comm: bash Not tainted 5.8.0-rc6+ #33 [ 414.383593] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0a 06/23/2016 [ 414.384669] RIP: 0010:sysfs_remove_group+0x76/0x80 [ 414.385738] Code: 48 89 df 5b 5d 41 5c e9 d8 b5 ff ff 48 89 df e8 60 b0 ff ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 90 ae 13 bb e8 6a 27 d0 ff <0f> 0b 5b 5d 41 5c c3 0f 1f 00 0f 1f 44 00 00 48 85 f6 74 31 41 54 [ 414.388404] RSP: 0018:ffffa22bc080fcb0 EFLAGS: 00010286 [ 414.389638] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 414.390829] RDX: 0000000000000001 RSI: ffff8ee2de3e9510 RDI: ffff8ee2de3e9510 [ 414.392064] RBP: ffffffffbaef2ee0 R08: 0000000000000000 R09: 0000000000000000 [ 414.393224] R10: 0000000000000000 R11: 000000002b30006c R12: ffff8ee260720008 [ 414.394388] R13: ffff8ee25e0a40e8 R14: ffffa22bc080ff08 R15: ffff8ee2c3be5020 [ 414.395661] FS: 00007fd2a7171740(0000) GS:ffff8ee2de200000(0000) knlGS:0000000000000000 [ 414.396825] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 414.398011] CR2: 00007f178ffe5020 CR3: 000000084c5cc003 CR4: 00000000003606e0 [ 414.399172] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 414.400352] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 414.401473] Call Trace: [ 414.402685] device_del+0x89/0x400 [ 414.403819] device_unregister+0x16/0x60 [ 414.405024] hwmon_device_unregister+0x44/0xa0 [ 414.406112] thermal_remove_hwmon_sysfs+0x196/0x200 [ 414.407256] thermal_zone_device_unregister+0x1b5/0x1f0 [ 414.408415] cxgb4_thermal_remove+0x3c/0x4f [cxgb4] [ 414.409668] remove_one+0x212/0x290 [cxgb4] [ 414.410875] pci_device_remove+0x36/0xb0 [ 414.412004] device_release_driver_internal+0xe2/0x1c0 [ 414.413276] pci_stop_bus_device+0x64/0x90 [ 414.414433] pci_stop_and_remove_bus_device_locked+0x16/0x30 [ 414.415609] remove_store+0x75/0x90 [ 414.416790] kernfs_fop_write+0x114/0x1b0 [ 414.417930] vfs_write+0xcf/0x210 [ 414.419059] ksys_write+0xa7/0xe0 [ 414.420120] do_syscall_64+0x4c/0xa0 [ 414.421278] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 414.422335] RIP: 0033:0x7fd2a686afd0 [ 414.423396] Code: Bad RIP value. [ 414.424549] RSP: 002b:00007fffc1446148 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 414.425638] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd2a686afd0 [ 414.426830] RDX: 0000000000000002 RSI: 00007fd2a7196000 RDI: 0000000000000001 [ 414.427927] RBP: 00007fd2a7196000 R08: 000000000000000a R09: 00007fd2a7171740 [ 414.428923] R10: 00007fd2a7171740 R11: 0000000000000246 R12: 00007fd2a6b43400 [ 414.430082] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000 [ 414.431027] irq event stamp: 76300 [ 414.435678] ---[ end trace 13865acb4d5ab00f ]--- Fixes: b18719157762 ("cxgb4: Add thermal zone support") Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31Merge branch 'Add-ip6_fragment-in-ipv6_stub'David S. Miller
wenxu says: ==================== Add ip6_fragment in ipv6_stub Add ip6_fragment in ipv6_stub and use it in openvswitch This version add default function eafnosupport_ipv6_fragment ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31openvswitch: using ip6_fragment in ipv6_stubwenxu
Using ipv6_stub->ipv6_fragment to avoid the netfilter dependency Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31ipv6: add ipv6_fragment hook in ipv6_stubwenxu
Add ipv6_fragment to ipv6_stub to avoid calling netfilter when access ip6_fragment. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31Merge branch 'gtp-minor-enhancements'David S. Miller
Nicolas Dichtel says: ==================== gtp: minor enhancements The first patch removes a useless rcu lock and the second relax alloc constraints when a PDP context is added. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31gtp: relax alloc constraint when adding a pdpNicolas Dichtel
When a PDP context is added, the rtnl lock is held, thus no need to force a GFP_ATOMIC. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31gtp: remove useless rcu_read_lock()Nicolas Dichtel
The rtnl lock is taken just the line above, no need to take the rcu also. Fixes: 1788b8569f5d ("gtp: fix use-after-free in gtp_encap_destroy()") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: phylink: avoid oops during initialisationRussell King
If we intend to use PCS operations, mac_pcs_get_state() will not be implemented, so will be NULL. If we also intend to register the PCS operations in mac_prepare() or mac_config(), then this leads to an attempt to call NULL function pointer during phylink_start(). Avoid this, but we must report the link is down. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31drivers/net/wan/hdlc_cisco: Add hard_header_lenXie He
This driver didn't set hard_header_len. This patch sets hard_header_len for it according to its header_ops->create function. This driver's header_ops->create function (cisco_hard_header) creates a header of (struct hdlc_header), so hard_header_len should be set to sizeof(struct hdlc_header). Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Acked-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31Merge branch 'hinic-add-debugfs-support'David S. Miller
Luo bin says: ==================== hinic: add debugfs support add debugfs node for querying sq/rq info and function table ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31hinic: add support to query function tableLuo bin
add debugfs node for querying function table, for example: cat /sys/kernel/debug/hinic/0000:15:00.0/func_table/valid Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31hinic: add support to query rq infoLuo bin
add debugfs node for querying rq info, for example: cat /sys/kernel/debug/hinic/0000:15:00.0/RQs/0x0/rq_hw_pi Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31hinic: add support to query sq infoLuo bin
add debugfs node for querying sq info, for example: cat /sys/kernel/debug/hinic/0000:15:00.0/SQs/0x0/sq_pi Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31ionic: fix txrx work accountingShannon Nelson
Take the tx accounting out of the work_done calculation to prevent a possible duplicate napi_schedule call when under high Tx stress but low Rx traffic. Fixes: b14e4e95f9ec ("ionic: tx separate servicing") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31xsk: Documentation for XDP_SHARED_UMEM between queues and netdevsMagnus Karlsson
Add documentation for the XDP_SHARED_UMEM feature when a UMEM is shared between different queues and/or netdevs. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-16-git-send-email-magnus.karlsson@intel.com
2020-08-31samples/bpf: Add new sample xsk_fwd.cCristian Dumitrescu
This sample code illustrates the packet forwarding between multiple AF_XDP sockets in multi-threading environment. All the threads and sockets are sharing a common buffer pool, with each socket having its own private buffer cache. The sockets are created with the xsk_socket__create_shared() function, which allows multiple AF_XDP sockets to share the same UMEM object. Example 1: Single thread handling two sockets. Packets received from socket A (on top of interface IFA, queue QA) are forwarded to socket B (on top of interface IFB, queue QB) and vice-versa. The thread is affinitized to CPU core C: ./xsk_fwd -i IFA -q QA -i IFB -q QB -c C Example 2: Two threads, each handling two sockets. Packets from socket A are sent to socket B (by thread X), packets from socket B are sent to socket A (by thread X); packets from socket C are sent to socket D (by thread Y), packets from socket D are sent to socket C (by thread Y). The two threads are bound to CPU cores CX and CY: ./xdp_fwd -i IFA -q QA -i IFB -q QB -i IFC -q QC -i IFD -q QD -c CX -c CY Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-15-git-send-email-magnus.karlsson@intel.com
2020-08-31libbpf: Support shared umems between queues and devicesMagnus Karlsson
Add support for shared umems between hardware queues and devices to the AF_XDP part of libbpf. This so that zero-copy can be achieved in applications that want to send and receive packets between HW queues on one device or between different devices/netdevs. In order to create sockets that share a umem between hardware queues and devices, a new function has been added called xsk_socket__create_shared(). It takes the same arguments as xsk_socket_create() plus references to a fill ring and a completion ring. So for every socket that share a umem, you need to have one more set of fill and completion rings. This in order to maintain the single-producer single-consumer semantics of the rings. You can create all the sockets via the new xsk_socket__create_shared() call, or create the first one with xsk_socket__create() and the rest with xsk_socket__create_shared(). Both methods work. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-14-git-send-email-magnus.karlsson@intel.com
2020-08-31xsk: Add shared umem support between devicesMagnus Karlsson
Add support to share a umem between different devices. This mode can be invoked with the XDP_SHARED_UMEM bind flag. Previously, sharing was only supported within the same device. Note that when sharing a umem between devices, just as in the case of sharing a umem between queue ids, you need to create a fill ring and a completion ring and tie them to the socket (with two setsockopts, one for each ring) before you do the bind with the XDP_SHARED_UMEM flag. This so that the single-producer single-consumer semantics of the rings can be upheld. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-13-git-send-email-magnus.karlsson@intel.com
2020-08-31xsk: Add shared umem support between queue idsMagnus Karlsson
Add support to share a umem between queue ids on the same device. This mode can be invoked with the XDP_SHARED_UMEM bind flag. Previously, sharing was only supported within the same queue id and device, and you shared one set of fill and completion rings. However, note that when sharing a umem between queue ids, you need to create a fill ring and a completion ring and tie them to the socket before you do the bind with the XDP_SHARED_UMEM flag. This so that the single-producer single-consumer semantics can be upheld. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-12-git-send-email-magnus.karlsson@intel.com
2020-08-31xsk: i40e: ice: ixgbe: mlx5: Test for dma_need_sync earlier for better ↵Magnus Karlsson
performance Test for dma_need_sync earlier to increase performance. xsk_buff_dma_sync_for_cpu() takes an xdp_buff as parameter and from that the xsk_buff_pool reference is dug out. Perf shows that this dereference causes a lot of cache misses. But as the buffer pool is now sent down to the driver at zero-copy initialization time, we might as well use this pointer directly, instead of going via the xsk_buff and we can do so already in xsk_buff_dma_sync_for_cpu() instead of in xp_dma_sync_for_cpu. This gets rid of these cache misses. Throughput increases with 3% for the xdpsock l2fwd sample application on my machine. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-11-git-send-email-magnus.karlsson@intel.com
2020-08-31xsk: Rearrange internal structs for better performanceMagnus Karlsson
Rearrange the xdp_sock, xdp_umem and xsk_buff_pool structures so that they get smaller and align better to the cache lines. In the previous commits of this patch set, these structs have been reordered with the focus on functionality and simplicity, not performance. This patch improves throughput performance by around 3%. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-10-git-send-email-magnus.karlsson@intel.com
2020-08-31xsk: Enable sharing of dma mappingsMagnus Karlsson
Enable the sharing of dma mappings by moving them out from the buffer pool. Instead we put each dma mapped umem region in a list in the umem structure. If dma has already been mapped for this umem and device, it is not mapped again and the existing dma mappings are reused. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-9-git-send-email-magnus.karlsson@intel.com
2020-08-31xsk: Move addrs from buffer pool to umemMagnus Karlsson
Replicate the addrs pointer in the buffer pool to the umem. This mapping will be the same for all buffer pools sharing the same umem. In the buffer pool we leave the addrs pointer for performance reasons. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-8-git-send-email-magnus.karlsson@intel.com
2020-08-31xsk: Move xsk_tx_list and its lock to buffer poolMagnus Karlsson
Move the xsk_tx_list and the xsk_tx_list_lock from the umem to the buffer pool. This so that we in a later commit can share the umem between multiple HW queues. There is one xsk_tx_list per device and queue id, so it should be located in the buffer pool. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-7-git-send-email-magnus.karlsson@intel.com
2020-08-31xsk: Move queue_id, dev and need_wakeup to buffer poolMagnus Karlsson
Move queue_id, dev, and need_wakeup from the umem to the buffer pool. This so that we in a later commit can share the umem between multiple HW queues. There is one buffer pool per dev and queue id, so these variables should belong to the buffer pool, not the umem. Need_wakeup is also something that is set on a per napi level, so there is usually one per device and queue id. So move this to the buffer pool too. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-6-git-send-email-magnus.karlsson@intel.com