summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-03-30xfrm: Register xfrm_dev_notifier in appropriate placeKirill Tkhai
Currently, driver registers it from pernet_operations::init method, and this breaks modularity, because initialization of net namespace and netdevice notifiers are orthogonal actions. We don't have per-namespace netdevice notifiers; all of them are global for all devices in all namespaces. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30Merge branch 'Implement-of_get_nvmem_mac_address-helper'David S. Miller
Mike Looijmans says: ==================== of_net: Implement of_get_nvmem_mac_address helper Posted this as a small set now, with an (optional) second patch that shows how the changes work and what I've used to test the code on a Topic Miami board. I've taken the liberty to add appropriate "Acked" and "Review" tags. v4: Replaced "6" with ETH_ALEN v3: Add patch that implements mac in nvmem for the Cadence MACB controller Remove the integrated of_get_mac_address call v2: Use of_nvmem_cell_get to avoid needing the assiciated device Use void* instead of char* Add devicetree binding doc ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30net: macb: Try to retrieve MAC addess from nvmem providerMike Looijmans
Call of_get_nvmem_mac_address() to fetch the MAC address from an nvmem cell, if one is provided in the device tree. This allows the address to be stored in an I2C EEPROM device for example. Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30of_net: Implement of_get_nvmem_mac_address helperMike Looijmans
It's common practice to store MAC addresses for network interfaces into nvmem devices. However the code to actually do this in the kernel lacks, so this patch adds of_get_nvmem_mac_address() for drivers to obtain the address from an nvmem cell provider. This is particulary useful on devices where the ethernet interface cannot be configured by the bootloader, for example because it's in an FPGA. Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30Merge branch 'nfp-flower-handle-MTU-changes'David S. Miller
Jakub Kicinski says: ==================== nfp: flower: handle MTU changes This set improves MTU handling for flower offload. The max MTU is correctly capped and physical port MTU is communicated to the FW (and indirectly HW). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30nfp: flower: offload phys port MTU changeJohn Hurley
Trigger a port mod message to request an MTU change on the NIC when any physical port representor is assigned a new MTU value. The driver waits 10 msec for an ack that the FW has set the MTU. If no ack is received the request is rejected and an appropriate warning flagged. Rather than maintain an MTU queue per repr, one is maintained per app. Because the MTU ndo is protected by the rtnl lock, there can never be contention here. Portmod messages from the NIC are also protected by rtnl so we first check if the portmod is an ack and, if so, handle outside rtnl and the cmsg work queue. Acks are detected by the marking of a bit in a portmod response. They are then verfied by checking the port number and MTU value expected by the app. If the expected MTU is 0 then no acks are currently expected. Also, ensure that the packet headroom reserved by the flower firmware is considered when accepting an MTU change on any repr. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30nfp: modify app MTU setting callbacksJohn Hurley
Rename the 'change_mtu' app callback to 'check_mtu'. This is called whenever an MTU change is requested on a netdev. It can reject the change but is not responsible for implementing it. Introduce a new 'repr_change_mtu' app callback that is hit when the MTU of a repr is to be changed. This is responsible for performing the MTU change and verifying it. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30Merge branch 'phylink-API-changes'David S. Miller
Florian Fainelli says: ==================== phylink: API changes This patch series contains two API changes to PHYLINK which will later be used by DSA to migrate to PHYLINK. Because these are API changes that impact other outstanding work (e.g: MVPP2) I would rather get them included sooner to minimize conflicts. Thank you! Changes in v2: - added missing documentation to mac_link_{up,down} that the interface must be configured in mac_config() - added Russell's, Andrew's and my tags ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30sfp/phylink: move module EEPROM ethtool access into netdev core ethtoolRussell King
Provide a pointer to the SFP bus in struct net_device, so that the ethtool module EEPROM methods can access the SFP directly, rather than needing every user to provide a hook for it. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30net: phy: phylink: Provide PHY interface to mac_link_{up, down}Florian Fainelli
In preparation for having DSA transition entirely to PHYLINK, we need to pass a PHY interface type to the mac_link_{up,down} callbacks because we may have to make decisions on that (e.g: turn on/off RGMII interfaces etc.). We do not pass an entire phylink_link_state because not all parameters (pause, duplex etc.) are defined when the link is down, only link and interface are. Update mvneta accordingly since it currently implements phylink_mac_ops. Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30MAINTAINERS: update vmxnet3 driver maintainerRonak Doshi
Shrikrishna Khare would no longer maintain the vmxnet3 driver. Taking over the role of vmxnet3 maintainer. Signed-off-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30Merge branch 'net-Broadcom-drivers-coalescing-fixes'David S. Miller
Florian Fainelli says: ==================== net: Broadcom drivers coalescing fixes Following Tal's review of the adaptive RX/TX coalescing feature added to the SYSTEMPORT and GENET driver a number of things showed up: - adaptive TX coalescing is not actually a good idea with the current way the estimator will program the ring, this results in a higher CPU load, NAPI on TX already does a reasonably good job at maintaining the interrupt count low - both SYSTEMPORT and GENET would suffer from the same issues while configuring coalescing parameters where the values would just not be applied correctly based on user settings, so we fix that too Tal, thanks again for your feedback, I would appreciate if you could review that the new behavior appears to be implemented correctly. Thanks! Changes in v2: - added Tal's reviewed-by to the first patch - split DIM initialization from coalescing parameters initialization - avoid duplicating the same code in bcmgenet_set_coalesce() when configuring RX rings - fixed the condition where default DIM parameters would be applied when adaptive RX coalescing would be enabled, do this only if it was disabled before ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30net: bcmgenet: Fix coalescing settings handlingFlorian Fainelli
There were a number of issues with setting the RX coalescing parameters: - we would not be preserving values that would have been configured across close/open calls, instead we would always reset to no timeout and 1 interrupt per packet, this would also prevent DIM from setting its default usec/pkts values - when adaptive RX would be turned on, we woud not be fetching the default parameters, we would stay with no timeout/1 packet per interrupt until the estimator kicks in and changes that - finally disabling adaptive RX coalescing while providing parameters would not be honored, and we would stay with whatever DIM had previously determined instead of the user requested parameters Fixes: 9f4ca05827a2 ("net: bcmgenet: Add support for adaptive RX coalescing") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30net: systemport: Fix coalescing settings handlingFlorian Fainelli
There were a number of issues with setting the RX coalescing parameters: - we would not be preserving values that would have been configured across close/open calls, instead we would always reset to no timeout and 1 interrupt per packet, this would also prevent DIM from setting its default usec/pkts values - when adaptive RX would be turned on, we woud not be fetching the default parameters, we would stay with no timeout/1 packet per interrupt until the estimator kicks in and changes that - finally disabling adaptive RX coalescing while providing parameters would not be honored, and we would stay with whatever DIM had previously determined instead of the user requested parameters Fixes: b6e0e875421e ("net: systemport: Implement adaptive interrupt coalescing") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30net: systemport: Remove adaptive TX coalescingFlorian Fainelli
Adaptive TX coalescing is not currently giving us any advantages and ends up making the CPU spin more frequently until TX completion. Deny and disable adaptive TX coalescing for now and rely on static configuration, we can always add it back later. Reviewed-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30net: Call add/kill vid ndo on vlan filter feature togglingGal Pressman
NETIF_F_HW_VLAN_[CS]TAG_FILTER features require more than just a bit flip in dev->features in order to keep the driver in a consistent state. These features notify the driver of each added/removed vlan, but toggling of vlan-filter does not notify the driver accordingly for each of the existing vlans. This patch implements a similar solution to NETIF_F_RX_UDP_TUNNEL_PORT behavior (which notifies the driver about UDP ports in the same manner that vids are reported). Each toggling of the features propagates to the 8021q module, which iterates over the vlans and call add/kill ndo accordingly. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30cxgb4: fix error return code in adap_init0()Wei Yongjun
Fix to return a negative error code from the hash filter init error handling case instead of 0, as done elsewhere in this function. Fixes: 5c31254e35a8 ("cxgb4: initialize hash-filter configuration") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29Merge tag 'wireless-drivers-next-for-davem-2018-03-29' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.17 Smaller new features to various drivers but nothing really out of ordinary. Major changes: ath10k * enable chip temperature measurement for QCA6174/QCA9377 * add firmware memory dump for QCA9984 * enable buffer STA on TDLS link for QCA6174 * support different beacon internals in multiple interface scenario for QCA988X/QCA99X0/QCA9984/QCA4019 iwlwifi * support for new PCI IDs for the 9000 family * support for a new firmware API version * support for advanced dwell and Optimized Connectivity Experience (OCE) in scanning btrsi * fix kconfig dependencies wil6210 * support multiple virtual interfaces ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29Merge tag 'mac80211-next-for-davem-2018-03-29' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== We have a fair number of patches, but many of them are from the first bullet here: * EAPoL-over-nl80211 from Denis - this will let us fix some long-standing issues with bridging, races with encryption and more * DFS offload support from the qtnfmac folks * regulatory database changes for the new ETSI adaptivity requirements * various other fixes and small enhancements ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29Merge branch 'dsa-Add-ATU-VTU-statistics'David S. Miller
Andrew Lunn says: ==================== Add ATU/VTU statistics Previous patches have added basic support for Address Translation Unit and VLAN translation Unit violation interrupts. Add statistics counters for when these occur, which can be accessed using ethtool. Downgrade one of the particularly spammy warnings from VTU violations to debug only, now that we have a counter for it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net: dsa: mv88e6xxx: Make VTU miss violations less spammyAndrew Lunn
VTU miss violations can happen under normal conditions. Don't spam the kernel log, downgrade the output to debug level only. The statistics counter will indicate it is happening, if anybody not debugging is interested. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reported-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net: dsa: mv88e6xxx: Keep ATU/VTU violation statisticsAndrew Lunn
Count the numbers of various ATU and VTU violation statistics and return them as part of the ethtool -S statistics. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29sctp: fix unused lable warningArnd Bergmann
The proc file cleanup left a label possibly unused: net/sctp/protocol.c: In function 'sctp_defaults_init': net/sctp/protocol.c:1304:1: error: label 'err_init_proc' defined but not used [-Werror=unused-label] This adds an #ifdef around it to match the respective 'goto'. Fixes: d47d08c8ca05 ("sctp: use proc_remove_subtree()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net: cavium: use module_pci_driver to simplify the codeWei Yongjun
Use the module_pci_driver() macro to make the code simpler by eliminating module_init and module_exit calls. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net: bcmgenet: return NULL instead of plain integerWei Yongjun
Fixes the following sparse warning: drivers/net/ethernet/broadcom/genet/bcmgenet.c:1351:16: warning: Using plain integer as NULL pointer Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()Dan Carpenter
The skb_segment() function returns error pointers on error. It never returns NULL. Fixes: 76db8087c4c9 ("net: bpf: add a test for skb_segment in test_bpf module") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29sfp: allow cotsworks modulesRussell King
Cotsworks modules fail the checksums - it appears that Cotsworks reprograms the EEPROM at the end of production with the final product information (serial, date code, and exact part number for module options) and fails to update the checksum. Work around this by detecting the Cotsworks name in the manufacturer field, and reducing the checksum failures to warnings rather than a hard error. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29Merge branch 'qed-flash-upgrade-support'David S. Miller
Sudarsana Reddy Kalluru says: ==================== qed*: Flash upgrade support. The patch series adds adapter flash upgrade support for qed/qede drivers. Please consider applying it to net-next branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29qede: Ethtool flash update support.Sudarsana Reddy Kalluru
The patch adds ethtool callback implementation for flash update. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29qed: Adapter flash update support.Sudarsana Reddy Kalluru
This patch adds the required driver support for updating the flash or non volatile memory of the adapter. At highlevel, flash upgrade comprises of reading the flash images from the input file, validating the images and writing them to the respective paritions. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29qed: Add APIs for flash access.Sudarsana Reddy Kalluru
This patch adds APIs for flash access. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29qed: Fix PTT entry leak in the selftest error flow.Sudarsana Reddy Kalluru
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29qed: Populate nvm image attribute shadow.Sudarsana Reddy Kalluru
This patch adds support for populating the flash image attributes. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29qed*: Utilize FW 8.33.11.0Michal Kalderon
This FW contains several fixes and features RDMA Features - SRQ support - XRC support - Memory window support - RDMA low latency queue support - RDMA bonding support RDMA bug fixes - RDMA remote invalidate during retransmit fix - iWARP MPA connect interop issue with RTR fix - iWARP Legacy DPM support - Fix MPA reject flow - iWARP error handling - RQ WQE validation checks MISC - Fix some HSI types endianity - New Restriction: vlan insertion in core_tx_bd_data can't be set for LB packets ETH - HW QoS offload support - Fix vlan, dcb and sriov flow of VF sending a packet with inband VLAN tag instead of default VLAN - Allow GRE version 1 offloads in RX flow - Allow VXLAN steering iSCSI / FcoE - Fix bd availability checking flow - Support 256th sge proerly in iscsi/fcoe retransmit - Performance improvement - Fix handle iSCSI command arrival with AHS and with immediate - Fix ipv6 traffic class configuration DEBUG - Update debug utilities Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29ipv6: export ip6 fragments sysctl to unprivileged usersEric Dumazet
IPv4 was changed in commit 52a773d645e9 ("net: Export ip fragment sysctl to unprivileged users") The only sysctl that is not per-netns is not used : ip6frag_secret_interval Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29liquidio: Prioritize control messagesIntiyaz Basha
During heavy tx traffic, control messages (sent by liquidio driver to NIC firmware) sometimes do not get processed in a timely manner. Reason is: the low-level metadata of control messages and that of egress network packets indicate that they have the same priority. Fix it by setting a higher priority for control messages through the new ctrl_qpg field in the oct_txpciq struct. It is the NIC firmware that does the actual setting of priority by writing to the new ctrl_qpg field; the host driver treats that value as opaque and just assigns it to pki_ih3->qpg Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29Merge branch 'net-Allow-FIB-notifiers-to-fail-add-and-replace'David S. Miller
David Ahern says: ==================== net: Allow FIB notifiers to fail add and replace I wanted to revisit how resource overload is handled for hardware offload of FIB entries and rules. At the moment, the in-kernel fib notifier can tell a driver about a route or rule add, replace, and delete, but the notifier can not affect the action. Specifically, in the case of mlxsw if a route or rule add is going to overflow the ASIC resources the only recourse is to abort hardware offload. Aborting offload is akin to taking down the switch as the path from data plane to the control plane simply can not support the traffic bandwidth of the front panel ports. Further, the current state of FIB notifiers is inconsistent with other resources where a driver can affect a user request - e.g., enslavement of a port into a bridge or a VRF. As a result of the work done over the past 3+ years, I believe we are at a point where we can bring consistency to the stack and offloads, and reliably allow the FIB notifiers to fail a request, pushing an error along with a suitable error message back to the user. Rather than aborting offload when the switch is out of resources, userspace is simply prevented from adding more routes and has a clear indication of why. This set does not resolve the corner case where rules or routes not supported by the device are installed prior to the driver getting loaded and registering for FIB notifications. In that case, hardware offload has not been established and it can refuse to offload anything, sending errors back to userspace via extack. Since conceptually the driver owns the netdevices associated with its asic, this corner case mainly applies to unsupported rules and any races during the bringup phase. Patch 1 fixes call_fib_notifiers to extract the errno from the encoded response from handlers. Patches 2-5 allow the call to call_fib_notifiers to fail the add or replace of a route or rule. Patch 6 adds a simple resource controller to netdevsim to illustrate how a FIB resource controller can limit the number of route entries. Changes since RFC - correct return code for call_fib_notifier - dropped patch 6 exporting devlink symbols - limited example resource controller to init_net only - updated Kconfig for netdevsim to use MAY_USE_DEVLINK - updated cover letter regarding startup case noted by Ido ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29netdevsim: Add simple FIB resource controller via devlinkDavid Ahern
Add devlink support to netdevsim and use it to implement a simple, profile based resource controller. Only one controller is needed per namespace, so the first netdevsim netdevice in a namespace registers with devlink. If that device is deleted, the resource settings are deleted. The resource controller allows a user to limit the number of IPv4 and IPv6 FIB entries and FIB rules. The resource paths are: /IPv4 /IPv4/fib /IPv4/fib-rules /IPv6 /IPv6/fib /IPv6/fib-rules The IPv4 and IPv6 top level resources are unlimited in size and can not be changed. From there, the number of FIB entries and FIB rule entries are unlimited by default. A user can specify a limit for the fib and fib-rules resources: $ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 96 $ devlink resource set netdevsim/netdevsim0 path /IPv4/fib-rules size 16 $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib size 64 $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib-rules size 16 $ devlink dev reload netdevsim/netdevsim0 such that the number of rules or routes is limited (96 ipv4 routes in the example above): $ for n in $(seq 1 32); do ip ro add 10.99.$n.0/24 dev eth1; done Error: netdevsim: Exceeded number of supported fib entries. $ devlink resource show netdevsim/netdevsim0 netdevsim/netdevsim0: name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables non resources: name fib size 96 occ 96 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables ... With this template in place for resource management, it is fairly trivial to extend and shows one way to implement a simple counter based resource controller typical of network profiles. Currently, devlink only supports initial namespace. Code is in place to adapt netdevsim to a per namespace controller once the network namespace issues are resolved. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net/ipv6: Move call_fib6_entry_notifiers up for route addsDavid Ahern
Move call to call_fib6_entry_notifiers for new IPv6 routes to right before the insertion into the FIB. At this point notifier handlers can decide the fate of the new route with a clean path to delete the potential new entry if the notifier returns non-0. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net/ipv4: Allow notifier to fail route replaceDavid Ahern
Add checking to call to call_fib_entry_notifiers for IPv4 route replace. Allows a notifier handler to fail the replace. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net/ipv4: Move call_fib_entry_notifiers up for new routesDavid Ahern
Move call to call_fib_entry_notifiers for new IPv4 routes to right before the call to fib_insert_alias. At this point the only remaining failure path is memory allocations in fib_insert_node. Handle that very unlikely failure with a call to call_fib_entry_notifiers to tell drivers about it. At this point notifier handlers can decide the fate of the new route with a clean path to delete the potential new entry if the notifier returns non-0. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net: Move call_fib_rule_notifiers up in fib_nl_newruleDavid Ahern
Move call_fib_rule_notifiers up in fib_nl_newrule to the point right before the rule is inserted into the list. At this point there are no more failure paths within the core rule code, so if the notifier does not fail then the rule will be inserted into the list. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net: Fix fib notifer to return errnoDavid Ahern
Notifier handlers use notifier_from_errno to convert any potential error to an encoded format. As a consequence the other side, call_fib_notifier{s} in this case, needs to use notifier_to_errno to return the error from the handler back to its caller. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29Merge tag 'mlx5-updates-2018-03-27' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2018-03-27 (Misc updates & SQ recovery) This series contains Misc updates and cleanups for mlx5e rx path and SQ recovery feature for tx path. From Tariq: (RX updates) - Disable Striding RQ when PCI devices, striding RQ limits the use of CQE compression feature, which is very critical for slow PCI devices performance, in this change we will prefer CQE compression over Striding RQ only on specific "slow" PCIe links. - RX path cleanups - Private flag to enable/disable striding RQ From Eran: (TX fast recovery) - TX timeout logic improvements, fast SQ recovery and TX error reporting if a HW error occurs while transmitting on a specific SQ, the driver will ignore such error and will wait for TX timeout to occur and reset all the rings. Instead, the current series improves the resiliency for such HW errors by detecting TX completions with errors, which will report them and perform a fast recover for the specific faulty SQ even before a TX timeout is detected. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29Merge branch 'Introduce-net_rwsem-to-protect-net_namespace_list'David S. Miller
Kirill Tkhai says: ==================== Introduce net_rwsem to protect net_namespace_list The series introduces fine grained rw_semaphore, which will be used instead of rtnl_lock() to protect net_namespace_list. This improves scalability and allows to do non-exclusive sleepable iteration for_each_net(), which is enough for most cases. scripts/get_maintainer.pl gives enormous list of people, and I add all to CC. Note, that this patch is independent of "Close race between {un, }register_netdevice_notifier and pernet_operations": https://patchwork.ozlabs.org/project/netdev/list/?series=36495 Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net: Remove rtnl_lock() in nf_ct_iterate_destroy()Kirill Tkhai
rtnl_lock() doesn't protect net::ct::count, and it's not needed for__nf_ct_unconfirmed_destroy() and for nf_queue_nf_hook_drop(). Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29ovs: Remove rtnl_lock() from ovs_exit_net()Kirill Tkhai
Here we iterate for_each_net() and removes vport from alive net to the exiting net. ovs_net::dps are protected by ovs_mutex(), and the others, who change it (ovs_dp_cmd_new(), __dp_destroy()) also take it. The same with datapath::ports list. So, we remove rtnl_lock() here. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29security: Remove rtnl_lock() in selinux_xfrm_notify_policyload()Kirill Tkhai
rt_genid_bump_all() consists of ipv4 and ipv6 part. ipv4 part is incrementing of net::ipv4::rt_genid, and I see many places, where it's read without rtnl_lock(). ipv6 part calls __fib6_clean_all(), and it's also called without rtnl_lock() in other places. So, rtnl_lock() here was used to iterate net_namespace_list only, and we can remove it. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net: Don't take rtnl_lock() in wireless_nlevent_flush()Kirill Tkhai
This function iterates over net_namespace_list and flushes the queue for every of them. What does this rtnl_lock() protects?! Since we may add skbs to net::wext_nlevents without rtnl_lock(), it does not protects us about queuers. It guarantees, two threads can't flush the queue in parallel, that can change the order, but since skb can be queued in any order, it doesn't matter, how many threads do this in parallel. In case of several threads, this will be even faster. So, we can remove rtnl_lock() here, as it was used for iteration over net_namespace_list only. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29net: Introduce net_rwsem to protect net_namespace_listKirill Tkhai
rtnl_lock() is used everywhere, and contention is very high. When someone wants to iterate over alive net namespaces, he/she has no a possibility to do that without exclusive lock. But the exclusive rtnl_lock() in such places is overkill, and it just increases the contention. Yes, there is already for_each_net_rcu() in kernel, but it requires rcu_read_lock(), and this can't be sleepable. Also, sometimes it may be need really prevent net_namespace_list growth, so for_each_net_rcu() is not fit there. This patch introduces new rw_semaphore, which will be used instead of rtnl_mutex to protect net_namespace_list. It is sleepable and allows not-exclusive iterations over net namespaces list. It allows to stop using rtnl_lock() in several places (what is made in next patches) and makes less the time, we keep rtnl_mutex. Here we just add new lock, while the explanation of we can remove rtnl_lock() there are in next patches. Fine grained locks generally are better, then one big lock, so let's do that with net_namespace_list, while the situation allows that. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>