summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)Author
2017-08-07ixgbe: push cls_u32 and mqprio setup_tc processing into separate functionsJiri Pirko
Let __ixgbe_setup_tc be a splitter for specific setup_tc types and push out cls_u32 and mqprio specific codes into separate functions. Also change the return values so they are the same as in the rest of the drivers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07cxgb4: push cls_u32 setup_tc processing into a separate functionJiri Pirko
Let cxgb_setup_tc be a splitter for specific setup_tc types and push out cls_u32 specific code into a separate function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: make egress_dev flag part of flower offload structJiri Pirko
Since this is specific to flower now, make it part of the flower offload struct. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: rename TC_SETUP_MATCHALL to TC_SETUP_CLSMATCHALLJiri Pirko
In order to be aligned with the rest of the types, rename TC_SETUP_MATCHALL to TC_SETUP_CLSMATCHALL. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: make type an argument for ndo_setup_tcJiri Pirko
Since the type is always present, push it to be a separate argument to ndo_setup_tc. On the way, name the type enum and use it for arg type. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net/mlx5: Increase the maximum flow counters supportedRabie Loulou
Read new NIC capability field which represnts 16 MSBs of the max flow counters number supported (max_flow_counter_31_16). Backward compatibility with older firmware is preserved, the modified driver reads max_flow_counter_31_16 as 0 from the older firmware and uses up to 64K counters. Changed flow counter id from 16 bits to 32 bits. Backward compatibility with older firmware is preserved as we kept the 16 LSBs of the counter id in place and added 16 MSBs from reserved field. Changed the background bulk reading of flow counters to work in chunks of at most 32K counters, to make sure we don't attempt to allocate very large buffers. Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-08-07net/mlx5: Delay events till ib registration endsErez Shitrit
When mlx5_ib registers itself to mlx5_core as an interface, it will call mlx5_add_device which will call mlx5_ib interface add callback, in case the latter successfully returns, only then mlx5_core will add it to the interface list and async events will be forwarded to mlx5_ib. Between mlx5_ib interface add callback and mlx5_core adding the mlx5_ib interface to its devices list, arriving mlx5_core events can be missed by the new mlx5_ib registering interface. In other words: thread 1: mlx5_ib: mlx5_register_interface(dev) thread 1: mlx5_core: mlx5_add_device(dev) thread 1: mlx5_core: ctx = dev->add => (mlx5_ib)->mlx5_ib_add thread 2: mlx5_core_event: **new event arrives, forward to dev_list thread 1: mlx5_core: add_ctx_to_dev_list(ctx) /* previous event was missed by the new interface.*/ It is ok to miss events before dev->add (mlx5_ib)->mlx5_ib_add_device but not after. We fix this race by accumulating the events that come between the ib_register_device (inside mlx5_add_device->(dev->add)) till the adding to the list completes and fire them to the new registering interface after that. Fixes: f1ee87fe55c8 ("net/mlx5: Organize device list API in one place") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-08-07net/mlx5: Add CONFIG_MLX5_ESWITCH KconfigSaeed Mahameed
Allow to selectively build the driver with or without sriov eswitch, VF representors and TC offloads. Also remove the need of two ndo ops structures (sriov & basic) and keep only one unified ndo ops, compile out VF SRIOV ndos when not needed (MLX5_ESWITCH=n), and for VF netdev calling those ndos will result in returning -EPERM. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: Jes Sorensen <jsorensen@fb.com> Cc: kernel-team@fb.com
2017-08-07net/mlx5: Separate between E-Switch and MPFSSaeed Mahameed
Multi-Physical Function Switch (MPFs) is required for when multi-PF configuration is enabled to allow passing user configured unicast MAC addresses to the requesting PF. Before this patch eswitch.c used to manage the HW MPFS l2 table, E-Switch always (regardless of sriov) enabled vport(0) (NIC PF) vport's contexts update on unicast mac address list changes, to populate the PF's MPFS L2 table accordingly. In downstream patch we would like to allow compiling the driver without E-Switch functionalities, for that we move MPFS l2 table logic out of eswitch.c into its own file, and provide Kconfig flag (MLX5_MPFS) to allow compiling out MPFS for those who don't want Multi-PF support. NIC PF netdevice will now directly update MPFS l2 table via the new MPFS API. VF netdevice has no access to MPFS L2 table, so E-Switch will remain responsible of updating its MPFS l2 table on behalf of its VFs. Due to this change we also don't require enabling vport(0) (PF vport) unicast mac changes events anymore, for when SRIOV is not enabled. Which means E-Switch is now activated only on SRIOV activation, and not required otherwise. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Cc: Jes Sorensen <jsorensen@fb.com> Cc: kernel-team@fb.com
2017-08-07net/mlx5: Unify vport manager capability checkSaeed Mahameed
Expose MLX5_VPORT_MANAGER macro to check for strict vport manager E-switch and MPFS (Multi Physical Function Switch) abilities. VPORT manager must be a PF with an ethernet link and with FW advertised vport group manager capability Replace older checks with the new macro and use it where needed in eswitch.c and mlx5e netdev eswitch related flows. The same macro will be reused in MPFS separation downstream patch. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-08-07net/mlx5e: NIC netdev init flow cleanupSaeed Mahameed
Remove redundant call to unregister vport representor in mlx5e_add error flow. Hide the representor priv and eswitch internal structures from en_main.c as preparation step for downstream patches which would allow building the driver without support for representors and eswitch. Fixes: 6f08a22c5fb2 ("net/mlx5e: Register/unregister vport representors on interface attach/detach") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2017-08-07net/mlx5e: Rearrange netdevice ops structuresSaeed Mahameed
Since we are going to allow building the driver without eswitch support, it would be possible to compile out the sriov netdevice ops struct such that the basic ops instance will be used for non VF devices too. Add missing udp tunnel ndos into mlx5e_netdev_ops_basic. While here, rearrange some ndos in the sriov ops struct and put vf/eswitch related ndos towards the end of it. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2017-08-06net: stmmac: Add Adaptrum Anarion GMAC glue layerAlexandru Gagniuc
Before the GMAC on the Anarion chip can be used, the PHY interface selection must be configured with the DWMAC block in reset. This layer covers a block containing only two registers. Although it is possible to model this as a reset controller and use the "resets" property of stmmac, it's much more intuitive to include this in the glue layer instead. At this time only RGMII is supported, because it is the only mode which has been validated hardware-wise. Signed-off-by: Alexandru Gagniuc <alex.g@adaptrum.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06netvsc: fix rtnl deadlock on unregister of vfstephen hemminger
With new transparent VF support, it is possible to get a deadlock when some of the deferred work is running and the unregister_vf is trying to cancel the work element. The solution is to use trylock and reschedule (similar to bonding and team device). Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Fixes: 0c195567a8f6 ("netvsc: transparent VF management") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06netvsc: fix race on sub channel creationstephen hemminger
The existing sub channel code did not wait for all the sub-channels to completely initialize. This could lead to race causing crash in napi_netif_del() from bad list. The existing code would send an init message, then wait only for the initial response that the init message was received. It thought it was waiting for sub channels but really the init response did the wakeup. The new code keeps track of the number of open channels and waits until that many are open. Other issues here were: * host might return less sub-channels than was requested. * the new init status is not valid until after init was completed. Fixes: b3e6b82a0099 ("hv_netvsc: Wait for sub-channels to be processed during probe") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06net: systemport: Support 64bit statisticskiki good
When using Broadcom Systemport device in 32bit Platform, ifconfig can only report up to 4G tx,rx status, which will be wrapped to 0 when the number of incoming or outgoing packets exceeds 4G, only taking around 2 hours in busy network environment (such as streaming). Therefore, it makes hard for network diagnostic tool to get reliable statistical result, so the patch is used to add 64bit support for Broadcom Systemport device in 32bit Platform. This patch provides 64bit statistics capability on both ethtool and ifconfig. Signed-off-by: Jianming.qiao <kiki-good@hotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06liquidio: moved console_bitmask module param to lio_main.cIntiyaz Basha
Moving PF module param console_bitmask to lio_main.c for consistency. Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06liquidio: add missing strings in oct_dev_state_str arrayIntiyaz Basha
There's supposed to be a one-to-one correspondence between the 18 macros that #define the OCT_DEV states (in octeon_device.h) and the strings in the oct_dev_state_str array, but there are only 14 strings in the array. Add the missing strings (so they become 18 in total), and also revise some incorrect/outdated text of existing strings. Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sfp: add SFP module supportRussell King
Add support for SFP hotpluggable modules via sfp-bus and phylink. This supports both copper and optical SFP modules, which require different Serdes modes in order to properly negotiate the link. Optical SFP modules typically require the Serdes link to be talking 1000BaseX mode - this is the gigabit ethernet mode defined by the 802.3 standard. Copper SFP modules typically integrate a PHY in the module to convert from Serdes to copper, and the PHY will be configured by the vendor to either present a 1000BaseX Serdes link (for fixed 1000BaseT) or a SGMII Serdes link. However, this is vendor defined, so we instead detect the PHY, switch the link to SGMII mode, and use traditional PHY based negotiation. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06phylink: add in-band autonegotiation support for 10GBase-KR mode.Russell King
Add in-band autonegotation support for 10GBase-KR mode. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06phylink: add support for MII ioctl access to Clause 45 PHYsRussell King
Add support for reading and writing the clause 45 MII registers. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06phylink: add module EEPROM supportRussell King
Add support for reading module EEPROMs through phylink. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sfp: add sfp-bus to bridge between network devices and sfp cagesRussell King
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06phylink: add phylink infrastructureRussell King
The link between the ethernet MAC and its PHY has become more complex as the interface evolves. This is especially true with serdes links, where the part of the PHY is effectively integrated into the MAC. Serdes links can be connected to a variety of devices, including SFF modules soldered down onto the board with the MAC, a SFP cage with a hotpluggable SFP module which may contain a PHY or directly modulate the serdes signals onto optical media with or without a PHY, or even a classical PHY connection. Moreover, the negotiation information on serdes links comes in two varieties - SGMII mode, where the PHY provides its speed/duplex/flow control information to the MAC, and 1000base-X mode where both ends exchange their abilities and each resolve the link capabilities. This means we need a more flexible means to support these arrangements, particularly with the hotpluggable nature of SFP, where the PHY can be attached or detached after the network device has been brought up. Ethtool information can come from multiple sources: - we may have a PHY operating in either SGMII or 1000base-X mode, in which case we take ethtool/mii data directly from the PHY. - we may have a optical SFP module without a PHY, with the MAC operating in 1000base-X mode - the ethtool/mii data needs to come from the MAC. - we may have a copper SFP module with a PHY whic can't be accessed, which means we need to take ethtool/mii data from the MAC. Phylink aims to solve this by providing an intermediary between the MAC and PHY, providing a safe way for PHYs to be hotplugged, and allowing a SFP driver to reconfigure the serdes connection. Phylink also takes over support of fixed link connections, where the speed/duplex/flow control are fixed, but link status may be controlled by a GPIO signal. By avoiding the fixed-phy implementation, phylink can provide a faster response to link events: fixed-phy has to wait for phylib to operate its state machine, which can take several seconds. In comparison, phylink takes milliseconds. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> - remove sync status - rework supported and advertisment handling - add 1000base-x speed for fixed links - use functionality exported from phy-core, reworking __phylink_ethtool_ksettings_set for it Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06net: phy: add I2C mdio busRussell King
Add an I2C MDIO bus bridge library, to allow phylib to access PHYs which are connected to an I2C bus instead of the more conventional MDIO bus. Such PHYs can be found in SFP adapters and SFF modules. Since PHYs appear at I2C bus address 0x40..0x5f, and 0x50/0x51 are reserved for SFP EEPROMs/diagnostics, we must not allow the MDIO bus to access these I2C addresses. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06net: phy: export phy_start_machine() for phylinkRussell King
phylink will need phy_start_machine exported, so lets export it as a GPL symbol. Documentation/networking/phy.txt indicates that this should be a PHY API function. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06net: phy: provide a hook for link up/link down eventsRussell King
Sometimes, we need to do additional work between the PHY coming up and marking the carrier present - for example, we may need to wait for the PHY to MAC link to finish negotiation. This changes phylib to provide a notification function pointer which avoids the built-in netif_carrier_on() and netif_carrier_off() functions. Standard ->adjust_link functionality is provided by hooking a helper into the new ->phy_link_change method. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06net: phy: add 1000Base-X to phy settings tableRussell King
Add the missing 1000Base-X entry to the phy settings table. This was not included because the original code could not cope with more than 32 bits of link mode mask. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06net: phy: move phy_lookup_setting() and guts of phy_supported_speeds() to ↵Russell King
phy-core phy_lookup_setting() provides useful functionality in ethtool code outside phylib. Move it to phy-core and allow it to be re-used (eg, in phylink) rather than duplicated elsewhere. Note that this supports the larger linkmode space. As we move the phy settings table, we also need to move the guts of phy_supported_speeds() as well. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06net: phy: split out PHY speed and duplex string generationRussell King
Other code would like to make use of this, so make the speed and duplex string generation visible, and place it in a separate file. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06net: phy: allow settings table to support more than 32 link modesRussell King
Allow the phy settings table to support more than 32 link modes by switching to the ethtool link mode bit number representation, rather than storing the mask. This will allow phylink and other ethtool code to share the settings table to look up settings. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-05iwlwifi: mvm: Fix a memory leak in an error handling path in ↵Christophe Jaillet
'iwl_mvm_sar_get_wgds_table()' We should free 'wgds.pointer' here as done a few lines above in another error handling path. It was allocated within 'acpi_evaluate_object()'. Fixes: c52030a01ccc ("iwlwifi: mvm: add GEO_TX_POWER_LIMIT cmd for geographic tx power table") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-04aquantia: Switch to use napi_gro_receivePavel Belous
Add support for GRO (generic receive offload) for aQuantia Atlantic driver. This results in a perfomance improvement when GRO is enabled. Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04xgene: Always get clk source, but ignore if it's missing for SGMII portsThomas Bogendoerfer
Even the driver doesn't do anything with the clk source for SGMII ports it needs to be enabled by doing a devm_clk_get(), if there is a clk source in DT. Fixes: 0db01097cabd ('xgene: Don't fail probe, if there is no clk resource for SGMII interfaces') Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Tested-by: Laura Abbott <labbott@redhat.com> Acked-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Will Deacon <will.deacon@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: change names of action number helpers to be aligned with the restJiri Pirko
The rest of the helpers are named tcf_exts_*, so change the name of the action number helpers to be aligned. While at it, change to inline functions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04mlxsw: spectrum_switchdev: Release multicast groups during finiIdo Schimmel
Each multicast group (MID) stores a bitmap of ports to which a packet should be forwarded to in case an MDB entry associated with the MID is hit. Since the initial introduction of IGMP snooping in commit 3a49b4fde2a1 ("mlxsw: Adding layer 2 multicast support") the driver didn't correctly free these multicast groups upon ungraceful situations such as the removal of the upper bridge device or module removal. The correct way to fix this is to associate each MID with the bridge ports member in it and then drop the reference in case the bridge port is destroyed, but this will result in a lot more code and will be fixed in net-next. For now, upon module removal, traverse the MID list and release each one. Fixes: 3a49b4fde2a1 ("mlxsw: Adding layer 2 multicast support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04mlxsw: spectrum_switchdev: Don't warn about valid situationsIdo Schimmel
Some operations in the bridge driver such as MDB deletion are preformed in an atomic context and thus deferred to a process context by the switchdev infrastructure. Therefore, by the time the operation is performed by the underlying device driver it's possible the bridge port context is already gone. This is especially true for removal flows, but theoretically can also be invoked during addition. Remove the warnings in such situations and return normally. Fixes: c57529e1d5d8 ("mlxsw: spectrum: Replace vPorts with Port-VLAN") Fixes: 3922285d96e7 ("net: bridge: Add support for offloading port attributes") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: hns: Fix for __udivdi3 compiler errorLin Yun Sheng
This patch fixes the __udivdi3 undefined error reported by test robot. Fixes: b8c17f708831 ("net: hns: Add self-adaptive interrupt coalesce support in hns driver") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: phy: marvell: logical vs bitwise OR typoDan Carpenter
This was supposed to be a bitwise OR but there is a || vs | typo. Fixes: 864dc729d528 ("net: phy: marvell: Refactor m88e1121 RGMII delay configuration") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03sock: enable MSG_ZEROCOPYWillem de Bruijn
Prepare the datapath for refcounted ubuf_info. Clone ubuf_info with skb_zerocopy_clone() wherever needed due to skb split, merge, resize or clone. Split skb_orphan_frags into two variants. The split, merge, .. paths support reference counted zerocopy buffers, so do not do a deep copy. Add skb_orphan_frags_rx for paths that may loop packets to receive sockets. That is not allowed, as it may cause unbounded latency. Deep copy all zerocopy copy buffers, ref-counted or not, in this path. The exact locations to modify were chosen by exhaustively searching through all code that might modify skb_frag references and/or the the SKBTX_DEV_ZEROCOPY tx_flags bit. The changes err on the safe side, in two ways. (1) legacy ubuf_info paths virtio and tap are not modified. They keep a 1:1 ubuf_info to sk_buff relationship. Calls to skb_orphan_frags still call skb_copy_ubufs and thus copy frags in this case. (2) not all copies deep in the stack are addressed yet. skb_shift, skb_split and skb_try_coalesce can be refined to avoid copying. These are not in the hot path and this patch is hairy enough as is, so that is left for future refinement. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03mlxsw: spectrum_router: Don't ignore IPv6 notificationsIdo Schimmel
We now have all the necessary IPv6 infrastructure in place, so stop ignoring these notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03mlxsw: spectrum_router: Abort on source-specific routesIdo Schimmel
Without resorting to ACLs, the device performs route lookup solely based on the destination IP address. In case source-specific routing is needed, an error is returned and the abort mechanism is activated, thus allowing the kernel to take over forwarding decisions. Instead of aborting, we can trap specific destination prefixes where source-specific routes are present, but this will result in a lot more code that is unlikely to ever be used. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03mlxsw: spectrum_router: Add support for route replaceIdo Schimmel
In case we got a replace event, then the replaced route must exist. If the route isn't capable of multipath, then replace first matching non-multipath capable route. If the route is capable of multipath and matching multipath capable route is found, then replace it. Otherwise, replace first matching non-multipath capable route. The new route is inserted before the replaced one. In case the replaced route is currently offloaded, then it's overwritten in the device's table by the new route and later deleted, thus not impacting routed traffic. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03mlxsw: spectrum_router: Add support for IPv6 routes addition / deletionIdo Schimmel
Allow directly connected and remote unicast IPv6 routes to be programmed to the device's tables. As with IPv4, identical routes - sharing the same destination prefix - are ordered in a FIB node according to their table ID and then the metric. While the kernel doesn't share the same trie for the local and main table, this does happen in the device, so ordering according to table ID is needed. Since individual nexthops can be added and deleted in IPv6, each FIB entry stores a linked list of the rt6_info structs it represents. Upon the addition or deletion of a nexthop, a new nexthop group is allocated according to the new configuration and the old one is destroyed. Identical groups aren't currently consolidated, but will be in a follow-up patchset. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03mlxsw: spectrum_router: Sanitize IPv6 FIB rulesIdo Schimmel
We only allow FIB offload in the presence of default rules or an l3mdev rule. In a similar fashion to IPv4 FIB rules, sanitize IPv6 rules. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03mlxsw: spectrum_router: Demultiplex FIB event based on familyIdo Schimmel
The FIB notification block currently only handles IPv4 events, but we want to start handling IPv6 events soon, so lay the groundwork now. Do that by preparing the work item and process it according to the notified address family. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03rocker: Ignore address families other than IPv4Ido Schimmel
As in previous patch, ignore IPv6 notifications since the driver doesn't support these. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03mlxsw: spectrum_router: Ignore address families other than IPv4Ido Schimmel
We're about to add IPv6 notifications in the FIB notification chain, but the driver currently doesn't support these, so ignore them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03net: core: Make the FIB notification chain genericIdo Schimmel
The FIB notification chain is currently soley used by IPv4 code. However, we're going to introduce IPv6 FIB offload support, which requires these notification as well. As explained in commit c3852ef7f2f8 ("ipv4: fib: Replay events when registering FIB notifier"), upon registration to the chain, the callee receives a full dump of the FIB tables and rules by traversing all the net namespaces. The integrity of the dump is ensured by a per-namespace sequence counter that is incremented whenever a change to the tables or rules occurs. In order to allow more address families to use the chain, each family is expected to register its fib_notifier_ops in its pernet init. These operations allow the common code to read the family's sequence counter as well as dump its tables and rules in the given net namespace. Additionally, a 'family' parameter is added to sent notifications, so that listeners could distinguish between the different families. Implement the common code that allows listeners to register to the chain and for address families to register their fib_notifier_ops. Subsequent patches will implement these operations in IPv6. In the future, ipmr and ip6mr will be extended to provide these notifications as well. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-03net: mvpp2: add support for TX interrupts and RX queue distribution modesThomas Petazzoni
This commit adds the support for two related features: - Support for TX interrupts, with one interrupt for each CPU - Support for different RX queue distribution modes MVPP2_QDIST_SINGLE_MODE where a single interrupt, shared by all CPUs, receives the RX events, and MVPP2_QDIST_MULTI_MODE, where the per-CPU interrupts used for TX events are also used for RX events. Since additional interrupts are needed, an update to the Device Tree binding is needed. However, backward compatibility is preserved with the old Device Tree binding, by gracefully degrading to the original behavior, with only one RX interrupt, and TX completion being handled by an hrtimer. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>