summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-03-29tipc: fix two bugs in secondary destination lookupJon Paul Maloy
A message sent to a node after a successful name table lookup may still find that the destination socket has disappeared, because distribution of name table updates is non-atomic. If so, the message will be rejected back to the sender with error code TIPC_ERR_NO_PORT. If the source socket of the message has disappeared in the meantime, the message should be dropped. However, in the currrent code, the message will instead be subject to an unwanted tertiary lookup, because the function tipc_msg_lookup_dest() doesn't check if there is an error code present in the message before performing the lookup. In the worst case, the message may now find the old destination again, and be redirected once more, instead of being dropped directly as it should be. A second bug in this function is that the "prev_node" field in the message is not updated after successful lookup, something that may have unpredictable consequences. The problems arising from those bugs occur very infrequently. The third change in this function; the test on msg_reroute_msg_cnt() is purely cosmetic, reflecting that the returned value never can be negative. This commit corrects the two bugs described above. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-03-27 This series contains updates to i40e and i40evf. Jesse adds new device IDs to handle the new 20G speed for KR2. Mitch provides a fix for an issue that shows up as a panic or memory corruption when the device is brought down while under heavy stress. This is resolved by delaying the releasing of resources until we receive acknowledgment from the PF driver that the rings have indeed been stopped. Also adds firmware version information to ethtool reporting to align with ixgbevf behavior. Akeem increases the polling loop limiter, sine we found that in certain circumstances the firmware can take longer to be ready after a reset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29Merge branch 'stacked_vlan_tso'David S. Miller
Toshiaki Makita says: ==================== Stacked vlan TSO On the basis of Netdev 0.1 discussion[1], I made a patch set to enable TSO for packets with multiple vlans. Currently, packets with multiple vlans are always segmented by software, which is caused by that netif_skb_features() drops most feature flags for multiple tagged packets. To allow NICs to segment them, we need to get rid of that check from core. Fortunately, recently introduced ndo_features_check() can be used to move the check to each driver, and this patch set is based on the idea. For the initial patch set, I chose 3 drivers, bonding, team, and igb, as candidates to enable TSO. I tested them and confirmed they works fine with this change. Here are samples of performance test results. As I expected, %sys gets pretty lower than before. * TEST1: vlan (.1Q) on vlan (.1ad) on igb (I350) - before $ netperf -t TCP_STREAM -H 192.168.10.1 -l 60 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.02 933.72 Average: CPU %user %nice %system %iowait %steal %idle Average: all 0.13 0.00 11.28 0.01 0.00 88.58 - after $ netperf -t TCP_STREAM -H 192.168.10.1 -l 60 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.01 936.13 Average: CPU %user %nice %system %iowait %steal %idle Average: all 0.24 0.00 4.17 0.01 0.00 95.58 * TEST2: vlan (.1Q) on bridge (.1ad vlan filtering) on team on igb (I350) - before $ netperf -t TCP_STREAM -H 192.168.10.1 -l 60 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.01 936.28 Average: CPU %user %nice %system %iowait %steal %idle Average: all 0.41 0.00 11.57 0.01 0.00 88.01 - after $ netperf -t TCP_STREAM -H 192.168.10.1 -l 60 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.02 935.72 Average: CPU %user %nice %system %iowait %steal %idle Average: all 0.14 0.00 7.66 0.01 0.00 92.19 In addition to above, I tested these configurations: - vlan (.1Q) on vlan (1.ad) on bonding on igb (I350) - vlan (.1Q) on vlan (1.Q) on igb (I350) - vlan (.1Q) on vlan (1.Q) on team on igb (I350) And didn't find any problem. [1] https://netdev01.org/sessions/18 https://netdev01.org/docs/netdev01_bof_8021ad_makita_150212.pdf ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29igb: Enable TSO for stacked vlanToshiaki Makita
As datasheets for igb (I210, I350, 82576, etc.) say, maclen can be from 14 to 127, which is enough for reasonable number of vlan tags. My netperf test showed I350's TSO works pretty fine with multiple vlans. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29team: Don't segment multiple tagged packets on team deviceToshiaki Makita
Team devices don't need to segment multiple tagged packets since their slaves can segment them. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29bonding: Don't segment multiple tagged packets on bonding deviceToshiaki Makita
Bonding devices don't need to segment multiple tagged packets since their slaves can segment them. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: Introduce passthru_features_checkToshiaki Makita
As there are a number of (especially virtual) devices that don't need the multiple vlan check, introduce passthru_features_check() for convenience. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: Move check for multiple vlans to driversToshiaki Makita
To allow drivers to handle the features check for multiple tags, move the check to ndo_features_check(). As no drivers currently handle multiple tagged TSO, introduce dflt_features_check() and call it if the driver does not have ndo_features_check(). Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29vlan: Introduce helper functions to check if skb is taggedToshiaki Makita
Separate the two checks for single vlan and multiple vlans in netif_skb_features(). This allows us to move the check for multiple vlans to another function later. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29vlan: Add features for stacked vlan deviceToshiaki Makita
Stacked vlan devices curretly have few features (GRO, HIGHDMA, LLTX). Since we have software fallbacks in case the NIC can not handle some features for multiple vlans, we can add the same features as the lower vlan devices for stacked vlan devices. This allows stacked vlan devices to create large (GSO) packets and not to segment packets. Those packets will be segmented by software on the real device, or even can be segmented by the NIC once TSO for multiple vlans becomes enabled by the following patches. The exception is those related to FCoE, which does not have a software fallback. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29tc: bpf: generalize pedit actionAlexei Starovoitov
existing TC action 'pedit' can munge any bits of the packet. Generalize it for use in bpf programs attached as cls_bpf and act_bpf via bpf_skb_store_bytes() helper function. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29Merge branch 'dsa-hw-bridging'David S. Miller
Guenter Roeck says: ==================== net: dsa: HW bridging, EEE support Patch 1 to 7 of this series prepare the drivers using the mv88e6xxx code for HW bridging support, without adding the code itself. For the most part this factors out common port initialization code. There is no functional change except for patch 3, which disables the message port bit for the CPU port to prevent packet duplication if HW bridging is configured. Patch 8 adds the infrastructure for hardware bridging support to the mv88e6xxx code. Patch 9 wires the MV88E6352 driver to support hardware bridging. Patches 10 to 12 add support for ndo_fdb functions to the dsa subsystem, and wire up the MV88E6352 driver to support those functions. Patches 13 to 16 add EEE support and HW bridging support to the mv88e6171 driver. This set of patches is from Andrew, applied on top of the first set of patches. The series applies to net-next as of 3/24/2015. Thanks a lot to Andrew Lunn for testing and valuable feedback. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6171: Add support for hardware bridgingAndrew Lunn
Wire up the common code for setting up hardware bridging and access to the forwarding database. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6171: Add EEE support to the mv88e6172Andrew Lunn
The mv88e6172 has support for EEE. Check for the product ID and call the common code if applicable. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6171: Add defines for switch product IDsAndrew Lunn
Make the code more readable by using defines for the switch IDs. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: Centralise getting switch idAndrew Lunn
Get the switch id and save it away in the private mv88x6xxx structure in a centralised piece of code, rather than each driver doing it itself. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6352: Add support for ndo_fdb functionsGuenter Roeck
Add support for manipulating switch fdb entries by pointing to the ndo_fdb functions implemented for mv88e6xxxx. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6xxx: Add support for fdb_add, fdb_del, and fdb_getnextGuenter Roeck
No vlan support at this time. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: Add basic framework to support ndo_fdb functionsGuenter Roeck
Provide callbacks for ndo_fdb_add, ndo_fdb_del, and ndo_fdb_dump. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6352: Add support for hardware bridgingGuenter Roeck
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6xxx: Add Hardware bridging supportGuenter Roeck
Bridge support is similar for all chips supported by the mv88e6xxx code, so add the code there. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6171: Use common port configurationGuenter Roeck
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6123_61_65: Use common port configurationGuenter Roeck
This will simplify adding offloaded bridge support later on. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6352: Use common port initialization codeGuenter Roeck
This prepares the driver for hardware bridging. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6xxx: Split mv88e6xxx_reg_read and mv88e6xxx_reg_writeGuenter Roeck
Split mv88e6xxx_reg_read and mv88e6xxx_reg_write into two functions each, one to acquire smi_mutex and one to get struct mii_bus *bus from struct dsa_switch *ds and to call the actual read/write function. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6xxx: Disable Message Port bit for CPU portGuenter Roeck
Datasheet says that the Message Port bit should not be set for the CPU port. Having it set causes DSA tagged packets to be sent to the CPU port roughly every 30 seconds. Those packets are the same as real packets forwarded between switch ports if the switch is configured for switching between multiple ports. The packets are then bridged by the software bridge, resulting in duplicated packets on the network. Reported-by: Andrew Lunn <andrew@lunn.ch> Cc: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6xxx: Provide function for common port initializationGuenter Roeck
Provide mv88e6xxx_setup_port_common() for common port initialization. Currently only write Port 1 Control and VLAN configuration since this will be needed for hardware bridging. More can be added later if desired/needed. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: dsa: mv88e6xxx: Factor out common initialization codeGuenter Roeck
Code used and needed in mv886xxx.c should be initialized there as well, so factor it out from the individual initialization files. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29hv_netvsc: remove vmbus_are_subchannels_present() in rndis_filter_device_add()Haiyang Zhang
The vmbus_are_subchannels_present() also involves opening the channels, which may be too early at this point. Checking for subchannels is not necessary here. So this patch removes it. Subchannels will be opened when offer messages arrive. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29net: smc91x: make use of 4th parameter to devm_gpiod_get_indexUwe Kleine-König
Since 39b2bbe3d715 (gpio: add flags argument to gpiod_get*() functions) which appeared in v3.17-rc1, the gpiod_get* functions take an additional parameter that allows to specify direction and initial value for output. Simplify accordingly. Moreover use devm_gpiod_get_index_optional for still simpler handling. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29hv_netvsc: Implement batching in send bufferHaiyang Zhang
With this patch, we can send out multiple RNDIS data packets in one send buffer slot and one VMBus message. It reduces the overhead associated with VMBus messages. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. Basically, nf_tables updates to add the set extension infrastructure and finish the transaction for sets from Patrick McHardy. More specifically, they are: 1) Move netns to basechain and use recently added possible_net_t, from Patrick McHardy. 2) Use LOGLEVEL_<FOO> from nf_log infrastructure, from Joe Perches. 3) Restore nf_log_trace that was accidentally removed during conflict resolution. 4) nft_queue does not depend on NETFILTER_XTABLES, starting from here all patches from Patrick McHardy. 5) Use raw_smp_processor_id() in nft_meta. Then, several patches to prepare ground for the new set extension infrastructure: 6) Pass object length to the hash callback in rhashtable as needed by the new set extension infrastructure. 7) Cleanup patch to restore struct nft_hash as wrapper for struct rhashtable 8) Another small source code readability cleanup for nft_hash. 9) Convert nft_hash to rhashtable callbacks. And finally... 10) Add the new set extension infrastructure. 11) Convert the nft_hash and nft_rbtree sets to use it. 12) Batch set element release to avoid several RCU grace period in a row and add new function nft_set_elem_destroy() to consolidate set element release. 13) Return the set extension data area from nft_lookup. 14) Refactor existing transaction code to add some helper functions and document it. 15) Complete the set transaction support, using similar approach to what we already use, to activate/deactivate elements in an atomic fashion. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29Merge branch 'tipc-next'David S. Miller
Ying Xue says: ==================== tipc: fix two corner issues The patch set aims at resolving the following two critical issues: Patch #1: Resolve a deadlock which happens while all links are reset Patch #2: Correct a mistake usage of RCU lock which is used to protect node list ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29tipc: involve reference counter for node structureYing Xue
TIPC node hash node table is protected with rcu lock on read side. tipc_node_find() is used to look for a node object with node address through iterating the hash node table. As the entire process of what tipc_node_find() traverses the table is guarded with rcu read lock, it's safe for us. However, when callers use the node object returned by tipc_node_find(), there is no rcu read lock applied. Therefore, this is absolutely unsafe for callers of tipc_node_find(). Now we introduce a reference counter for node structure. Before tipc_node_find() returns node object to its caller, it first increases the reference counter. Accordingly, after its caller used it up, it decreases the counter again. This can prevent a node being used by one thread from being freed by another thread. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29tipc: fix potential deadlock when all links are resetYing Xue
[ 60.988363] ====================================================== [ 60.988754] [ INFO: possible circular locking dependency detected ] [ 60.989152] 3.19.0+ #194 Not tainted [ 60.989377] ------------------------------------------------------- [ 60.989781] swapper/3/0 is trying to acquire lock: [ 60.990079] (&(&n_ptr->lock)->rlock){+.-...}, at: [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.990743] [ 60.990743] but task is already holding lock: [ 60.991106] (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.991738] [ 60.991738] which lock already depends on the new lock. [ 60.991738] [ 60.992174] [ 60.992174] the existing dependency chain (in reverse order) is: [ 60.992174] -> #1 (&(&bclink->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] [<ffffffffa0000f57>] tipc_bclink_add_node+0x97/0xf0 [tipc] [ 60.992174] [<ffffffffa0011815>] tipc_node_link_up+0xf5/0x110 [tipc] [ 60.992174] [<ffffffffa0007783>] link_state_event+0x2b3/0x4f0 [tipc] [ 60.992174] [<ffffffffa00193c0>] tipc_link_proto_rcv+0x24c/0x418 [tipc] [ 60.992174] [<ffffffffa0008857>] tipc_rcv+0x827/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] -> #0 (&(&n_ptr->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a8f7d>] __lock_acquire+0x163d/0x1ca0 [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.992174] [<ffffffffa0001e11>] tipc_bclink_rcv+0x611/0x640 [tipc] [ 60.992174] [<ffffffffa0008646>] tipc_rcv+0x616/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] [ 60.992174] other info that might help us debug this: [ 60.992174] [ 60.992174] Possible unsafe locking scenario: [ 60.992174] [ 60.992174] CPU0 CPU1 [ 60.992174] ---- ---- [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] [ 60.992174] *** DEADLOCK *** [ 60.992174] [ 60.992174] 3 locks held by swapper/3/0: [ 60.992174] #0: (rcu_read_lock){......}, at: [<ffffffff81646791>] __netif_receive_skb_core+0x71/0x980 [ 60.992174] #1: (rcu_read_lock){......}, at: [<ffffffffa0002c35>] tipc_l2_rcv_msg+0x5/0xd0 [tipc] [ 60.992174] #2: (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] The correct the sequence of grabbing n_ptr->lock and bclink->lock should be that the former is first held and the latter is then taken, which exactly happened on CPU1. But especially when the retransmission of broadcast link is failed, bclink->lock is first held in tipc_bclink_rcv(), and n_ptr->lock is taken in link_retransmit_failure() called by tipc_link_retransmit() subsequently, which is demonstrated on CPU0. As a result, deadlock occurs. If the order of holding the two locks happening on CPU0 is reversed, the deadlock risk will be relieved. Therefore, the node lock taken in link_retransmit_failure() originally is moved to tipc_bclink_rcv() so that it's obtained before bclink lock. But the precondition of the adjustment of node lock is that responding to bclink reset event must be moved from tipc_bclink_unlock() to tipc_node_unlock(). Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29virtio: simplify the using of received in virtnet_pollLi RongQing
received is 0, no need to minus it and use "+=" to reassign it Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29Merge branch 'be2net-next'David S. Miller
Sathya Perla says: ==================== be2net: patch set Hi David, this patch set includes 2 feature additions to the be2net driver: Patch 1 sets up cpu affinity hints for be2net irqs using the cpumask_set_cpu_local_first() API that first picks the near numa cores and when they are exhausted, selects the far numa cores. Patch 2 setups up xps queue mapping for be2net's TXQs to avoid, by default, TX lock contention. Patch 3 just bumps up the driver version. Pls consider applying this patch set to the net-next queue. Thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29be2net: bump up the driver version to 10.6.0.1Sathya Perla
Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29be2net: setup xps queue mappingSathya Perla
This patch sets up xps queue mapping on load, so that TX traffic is steered to the queue whose irqs are being processed by the current cpu. This helps in avoiding TX lock contention. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29be2net: assign CPU affinity hints to be2net IRQsPadmanabh Ratnakar
This patch provides hints to irqbalance to map be2net IRQs to specific CPU cores. cpumask_set_cpu_local_first() is used, which first maps IRQs to near NUMA cores; when those cores are exhausted, IRQs are mapped to far NUMA cores. Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29tcp: tcp_syn_flood_action() can be staticEric Dumazet
After commit 1fb6f159fd21 ("tcp: add tcp_conn_request"), tcp_syn_flood_action() is no longer used from IPv6. We can make it static, by moving it above tcp_conn_request() Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29cxgb4: fix boolreturn.cocci warningsWu Fengguang
drivers/net/ethernet/chelsio/cxgb4/cxgb4_fcoe.c:49:9-10: WARNING: return of 0/1 in function 'cxgb_fcoe_sof_eof_supported' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci CC: Varun Prakash <varun@chelsio.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29fib6: install fib6 ops in the last stepWANG Cong
We should not commit the new ops until we finish all the setup, otherwise we have to NULL it on failure. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-27Merge branch 'bcmgenet-next'David S. Miller
Petri Gynther says: ==================== net: bcmgenet: multiple Rx queues support Final patch set to add support for multiple Rx queues: 1. remove priv->int0_mask and priv->int1_mask 2. modify Tx ring int_enable and int_disable vectors 3. simplify bcmgenet_init_dma() 4. tweak init_umac() 5. rework Tx NAPI code 6. rework Rx NAPI code 7. add support for multiple Rx queues ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-27net: bcmgenet: add support for multiple Rx queuesPetri Gynther
Add support for multiple Rx queues: 1. Add NAPI context per Rx queue 2. Modify Rx interrupt and Rx NAPI code to handle multiple Rx queues Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-27net: bcmgenet: rework Rx NAPI codePetri Gynther
Introduce new bcmgenet functions to handle the NAPI calls to: netif_napi_add() napi_enable() napi_disable() netif_napi_del() Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-27net: bcmgenet: rework Tx NAPI codePetri Gynther
Introduce new bcmgenet functions to handle the NAPI calls to: netif_napi_add() napi_enable() napi_disable() netif_napi_del() Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-27net: bcmgenet: tweak init_umac()Petri Gynther
Use more meaningful variable names int0_enable and int1_enable when enabling bcmgenet interrupts. For Rx default queue interrupts, use: UMAC_IRQ_RXDMA_BDONE | UMAC_IRQ_RXDMA_PDONE For Tx default queue interrupts, use: UMAC_IRQ_TXDMA_BDONE | UMAC_IRQ_TXDMA_PDONE Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-27net: bcmgenet: simplify bcmgenet_init_dma()Petri Gynther
Do the two kcalloc() calls first, before proceeding into Rx/Tx DMA init. Makes the error case handling much simpler. Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Jaedon Shin <jaedon.shin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-27net: bcmgenet: modify Tx ring int_enable and int_disable vectorsPetri Gynther
Remove unnecessary function parameter priv. Use ring->priv instead. Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>