summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2015-07-11net: dsa: Fix off-by-one in switch address parsingFlorian Fainelli
cd->sw_addr is used as a MDIO bus address, which cannot exceed PHY_MAX_ADDR (32), our check was off-by-one. Fixes: 5e95329b701c ("dsa: add device tree bindings to register DSA switches") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-11net: dsa: Test array index before useFlorian Fainelli
port_index is used an index into an array, and this information comes from Device Tree, make sure that port_index is not equal to the array size before using it. Move the check against port_index earlier in the loop. Fixes: 5e95329b701c: ("dsa: add device tree bindings to register DSA switches") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-11net: switchdev: don't abort unsupported operationsVivien Didelot
There is no need to abort attribute setting or object addition, if the prepare phase returned operation not supported. Thus, abort these two transactions only if the error is not -EOPNOTSUPP. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-10net: inet_diag: always export IPV6_V6ONLY sockopt for listening socketsPhil Sutter
Reconsidering my commit 20462155 "net: inet_diag: export IPV6_V6ONLY sockopt", I am not happy with the limitations it causes for socket analysing code in userspace. Exporting the value only if it is set makes it hard for userspace to decide whether the option is not set or the kernel does not support exporting the option at all. >From an auditor's perspective, the interesting question for listening AF_INET6 sockets is: "Does it NOT have IPV6_V6ONLY set?" Because it is the unexpected case. This patch allows to answer this question reliably. Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-10bridge: mdb: allow the user to delete mdb entry if there's a querierSatish Ashok
Until now when a querier was present static entries couldn't be deleted. Fix this and allow the user to manipulate the mdb with or without a querier. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-10net: call rcu_read_lock early in process_backlogJulian Anastasov
Incoming packet should be either in backlog queue or in RCU read-side section. Otherwise, the final sequence of flush_backlog() and synchronize_net() may miss packets that can run without device reference: CPU 1 CPU 2 skb->dev: no reference process_backlog:__skb_dequeue process_backlog:local_irq_enable on_each_cpu for flush_backlog => IPI(hardirq): flush_backlog - packet not found in backlog CPU delayed ... synchronize_net - no ongoing RCU read-side sections netdev_run_todo, rcu_barrier: no ongoing callbacks __netif_receive_skb_core:rcu_read_lock - too late free dev process packet for freed dev Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue") Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-10net: do not process device backlog during unregistrationJulian Anastasov
commit 381c759d9916 ("ipv4: Avoid crashing in ip_error") fixes a problem where processed packet comes from device with destroyed inetdev (dev->ip_ptr). This is not expected because inetdev_destroy is called in NETDEV_UNREGISTER phase and packets should not be processed after dev_close_many() and synchronize_net(). Above fix is still required because inetdev_destroy can be called for other reasons. But it shows the real problem: backlog can keep packets for long time and they do not hold reference to device. Such packets are then delivered to upper levels at the same time when device is unregistered. Calling flush_backlog after NETDEV_UNREGISTER_FINAL still accounts all packets from backlog but before that some packets continue to be delivered to upper levels long after the synchronize_net call which is supposed to wait the last ones. Also, as Eric pointed out, processed packets, mostly from other devices, can continue to add new packets to backlog. Fix the problem by moving flush_backlog early, after the device driver is stopped and before the synchronize_net() call. Then use netif_running check to make sure we do not add more packets to backlog. We have to do it in enqueue_to_backlog context when the local IRQ is disabled. As result, after the flush_backlog and synchronize_net sequence all packets should be accounted. Thanks to Eric W. Biederman for the test script and his valuable feedback! Reported-by: Vittorio Gambaletta <linuxbugs@vittgam.net> Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue") Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09bridge: fix potential crash in __netdev_pick_tx()Eric Dumazet
Commit c29390c6dfee ("xps: must clear sender_cpu before forwarding") fixed an issue in normal forward path, caused by sender_cpu & napi_id skb fields being an union. Bridge is another point where skb can be forwarded, so we need the same cure. Bug triggers if packet was received on a NIC using skb_mark_napi_id() Fixes: 2bd82484bb4c ("xps: fix xps for stacked devices") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Bob Liu <bob.liu@oracle.com> Tested-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09net: pktgen: kill the "Wait for kthread_stop" code in pktgen_thread_worker()Oleg Nesterov
pktgen_thread_worker() doesn't need to wait for kthread_stop(), it can simply exit. Just pktgen_create_thread() and pg_net_exit() should do get_task_struct()/put_task_struct(). kthread_stop(dead_thread) is fine. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09net: pktgen: fix race between pktgen_thread_worker() and kthread_stop()Oleg Nesterov
pktgen_thread_worker() is obviously racy, kthread_stop() can come between the kthread_should_stop() check and set_current_state(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Jan Stancek <jstancek@redhat.com> Reported-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree. This batch mostly comes with patches to address fallout from the previous merge window cycle, they are: 1) Use entry->state.hook_list from nf_queue() instead of the global nf_hooks which is not valid when used from NFPROTO_NETDEV, this should cause no problems though since we have no userspace queueing for that family, but let's fix this now for the sake of correctness. Patch from Eric W. Biederman. 2) Fix compilation breakage in bridge netfilter if CONFIG_NF_DEFRAG_IPV4 is not set, from Bernhard Thaler. 3) Use percpu jumpstack in arptables too, now that there's a single copy of the rule blob we can't store the return address there anymore. Patch from Florian Westphal. 4) Fix a skb leak in the xmit path of bridge netfilter, problem there since 2.6.37 although it should be not possible to hit invalid traffic there, also from Florian. 5) Eric Leblond reports that when loading a large ruleset with many missing modules after a fresh boot, nf_tables can take long time commit it. Fix this by processing the full batch until the end, even on missing modules, then abort only once and restart processing. 6) Add bridge netfilter files to the MAINTAINER files. 7) Fix a net_device refcount leak in the new IPV6 bridge netfilter code, from Julien Grall. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08ipv4: add support for linkdown sysctl to netconfAndy Gospodarek
This kernel patch exports the value of the new ignore_routes_with_linkdown via netconf. v2: changes to notify userspace via netlink when sysctl values change and proposed for 'net' since this could be considered a bugfix Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Suggested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08bridge: mdb: zero out the local br_ip variable before useNikolay Aleksandrov
Since commit b0e9a30dd669 ("bridge: Add vlan id to multicast groups") there's a check in br_ip_equal() for a matching vlan id, but the mdb functions were not modified to use (or at least zero it) so when an entry was added it would have a garbage vlan id (from the local br_ip variable in __br_mdb_add/del) and this would prevent it from being matched and also deleted. So zero out the whole local ip var to protect ourselves from future changes and also to fix the current bug, since there's no vlan id support in the mdb uapi - use always vlan id 0. Example before patch: root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent RTNETLINK answers: Invalid argument After patch: root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Fixes: b0e9a30dd669 ("bridge: Add vlan id to multicast groups") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08net/tipc: initialize security state for new connection socketStephen Smalley
Calling connect() with an AF_TIPC socket would trigger a series of error messages from SELinux along the lines of: SELinux: Invalid class 0 type=AVC msg=audit(1434126658.487:34500): avc: denied { <unprintable> } for pid=292 comm="kworker/u16:5" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=<unprintable> permissive=0 This was due to a failure to initialize the security state of the new connection sock by the tipc code, leaving it with junk in the security class field and an unlabeled secid. Add a call to security_sk_clone() to inherit the security state from the parent socket. Reported-by: Tim Shearer <tim.shearer@overturenetworks.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08ip_tunnel: fix ipv4 pmtu check to honor inner ip header dfTimo Teräs
Frag needed should be sent only if the inner header asked to not fragment. Currently fragmentation is broken if the tunnel has df set, but df was not asked in the original packet. The tunnel's df needs to be still checked to update internally the pmtu cache. Commit 23a3647bc4f93bac broke it, and this commit fixes the ipv4 df check back to the way it was. Fixes: 23a3647bc4f93bac ("ip_tunnels: Use skb-len to PMTU check.") Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08rtnetlink: verify IFLA_VF_INFO attributes before passing them to driverDaniel Borkmann
Jason Gunthorpe reported that since commit c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric"), we don't verify IFLA_VF_INFO attributes anymore with respect to their policy, that is, ifla_vfinfo_policy[]. Before, they were part of ifla_policy[], but they have been nested since placed under IFLA_VFINFO_LIST, that contains the attribute IFLA_VF_INFO, which is another nested attribute for the actual VF attributes such as IFLA_VF_MAC, IFLA_VF_VLAN, etc. Despite the policy being split out from ifla_policy[] in this commit, it's never applied anywhere. nla_for_each_nested() only does basic nla_ok() testing for struct nlattr, but it doesn't know about the data context and their requirements. Fix, on top of Jason's initial work, does 1) parsing of the attributes with the right policy, and 2) using the resulting parsed attribute table from 1) instead of the nla_for_each_nested() loop (just like we used to do when still part of ifla_policy[]). Reference: http://thread.gmane.org/gmane.linux.network/368913 Fixes: c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric") Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Rony Efraim <ronye@mellanox.com> Cc: Vlad Zolotarov <vladz@cloudius-systems.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08Revert "dev: set iflink to 0 for virtual interfaces"Nicolas Dichtel
This reverts commit e1622baf54df8cc958bf29d71de5ad545ea7d93c. The side effect of this commit is to add a '@NONE' after each virtual interface name with a 'ip link'. It may break existing scripts. Reported-by: Olivier Hartkopp <socketcan@hartkopp.net> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08net: graceful exit from netif_alloc_netdev_queues()Eric Dumazet
User space can crash kernel with ip link add ifb10 numtxqueues 100000 type ifb We must replace a BUG_ON() by proper test and return -EINVAL for crazy values. Fixes: 60877a32bce00 ("net: allow large number of tx queues") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08bridge: mdb: start delete timer for temp static entriesSatish Ashok
Start the delete timer when adding temp static entries so they can expire. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08net_sched: gen_estimator: extend pps limitEric Dumazet
rate estimators are limited to 4 Mpps, which was fine years ago, but too small with current hardware generation. Lets use 2^5 scaling instead of 2^10 to get 128 Mpps new limit. On 64bit arch, use an "unsigned long" for temp storage and remove limit. (We do not expect 32bit arches to be able to reach this point) Tested: tc -s -d filter sh dev eth0 parent ffff: filter protocol ip pref 1 u32 filter protocol ip pref 1 u32 fh 800: ht divisor 1 filter protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:15 match 07000000/ff000000 at 12 action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 installed 166 sec Action statistics: Sent 39734251496 bytes 863788076 pkt (dropped 863788117, overlimits 0 requeues 0) rate 4067Mbit 11053596pps backlog 0b 0p requeues 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08netfilter: bridge: Use __in6_dev_get rather than in6_dev_get in br_validate_ipv6Julien Grall
The commit efb6de9b4ba0092b2c55f6a52d16294a8a698edd "netfilter: bridge: forward IPv6 fragmented packets" introduced a new function br_validate_ipv6 which take a reference on the inet6 device. Although, the reference is not released at the end. This will result to the impossibility to destroy any netdevice using ipv6 and bridge. It's possible to directly retrieve the inet6 device without taking a reference as all netfilter hooks are protected by rcu_read_lock via nf_hook_slow. Spotted while trying to destroy a Xen guest on the upstream Linux: "unregister_netdevice: waiting for vif1.0 to become free. Usage count = 1" Signed-off-by: Julien Grall <julien.grall@citrix.com> Cc: Bernhard Thaler <bernhard.thaler@wvnet.at> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: fw@strlen.de Cc: ian.campbell@citrix.com Cc: wei.liu2@citrix.com Cc: Bob Liu <bob.liu@oracle.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-03ipv6: Make MLD packets to only be processed locallyAngga
Before commit daad151263cf ("ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input().") MLD packets were only processed locally. After the change, a copy of MLD packet goes through ip6_mr_input, causing MRT6MSG_NOCACHE message to be generated to user space. Make MLD packet only processed locally. Fixes: daad151263cf ("ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input().") Signed-off-by: Hermin Anggawijaya <hermin.anggawijaya@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-03netlink: Delete an unnecessary check before the function call "module_put"Markus Elfring
The module_put() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-03net-RDS: Delete an unnecessary check before the function call "module_put"Markus Elfring
The module_put() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-03net-ipv6: Delete an unnecessary check before the function call "free_percpu"Markus Elfring
The free_percpu() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-02bridge: vlan: fix usage of vlan 0 and 4095 againNikolay Aleksandrov
Vlan ids 0 and 4095 were disallowed by commit: 8adff41c3d25 ("bridge: Don't use VID 0 and 4095 in vlan filtering") but then the check was removed when vlan ranges were introduced by: bdced7ef7838 ("bridge: support for multiple vlans and vlan ranges in setlink and dellink requests") So reintroduce the vlan range check. Before patch: [root@testvm ~]# bridge vlan add vid 0 dev eth0 master (succeeds) After Patch: [root@testvm ~]# bridge vlan add vid 0 dev eth0 master RTNETLINK answers: Invalid argument Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: bdced7ef7838 ("bridge: support for multiple vlans and vlan ranges in setlink and dellink requests") Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-02Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2015-07-02 A couple of regressions crept in because of a patch to use proper list APIs rather than manually reading & writing the next/prev pointers (commit 835a6a2f8603237a3e6cded5a6765090ecb06ea5). Turns out this was masking a few bugs: a missing INIT_LIST_HEAD() call and incorrectly using list_del() rather than list_del_init(). The two patches in this set fix these, and it'd be nice they could still make it to 4.2-rc1 to avoid new bug reports from users. Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-02Merge tag 'module_init-alternate_initcall-v4.1-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull module_init replacement part two from Paul Gortmaker: "Replace module_init with appropriate alternate initcall in non modules. This series converts non-modular code that is using the module_init() call to hook itself into the system to instead use one of our alternate priority initcalls. Unlike the previous series that used device_initcall and hence was a runtime no-op, these commits change to one of the alternate initcalls, because (a) we have them and (b) it seems like the right thing to do. For example, it would seem logical to use arch_initcall for arch specific setup code and fs_initcall for filesystem setup code. This does mean however, that changes in the init ordering will be taking place, and so there is a small risk that some kind of implicit init ordering issue may lie uncovered. But I think it is still better to give these ones sensible priorities than to just assign them all to device_initcall in order to exactly preserve the old ordering. Thad said, we have already made similar changes in core kernel code in commit c96d6660dc65 ("kernel: audit/fix non-modular users of module_init in core code") without any regressions reported, so this type of change isn't without precedent. It has also got the same local testing and linux-next coverage as all the other pull requests that I'm sending for this merge window have got. Once again, there is an unused module_exit function removal that shows up as an outlier upon casual inspection of the diffstat" * tag 'module_init-alternate_initcall-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: x86: perf_event_intel_pt.c: use arch_initcall to hook in enabling x86: perf_event_intel_bts.c: use arch_initcall to hook in enabling mm/page_owner.c: use late_initcall to hook in enabling lib/list_sort: use late_initcall to hook in self tests arm: use subsys_initcall in non-modular pl320 IPC code powerpc: don't use module_init for non-modular core hugetlb code powerpc: use subsys_initcall for Freescale Local Bus x86: don't use module_init for non-modular core bootflag code netfilter: don't use module_init/exit in core IPV4 code fs/notify: don't use module_init for non-modular inotify_user code mm: replace module_init usages with subsys_initcall in nommu.c
2015-07-02netfilter: nfnetlink: keep going batch handling on missing modulesPablo Neira Ayuso
After a fresh boot with no modules in place at all and a large rulesets, the existing nfnetlink_rcv_batch() funcion can take long time to commit the ruleset due to the many abort path. This is specifically a problem for the existing client of this code, ie. nf_tables, since it results in several synchronize_rcu() call in a row. This patch changes the policy to keep full batch processing on missing modules errors so we abort only once. Reported-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-02netfilter: bridge: don't leak skb in error pathsFlorian Westphal
br_nf_dev_queue_xmit must free skb in its error path. NF_DROP is misleading -- its an okfn, not a netfilter hook. Fixes: 462fb2af9788a ("bridge : Sanitize skb before it enters the IP stack") Fixes: efb6de9b4ba00 ("netfilter: bridge: forward IPv6 fragmented packets") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-02netfilter: arptables: use percpu jumpstackFlorian Westphal
commit 482cfc318559 ("netfilter: xtables: avoid percpu ruleset duplication") Unlike ip and ip6tables, arp tables were never converted to use the percpu jump stack. It still uses the rule blob to store return address, which isn't safe anymore since we now share this blob among all processors. Because there is no TEE support for arptables, we don't need to cope with reentrancy, so we can use loocal variable to hold stack offset. Fixes: 482cfc318559 ("netfilter: xtables: avoid percpu ruleset duplication") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-02netfilter: bridge: fix CONFIG_NF_DEFRAG_IPV4/6 related warnings/errorsBernhard Thaler
br_nf_ip_fragment() is not needed when neither CONFIG_NF_DEFRAG_IPV4 nor CONFIG_NF_DEFRAG_IPV6 is set. struct brnf_frag_data must be available if either CONFIG_NF_DEFRAG_IPV4 or CONFIG_NF_DEFRAG_IPV6 is set. Fixes: efb6de9b4ba0 ("netfilter: bridge: forward IPv6 fragmented packets") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-02netfilter: nf_queue: Don't recompute the hook_list headEric W. Biederman
If someone sends packets from one of the netdevice ingress hooks to the a userspace queue, and then userspace later accepts the packet, the netfilter code can enter an infinite loop as the list head will never be found. Pass in the saved list_head to avoid this. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) mlx4 driver bug fixes (TX queue wakeups, csum complete indications) from Ido Shamay, Eran Ben Elisha, and Or Gerlitz. 2) Missing unlock in error path of PTP support in renesas driver, from Dan Carpenter. 3) Add Vitesse 8641 phy IDs to vitesse PHY driver, from Shaohui Xie. 4) Bnx2x driver bug fixes (linearization of encap packets, scratchpad parity error notifications, flow-control and speed settings) from Yuval Mintz, Manish Chopra, Shahed Shaikh, and Ariel Elior. 5) ipv6 extension header parsing in the igb chip has a HW errata, disable it. Frm Todd Fujinaka. 6) Fix PCI link state locking issue in e1000e driver, from Yanir Lubetkin. 7) Cure panics during MTU change in i40e, from Mitch Williams. 8) Don't leak promisc refs in DSA slave driver, from Gilad Ben-Yossef. 9) Add missing HAS_DMA dep to VIA Rhine driver, from Geery Uytterhoeven. 10) Make sure DMA map/unmap calls are symmetric in bnx2x driver, from Michal Schmidt. 11) Workaround for MDIO access problems in bcm7xxx devices, from FLorian Fainelli. 12) Fix races in SCTP protocol between OTTB responses and route removals, from Alexander Sverdlin. 13) Fix jumbo frame checksum issue with some mvneta devices, from Simon Guinot. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (58 commits) sock_diag: don't broadcast kernel sockets net: mvneta: disable IP checksum with jumbo frames for Armada 370 ARM: mvebu: update Ethernet compatible string for Armada XP net: mvneta: introduce compatible string "marvell, armada-xp-neta" api: fix compatibility of linux/in.h with netinet/in.h net: icplus: fix typo in constant name sis900: Trivial: Fix typos in enums stmmac: Trivial: fix typo in constant name sctp: Fix race between OOTB responce and route removal net-Liquidio: Delete unnecessary checks before the function call "vfree" vmxnet3: Bump up driver version number amd-xgbe: Add the __GFP_NOWARN flag to Rx buffer allocation net: phy: mdio-bcm-unimac: workaround initial read failures for integrated PHYs net: bcmgenet: workaround initial read failures for integrated PHYs net: phy: bcm7xxx: workaround MDIO management controller initial read bnx2x: fix DMA API usage net: via: VIA_RHINE and VIA_VELOCITY should depend on HAS_DMA net/phy: tune get_phy_c45_ids to support more c45 phy bnx2x: fix lockdep splat net: fec: don't access RACC register when not available ...
2015-07-01Merge tag 'modules-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "Main excitement here is Peter Zijlstra's lockless rbtree optimization to speed module address lookup. He found some abusers of the module lock doing that too. A little bit of parameter work here too; including Dan Streetman's breaking up the big param mutex so writing a parameter can load another module (yeah, really). Unfortunately that broke the usual suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were appended too" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits) modules: only use mod->param_lock if CONFIG_MODULES param: fix module param locks when !CONFIG_SYSFS. rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() module: add per-module param_lock module: make perm const params: suppress unused variable error, warn once just in case code changes. modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'. kernel/module.c: avoid ifdefs for sig_enforce declaration kernel/workqueue.c: remove ifdefs over wq_power_efficient kernel/params.c: export param_ops_bool_enable_only kernel/params.c: generalize bool_enable_only kernel/module.c: use generic module param operaters for sig_enforce kernel/params: constify struct kernel_param_ops uses sysfs: tightened sysfs permission checks module: Rework module_addr_{min,max} module: Use __module_address() for module_address_lookup() module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING module: Optimize __module_address() using a latched RB-tree rbtree: Implement generic latch_tree seqlock: Introduce raw_read_seqcount_latch() ...
2015-06-30Bluetooth: Reinitialize the list after deletion for session user listTedd Ho-Jeong An
If the user->list is deleted with list_del(), it doesn't initialize the entry which can cause the issue with list_empty(). According to the comment from the list.h, list_empty() returns false even if the list is empty and put the entry in an undefined state. /** * list_del - deletes entry from list. * @entry: the element to delete from the list. * Note: list_empty() on entry does not return true after this, the entry is * in an undefined state. */ Because of this behavior, list_empty() returns false even if list is empty when the device is reconnected. So, user->list needs to be re-initialized after list_del(). list.h already have a macro list_del_init() which deletes the entry and initailze it again. Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com> Tested-by: Jörg Otte <jrg.otte@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-06-30sock_diag: don't broadcast kernel socketsCraig Gallek
Kernel sockets do not hold a reference for the network namespace to which they point. Socket destruction broadcasting relies on the network namespace and will cause the splat below when a kernel socket is destroyed. This fix simply ignores kernel sockets when they are destroyed. Reported as: general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 1 PID: 9130 Comm: kworker/1:1 Not tainted 4.1.0-gelk-debug+ #1 Workqueue: sock_diag_events sock_diag_broadcast_destroy_work Stack: ffff8800b9c586c0 ffff8800b9c586c0 ffff8800ac4692c0 ffff8800936d4a90 ffff8800352efd38 ffffffff8469a93e ffff8800352efd98 ffffffffc09b9b90 ffff8800352efd78 ffff8800ac4692c0 ffff8800b9c586c0 ffff8800831b6ab8 Call Trace: [<ffffffff8469a93e>] ? mutex_unlock+0xe/0x10 [<ffffffffc09b9b90>] ? inet_diag_handler_get_info+0x110/0x1fb [inet_diag] [<ffffffff845c868d>] netlink_broadcast+0x1d/0x20 [<ffffffff8469a93e>] ? mutex_unlock+0xe/0x10 [<ffffffff845b2bf5>] sock_diag_broadcast_destroy_work+0xd5/0x160 [<ffffffff8408ea97>] process_one_work+0x147/0x420 [<ffffffff8408f0f9>] worker_thread+0x69/0x470 [<ffffffff8409fda3>] ? preempt_count_sub+0xa3/0xf0 [<ffffffff8408f090>] ? rescuer_thread+0x320/0x320 [<ffffffff84093cd7>] kthread+0x107/0x120 [<ffffffff84093bd0>] ? kthread_create_on_node+0x1b0/0x1b0 [<ffffffff8469d31f>] ret_from_fork+0x3f/0x70 [<ffffffff84093bd0>] ? kthread_create_on_node+0x1b0/0x1b0 Tested: Using a debug kernel while 'ss -E' is running: ip netns add test-ns ip netns delete test-ns Fixes: eb4cb008529c sock_diag: define destruction multicast groups Fixes: 26abe14379f8 net: Modify sk_alloc to not reference count the netns of kernel sockets. Reported-by: Dave Jones <davej@codemonkey.org.uk> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-29sctp: Fix race between OOTB responce and route removalAlexander Sverdlin
There is NULL pointer dereference possible during statistics update if the route used for OOTB responce is removed at unfortunate time. If the route exists when we receive OOTB packet and we finally jump into sctp_packet_transmit() to send ABORT, but in the meantime route is removed under our feet, we take "no_route" path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...). But sctp_ootb_pkt_new() used to prepare responce packet doesn't call sctp_transport_set_owner() and therefore there is no asoc associated with this packet. Probably temporary asoc just for OOTB responces is overkill, so just introduce a check like in all other places in sctp_packet_transmit(), where "asoc" is dereferenced. To reproduce this, one needs to 0. ensure that sctp module is loaded (otherwise ABORT is not generated) 1. remove default route on the machine 2. while true; do ip route del [interface-specific route] ip route add [interface-specific route] done 3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT responce On x86_64 the crash looks like this: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp] PGD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ... CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.0.5-1-ARCH #1 Hardware name: ... task: ffffffff818124c0 ti: ffffffff81800000 task.ti: ffffffff81800000 RIP: 0010:[<ffffffffa05ec9ac>] [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp] RSP: 0018:ffff880127c037b8 EFLAGS: 00010296 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000015ff66b480 RDX: 00000015ff66b400 RSI: ffff880127c17200 RDI: ffff880123403700 RBP: ffff880127c03888 R08: 0000000000017200 R09: ffffffff814625af R10: ffffea00047e4680 R11: 00000000ffffff80 R12: ffff8800b0d38a28 R13: ffff8800b0d38a28 R14: ffff8800b3e88000 R15: ffffffffa05f24e0 FS: 0000000000000000(0000) GS:ffff880127c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 00000000c855b000 CR4: 00000000000007f0 Stack: ffff880127c03910 ffff8800b0d38a28 ffffffff8189d240 ffff88011f91b400 ffff880127c03828 ffffffffa05c94c5 0000000000000000 ffff8800baa1c520 0000000000000000 0000000000000001 0000000000000000 0000000000000000 Call Trace: <IRQ> [<ffffffffa05c94c5>] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp] [<ffffffffa05d6b42>] ? sctp_transport_put+0x52/0x80 [sctp] [<ffffffffa05d0bfc>] sctp_do_sm+0xb8c/0x19a0 [sctp] [<ffffffff810b0e00>] ? trigger_load_balance+0x90/0x210 [<ffffffff810e0329>] ? update_process_times+0x59/0x60 [<ffffffff812c7a40>] ? timerqueue_add+0x60/0xb0 [<ffffffff810e0549>] ? enqueue_hrtimer+0x29/0xa0 [<ffffffff8101f599>] ? read_tsc+0x9/0x10 [<ffffffff8116d4b5>] ? put_page+0x55/0x60 [<ffffffff810ee1ad>] ? clockevents_program_event+0x6d/0x100 [<ffffffff81462b68>] ? skb_free_head+0x58/0x80 [<ffffffffa029a10b>] ? chksum_update+0x1b/0x27 [crc32c_generic] [<ffffffff81283f3e>] ? crypto_shash_update+0xce/0xf0 [<ffffffffa05d3993>] sctp_endpoint_bh_rcv+0x113/0x280 [sctp] [<ffffffffa05dd4e6>] sctp_inq_push+0x46/0x60 [sctp] [<ffffffffa05ed7a0>] sctp_rcv+0x880/0x910 [sctp] [<ffffffffa05ecb50>] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp] [<ffffffffa05ecb70>] ? sctp_csum_update+0x20/0x20 [sctp] [<ffffffff814b05a5>] ? ip_route_input_noref+0x235/0xd30 [<ffffffff81051d6b>] ? ack_ioapic_level+0x7b/0x150 [<ffffffff814b27be>] ip_local_deliver_finish+0xae/0x210 [<ffffffff814b2e15>] ip_local_deliver+0x35/0x90 [<ffffffff814b2a15>] ip_rcv_finish+0xf5/0x370 [<ffffffff814b3128>] ip_rcv+0x2b8/0x3a0 [<ffffffff81474193>] __netif_receive_skb_core+0x763/0xa50 [<ffffffff81476c28>] __netif_receive_skb+0x18/0x60 [<ffffffff81476cb0>] netif_receive_skb_internal+0x40/0xd0 [<ffffffff814776c8>] napi_gro_receive+0xe8/0x120 [<ffffffffa03946aa>] rtl8169_poll+0x2da/0x660 [r8169] [<ffffffff8147896a>] net_rx_action+0x21a/0x360 [<ffffffff81078dc1>] __do_softirq+0xe1/0x2d0 [<ffffffff8107912d>] irq_exit+0xad/0xb0 [<ffffffff8157d158>] do_IRQ+0x58/0xf0 [<ffffffff8157b06d>] common_interrupt+0x6d/0x6d <EOI> [<ffffffff810e1218>] ? hrtimer_start+0x18/0x20 [<ffffffffa05d65f9>] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp] [<ffffffff81020c50>] ? mwait_idle+0x60/0xa0 [<ffffffff810216ef>] arch_cpu_idle+0xf/0x20 [<ffffffff810b731c>] cpu_startup_entry+0x3ec/0x480 [<ffffffff8156b365>] rest_init+0x85/0x90 [<ffffffff818eb035>] start_kernel+0x48b/0x4ac [<ffffffff818ea120>] ? early_idt_handlers+0x120/0x120 [<ffffffff818ea339>] x86_64_start_reservations+0x2a/0x2c [<ffffffff818ea49c>] x86_64_start_kernel+0x161/0x184 Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9 RIP [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp] RSP <ffff880127c037b8> CR2: 0000000000000020 ---[ end trace 5aec7fd2dc983574 ]--- Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) drm_kms_helper: panic occurred, switching back to text console ---[ end Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28dsa: fix promiscuity leak on slave dev open errorGilad Ben-Yossef
DSA master netdev promiscuity counter was not being properly decremented on slave device open error path. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> CC: Gilad Ben-Yossef <giladb@ezchip.com> CC: David S. Miller <davem@davemloft.net> CC: Florian Fainelli <f.fainelli@gmail.com> CC: Guenter Roeck <linux@roeck-us.net> CC: Andrew Lunn <andrew@lunn.ch> CC: Scott Feldman <sfeldma@gmail.com> Acked-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28net: Kill sock->sk_protinfoDavid Miller
No more users, so it can now be removed. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28ax25: Stop using sock->sk_protinfo.David Miller
Just make a ax25_sock structure that provides the ax25_cb pointer. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28flow_dissector: Pre-initialize ip_proto in __skb_flow_dissect()Geert Uytterhoeven
net/core/flow_dissector.c: In function ‘__skb_flow_dissect’: net/core/flow_dissector.c:132: warning: ‘ip_proto’ may be used uninitialized in this function Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28ipv4: fix RCU lockdep warning from linkdown changesAndy Gospodarek
The following lockdep splat was seen due to the wrong context for grabbing in_dev. =============================== [ INFO: suspicious RCU usage. ] 4.1.0-next-20150626-dbg-00020-g54a6d91-dirty #244 Not tainted ------------------------------- include/linux/inetdevice.h:205 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ip/403: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81453305>] rtnl_lock+0x17/0x19 #1: ((inetaddr_chain).rwsem){.+.+.+}, at: [<ffffffff8105c327>] __blocking_notifier_call_chain+0x35/0x6a stack backtrace: CPU: 2 PID: 403 Comm: ip Not tainted 4.1.0-next-20150626-dbg-00020-g54a6d91-dirty #244 0000000000000001 ffff8800b189b728 ffffffff8150a542 ffffffff8107a8b3 ffff880037bbea40 ffff8800b189b758 ffffffff8107cb74 ffff8800379dbd00 ffff8800bec85800 ffff8800bf9e13c0 00000000000000ff ffff8800b189b7d8 Call Trace: [<ffffffff8150a542>] dump_stack+0x4c/0x6e [<ffffffff8107a8b3>] ? up+0x39/0x3e [<ffffffff8107cb74>] lockdep_rcu_suspicious+0xf7/0x100 [<ffffffff814b63c3>] fib_dump_info+0x227/0x3e2 [<ffffffff814b6624>] rtmsg_fib+0xa6/0x116 [<ffffffff814b978f>] fib_table_insert+0x316/0x355 [<ffffffff814b362e>] fib_magic+0xb7/0xc7 [<ffffffff814b4803>] fib_add_ifaddr+0xb1/0x13b [<ffffffff814b4d09>] fib_inetaddr_event+0x36/0x90 [<ffffffff8105c086>] notifier_call_chain+0x4c/0x71 [<ffffffff8105c340>] __blocking_notifier_call_chain+0x4e/0x6a [<ffffffff8105c370>] blocking_notifier_call_chain+0x14/0x16 [<ffffffff814a7f50>] __inet_insert_ifa+0x1a5/0x1b3 [<ffffffff814a894d>] inet_rtm_newaddr+0x350/0x35f [<ffffffff81457866>] rtnetlink_rcv_msg+0x17b/0x18a [<ffffffff8107e7c3>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8146965f>] ? netlink_deliver_tap+0x1cb/0x1f7 [<ffffffff814576eb>] ? rtnl_newlink+0x72a/0x72a ... This patch resolves that splat. Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28tipc: purge backlog queue counters when broadcast link is resetJon Paul Maloy
In commit 1f66d161ab3d8b518903fa6c3f9c1f48d6919e74 ("tipc: introduce starvation free send algorithm") we introduced a counter per priority level for buffers in the link backlog queue. We also introduced a new function tipc_link_purge_backlog(), to reset these counters to zero when the link is reset. Unfortunately, we missed to call this function when the broadcast link is reset, with the result that the values of these counters might be permanently skewed when new nodes are attached. This may in the worst case lead to permananent, but spurious, broadcast link congestion, where no broadcast packets can be sent at all. We fix this bug with this commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-27Merge branch 'for-4.2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "A relatively quiet cycle, with a mix of cleanup and smaller bugfixes" * 'for-4.2' of git://linux-nfs.org/~bfields/linux: (24 commits) sunrpc: use sg_init_one() in krb5_rc4_setup_enc/seq_key() nfsd: wrap too long lines in nfsd4_encode_read nfsd: fput rd_file from XDR encode context nfsd: take struct file setup fully into nfs4_preprocess_stateid_op nfsd: refactor nfs4_preprocess_stateid_op nfsd: clean up raparams handling nfsd: use swap() in sort_pacl_range() rpcrdma: Merge svcrdma and xprtrdma modules into one svcrdma: Add a separate "max data segs macro for svcrdma svcrdma: Replace GFP_KERNEL in a loop with GFP_NOFAIL svcrdma: Keep rpcrdma_msg fields in network byte-order svcrdma: Fix byte-swapping in svc_rdma_sendto.c nfsd: Update callback sequnce id only CB_SEQUENCE success nfsd: Reset cb_status in nfsd4_cb_prepare() at retrying svcrdma: Remove svc_rdma_xdr_decode_deferred_req() SUNRPC: Move EXPORT_SYMBOL for svc_process uapi/nfs: Add NFSv4.1 ACL definitions nfsd: Remove dead declarations nfsd: work around a gcc-5.1 warning nfsd: Checking for acl support does not require fetching any acls ...
2015-06-26Bluetooth: hidp: Initialize list header of hidp session userTedd Ho-Jeong An
When new hidp session is created, list header in l2cap_user is not initialized and this causes list_empty() to fail in l2cap_register_user() even if l2cap_user list is empty. Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com> Tested-by: Jörg Otte <jrg.otte@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-06-25net: sched: flower fix typoJamal Hadi Salim
Fix typo in the validation rules for flower's attributes Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Add TX fast path in mac80211, from Johannes Berg. 2) Add TSO/GRO support to ibmveth, from Thomas Falcon 3) Move away from cached routes in ipv6, just like ipv4, from Martin KaFai Lau. 4) Lots of new rhashtable tests, from Thomas Graf. 5) Run ingress qdisc lockless, from Alexei Starovoitov. 6) Allow servers to fetch TCP packet headers for SYN packets of new connections, for fingerprinting. From Eric Dumazet. 7) Add mode parameter to pktgen, for testing receive. From Alexei Starovoitov. 8) Cache access optimizations via simplifications of build_skb(), from Alexander Duyck. 9) Move page frag allocator under mm/, also from Alexander. 10) Add xmit_more support to hv_netvsc, from KY Srinivasan. 11) Add a counter guard in case we try to perform endless reclassify loops in the packet scheduler. 12) Extern flow dissector to be programmable and use it in new "Flower" classifier. From Jiri Pirko. 13) AF_PACKET fanout rollover fixes, performance improvements, and new statistics. From Willem de Bruijn. 14) Add netdev driver for GENEVE tunnels, from John W Linville. 15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso. 16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet. 17) Add an ECN retry fallback for the initial TCP handshake, from Daniel Borkmann. 18) Add tail call support to BPF, from Alexei Starovoitov. 19) Add several pktgen helper scripts, from Jesper Dangaard Brouer. 20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa. 21) Favor even port numbers for allocation to connect() requests, and odd port numbers for bind(0), in an effort to help avoid ip_local_port_range exhaustion. From Eric Dumazet. 22) Add Cavium ThunderX driver, from Sunil Goutham. 23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata, from Alexei Starovoitov. 24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai. 25) Double TCP Small Queues default to 256K to accomodate situations like the XEN driver and wireless aggregation. From Wei Liu. 26) Add more entropy inputs to flow dissector, from Tom Herbert. 27) Add CDG congestion control algorithm to TCP, from Kenneth Klette Jonassen. 28) Convert ipset over to RCU locking, from Jozsef Kadlecsik. 29) Track and act upon link status of ipv4 route nexthops, from Andy Gospodarek. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits) bridge: vlan: flush the dynamically learned entries on port vlan delete bridge: multicast: add a comment to br_port_state_selection about blocking state net: inet_diag: export IPV6_V6ONLY sockopt stmmac: troubleshoot unexpected bits in des0 & des1 net: ipv4 sysctl option to ignore routes when nexthop link is down net: track link-status of ipv4 nexthops net: switchdev: ignore unsupported bridge flags net: Cavium: Fix MAC address setting in shutdown state drivers: net: xgene: fix for ACPI support without ACPI ip: report the original address of ICMP messages net/mlx5e: Prefetch skb data on RX net/mlx5e: Pop cq outside mlx5e_get_cqe net/mlx5e: Remove mlx5e_cq.sqrq back-pointer net/mlx5e: Remove extra spaces net/mlx5e: Avoid TX CQE generation if more xmit packets expected net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq() net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device ...
2015-06-24bridge: vlan: flush the dynamically learned entries on port vlan deleteNikolay Aleksandrov
Add a new argument to br_fdb_delete_by_port which allows to specify a vid to match when flushing entries and use it in nbp_vlan_delete() to flush the dynamically learned entries of the vlan/port pair when removing a vlan from a port. Before this patch only the local mac was being removed and the dynamically learned ones were left to expire. Note that the do_all argument is still respected and if specified, the vid will be ignored. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24bridge: multicast: add a comment to br_port_state_selection about blocking stateNikolay Aleksandrov
Add a comment to explain why we're not disabling port's multicast when it goes in blocking state. Since there's a check in the timer's function which bypasses the timer if the port's in blocking/disabled state, the timer will simply expire and stop without sending more queries. Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>