summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2015-07-27packet: missing dev_put() in packet_do_bind()Lars Westerhoff
When binding a PF_PACKET socket, the use count of the bound interface is always increased with dev_hold in dev_get_by_{index,name}. However, when rebound with the same protocol and device as in the previous bind the use count of the interface was not decreased. Ultimately, this caused the deletion of the interface to fail with the following message: unregister_netdevice: waiting for dummy0 to become free. Usage count = 1 This patch moves the dev_put out of the conditional part that was only executed when either the protocol or device changed on a bind. Fixes: 902fefb82ef7 ('packet: improve socket create/bind latency in some cases') Signed-off-by: Lars Westerhoff <lars.westerhoff@newtec.eu> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27SUNRPC: Report TCP errors to the callerTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-27fib_trie: Drop unnecessary calls to leaf_pull_suffixAlexander Duyck
It was reported that update_suffix was taking a long time on systems where a large number of leaves were attached to a single node. As it turns out fib_table_flush was calling update_suffix for each leaf that didn't have all of the aliases stripped from it. As a result, on this large node removing one leaf would result in us calling update_suffix for every other leaf on the node. The fix is to just remove the calls to leaf_pull_suffix since they are redundant as we already have a call in resize that will go through and update the suffix length for the node before we exit out of fib_table_flush or fib_table_flush_external. Reported-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27sunrpc: translate -EAGAIN to -ENOBUFS when socket is writable.NeilBrown
The networking layer does not reliably report the distinction between a non-block write failing because: 1/ the queue is too full already and 2/ a memory allocation attempt failed. The distinction is important because in the first case it is appropriate to retry as soon as the socket reports that it is writable, and in the second case a small delay is required as the socket will most likely report as writable but kmalloc could still fail. sk_stream_wait_memory() exhibits this distinction nicely, setting 'vm_wait' if a small wait is needed. However in the non-blocking case it always returns -EAGAIN no matter the cause of the failure. This -EAGAIN call get all the way to sunrpc. The sunrpc layer expects EAGAIN to indicate the first cause, and ENOBUFS to indicate the second. Various documentation suggests that this is not unreasonable, but does not guarantee the desired error codes. The result of getting -EAGAIN when -ENOBUFS is expected is that the send is tried again in a tight loop and soft lockups are reported. so: add tests after calls to xs_sendpages() to translate -EAGAIN into -ENOBUFS if the socket is writable. This cannot happen inside xs_sendpages() as the test for "is socket writable" is different between TCP and UDP. With this change, the tight loop retrying xs_sendpages() becomes a loop which only retries every 250ms, and so will not trigger a soft-lockup warning. It is possible that the write did fail because the queue was too full and by the time xs_sendpages() completed, the queue was writable again. In this case an extra 250ms delay is inserted that isn't really needed. This circumstance suggests a degree of congestion so a delay is not necessarily a bad thing, and it can only cause a single 250ms delay, not a series of them. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-27lwtunnel: use kfree_skb() instead of vanilla kfree()Dan Carpenter
kfree_skb() is correct here. Fixes: ffce41962ef6 ('lwtunnel: support dst output redirect function') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27tcp: tso: allow deferring under reordering stateEric Dumazet
While doing experiments with reordering resilience, we found linux senders were not able to send at full speed under reordering, because every incoming SACK was releasing one MSS. This patch removes the limitation, as we did for CWR state in commit a0ea700e409 ("tcp: tso: allow CA_CWR state in tcp_tso_should_defer()") Neal Cardwell had a concern about limited transmit so Yuchung conducted experiments on GFE and found nothing worth adding an extra check on fast path : if (icsk->icsk_ca_state == TCP_CA_Disorder && tcp_sk(sk)->reordering == sysctl_tcp_reordering) goto send_now; Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27ipv6: Avoid rt6_probe() taking writer lock in the fast pathMartin KaFai Lau
The patch checks neigh->nud_state before acquiring the writer lock. Note that rt6_probe() is only used in CONFIG_IPV6_ROUTER_PREF. 40 udpflood processes and a /64 gateway route are used. The gateway has NUD_PERMANENT. Each of them is run for 30s. At the end, the total number of finished sendto(): Before: 55M After: 95M Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> CC: Julian Anastasov <ja@ssi.bg> CC: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27ipv6: Re-arrange code in rt6_probe()Martin KaFai Lau
It is a prep work for the next patch to remove write_lock from rt6_probe(). 1. Reduce the number of if(neigh) check. From 4 to 1. 2. Bring the write_(un)lock() closer to the operations that the lock is protecting. Hopefully, the above make rt6_probe() more readable. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Julian Anastasov <ja@ssi.bg> Cc: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27tcp: fix recv with flags MSG_WAITALL | MSG_PEEKSabrina Dubroca
Currently, tcp_recvmsg enters a busy loop in sk_wait_data if called with flags = MSG_WAITALL | MSG_PEEK. sk_wait_data waits for sk_receive_queue not empty, but in this case, the receive queue is not empty, but does not contain any skb that we can use. Add a "last skb seen on receive queue" argument to sk_wait_data, so that it sleeps until the receive queue has new skbs. Link: https://bugzilla.kernel.org/show_bug.cgi?id=99461 Link: https://sourceware.org/bugzilla/show_bug.cgi?id=18493 Link: https://bugzilla.redhat.com/show_bug.cgi?id=1205258 Reported-by: Enrico Scholz <rh-bugzilla@ensc.de> Reported-by: Dan Searle <dan@censornet.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27lwtunnel: change prototype of lwtunnel_state_get()Nicolas Dichtel
It saves some lines and simplify a bit the code when the state is returning by this function. It's also useful to handle a NULL entry. To avoid too long lines, I've also renamed lwtunnel_state_get() and lwtunnel_state_put() to lwtstate_get() and lwtstate_put(). CC: Thomas Graf <tgraf@suug.ch> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27ipv6: copy lwtstate in ip6_rt_copy_init()Nicolas Dichtel
We need to copy this field (ip6_rt_cache_alloc() and ip6_rt_pcpu_alloc() use ip6_rt_copy_init() to build a dst). CC: Thomas Graf <tgraf@suug.ch> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 19e42e451506 ("ipv6: support for fib route lwtunnel encap attributes") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27ipv6: use lwtunnel_output6() only if flag redirect is setNicolas Dichtel
This function make sense only when LWTUNNEL_STATE_OUTPUT_REDIRECT is set. The check is already done in IPv4. CC: Thomas Graf <tgraf@suug.ch> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 74a0f2fe8ed5 ("ipv6: rt6_info output redirect to tunnel output") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27dev: Spelling fix in commentssubashab@codeaurora.org
Fix the following typo - unchainged -> unchanged Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2015-07-23 Here's another one-patch pull request for 4.2 which targets a potential NULL pointer dereference in the LE Security Manager code that can be triggered by using older user space tools. The issue has been there since 4.0 so there's the appropriate "Cc: stable" in place. Let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26bridge: mdb: notify on router port add and delSatish Ashok
Send notifications on router port add and del/expire, re-use the already existing MDBA_ROUTER and send NEWMDB/DELMDB netlink notifications respectively. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26openvswitch: Retrieve tunnel metadata when receiving from vport-netdevThomas Graf
Retrieve the tunnel metadata for packets received by a net_device and provide it to ovs_vport_receive() for flow key extraction. [This hunk was in the GRE patch in the initial series and missed the cut for the initial submission for merging.] Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device") Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26inet: frags: remove INET_FRAG_EVICTED and use list_evictor for the testNikolay Aleksandrov
We can simply remove the INET_FRAG_EVICTED flag to avoid all the flags race conditions with the evictor and use a participation test for the evictor list, when we're at that point (after inet_frag_kill) in the timer there're 2 possible cases: 1. The evictor added the entry to its evictor list while the timer was waiting for the chainlock or 2. The timer unchained the entry and the evictor won't see it In both cases we should be able to see list_evictor correctly due to the sync on the chainlock. Joint work with Florian Westphal. Tested-by: Frank Schreuder <fschreuder@transip.nl> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26inet: frag: don't wait for timer deletion when evictingFlorian Westphal
Frank reports 'NMI watchdog: BUG: soft lockup' errors when load is high. Instead of (potentially) unbounded restarts of the eviction process, just skip to the next entry. One caveat is that, when a netns is exiting, a timer may still be running by the time inet_evict_bucket returns. We use the frag memory accounting to wait for outstanding timers, so that when we free the percpu counter we can be sure no running timer will trip over it. Reported-and-tested-by: Frank Schreuder <fschreuder@transip.nl> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26inet: frag: change *_frag_mem_limit functions to take netns_frags as argumentFlorian Westphal
Followup patch will call it after inet_frag_queue was freed, so q->net doesn't work anymore (but netf = q->net; free(q); mem_limit(netf) would). Tested-by: Frank Schreuder <fschreuder@transip.nl> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26inet: frag: don't re-use chainlist for evictorFlorian Westphal
commit 65ba1f1ec0eff ("inet: frags: fix a race between inet_evict_bucket and inet_frag_kill") describes the bug, but the fix doesn't work reliably. Problem is that ->flags member can be set on other cpu without chainlock being held by that task, i.e. the RMW-Cycle can clear INET_FRAG_EVICTED bit after we put the element on the evictor private list. We can crash when walking the 'private' evictor list since an element can be deleted from list underneath the evictor. Join work with Nikolay Alexandrov. Fixes: b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue") Reported-by: Johan Schuijt <johan@transip.nl> Tested-by: Frank Schreuder <fschreuder@transip.nl> Signed-off-by: Nikolay Alexandrov <nikolay@cumulusnetworks.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26openvswitch: fix compilation when vxlan is a moduleNicolas Dichtel
With CONFIG_VXLAN=m and CONFIG_OPENVSWITCH=y, there was the following compilation error: LD init/built-in.o net/built-in.o: In function `vxlan_tnl_create': .../net/openvswitch/vport-netdev.c:322: undefined reference to `vxlan_dev_create' make: *** [vmlinux] Error 1 CC: Thomas Graf <tgraf@suug.ch> Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26ipv4: be more aggressive when probing alternative gatewaysJulian Anastasov
Currently, we do not notice if new alternative gateways are added. We can do it by checking for present neigh entry. Also, gateways that are currently probed (NUD_INCOMPLETE) can be skipped from round-robin probing. Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26ipv6: fix crash over flow-based vxlan deviceWei-Chun Chao
Similar check was added in ip_rcv but not in ipv6_rcv. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81734e0a>] ipv6_rcv+0xfa/0x500 Call Trace: [<ffffffff816c9786>] ? ip_rcv+0x296/0x400 [<ffffffff817732d2>] ? packet_rcv+0x52/0x410 [<ffffffff8168e99f>] __netif_receive_skb_core+0x63f/0x9a0 [<ffffffffc02b34a0>] ? br_handle_frame_finish+0x580/0x580 [bridge] [<ffffffff8109912c>] ? update_rq_clock.part.81+0x1c/0x40 [<ffffffff8168ed18>] __netif_receive_skb+0x18/0x60 [<ffffffff8168fa1f>] process_backlog+0x9f/0x150 Fixes: ee122c79d422 (vxlan: Flow based tunneling) Signed-off-by: Wei-Chun Chao <weichunc@plumgrid.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26net: sctp: stop spamming klog with rfc6458, 5.3.2. deprecation warningsDaniel Borkmann
Back then when we added support for SCTP_SNDINFO/SCTP_RCVINFO from RFC6458 5.3.4/5.3.5, we decided to add a deprecation warning for the (as per RFC deprecated) SCTP_SNDRCV via commit bbbea41d5e53 ("net: sctp: deprecate rfc6458, 5.3.2. SCTP_SNDRCV support"), see [1]. Imho, it was not a good idea, and we should just revert that message for a couple of reasons: 1) It's uapi and therefore set in stone forever. 2) To be able to run on older and newer kernels, an SCTP application would need to probe for both, SCTP_SNDRCV, but also SCTP_SNDINFO/ SCTP_RCVINFO support, so that on older kernels, it can make use of SCTP_SNDRCV, and on newer kernels SCTP_SNDINFO/SCTP_RCVINFO. In my (limited) experience, a lot of SCTP appliances are migrating to newer kernels only ve(ee)ry slowly. 3) Some people don't have the chance to change their applications, f.e. due to proprietary legacy stuff. So, they'll hit this warning in fast path and are stuck with older kernels. But i.e. due to point 1) I really fail to see the benefit of a warning. So just revert that for now, the issue was reported up Jamal. [1] http://thread.gmane.org/gmane.linux.network/321960/ Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Michael Tuexen <tuexen@fh-muenster.de> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26tipc: clean up socket layer message receptionJon Paul Maloy
When a message is received in a socket, one of the call chains tipc_sk_rcv()->tipc_sk_enqueue()->filter_rcv()(->tipc_sk_proto_rcv()) or tipc_sk_backlog_rcv()->filter_rcv()(->tipc_sk_proto_rcv()) are followed. At each of these levels we may encounter situations where the message may need to be rejected, or a new message produced for transfer back to the sender. Despite recent improvements, the current code for doing this is perceived as awkward and hard to follow. Leveraging the two previous commits in this series, we now introduce a more uniform handling of such situations. We let each of the functions in the chain itself produce/reverse the message to be returned to the sender, but also perform the actual forwarding. This simplifies the necessary logics within each function. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26tipc: introduce new tipc_sk_respond() functionJon Paul Maloy
Currently, we use the code sequence if (msg_reverse()) tipc_link_xmit_skb() at numerous locations in socket.c. The preparation of arguments for these calls, as well as the sequence itself, makes the code unecessarily complex. In this commit, we introduce a new function, tipc_sk_respond(), that performs this call combination. We also replace some, but not yet all, of these explicit call sequences with calls to the new function. Notably, we let the function tipc_sk_proto_rcv() use the new function to directly send out PROBE_REPLY messages, instead of deferring this to the calling tipc_sk_rcv() function, as we do now. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26tipc: let function tipc_msg_reverse() expand header when neededJon Paul Maloy
The shortest TIPC message header, for cluster local CONNECTED messages, is 24 bytes long. With this format, the fields "dest_node" and "orig_node" are optimized away, since they in reality are redundant in this particular case. However, the absence of these fields leads to code inconsistencies that are difficult to handle in some cases, especially when we need to reverse or reject messages at the socket layer. In this commit, we concentrate the handling of the absent fields to one place, by letting the function tipc_msg_reverse() reallocate the buffer and expand the header to 32 bytes when necessary. This means that the socket code now can assume that the two previously absent fields are present in the header when a message needs to be rejected. This opens up for some further simplifications of the socket code. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26bridge: netlink: fix slave_changelink/br_setport race conditionsNikolay Aleksandrov
Since slave_changelink support was added there have been a few race conditions when using br_setport() since some of the port functions it uses require the bridge lock. It is very easy to trigger a lockup due to some internal spin_lock() usage without bh disabled, also it's possible to get the bridge into an inconsistent state. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: 3ac636b8591c ("bridge: implement rtnl_link_ops->slave_changelink") Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains ten Netfilter/IPVS fixes, they are: 1) Address refcount leak when creating an expectation from the ctnetlink interface. 2) Fix bug splat in the IDLETIMER target related to sysfs, from Dmitry Torokhov. 3) Resolve panic for unreachable route in IPVS with locally generated traffic in the output path, from Alex Gartrell. 4) Fix wrong source address in rare cases for tunneled traffic in IPVS, from Julian Anastasov. 5) Fix crash if scheduler is changed via ipvsadm -E, again from Julian. 6) Make sure skb->sk is unset for forwarded traffic through IPVS, again from Alex Gartrell. 7) Fix crash with IPVS sync protocol v0 and FTP, from Julian. 8) Reset sender cpu for forwarded traffic in IPVS, also from Julian. 9) Allocate template conntracks through kmalloc() to resolve netns dependency problems with the conntrack kmem_cache. 10) Fix zones with expectations that clash using the same tuple, from Joe Stringer. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-25cgroup: net_cls: fix false-positive "suspicious RCU usage"Konstantin Khlebnikov
In dev_queue_xmit() net_cls protected with rcu-bh. [ 270.730026] =============================== [ 270.730029] [ INFO: suspicious RCU usage. ] [ 270.730033] 4.2.0-rc3+ #2 Not tainted [ 270.730036] ------------------------------- [ 270.730040] include/linux/cgroup.h:353 suspicious rcu_dereference_check() usage! [ 270.730041] other info that might help us debug this: [ 270.730043] rcu_scheduler_active = 1, debug_locks = 1 [ 270.730045] 2 locks held by dhclient/748: [ 270.730046] #0: (rcu_read_lock_bh){......}, at: [<ffffffff81682b70>] __dev_queue_xmit+0x50/0x960 [ 270.730085] #1: (&qdisc_tx_lock){+.....}, at: [<ffffffff81682d60>] __dev_queue_xmit+0x240/0x960 [ 270.730090] stack backtrace: [ 270.730096] CPU: 0 PID: 748 Comm: dhclient Not tainted 4.2.0-rc3+ #2 [ 270.730098] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011 [ 270.730100] 0000000000000001 ffff8800bafeba58 ffffffff817ad487 0000000000000007 [ 270.730103] ffff880232a0a780 ffff8800bafeba88 ffffffff810ca4f2 ffff88022fb23e00 [ 270.730105] ffff880232a0a780 ffff8800bafebb68 ffff8800bafebb68 ffff8800bafebaa8 [ 270.730108] Call Trace: [ 270.730121] [<ffffffff817ad487>] dump_stack+0x4c/0x65 [ 270.730148] [<ffffffff810ca4f2>] lockdep_rcu_suspicious+0xe2/0x120 [ 270.730153] [<ffffffff816a62d2>] task_cls_state+0x92/0xa0 [ 270.730158] [<ffffffffa00b534f>] cls_cgroup_classify+0x4f/0x120 [cls_cgroup] [ 270.730164] [<ffffffff816aac74>] tc_classify_compat+0x74/0xc0 [ 270.730166] [<ffffffff816ab573>] tc_classify+0x33/0x90 [ 270.730170] [<ffffffffa00bcb0a>] htb_enqueue+0xaa/0x4a0 [sch_htb] [ 270.730172] [<ffffffff81682e26>] __dev_queue_xmit+0x306/0x960 [ 270.730174] [<ffffffff81682b70>] ? __dev_queue_xmit+0x50/0x960 [ 270.730176] [<ffffffff816834a3>] dev_queue_xmit_sk+0x13/0x20 [ 270.730185] [<ffffffff81787770>] dev_queue_xmit+0x10/0x20 [ 270.730187] [<ffffffff8178b91c>] packet_snd.isra.62+0x54c/0x760 [ 270.730190] [<ffffffff8178be25>] packet_sendmsg+0x2f5/0x3f0 [ 270.730203] [<ffffffff81665245>] ? sock_def_readable+0x5/0x190 [ 270.730210] [<ffffffff817b64bb>] ? _raw_spin_unlock+0x2b/0x40 [ 270.730216] [<ffffffff8173bcbc>] ? unix_dgram_sendmsg+0x5cc/0x640 [ 270.730219] [<ffffffff8165f367>] sock_sendmsg+0x47/0x50 [ 270.730221] [<ffffffff8165f42f>] sock_write_iter+0x7f/0xd0 [ 270.730232] [<ffffffff811fd4c7>] __vfs_write+0xa7/0xf0 [ 270.730234] [<ffffffff811fe5b8>] vfs_write+0xb8/0x190 [ 270.730236] [<ffffffff811fe8c2>] SyS_write+0x52/0xb0 [ 270.730239] [<ffffffff817b6bae>] entry_SYSCALL_64_fastpath+0x12/0x76 Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-24sch_choke: drop all packets in queue during resetWANG Cong
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-24sch_plug: purge buffered packets during resetWANG Cong
Otherwise the skbuff related structures are not correctly refcount'ed. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-24bridge: Fix setting a flag in br_fill_ifvlaninfo_range().Rosen, Rami
This patch fixes setting of vinfo.flags in the br_fill_ifvlaninfo_range() method. The assignment of vinfo.flags &= ~BRIDGE_VLAN_INFO_RANGE_BEGIN has no effect and is unneeded, as vinfo.flags value is overriden by the immediately following vinfo.flags = flags | BRIDGE_VLAN_INFO_RANGE_END assignement. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-24ipv4: consider TOS in fib_select_defaultJulian Anastasov
fib_select_default considers alternative routes only when res->fi is for the first alias in res->fa_head. In the common case this can happen only when the initial lookup matches the first alias with highest TOS value. This prevents the alternative routes to require specific TOS. This patch solves the problem as follows: - routes that require specific TOS should be returned by fib_select_default only when TOS matches, as already done in fib_table_lookup. This rule implies that depending on the TOS we can have many different lists of alternative gateways and we have to keep the last used gateway (fa_default) in first alias for the TOS instead of using single tb_default value. - as the aliases are ordered by many keys (TOS desc, fib_priority asc), we restrict the possible results to routes with matching TOS and lowest metric (fib_priority) and routes that match any TOS, again with lowest metric. For example, packet with TOS 8 can not use gw3 (not lowest metric), gw4 (different TOS) and gw6 (not lowest metric), all other gateways can be used: tos 8 via gw1 metric 2 <--- res->fa_head and res->fi tos 8 via gw2 metric 2 tos 8 via gw3 metric 3 tos 4 via gw4 tos 0 via gw5 tos 0 via gw6 metric 1 Reported-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-24ipv4: fib_select_default should match the prefixJulian Anastasov
fib_trie starting from 4.1 can link fib aliases from different prefixes in same list. Make sure the alternative gateways are in same table and for same prefix (0) by checking tb_id and fa_slen. Fixes: 79e5ad2ceb00 ("fib_trie: Remove leaf_info") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-23Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio/vhost fixes from Michael Tsirkin: "Bugfixes and documentation fixes. Igor's patch that allows users to tweak memory table size is borderline, but it does fix known crashes, so I merged it" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: add max_mem_regions module parameter vhost: extend memory regions allocation to vmalloc 9p/trans_virtio: reset virtio device on remove virtio/s390: rename drivers/s390/kvm -> drivers/s390/virtio MAINTAINERS: separate section for s390 virtio drivers virtio: define virtio_pci_cfg_cap in header. virtio: Fix typecast of pointer in vring_init() virtio scsi: fix unused variable warning vhost: use binary search instead of linear in find_region() virtio_net: document VIRTIO_NET_CTRL_GUEST_OFFLOADS
2015-07-23Bluetooth: Move IRK checking logic in preparation to new connect methodJakub Pawlowski
Move IRK checking logic in preparation to new connect method. Also make sure that MGMT_STATUS_INVALID_PARAMS is returned when non identity address is passed to ADD_DEVICE. Right now MGMT_STATUS_FAILED is returned, which might be misleading. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-07-23Bluetooth: __l2cap_wait_ack() add defensive timeoutDean Jenkins
Add a timeout to prevent the do while loop running in an infinite loop. This ensures that the channel will be instructed to close within 10 seconds so prevents l2cap_sock_shutdown() getting stuck forever. Returns -ENOLINK when the timeout is reached. The channel will be subequently closed and not all data will be ACK'ed. Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23Bluetooth: __l2cap_wait_ack() use msecs_to_jiffies()Dean Jenkins
Use msecs_to_jiffies() instead of using HZ so that it is easier to specify the time in milliseconds. Also add a #define L2CAP_WAIT_ACK_POLL_PERIOD to specify the 200ms polling period so that it is defined in a single place. Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23Bluetooth: Add BT_DBG to l2cap_sock_shutdown()Dean Jenkins
Add helpful BT_DBG debug to l2cap_sock_shutdown() and __l2cap_wait_ack() so that the code flow can be analysed. Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23Bluetooth: Make __l2cap_wait_ack more efficientDean Jenkins
Use chan->state instead of chan->conn because waiting for ACK's is only possible in the BT_CONNECTED state. Also avoids reference to the conn structure so makes locking easier. Only call __l2cap_wait_ack() when the needed condition of chan->unacked_frames > 0 && chan->state == BT_CONNECTED is true and convert the while loop to a do while loop. __l2cap_wait_ack() change the function prototype to pass in the chan variable as chan is already available in the calling function l2cap_sock_shutdown(). Avoids locking issues. Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23Bluetooth: L2CAP ERTM shutdown protect sk and chanDean Jenkins
During execution of l2cap_sock_shutdown() which might sleep, the sk and chan structures can be in an unlocked condition which potentially allows the structures to be freed by other running threads. Therefore, there is a possibility of a malfunction or memory reuse after being freed. Keep the sk and chan structures alive during the execution of l2cap_sock_shutdown() by using their respective hold and put functions. This allows the structures to be freeable at the end of l2cap_sock_shutdown(). Signed-off-by: Kautuk Consul <Kautuk_Consul@mentor.com> Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23mac802154: fix ieee802154_rx handlingVarka Bhadram
Instead of passing ieee802154_hw pointer to ieee802154_rx, we can directly pass the ieee802154_local pointer. Signed-off-by: Varka Bhadram <varkabhadram@gmail.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23mac802154: do not export ieee802154_rx()Varka Bhadram
Right now there are no other users for ieee802154_rx() in kernel. So lets remove EXPORT_SYMBOL() for this. Also it moves the function prototype from global header file to local header file. Signed-off-by: Varka Bhadram <varkabhadram@gmail.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23mac802154: cfg: add suspend and resume callbacksAlexander Aring
This patch introduces suspend and resume callbacks to mac802154. When doing suspend we calling the stop driver callback which should stop the receiving of frames. A transceiver should go into low-power mode then. Calling resume will call the start driver callback, which starts receiving again and allow to transmit frames. This was tested only with the fakelb driver and a qemu vm by doing the following commands: echo "devices" > /sys/power/pm_test echo "freeze" > /sys/power/state while doing some high traffic between two fakelb phys. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23cfg802154: add PM hooksVarka Bhadram
This patch help to implement suspend/resume in mac802154, these hooks will be run before the device is suspended and after it resumes. Signed-off-by: Varka Bhadram <varkab@cdac.in> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23mac802154: util: add stop_device utility functionAlexander Aring
This patch adds ieee802154_stop_device for preparing a utility function to stop the ieee802154 device. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23mac802154: remove unused macroVarka Bhadram
This patch removes the unused macro which was removed with the rework of linux-wpan kernel. Signed-off-by: Varka Bhadram <varkab@cdac.in> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-23mac802154: use WARN_ON() macroVarka Bhadram
This patch will generate the warning if the required driver ops were not defined. Also it removes unnecessary debug message. Signed-off-by: Varka Bhadram <varkab@cdac.in> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-07-236lowpan: add request for ipv6 moduleAlexander Aring
The iphc module depends on CONFIG_IPV6, because it's not very useful to build the module without IPv6 support. Recently an user reported about issues for setting an IPv6 address to a 6LoWPAN interface. The issues was solved by modprobe the ipv6 module before. To avoid such user issues we try to request the ipv6 module when the 6LoWPAN module is loaded. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>