summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2016-02-18net-sysfs: remove unused fmt_long_hexColin Ian King
Ever since commit 04ed3e741d0f133e02bed7fa5c98edba128f90e7 ("net: change netdev->features to u32") the format string fmt_long_hex has not been used, so we may as well remove it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17tcp: correctly crypto_alloc_hash return checkInsu Yun
crypto_alloc_hash never returns NULL Signed-off-by: Insu Yun <wuninsu@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17vlan: change return type of vlan_proc_rem_devZhang Shengju
Since function vlan_proc_rem_dev() will only return 0, it's better to return void instead of int. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17net: dsa: Unregister slave_dev in error pathFlorian Fainelli
With commit 0071f56e46da ("dsa: Register netdev before phy"), we are now trying to free a network device that has been previously registered, and in case of errors, this will make us hit the BUG_ON(dev->reg_state != NETREG_UNREGISTERED) condition. Fix this by adding a missing unregister_netdev() before free_netdev(). Fixes: 0071f56e46da ("dsa: Register netdev before phy") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18netfilter: ipvs: avoid unused variable warningsArnd Bergmann
The proc_create() and remove_proc_entry() functions do not reference their arguments when CONFIG_PROC_FS is disabled, so we get a couple of warnings about unused variables in IPVS: ipvs/ip_vs_app.c:608:14: warning: unused variable 'net' [-Wunused-variable] ipvs/ip_vs_ctl.c:3950:14: warning: unused variable 'net' [-Wunused-variable] ipvs/ip_vs_ctl.c:3994:14: warning: unused variable 'net' [-Wunused-variable] This removes the local variables and instead looks them up separately for each use, which obviously avoids the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 4c50a8ce2b63 ("netfilter: ipvs: avoid unused variable warning") Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-02-18netfilter: ipvs: Remove noisy debug print from ip_vs_del_serviceYannick Brosseau
This have been there for a long time, but does not seem to add value Signed-off-by: Yannick Brosseau <scientist@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-02-17ipv4: Remove inet_lro libraryBen Hutchings
There are no longer any in-tree drivers that use it. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17af_llc: fix types on llc_ui_wait_for_connOne Thousand Gnomes
The timeout is a long, we return it truncated if it is huge. Basically harmless as the only caller does a boolean check, but tidy it up anyway. (64bit build tested this time. Thank you 0day) Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17sctp: remove the unused sctp_datamsg_free()Xin Long
Since commit 8b570dc9f7b6 ("sctp: only drop the reference on the datamsg after sending a msg") used sctp_datamsg_put in sctp_sendmsg, instead of sctp_datamsg_free, this function has no use in sctp. So we will remove it. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17sctp: remove rcu_read_lock in sctp_seq_dump_remote_addrs()Xin Long
sctp_seq_dump_remote_addrs is only called by sctp_assocs_seq_show() and it has been protected by rcu_read_lock that is from rhashtable_walk_start(). So we will remove this one. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17sctp: move rcu_read_lock from __sctp_lookup_association to ↵Xin Long
sctp_lookup_association __sctp_lookup_association() is only invoked by sctp_v4_err() and sctp_rcv(), both which run on the rx BH, and it has been protected by rcu_read_lock [see ip_local_deliver_finish() / ipv6_rcv()]. So we can move it to sctp_lookup_association, only let sctp_lookup_association use rcu_read_lock. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17l2tp: Fix error creating L2TP tunnelsMark Tomlinson
A previous commit (33f72e6) added notification via netlink for tunnels when created/modified/deleted. If the notification returned an error, this error was returned from the tunnel function. If there were no listeners, the error code ESRCH was returned, even though having no listeners is not an error. Other calls to this and other similar notification functions either ignore the error code, or filter ESRCH. This patch checks for ESRCH and does not flag this as an error. Reviewed-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17core: remove unneded headers for net cgroup controllers.Rosen, Rami
commit 3ed80a6 (cgroup: drop module support) made including module.h redundant in the net cgroup controllers, netclassid_cgroup.c and netprio_cgroup.c. This patch removes them. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17auth_gss: fix panic in gss_pipe_downcall() in fips modeScott Mayhew
On Mon, 15 Feb 2016, Trond Myklebust wrote: > Hi Scott, > > On Mon, Feb 15, 2016 at 2:28 PM, Scott Mayhew <smayhew@redhat.com> wrote: > > md5 is disabled in fips mode, and attempting to import a gss context > > using md5 while in fips mode will result in crypto_alg_mod_lookup() > > returning -ENOENT, which will make its way back up to > > gss_pipe_downcall(), where the BUG() is triggered. Handling the -ENOENT > > allows for a more graceful failure. > > > > Signed-off-by: Scott Mayhew <smayhew@redhat.com> > > --- > > net/sunrpc/auth_gss/auth_gss.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c > > index 799e65b..c30fc3b 100644 > > --- a/net/sunrpc/auth_gss/auth_gss.c > > +++ b/net/sunrpc/auth_gss/auth_gss.c > > @@ -737,6 +737,9 @@ gss_pipe_downcall(struct file *filp, const char __user *src, size_t mlen) > > case -ENOSYS: > > gss_msg->msg.errno = -EAGAIN; > > break; > > + case -ENOENT: > > + gss_msg->msg.errno = -EPROTONOSUPPORT; > > + break; > > default: > > printk(KERN_CRIT "%s: bad return from " > > "gss_fill_context: %zd\n", __func__, err); > > -- > > 2.4.3 > > > > Well debugged, but I unfortunately do have to ask if this patch is > sufficient? In addition to -ENOENT, and -ENOMEM, it looks to me as if > crypto_alg_mod_lookup() can also fail with -EINTR, -ETIMEDOUT, and > -EAGAIN. Don't we also want to handle those? You're right, I was focusing on the panic that I could easily reproduce. I'm still not sure how I could trigger those other conditions. > > In fact, peering into the rats nest that is > gss_import_sec_context_kerberos(), it looks as if that is just a tiny > subset of all the errors that we might run into. Perhaps the right > thing to do here is to get rid of the BUG() (but keep the above > printk) and just return a generic error? That sounds fine to me -- updated patch attached. -Scott >From d54c6b64a107a90a38cab97577de05f9a4625052 Mon Sep 17 00:00:00 2001 From: Scott Mayhew <smayhew@redhat.com> Date: Mon, 15 Feb 2016 15:12:19 -0500 Subject: [PATCH] auth_gss: remove the BUG() from gss_pipe_downcall() Instead return a generic error via gss_msg->msg.errno. None of the errors returned by gss_fill_context() should necessarily trigger a kernel panic. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-02-17xprtrdma: rpcrdma_bc_receive_call() should init rq_private_buf.lenChuck Lever
Some NFSv4.1 OPEN requests were hanging waiting for the NFS server to finish recalling delegations. Turns out that each NFSv4.1 CB request on RDMA gets a GARBAGE_ARGS reply from the Linux client. Commit 756b9b37cfb2e3dc added a line in bc_svc_process that overwrites the incoming rq_rcv_buf's length with the value in rq_private_buf.len. But rpcrdma_bc_receive_call() does not invoke xprt_complete_bc_request(), thus rq_private_buf.len is not initialized. svc_process_common() is invoked with a zero-length RPC message, and fails. Fixes: 756b9b37cfb2e3dc ('SUNRPC: Fix callback channel') Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-02-17net: add tc offload feature flagJohn Fastabend
Its useful to turn off the qdisc offload feature at a per device level. This gives us a big hammer to enable/disable offloading. More fine grained control (i.e. per rule) may be supported later. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17net: sched: add cls_u32 offload hooks for netdevsJohn Fastabend
This patch allows netdev drivers to consume cls_u32 offloads via the ndo_setup_tc ndo op. This works aligns with how network drivers have been doing qdisc offloads for mqprio. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17net: rework setup_tc ndo op to consume general tc operandJohn Fastabend
This patch updates setup_tc so we can pass additional parameters into the ndo op in a generic way. To do this we provide structured union and type flag. This lets each classifier and qdisc provide its own set of attributes without having to add new ndo ops or grow the signature of the callback. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17net: rework ndo tc op to consume additional qdisc handle parameterJohn Fastabend
The ndo_setup_tc() op was added to support drivers offloading tx qdiscs however only support for mqprio was ever added. So we only ever added support for passing the number of traffic classes to the driver. This patch generalizes the ndo_setup_tc op so that a handle can be provided to indicate if the offload is for ingress or egress or potentially even child qdiscs. CC: Murali Karicheri <m-karicheri2@ti.com> CC: Shradha Shah <sshah@solarflare.com> CC: Or Gerlitz <ogerlitz@mellanox.com> CC: Ariel Elior <ariel.elior@qlogic.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Bruce Allan <bruce.w.allan@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16net: Export ip fragment sysctl to unprivileged usersNikolay Borisov
Now that all the ip fragmentation related sysctls are namespaceified there is no reason to hide them anymore from "root" users inside containers. Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ipv4: namespacify ip fragment max dist sysctl knobNikolay Borisov
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ipv4: namespacify ip_early_demux sysctl knobNikolay Borisov
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ipv4: Namespacify ip_dynaddr sysctl knobNikolay Borisov
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16igmp: net: Move igmp namespace init to correct fileNikolay Borisov
When igmp related sysctl were namespacified their initializatin was erroneously put into the tcp socket namespace constructor. This patch moves the relevant code into the igmp namespace constructor to keep things consistent. Also sprinkle some #ifdefs to silence warnings Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ipv4: Namespaceify ip_default_ttl sysctl knobNikolay Borisov
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16tcp: add tcpi_min_rtt and tcpi_notsent_bytes to tcp_infoEric Dumazet
tcpi_min_rtt reports the minimal rtt observed by TCP stack for the flow, in usec unit. Might be ~0U if not yet known. tcpi_notsent_bytes reports the amount of bytes in the write queue that were not yet sent. This is done in a single patch to not add a temporary 32bit padding hole in tcp_info. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16tcp: md5: release request socket instead of listenerEric Dumazet
If tcp_v4_inbound_md5_hash() returns an error, we must release the refcount on the request socket, not on the listener. The bug was added for IPv4 only. Fixes: 079096f103fac ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16net/ipv4: add dst cache support for gre lwtunnelsPaolo Abeni
In case of UDP traffic with datagram length below MTU this gives about 4% performance increase Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16net: add dst_cache to ovs vxlan lwtunnelPaolo Abeni
In case of UDP traffic with datagram length below MTU this give about 2% performance increase when tunneling over ipv4 and about 60% when tunneling over ipv6 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ip_tunnel: replace dst_cache with generic implementationPaolo Abeni
The current ip_tunnel cache implementation is prone to a race that will cause the wrong dst to be cached on cuncurrent dst cache miss and ip tunnel update via netlink. Replacing with the generic implementation fix the issue. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16net: replace dst_cache ip6_tunnel implementation with the generic onePaolo Abeni
This also fix a potential race into the existing tunnel code, which could lead to the wrong dst to be permanenty cached: CPU1: CPU2: <xmit on ip6_tunnel> <cache lookup fails> dst = ip6_route_output(...) <tunnel params are changed via nl> dst_cache_reset() // no effect, // the cache is empty dst_cache_set() // the wrong dst // is permanenty stored // into the cache With the new dst implementation the above race is not possible since the first cache lookup after dst_cache_reset will fail due to the timestamp check Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16net: add dst_cache supportPaolo Abeni
This patch add a generic, lockless dst cache implementation. The need for lock is avoided updating the dst cache fields only in per cpu scope, and requiring that the cache manipulation functions are invoked with the local bh disabled. The refresh_ts and reset_ts fields are used to ensure the cache consistency in case of cuncurrent cache update (dst_cache_set*) and reset operation (dst_cache_reset). Consider the following scenario: CPU1: CPU2: <cache lookup with emtpy cache: it fails> <get dst via uncached route lookup> <related configuration changes> dst_cache_reset() dst_cache_set() The dst entry set passed to dst_cache_set() should not be used for later dst cache lookup, because it's obtained using old configuration values. Since the refresh_ts is updated only on dst_cache lookup, the cached value in the above scenario will be discarded on the next lookup. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16tcp: do not set rtt_min to 1Eric Dumazet
There are some cases where rtt_us derives from deltas of jiffies, instead of using usec timestamps. Since we want to track minimal rtt, better to assume a delta of 0 jiffie might be in fact be very close to 1 jiffie. It is kind of sad jiffies_to_usecs(1) calls a function instead of simply using a constant. Fixes: f672258391b42 ("tcp: track min RTT using windowed min-filter") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16net: dsa: remove phy_disconnect from error pathSascha Hauer
The phy has not been initialized, disconnecting it in the error path results in a NULL pointer exception. Drop the phy_disconnect from the error path. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16tipc: refactor node xmit and fix memory leaksRichard Alpe
Refactor tipc_node_xmit() to fail fast and fail early. Fix several potential memory leaks in unexpected error paths. Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16tipc: fix premature addition of node to lookup tableJon Paul Maloy
In commit 5266698661401a ("tipc: let broadcast packet reception use new link receive function") we introduced a new per-node broadcast reception link instance. This link is created at the moment the node itself is created. Unfortunately, the allocation is done after the node instance has already been added to the node lookup hash table. This creates a potential race condition, where arriving broadcast packets are able to find and access the node before it has been fully initialized, and before the above mentioned link has been created. The result is occasional crashes in the function tipc_bcast_rcv(), which is trying to access the not-yet existing link. We fix this by deferring the addition of the node instance until after it has been fully initialized in the function tipc_node_create(). Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16bridge: mdb: avoid uninitialized variable warningArnd Bergmann
A recent change to the mdb code confused the compiler to the point where it did not realize that the port-group returned from br_mdb_add_group() is always valid when the function returns a nonzero return value, so we get a spurious warning: net/bridge/br_mdb.c: In function 'br_mdb_add': net/bridge/br_mdb.c:542:4: error: 'pg' may be used uninitialized in this function [-Werror=maybe-uninitialized] __br_mdb_notify(dev, entry, RTM_NEWMDB, pg); Slightly rearranging the code in br_mdb_add_group() makes the problem go away, as gcc is clever enough to see that both functions check for 'ret != 0'. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 9e8430f8d60d ("bridge: mdb: Passing the port-group pointer to br_mdb module") Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16net: Copy inner L3 and L4 headers as unaligned on GRE TEBAlexander Duyck
This patch corrects the unaligned accesses seen on GRE TEB tunnels when generating hash keys. Specifically what this patch does is make it so that we force the use of skb_copy_bits when the GRE inner headers will be unaligned due to NET_IP_ALIGNED being a non-zero value. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ethtool: ensure channel counts are within bounds during SCHANNELSKeller, Jacob E
Add a sanity check to ensure that all requested channel sizes are within bounds, which should reduce errors in driver implementation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH}Keller, Jacob E
Ethernet drivers implementing both {GS}RXFH and {GS}CHANNELS ethtool ops incorrectly allow SCHANNELS when it would conflict with the settings from SRXFH. This occurs because it is not possible for drivers to understand whether their Rx flow indirection table has been configured or is in the default state. In addition, drivers currently behave in various ways when increasing the number of Rx channels. Some drivers will always destroy the Rx flow indirection table when this occurs, whether it has been set by the user or not. Other drivers will attempt to preserve the table even if the user has never modified it from the default driver settings. Neither of these situation is desirable because it leads to unexpected behavior or loss of user configuration. The correct behavior is to simply return -EINVAL when SCHANNELS would conflict with the current Rx flow table settings. However, it should only do so if the current settings were modified by the user. If we required that the new settings never conflict with the current (default) Rx flow settings, we would force users to first reduce their Rx flow settings and then reduce the number of Rx channels. This patch proposes a solution implemented in net/core/ethtool.c which ensures that all drivers behave correctly. It checks whether the RXFH table has been configured to non-default settings, and stores this information in a private netdev flag. When the number of channels is requested to change, it first ensures that the current Rx flow table is not going to assign flows to now disabled channels. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contain a rather large batch for your net that includes accumulated bugfixes, they are: 1) Run conntrack cleanup from workqueue process context to avoid hitting soft lockup via watchdog for large tables. This is required by the IPv6 masquerading extension. From Florian Westphal. 2) Use original skbuff from nfnetlink batch when calling netlink_ack() on error since this needs to access the skb->sk pointer. 3) Incremental fix on top of recent Sasha Levin's lock fix for conntrack resizing. 4) Fix several problems in nfnetlink batch message header sanitization and error handling, from Phil Turnbull. 5) Select NF_DUP_IPV6 based on CONFIG_IPV6, from Arnd Bergmann. 6) Fix wrong signess in return values on nf_tables counter expression, from Anton Protopopov. Due to the NetDev 1.1 organization burden, I had no chance to pass up this to you any sooner in this release cycle, sorry about that. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16af_unix: Guard against other == sk in unix_dgram_sendmsgRainer Weikusat
The unix_dgram_sendmsg routine use the following test if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) { to determine if sk and other are in an n:1 association (either established via connect or by using sendto to send messages to an unrelated socket identified by address). This isn't correct as the specified address could have been bound to the sending socket itself or because this socket could have been connected to itself by the time of the unix_peer_get but disconnected before the unix_state_lock(other). In both cases, the if-block would be entered despite other == sk which might either block the sender unintentionally or lead to trying to unlock the same spin lock twice for a non-blocking send. Add a other != sk check to guard against this. Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue") Reported-By: Philipp Hahn <pmhahn@pmhahn.de> Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Tested-by: Philipp Hahn <pmhahn@pmhahn.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16af_unix: Don't set err in unix_stream_read_generic unless there was an errorRainer Weikusat
The present unix_stream_read_generic contains various code sequences of the form err = -EDISASTER; if (<test>) goto out; This has the unfortunate side effect of possibly causing the error code to bleed through to the final out: return copied ? : err; and then to be wrongly returned if no data was copied because the caller didn't supply a data buffer, as demonstrated by the program available at http://pad.lv/1540731 Change it such that err is only set if an error condition was detected. Fixes: 3822b5c2fc62 ("af_unix: Revert 'lock_interruptible' in stream receive code") Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com> Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16batman-adv: Avoid endless loop in bat-on-bat netdevice checkAndrew Lunn
batman-adv checks in different situation if a new device is already on top of a different batman-adv device. This is done by getting the iflink of a device and all its parent. It assumes that this iflink is always a parent device in an acyclic graph. But this assumption is broken by devices like veth which are actually a pair of two devices linked to each other. The recursive check would therefore get veth0 when calling dev_get_iflink on veth1. And it gets veth0 when calling dev_get_iflink with veth1. Creating a veth pair and loading batman-adv freezes parts of the system ip link add veth0 type veth peer name veth1 modprobe batman-adv An RCU stall will be detected on the system which cannot be fixed. INFO: rcu_sched self-detected stall on CPU 1: (5264 ticks this GP) idle=3e9/140000000000001/0 softirq=144683/144686 fqs=5249 (t=5250 jiffies g=46 c=45 q=43) Task dump for CPU 1: insmod R running task 0 247 245 0x00000008 ffffffff8151f140 ffffffff8107888e ffff88000fd141c0 ffffffff8151f140 0000000000000000 ffffffff81552df0 ffffffff8107b420 0000000000000001 ffff88000e3fa700 ffffffff81540b00 ffffffff8107d667 0000000000000001 Call Trace: <IRQ> [<ffffffff8107888e>] ? rcu_dump_cpu_stacks+0x7e/0xd0 [<ffffffff8107b420>] ? rcu_check_callbacks+0x3f0/0x6b0 [<ffffffff8107d667>] ? hrtimer_run_queues+0x47/0x180 [<ffffffff8107cf9d>] ? update_process_times+0x2d/0x50 [<ffffffff810873fb>] ? tick_handle_periodic+0x1b/0x60 [<ffffffff810290ae>] ? smp_trace_apic_timer_interrupt+0x5e/0x90 [<ffffffff813bbae2>] ? apic_timer_interrupt+0x82/0x90 <EOI> [<ffffffff812c3fd7>] ? __dev_get_by_index+0x37/0x40 [<ffffffffa0031f3e>] ? batadv_hard_if_event+0xee/0x3a0 [batman_adv] [<ffffffff812c5801>] ? register_netdevice_notifier+0x81/0x1a0 [...] This can be avoided by checking if two devices are each others parent and stopping the check in this situation. Fixes: b7eddd0b3950 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface") Signed-off-by: Andrew Lunn <andrew@lunn.ch> [sven@narfation.org: rewritten description, extracted fix] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-02-16batman-adv: Only put orig_node_vlan list reference when removedSven Eckelmann
The batadv_orig_node_vlan reference counter in batadv_tt_global_size_mod can only be reduced when the list entry was actually removed. Otherwise the reference counter may reach zero when batadv_tt_global_size_mod is called from two different contexts for the same orig_node_vlan but only one context is actually removing the entry from the list. The release function for this orig_node_vlan is not called inside the vlan_list_lock spinlock protected region because the function batadv_tt_global_size_mod still holds a orig_node_vlan reference for the object pointer on the stack. Thus the actual release function (when required) will be called only at the end of the function. Fixes: 7ea7b4a14275 ("batman-adv: make the TT CRC logic VLAN specific") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-02-16batman-adv: Only put gw_node list reference when removedSven Eckelmann
The batadv_gw_node reference counter in batadv_gw_node_update can only be reduced when the list entry was actually removed. Otherwise the reference counter may reach zero when batadv_gw_node_update is called from two different contexts for the same gw_node but only one context is actually removing the entry from the list. The release function for this gw_node is not called inside the list_lock spinlock protected region because the function batadv_gw_node_update still holds a gw_node reference for the object pointer on the stack. Thus the actual release function (when required) will be called only at the end of the function. Fixes: bd3524c14bd0 ("batman-adv: remove obsolete deleted attribute for gateway node") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-02-16mm/gup: Switch all callers of get_user_pages() to not pass tsk/mmDave Hansen
We will soon modify the vanilla get_user_pages() so it can no longer be used on mm/tasks other than 'current/current->mm', which is by far the most common way it is called. For now, we allow the old-style calls, but warn when they are used. (implemented in previous patch) This patch switches all callers of: get_user_pages() get_user_pages_unlocked() get_user_pages_locked() to stop passing tsk/mm so they will no longer see the warnings. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: jack@suse.cz Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210156.113E9407@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-15treewide: Fix typo in printkMasanari Iida
This patch fix spelling typos found in printk and Kconfig. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-02-14Merge 4.5-rc4 into tty-nextGreg Kroah-Hartman
We want the fixes in here, and this resolves a merge error in tty_io.c Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-13vsock: Fix blocking ops call in prepare_to_waitLaura Abbott
We receoved a bug report from someone using vmware: WARNING: CPU: 3 PID: 660 at kernel/sched/core.c:7389 __might_sleep+0x7d/0x90() do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810fa68d>] prepare_to_wait+0x2d/0x90 Modules linked in: vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event snd_ens1371 iosf_mbi gameport snd_rawmidi snd_ac97_codec ac97_bus snd_seq coretemp snd_seq_device snd_pcm snd_timer snd soundcore ppdev crct10dif_pclmul crc32_pclmul ghash_clmulni_intel vmw_vmci vmw_balloon i2c_piix4 shpchp parport_pc parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs xor raid6_pq 8021q garp stp llc mrp crc32c_intel serio_raw mptspi vmwgfx drm_kms_helper ttm drm scsi_transport_spi mptscsih e1000 ata_generic mptbase pata_acpi CPU: 3 PID: 660 Comm: vmtoolsd Not tainted 4.2.0-0.rc1.git3.1.fc23.x86_64 #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 0000000000000000 0000000049e617f3 ffff88006ac37ac8 ffffffff818641f5 0000000000000000 ffff88006ac37b20 ffff88006ac37b08 ffffffff810ab446 ffff880068009f40 ffffffff81c63bc0 0000000000000061 0000000000000000 Call Trace: [<ffffffff818641f5>] dump_stack+0x4c/0x65 [<ffffffff810ab446>] warn_slowpath_common+0x86/0xc0 [<ffffffff810ab4d5>] warn_slowpath_fmt+0x55/0x70 [<ffffffff8112551d>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [<ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90 [<ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90 [<ffffffff810da2bd>] __might_sleep+0x7d/0x90 [<ffffffff812163b3>] __might_fault+0x43/0xa0 [<ffffffff81430477>] copy_from_iter+0x87/0x2a0 [<ffffffffa039460a>] __qp_memcpy_to_queue+0x9a/0x1b0 [vmw_vmci] [<ffffffffa0394740>] ? qp_memcpy_to_queue+0x20/0x20 [vmw_vmci] [<ffffffffa0394757>] qp_memcpy_to_queue_iov+0x17/0x20 [vmw_vmci] [<ffffffffa0394d50>] qp_enqueue_locked+0xa0/0x140 [vmw_vmci] [<ffffffffa039593f>] vmci_qpair_enquev+0x4f/0xd0 [vmw_vmci] [<ffffffffa04847bb>] vmci_transport_stream_enqueue+0x1b/0x20 [vmw_vsock_vmci_transport] [<ffffffffa047ae05>] vsock_stream_sendmsg+0x2c5/0x320 [vsock] [<ffffffff810fabd0>] ? wake_atomic_t_function+0x70/0x70 [<ffffffff81702af8>] sock_sendmsg+0x38/0x50 [<ffffffff81702ff4>] SYSC_sendto+0x104/0x190 [<ffffffff8126e25a>] ? vfs_read+0x8a/0x140 [<ffffffff817042ee>] SyS_sendto+0xe/0x10 [<ffffffff8186d9ae>] entry_SYSCALL_64_fastpath+0x12/0x76 transport->stream_enqueue may call copy_to_user so it should not be called inside a prepare_to_wait. Narrow the scope of the prepare_to_wait to avoid the bad call. This also applies to vsock_stream_recvmsg as well. Reported-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Laura Abbott <labbott@fedoraproject.org> Signed-off-by: David S. Miller <davem@davemloft.net>