summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2016-06-24netfilter: nf_tables: add generation mask to chainsPablo Neira Ayuso
Similar to ("netfilter: nf_tables: add generation mask to tables"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-24netfilter: nf_tables: add generation mask to tablesPablo Neira Ayuso
This patch addresses two problems: 1) The netlink dump is inconsistent when interfering with an ongoing transaction update for several reasons: 1.a) We don't honor the internal NFT_TABLE_INACTIVE flag, and we should be skipping these inactive objects in the dump. 1.b) We perform speculative deletion during the preparation phase, that may result in skipping active objects. 1.c) The listing order changes, which generates noise when tracking incremental ruleset update via tools like git or our own testsuite. 2) We don't allow to add and to update the object in the same batch, eg. add table x; add table x { flags dormant\; }. In order to resolve these problems: 1) If the user requests a deletion, the object becomes inactive in the next generation. Then, ignore objects that scheduled to be deleted from the lookup path, as they will be effectively removed in the next generation. 2) From the get/dump path, if the object is not currently active, we skip it. 3) Support 'add X -> update X' sequence from a transaction. After this update, we obtain a consistent list as long as we stay in the same generation. The userspace side can detect interferences through the generation counter so it can restart the dumping. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-24netfilter: nf_tables: add generic macros to check for generation maskPablo Neira Ayuso
Thus, we can reuse these to check the genmask of any object type, not only rules. This is required now that tables, chain and sets will get a generation mask field too in follow up patches. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-24netfilter: xt_NFLOG: nflog-range does not truncate packetsVishwanath Pai
li->u.ulog.copy_len is currently ignored by the kernel, we should truncate the packet to either li->u.ulog.copy_len (if set) or copy_range before sending it to userspace. 0 is a valid input for copy_len, so add a new flag to indicate whether this was option was specified by the user or not. Add two flags to indicate whether nflog-size/copy_len was set or not. XT_NFLOG_F_COPY_LEN is for XT_NFLOG and NFLOG_F_COPY_LEN for nfnetlink_log On the userspace side, this was initially represented by the option nflog-range, this will be replaced by --nflog-size now. --nflog-range would still exist but does not do anything. Reported-by: Joe Dollard <jdollard@akamai.com> Reviewed-by: Josh Hunt <johunt@akamai.com> Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-24netfilter: nf_reject_ipv4: don't send tcp RST if the packet is non-TCPLiping Zhang
In iptables, if the user add a rule to send tcp RST and specify the non-TCP protocol, such as UDP, kernel will reject this request. But in nftables, this validity check only occurs in nft tool, i.e. only in userspace. This means that user can add such a rule like follows via nfnetlink: "nft add rule filter forward ip protocol udp reject with tcp reset" This will generate some confusing tcp RST packets. So we should send tcp RST only when it is TCP packet. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23vfs: Pass data, ns, and ns->userns to mount_nsEric W. Biederman
Today what is normally called data (the mount options) is not passed to fill_super through mount_ns. Pass the mount options and the namespace separately to mount_ns so that filesystems such as proc that have mount options, can use mount_ns. Pass the user namespace to mount_ns so that the standard permission check that verifies the mounter has permissions over the namespace can be performed in mount_ns instead of in each filesystems .mount method. Thus removing the duplication between mqueuefs and proc in terms of permission checks. The extra permission check does not currently affect the rpc_pipefs filesystem and the nfsd filesystem as those filesystems do not currently allow unprivileged mounts. Without unpvileged mounts it is guaranteed that the caller has already passed capable(CAP_SYS_ADMIN) which guarantees extra permission check will pass. Update rpc_pipefs and the nfsd filesystem to ensure that the network namespace reference is always taken in fill_super and always put in kill_sb so that the logic is simpler and so that errors originating inside of fill_super do not cause a network namespace leak. Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-06-23netem: fix a use after freeEric Dumazet
If the packet was dropped by lower qdisc, then we must not access it later. Save qdisc_pkt_len(skb) in a temp variable. Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23act_ife: acquire ife_mod_lock before reading ifeoplistWANG Cong
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23act_ife: only acquire tcf_lock for existing actionsWANG Cong
Alexey reported that we have GFP_KERNEL allocation when holding the spinlock tcf_lock. Actually we don't have to take that spinlock for all the cases, especially for the new one we just create. To modify the existing actions, we still need this spinlock to make sure the whole update is atomic. For net-next, we can get rid of this spinlock because we already hold the RTNL lock on slow path, and on fast path we can use RCU to protect the metalist. Joint work with Jamal. Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23esp: Fix ESN generation under UDP encapsulationHerbert Xu
Blair Steven noticed that ESN in conjunction with UDP encapsulation is broken because we set the temporary ESP header to the wrong spot. This patch fixes this by first of all using the right spot, i.e., 4 bytes off the real ESP header, and then saving this information so that after encryption we can restore it properly. Fixes: 7021b2e1cddd ("esp4: Switch to new AEAD interface") Reported-by: Blair Steven <Blair.Steven@alliedtelesis.co.nz> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23netfilter: nft_meta: set skb->nf_trace appropriatelyLiping Zhang
When user add a nft rule to set nftrace to zero, for example: # nft add rule ip filter input nftrace set 0 We should set nf_trace to zero also. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23netfilter: nf_tables: fix memory leak if expr init failsLiping Zhang
If expr init fails then we need to free it. So when the user add a nft rule as follows: # nft add rule filter input tcp dport 22 flow table ssh \ { ip saddr limit rate 0/second } memory leak will happen. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23netfilter: Allow xt_owner in any user namespaceEric W. Biederman
Making this work is a little tricky as it really isn't kosher to change the xt_owner_match_info in a check function. Without changing xt_owner_match_info we need to know the user namespace the uids and gids are specified in. In the common case net->user_ns == current_user_ns(). Verify net->user_ns == current_user_ns() in owner_check so we can later assume it in owner_mt. In owner_check also verify that all of the uids and gids specified are in net->user_ns and that the expected min/max relationship exists between the uids and gids in xt_owner_match_info. In owner_mt get the network namespace from the outgoing socket, as this must be the same network namespace as the netfilter rules, and use that network namespace to find the user namespace the uids and gids in xt_match_owner_info are encoded in. Then convert from their encoded from into the kernel internal format for uids and gids and perform the owner match. Similar to ping_group_range, this code does not try to detect noncontiguous UID/GID ranges. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23netfilter: move zone info into struct nf_connFlorian Westphal
Curently we store zone information as a conntrack extension. This has one drawback: for every lookup we need to fetch the zone data from the extension area. This change place the zone data directly into the main conntrack object structure and then removes the zone conntrack extension. The zone data is just 4 bytes, it fits into a padding hole before the tuplehash info, so we do not even increase the nf_conn structure size. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23netfilter: nf_log: Remove NULL checkShivani Bhardwaj
If 'logger' was NULL, there would be a direct jump to the label 'out', since it has already been checked for NULL, remove this unnecessary check. Signed-off-by: Shivani Bhardwaj <shivanib134@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23netfilter: conntrack: align nf_conn on cacheline boundaryFlorian Westphal
increases struct size by 32 bytes (288 -> 320), but it is the right thing, else any attempt to (re-)arrange nf_conn members by cacheline won't work. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23netfilter: xt_TRACE: add explicitly nf_logger_find_get callLiping Zhang
Consider such situation, if nf_log_ipv4 kernel module is not installed, and the user add a following iptables rule: # iptables -t raw -I PREROUTING -j TRACE There will be no trace log generated until the user install nf_log_ipv4 module manully. So we should add request related nf_log module appropriately here. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23netfilter: nf_log: handle NFPROTO_INET properly in nf_logger_[find_get|put]Liping Zhang
When we request NFPROTO_INET, it means both NFPROTO_IPV4 and NFPROTO_IPV6. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23netfilter: x_tables: fix possible ZERO_SIZE_PTR pointer dereferencing error.Xiubo Li
Since we cannot make sure that the 'hook_mask' will always be none zero here. If it equals to zero, the num_hooks will be zero too, and then kmalloc() will return ZERO_SIZE_PTR, which is (void *)16. Then the following error check will fails: ops = kmalloc(sizeof(*ops) * num_hooks, GFP_KERNEL); if (ops == NULL) return ERR_PTR(-ENOMEM); So this patch will fix this with just doing the zero check before kmalloc() is called. Maybe the case above will never happen here, but in theory. Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-06-23can: only call can_stat_update with procfsArnd Bergmann
The change to leave out procfs support in CAN when CONFIG_PROC_FS is not set was incomplete and leads to a build error: net/built-in.o: In function `can_init': :(.init.text+0x9858): undefined reference to `can_stat_update' ERROR: "can_stat_update" [net/can/can.ko] undefined! This tries a better approach, encapsulating all of the calls within IS_ENABLED(), so we also leave out the timer function from the object file. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: a20fadf85312 ("can: build proc support only if CONFIG_PROC_FS is activated") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-06-22openvswitch: Add packet len info to upcall.William Tu
The commit f2a4d086ed4c ("openvswitch: Add packet truncation support.") introduces packet truncation before sending to userspace upcall receiver. This patch passes up the skb->len before truncation so that the upcall receiver knows the original packet size. Potentially this will be used by sFlow, where OVS translates sFlow config header=N to a sample action, truncating packet to N byte in kernel datapath. Thus, only N bytes instead of full-packet size is copied from kernel to userspace, saving the kernel-to-userspace bandwidth. Signed-off-by: William Tu <u9012063@gmail.com> Cc: Pravin Shelar <pshelar@nicira.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-22tipc: unclone unbundled buffers before forwardingJon Paul Maloy
When extracting an individual message from a received "bundle" buffer, we just create a clone of the base buffer, and adjust it to point into the right position of the linearized data area of the latter. This works well for regular message reception, but during periods of extremely high load it may happen that an extracted buffer, e.g, a connection probe, is reversed and forwarded through an external interface while the preceding extracted message is still unhandled. When this happens, the header or data area of the preceding message will be partially overwritten by a MAC header, leading to unpredicatable consequences, such as a link reset. We now fix this by ensuring that the msg_reverse() function never returns a cloned buffer, and that the returned buffer always contains sufficient valid head and tail room to be forwarded. Reported-by: Erik Hugne <erik.hugne@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-22kcm: fix /proc memory leakJiri Slaby
Every open of /proc/net/kcm leaks 16 bytes of memory as is reported by kmemleak: unreferenced object 0xffff88059c0e3458 (size 192): comm "cat", pid 1401, jiffies 4294935742 (age 310.720s) hex dump (first 32 bytes): 28 45 71 96 05 88 ff ff 00 10 00 00 00 00 00 00 (Eq............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8156a2de>] kmem_cache_alloc_trace+0x16e/0x230 [<ffffffff8162a479>] seq_open+0x79/0x1d0 [<ffffffffa0578510>] kcm_seq_open+0x0/0x30 [kcm] [<ffffffff8162a479>] seq_open+0x79/0x1d0 [<ffffffff8162a8cf>] __seq_open_private+0x2f/0xa0 [<ffffffff81712548>] seq_open_net+0x38/0xa0 ... It is caused by a missing free in the ->release path. So fix it by providing seq_release_net as the ->release method. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Fixes: cd6e111bf5 (kcm: Add statistics and proc interfaces) Cc: "David S. Miller" <davem@davemloft.net> Cc: Tom Herbert <tom@herbertland.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-22rxrpc: Kill off the rxrpc_transport structDavid Howells
The rxrpc_transport struct is now redundant, given that the rxrpc_peer struct is now per peer port rather than per peer host, so get rid of it. Service connection lists are transferred to the rxrpc_peer struct, as is the conn_lock. Previous patches moved the client connection handling out of the rxrpc_transport struct and discarded the connection bundling code. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Kill the client connection bundle conceptDavid Howells
Kill off the concept of maintaining a bundle of connections to a particular target service to increase the number of call slots available for any beyond four for that service (there are four call slots per connection). This will make cleaning up the connection handling code easier and facilitate removal of the rxrpc_transport struct. Bundling can be reintroduced later if necessary. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Provide more refcount helper functionsDavid Howells
Provide refcount helper functions for connections so that the code doesn't touch local or connection usage counts directly. Also make it such that local and peer put functions can take a NULL pointer. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Make rxrpc_send_packet() take a connection not a transportDavid Howells
Make rxrpc_send_packet() take a connection not a transport as part of the phasing out of the rxrpc_transport struct. Whilst we're at it, rename the function to rxrpc_send_data_packet() to differentiate it from the other packet sending functions. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Calls displayed in /proc may in future lack a connectionDavid Howells
Allocated rxrpc calls displayed in /proc/net/rxrpc_calls may in future be on the proc list before they're connected or after they've been disconnected - in which case they may not have a pointer to a connection struct that can be used to get data from there. Deal with this by using stuff from the call struct in preference where possible and printing "no_connection" rather than a peer address if no connection is assigned. This change also has the added bonus that the service ID is now taken from the call rather the connection which will allow per-call service upgrades to be shown - something required for AuriStor server compatibility. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Validate the net address given to rxrpc_kernel_begin_call()David Howells
Validate the net address given to rxrpc_kernel_begin_call() before using it. Whilst this should be mostly unnecessary for in-kernel users, it does clear the tail of the address struct in case we want to hash or compare the whole thing. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Use IDR to allocate client conn IDs on a machine-wide basisDavid Howells
Use the IDR facility to allocate client connection IDs on a machine-wide basis so that each client connection has a unique identifier. When the connection ID space wraps, we advance the epoch by 1, thereby effectively having a 62-bit ID space. The IDR facility is then used to look up client connections during incoming packet routing instead of using an rbtree rooted on the transport. This change allows for the removal of the transport in the future and also means that client connections can be looked up directly in the data-ready handler by connection ID. The ID management code is placed in a new file, conn-client.c, to which all the client connection-specific code will eventually move. Note that the IDR tree gets very expensive on memory if the connection IDs are widely scattered throughout the number space, so we shall need to retire connections that have, say, an ID more than four times the maximum number of client conns away from the current allocation point to try and keep the IDs concentrated. We will also need to retire connections from an old epoch. Also note that, for the moment, a pointer to the transport has to be passed through into the ID allocation function so that we can take a BH lock to prevent a locking issue against in-BH lookup of client connections. This will go away later when RCU is used for server connections also. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: rxrpc_connection_lock shouldn't be a BH lock, but conn_lock isDavid Howells
rxrpc_connection_lock shouldn't be accessed as a BH-excluding lock. It's only accessed in a few places and none of those are in BH-context. rxrpc_transport::conn_lock, however, *is* a BH-excluding lock and should be accessed so consistently. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Pass sk_buff * rather than rxrpc_host_header * to functionsDavid Howells
Pass a pointer to struct sk_buff rather than struct rxrpc_host_header to functions so that they can in the future get at transport protocol parameters rather than just RxRPC parameters. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Fix exclusive connection handlingDavid Howells
"Exclusive connections" are meant to be used for a single client call and then scrapped. The idea is to limit the use of the negotiated security context. The current code, however, isn't doing this: it is instead restricting the socket to a single virtual connection and doing all the calls over that. This is changed such that the socket no longer maintains a special virtual connection over which it will do all the calls, but rather gets a new one each time a new exclusive call is made. Further, using a socket option for this is a poor choice. It should be done on sendmsg with a control message marker instead so that calls can be marked exclusive individually. To that end, add RXRPC_EXCLUSIVE_CALL which, if passed to sendmsg() as a control message element, will cause the call to be done on an single-use connection. The socket option (RXRPC_EXCLUSIVE_CONNECTION) still exists and, if set, will override any lack of RXRPC_EXCLUSIVE_CALL being specified so that programs using the setsockopt() will appear to work the same. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Replace conn->trans->{local,peer} with conn->params.{local,peer}David Howells
Replace accesses of conn->trans->{local,peer} with conn->params.{local,peer} thus making it easier for a future commit to remove the rxrpc_transport struct. This also reduces the number of memory accesses involved. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: Use structs to hold connection params and protocol infoDavid Howells
Define and use a structure to hold connection parameters. This makes it easier to pass multiple connection parameters around. Define and use a structure to hold protocol information used to hash a connection for lookup on incoming packet. Most of these fields will be disposed of eventually, including the duplicate local pointer. Whilst we're at it rename "proto" to "family" when referring to a protocol family. Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: fix uninitialized variable useArnd Bergmann
Hashing the peer key was introduced for AF_INET, but gcc warns about the rxrpc_peer_hash_key function returning uninitialized data for any other value of srx->transport.family: net/rxrpc/peer_object.c: In function 'rxrpc_peer_hash_key': net/rxrpc/peer_object.c:57:15: error: 'p' may be used uninitialized in this function [-Werror=maybe-uninitialized] Assuming that nothing else can be set here, this changes the function to just return zero in case of an unknown address family. Fixes: be6e6707f6ee ("rxrpc: Rework peer object handling to use hash table and RCU") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-22rxrpc: checking for IS_ERR() instead of NULLDan Carpenter
rxrpc_lookup_peer_rcu() and rxrpc_lookup_peer() return NULL on error, never error pointers, so IS_ERR() can't be used. Fix three callers of those functions. Fixes: be6e6707f6ee ('rxrpc: Rework peer object handling to use hash table and RCU') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-18ipv6: RFC 4884 partial support for SIT/GRE tunnelsEric Dumazet
When receiving an ICMPv4 message containing extensions as defined in RFC 4884, and translating it to ICMPv6 at SIT or GRE tunnel, we need some extra manipulation in order to properly forward the extensions. This patch only takes care of Time Exceeded messages as they are the ones that typically carry information from various routers in a fabric during a traceroute session. It also avoids complex skb logic if the data_len is not a multiple of 8. RFC states : The "original datagram" field MUST contain at least 128 octets. If the original datagram did not contain 128 octets, the "original datagram" field MUST be zero padded to 128 octets. In practice routers use 128 bytes of original datagram, not more. Initial translation was added in commit ca15a078bd90 ("sit: generate icmpv6 error when receiving icmpv4 error") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Oussama Ghorbel <ghorbel@pivasoftware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-18gre: better support for ICMP messages for gre+ipv6Eric Dumazet
ipgre_err() can call ip6_err_gen_icmpv6_unreach() for proper support of ipv4+gre+icmp+ipv6+... frames, used for example by traceroute/mtr. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-18ipv6: translate ICMP_TIME_EXCEEDED to ICMPV6_TIME_EXCEEDEric Dumazet
For better traceroute/mtr support for SIT and GRE tunnels, we translate IPV4 ICMP ICMP_TIME_EXCEEDED to ICMPV6_TIME_EXCEED We also have to translate the IPv4 source IP address of ICMP message to IPv6 v4mapped. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-18ip6: move ipip6_err_gen_icmpv6_unreach()Eric Dumazet
We want to use this helper from GRE as well, so this is the time to move it in net/ipv6/icmp.c Also add a @nhs parameter, since SIT and GRE have different values for the header(s) to skip. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-18ipv6: icmp: add a force_saddr param to icmp6_send()Eric Dumazet
SIT or GRE tunnels might want to translate an IPV4 address into a v4mapped one when translating ICMP to ICMPv6. This patch adds the parameter to icmp6_send() but does not change icmpv6_send() signature. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-18net: rds: fix coding style issuesJoshua Houghton
Fix coding style issues in the following files: ib_cm.c: add space loop.c: convert spaces to tabs sysctl.c: add space tcp.h: convert spaces to tabs tcp_connect.c:remove extra indentation in switch statement tcp_recv.c: convert spaces to tabs tcp_send.c: convert spaces to tabs transport.c: move brace up one line on for statement Signed-off-by: Joshua Houghton <josh@awful.name> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-18AX.25: Close socket connection on session completionBasil Gunn
A socket connection made in ax.25 is not closed when session is completed. The heartbeat timer is stopped prematurely and this is where the socket gets closed. Allow heatbeat timer to run to close socket. Symptom occurs in kernels >= 4.2.0 Originally sent 6/15/2016. Resend with distribution list matching scripts/maintainer.pl output. Signed-off-by: Basil Gunn <basil@pacabunga.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17RDS: TCP: rds_tcp_accept_one() should transition socket from RESETTING to UPSowmini Varadhan
The state of the rds_connection after rds_tcp_reset_callbacks() would be RDS_CONN_RESETTING and this is the value that should be passed by rds_tcp_accept_one() to rds_connect_path_complete() to transition the socket to RDS_CONN_UP. Fixes: b5c21c0947c1 ("RDS: TCP: fix race windows in send-path quiescence by rds_tcp_accept_one()") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17RDS: TCP: Fix non static symbol warningsWei Yongjun
Fixes the following sparse warnings: net/rds/tcp.c:59:5: warning: symbol 'rds_tcp_min_sndbuf' was not declared. Should it be static? net/rds/tcp.c:60:5: warning: symbol 'rds_tcp_min_rcvbuf' was not declared. Should it be static? Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17Merge tag 'linux-can-next-for-4.8-20160617' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2016-06-17 this is a pull request of 14 patches for net-next/master. Geert Uytterhoeven contributes a patch that adds a file patterns for CAN device tree bindings to MAINTAINERS. A patch by Alexander Aring fixes warnings when building without proc support. A patch by me improves the sample point calculation. Marek Vasut's patch converts the slcan driver to use CAN_MTU. A patch by William Breathitt Gray converts the tscan1 driver to use module_isa_driver. Two patches by Maximilian Schneider for the gs_usb driver fix coding style and add support for set_phys_id callback. 5 patches by Oliver Hartkopp add support for CANFD to the bcm. And finally two patches by Ramesh Shanmugasundaram, which add support for the rcar_canfd driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17tipc: fix socket timer deadlockJon Paul Maloy
We sometimes observe a 'deadly embrace' type deadlock occurring between mutually connected sockets on the same node. This happens when the one-hour peer supervision timers happen to expire simultaneously in both sockets. The scenario is as follows: CPU 1: CPU 2: -------- -------- tipc_sk_timeout(sk1) tipc_sk_timeout(sk2) lock(sk1.slock) lock(sk2.slock) msg_create(probe) msg_create(probe) unlock(sk1.slock) unlock(sk2.slock) tipc_node_xmit_skb() tipc_node_xmit_skb() tipc_node_xmit() tipc_node_xmit() tipc_sk_rcv(sk2) tipc_sk_rcv(sk1) lock(sk2.slock) lock((sk1.slock) filter_rcv() filter_rcv() tipc_sk_proto_rcv() tipc_sk_proto_rcv() msg_create(probe_rsp) msg_create(probe_rsp) tipc_sk_respond() tipc_sk_respond() tipc_node_xmit_skb() tipc_node_xmit_skb() tipc_node_xmit() tipc_node_xmit() tipc_sk_rcv(sk1) tipc_sk_rcv(sk2) lock((sk1.slock) lock((sk2.slock) ===> DEADLOCK ===> DEADLOCK Further analysis reveals that there are three different locations in the socket code where tipc_sk_respond() is called within the context of the socket lock, with ensuing risk of similar deadlocks. We now solve this by passing a buffer queue along with all upcalls where sk_lock.slock may potentially be held. Response or rejected message buffers are accumulated into this queue instead of being sent out directly, and only sent once we know we are safely outside the slock context. Reported-by: GUNA <gbalasun@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17tipc: potential shift wrapping bug in map_set()Dan Carpenter
"up_map" is a u64 type but we're not using the high 32 bits. Fixes: 35c55c9877f8 ('tipc: add neighbor monitoring framework') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17net: ipv6: Address selection needs to consider L3 domainsDavid Ahern
IPv6 version of 3f2fb9a834cb ("net: l3mdev: address selection should only consider devices in L3 domain") and the follow up commit, a17b693cdd876 ("net: l3mdev: prefer VRF master for source address selection"). That is, if outbound device is given then the address preference order is an address from that device, an address from the master device if it is enslaved, and then an address from a device in the same L3 domain. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>