summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2017-04-21rds: make use of iov_iter_revert()Al Viro
2017-04-21net: ipv6: RTF_PCPU should not be settable from userspaceDavid Ahern
Andrey reported a fault in the IPv6 route code: kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 1 PID: 4035 Comm: a.out Not tainted 4.11.0-rc7+ #250 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff880069809600 task.stack: ffff880062dc8000 RIP: 0010:ip6_rt_cache_alloc+0xa6/0x560 net/ipv6/route.c:975 RSP: 0018:ffff880062dced30 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: ffff8800670561c0 RCX: 0000000000000006 RDX: 0000000000000003 RSI: ffff880062dcfb28 RDI: 0000000000000018 RBP: ffff880062dced68 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff880062dcfb28 R14: dffffc0000000000 R15: 0000000000000000 FS: 00007feebe37e7c0(0000) GS:ffff88006cb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000205a0fe4 CR3: 000000006b5c9000 CR4: 00000000000006e0 Call Trace: ip6_pol_route+0x1512/0x1f20 net/ipv6/route.c:1128 ip6_pol_route_output+0x4c/0x60 net/ipv6/route.c:1212 ... Andrey's syzkaller program passes rtmsg.rtmsg_flags with the RTF_PCPU bit set. Flags passed to the kernel are blindly copied to the allocated rt6_info by ip6_route_info_create making a newly inserted route appear as though it is a per-cpu route. ip6_rt_cache_alloc sees the flag set and expects rt->dst.from to be set - which it is not since it is not really a per-cpu copy. The subsequent call to __ip6_dst_alloc then generates the fault. Fix by checking for the flag and failing with EINVAL. Fixes: d52d3997f843f ("ipv6: Create percpu rt6_info") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21bpf: add napi_id read access to __sk_buffDaniel Borkmann
Add napi_id access to __sk_buff for socket filter program types, tc program types and other bpf_convert_ctx_access() users. Having access to skb->napi_id is useful for per RX queue listener siloing, f.e. in combination with SO_ATTACH_REUSEPORT_EBPF and when busy polling is used, meaning SO_REUSEPORT enabled listeners can then select the corresponding socket at SYN time already [1]. The skb is marked via skb_mark_napi_id() early in the receive path (e.g., napi_gro_receive()). Currently, sockets can only use SO_INCOMING_NAPI_ID from 6d4339028b35 ("net: Introduce SO_INCOMING_NAPI_ID") as a socket option to look up the NAPI ID associated with the queue for steering, which requires a prior sk_mark_napi_id() after the socket was looked up. Semantics for the __sk_buff napi_id access are similar, meaning if skb->napi_id is < MIN_NAPI_ID (e.g. outgoing packets using sender_cpu), then an invalid napi_id of 0 is returned to the program, otherwise a valid non-zero napi_id. [1] http://netdevconf.org/2.1/slides/apr6/dumazet-BUSY-POLLING-Netdev-2.1.pdf Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21gso: Validate assumption of frag_list segementationIlan Tayari
Commit 07b26c9454a2 ("gso: Support partial splitting at the frag_list pointer") assumes that all SKBs in a frag_list (except maybe the last one) contain the same amount of GSO payload. This assumption is not always correct, resulting in the following warning message in the log: skb_segment: too many frags For example, mlx5 driver in Striding RQ mode creates some RX SKBs with one frag, and some with 2 frags. After GRO, the frag_list SKBs end up having different amounts of payload. If this frag_list SKB is then forwarded, the aforementioned assumption is violated. Validate the assumption, and fall back to software GSO if it not true. Change-Id: Ia03983f4a47b6534dd987d7a2aad96d54d46d212 Fixes: 07b26c9454a2 ("gso: Support partial splitting at the frag_list pointer") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuningMatthew Whitehead
Constants used for tuning are generally a bad idea, especially as hardware changes over time. Replace the constant 2 jiffies with sysctl variable netdev_budget_usecs to enable sysadmins to tune the softirq processing. Also document the variable. For example, a very fast machine might tune this to 1000 microseconds, while my regression testing 486DX-25 needs it to be 4000 microseconds on a nearly idle network to prevent time_squeeze from being incremented. Version 2: changed jiffies to microseconds for predictable units. Signed-off-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21ip_tunnel: Allow policy-based routing through tunnelsCraig Gallek
This feature allows the administrator to set an fwmark for packets traversing a tunnel. This allows the use of independent routing tables for tunneled packets without the use of iptables. There is no concept of per-packet routing decisions through IPv4 tunnels, so this implementation does not need to work with per-packet route lookups as the v6 implementation may (with IP6_TNL_F_USE_ORIG_FWMARK). Further, since the v4 tunnel ioctls share datastructures (which can not be trivially modified) with the kernel's internal tunnel configuration structures, the mark attribute must be stored in the tunnel structure itself and passed as a parameter when creating or changing tunnel attributes. Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21ip6_tunnel: Allow policy-based routing through tunnelsCraig Gallek
This feature allows the administrator to set an fwmark for packets traversing a tunnel. This allows the use of independent routing tables for tunneled packets without the use of iptables. Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21ipv6: sr: fix double free of skb after handling invalid SRHDavid Lebrun
The icmpv6_param_prob() function already does a kfree_skb(), this patch removes the duplicate one. Fixes: 1ababeba4a21f3dba3da3523c670b207fb2feb62 ("ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21net: dsa: Remove redundant NULL dst checkFlorian Fainelli
tag_lan9303.c does check for a NULL dst but that's already checked by dsa_switch_rcv() one layer above. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Juergen Borleis <jbe@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20net sched actions: allocate act cookie earlyWolfgang Bumiller
Policing filters do not use the TCA_ACT_* enum and the tb[] nlattr array in tcf_action_init_1() doesn't get filled for them so we should not try to look for a TCA_ACT_COOKIE attribute in the then uninitialized array. The error handling in cookie allocation then calls tcf_hash_release() leading to invalid memory access later on. Additionally, if cookie allocation fails after an already existing non-policing filter has successfully been changed, tcf_action_release() should not be called, also we would have to roll back the changes in the error handling, so instead we now allocate the cookie early and assign it on success at the end. CVE-2017-7979 Fixes: 1045ba77a596 ("net sched actions: Add support for user cookies") Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2017-04-19 Two fixes for af_key: 1) Add a lock to key dump to prevent a NULL pointer dereference. From Yuejie Shi. 2) Fix slab-out-of-bounds in parse_ipsecrequests. From Herbert Xu. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20tcp_cubic: fix typo in module param descriptionChema Gonzalez
Signed-off-by: Chema Gonzalez <chemag@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20net: ipv6: Fix UDP early demux lookup with udp_l3mdev_accept=0subashab@codeaurora.org
David Ahern reported that 5425077d73e0c ("net: ipv6: Add early demux handler for UDP unicast") breaks udp_l3mdev_accept=0 since early demux for IPv6 UDP was doing a generic socket lookup which does not require an exact match. Fix this by making UDPv6 early demux match connected sockets only. v1->v2: Take reference to socket after match as suggested by Eric v2->v3: Add comment before break Fixes: 5425077d73e0c ("net: ipv6: Add early demux handler for UDP unicast") Reported-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Eric Dumazet <edumazet@google.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20tcp: remove poll() flakes with FastOpenEric Dumazet
When using TCP FastOpen for an active session, we send one wakeup event from tcp_finish_connect(), right before the data eventually contained in the received SYNACK is queued to sk->sk_receive_queue. This means that depending on machine load or luck, poll() users might receive POLLOUT events instead of POLLIN|POLLOUT To fix this, we need to move the call to sk->sk_state_change() after the (optional) call to tcp_rcv_fastopen_synack() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20tcp: remove poll() flakes when receiving RSTEric Dumazet
When a RST packet is processed, we send two wakeup events to interested polling users. First one by a sk->sk_error_report(sk) from tcp_reset(), followed by a sk->sk_state_change(sk) from tcp_done(). Depending on machine load and luck, poll() can either return POLLERR, or POLLIN|POLLOUT|POLLERR|POLLHUP (this happens on 99 % of the cases) This is probably fine, but we can avoid the confusion by reordering things so that we have more TCP fields updated before the first wakeup. This might even allow us to remove some barriers we added in the past. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20ipv6: sr: fix out-of-bounds access in SRH validationDavid Lebrun
This patch fixes an out-of-bounds access in seg6_validate_srh() when the trailing data is less than sizeof(struct sr6_tlv). Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20mac80211: reject ToDS broadcast data framesJohannes Berg
AP/AP_VLAN modes don't accept any real 802.11 multicast data frames, but since they do need to accept broadcast management frames the same is currently permitted for data frames. This opens a security problem because such frames would be decrypted with the GTK, and could even contain unicast L3 frames. Since the spec says that ToDS frames must always have the BSSID as the RA (addr1), reject any other data frames. The problem was originally reported in "Predicting, Decrypting, and Abusing WPA2/802.11 Group Keys" at usenix https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/vanhoef and brought to my attention by Jouni. Cc: stable@vger.kernel.org Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com> -- Dave, I didn't want to send you a new pull request for a single commit yet again - can you apply this one patch as is? Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20bpf: remove reference to sock_filter_ext from kerneldoc commentTobias Klauser
struct sock_filter_ext didn't make it into the tree and is now called struct bpf_insn. Reword the kerneldoc comment for bpf_convert_filter() accordingly. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20Merge tag 'mac80211-next-for-davem-2017-04-18' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== My last pull request has been a while, we now have: * connection quality monitoring with multiple thresholds * support for FILS shared key authentication offload * pre-CAC regulatory compliance - only ETSI allows this * sanity check for some rate confusion that hit ChromeOS (but nobody else uses it, evidently) * some documentation updates * lots of cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20net: dsa: add support for the SMSC-LAN9303 tagging formatJuergen Beisert
To define the outgoing port and to discover the incoming port a regular VLAN tag is used by the LAN9303. But its VID meaning is 'special'. This tag handler/filter depends on some hardware features which must be enabled in the device to provide and make use of this special VLAN tag to control the destination and the source of an ethernet packet. Signed-off-by: Juergen Borleis <jbe@pengutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20sunrpc: don't check for failure from mempool_alloc()NeilBrown
When mempool_alloc() is allowed to sleep (GFP_NOIO allows sleeping) it cannot fail. So rpc_alloc_task() cannot fail, so rpc_new_task doesn't need to test for failure. Consequently rpc_new_task() cannot fail, so the callers don't need to test. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20Merge tag 'mac80211-for-davem-2017-04-18' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A single fix, for the MU-MIMO monitor mode, that fixes bad SKB accesses if the SKB was paged, which is the case for the only driver supporting this - iwlwifi. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
A function in kernel/bpf/syscall.c which got a bug fix in 'net' was moved to kernel/bpf/verifier.c in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-19netfilter: tcp: Use TCP_MAX_WSCALE instead of literal 14Gao Feng
The window scale may be enlarged from 14 to 15 according to the itef draft https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03. Use the macro TCP_MAX_WSCALE to support it easily with TCP stack in the future. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: ipvs: fix incorrect conflict resolutionFlorian Westphal
The commit ab8bc7ed864b9c4f1fcb00a22bbe4e0f66ce8003 ("netfilter: remove nf_ct_is_untracked") changed the line if (ct && !nf_ct_is_untracked(ct) && nfct_nat(ct)) { to if (ct && nfct_nat(ct)) { meanwhile, the commit 41390895e50bc4f28abe384c6b35ac27464a20ec ("netfilter: ipvs: don't check for presence of nat extension") from ipvs-next had changed the same line to if (ct && !nf_ct_is_untracked(ct) && (ct->status & IPS_NAT_MASK)) { When ipvs-next got merged into nf-next, the merge resolution took the first version, dropping the conversion of nfct_nat(). While this doesn't cause a problem at the moment, it will once we stop adding the nat extension by default. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19nefilter: eache: reduce struct size from 32 to 24 byteFlorian Westphal
Only "cache" needs to use ulong (its used with set_bit()), missed can use u16. Also add build-time assertion to ensure event bits fit. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: allow early drop of assured conntracksFlorian Westphal
If insertion of a new conntrack fails because the table is full, the kernel searches the next buckets of the hash slot where the new connection was supposed to be inserted at for an entry that hasn't seen traffic in reply direction (non-assured), if it finds one, that entry is is dropped and the new connection entry is allocated. Allow the conntrack gc worker to also remove *assured* conntracks if resources are low. Do this by querying the l4 tracker, e.g. tcp connections are now dropped if they are no longer established (e.g. in finwait). This could be refined further, e.g. by adding 'soft' established timeout (i.e., a timeout that is only used once we get close to resource exhaustion). Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: conntrack: use u8 for extension sizes againFlorian Westphal
commit 223b02d923ecd7c84cf9780bb3686f455d279279 ("netfilter: nf_conntrack: reserve two bytes for nf_ct_ext->len") had to increase size of the extension offsets because total size of the extensions had increased to a point where u8 did overflow. 3 years later we've managed to diet extensions a bit and we no longer need u16. Furthermore we can now add a compile-time assertion for this problem. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: remove last traces of variable-sized extensionsFlorian Westphal
get rid of the (now unused) nf_ct_ext_add_length define and also rename the function to plain nf_ct_ext_add(). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: helpers: remove data_len usage for inkernel helpersFlorian Westphal
No need to track this for inkernel helpers anymore as NF_CT_HELPER_BUILD_BUG_ON checks do this now. All inkernel helpers know what kind of structure they stored in helper->data. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: nfnetlink_cthelper: reject too large userspace allocation requestsFlorian Westphal
Userspace should not abuse the kernel to store large amounts of data, reject requests larger than the private area can accommodate. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: helper: add build-time asserts for helper data sizeFlorian Westphal
add a 32 byte scratch area in the helper struct instead of relying on variable sized helpers plus compile-time asserts to let us know if 32 bytes aren't enough anymore. Not having variable sized helpers will later allow to add BUILD_BUG_ON for the total size of conntrack extensions -- the helper extension is the only one that doesn't have a fixed size. The (useless!) NF_CT_HELPER_BUILD_BUG_ON(0); are added so that in case someone adds a new helper and copy-pastes from one that doesn't store private data at least some indication that this macro should be used somehow is there... Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: nft_ct: allow to set ctnetlink event types of a connectionFlorian Westphal
By default the kernel emits all ctnetlink events for a connection. This allows to select the types of events to generate. This can be used to e.g. only send DESTROY events but no NEW/UPDATE ones and will work even if sysctl net.netfilter.nf_conntrack_events is set to 0. This was already possible via iptables' CT target, but the nft version has the advantage that it can also be used with already-established conntracks. The added nf_ct_is_template() check isn't a bug fix as we only support mark and labels (and unlike ecache the conntrack core doesn't copy those). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19esp4/6: Fix GSO path for non-GSO SW-crypto packetsIlan Tayari
If esp*_offload module is loaded, outbound packets take the GSO code path, being encapsulated at layer 3, but encrypted in layer 2. validate_xmit_xfrm calls esp*_xmit for that. esp*_xmit was wrongfully detecting these packets as going through hardware crypto offload, while in fact they should be encrypted in software, causing plaintext leakage to the network, and also dropping at the receiver side. Perform the encryption in esp*_xmit, if the SA doesn't have a hardware offload_handle. Also, align esp6 code to esp4 logic. Fixes: fca11ebde3f0 ("esp4: Reorganize esp_output") Fixes: 383d0350f2cc ("esp6: Reorganize esp_output") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-19esp6: fix incorrect null pointer check on xoColin Ian King
The check for xo being null is incorrect, currently it is checking for non-null, it should be checking for null. Detected with CoverityScan, CID#1429349 ("Dereference after null check") Fixes: 7862b4058b9f ("esp: Add gso handlers for esp4 and esp6") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-18mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCUPaul E. McKenney
A group of Linux kernel hackers reported chasing a bug that resulted from their assumption that SLAB_DESTROY_BY_RCU provided an existence guarantee, that is, that no block from such a slab would be reallocated during an RCU read-side critical section. Of course, that is not the case. Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire slab of blocks. However, there is a phrase for this, namely "type safety". This commit therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order to avoid future instances of this sort of confusion. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> [ paulmck: Add comments mentioning the old name, as requested by Eric Dumazet, in order to help people familiar with the old name find the new one. ] Acked-by: David Rientjes <rientjes@google.com>
2017-04-18sctp: process duplicated strreset asoc request correctlyXin Long
This patch is to fix the replay attack issue for strreset asoc requests. When a duplicated strreset asoc request is received, reply it with bad seqno if it's seqno < asoc->strreset_inseq - 2, and reply it with the result saved in asoc if it's seqno >= asoc->strreset_inseq - 2. But note that if the result saved in asoc is performed, the sender's next tsn and receiver's next tsn for the response chunk should be set. It's safe to get them from asoc. Because if it's changed, which means the peer has received the response already, the new response with wrong tsn won't be accepted by peer. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-18sctp: process duplicated strreset in and addstrm in requests correctlyXin Long
This patch is to fix the replay attack issue for strreset and addstrm in requests. When a duplicated strreset in or addstrm in request is received, reply it with bad seqno if it's seqno < asoc->strreset_inseq - 2, and reply it with the result saved in asoc if it's seqno >= asoc->strreset_inseq - 2. For strreset in or addstrm in request, if the receiver side processes it successfully, a strreset out or addstrm out request(as a response for that request) will be sent back to peer. reconf_time will retransmit the out request even if it's lost. So when receiving a duplicated strreset in or addstrm in request and it's result was performed, it shouldn't reply this request, but drop it instead. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-18sctp: process duplicated strreset out and addstrm out requests correctlyXin Long
Now sctp stream reconf will process a request again even if it's seqno is less than asoc->strreset_inseq. If one request has been done successfully and some data chunks have been accepted and then a duplicated strreset out request comes, the streamin's ssn will be cleared. It will cause that stream will never receive chunks any more because of unsynchronized ssn. It allows a replay attack. A similar issue also exists when processing addstrm out requests. It will cause more extra streams being added. This patch is to fix it by saving the last 2 results into asoc. When a duplicated strreset out or addstrm out request is received, reply it with bad seqno if it's seqno < asoc->strreset_inseq - 2, and reply it with the result saved in asoc if it's seqno >= asoc->strreset_inseq - 2. Note that it saves last 2 results instead of only last 1 result, because two requests can be sent together in one chunk. And note that when receiving a duplicated request, the receiver side will still reply it even if the peer has received the response. It's safe, As the response will be dropped by the peer. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-18nl80211: Fix enum type of variable in nl80211_put_sta_rate()Matthias Kaehlcke
rate_flg is of type 'enum nl80211_attrs', however it is assigned with 'enum nl80211_rate_info' values. Change the type of rate_flg accordingly. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18mac80211: ibss: Fix channel type enum in ieee80211_sta_join_ibss()Matthias Kaehlcke
cfg80211_chandef_create() expects an 'enum nl80211_channel_type' as channel type however in ieee80211_sta_join_ibss() NL80211_CHAN_WIDTH_20_NOHT is passed in two occasions, which is of the enum type 'nl80211_chan_width'. Change the value to NL80211_CHAN_NO_HT (20 MHz, non-HT channel) of the channel type enum. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18cfg80211: Fix array-bounds warning in fragment copyMatthias Kaehlcke
__ieee80211_amsdu_copy_frag intentionally initializes a pointer to array[-1] to increment it later to valid values. clang rightfully generates an array-bounds warning on the initialization statement. Initialize the pointer to array[0] and change the algorithm from increment before to increment after consume. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18mac80211: keep a separate list of monitor interfaces that are upJohannes Berg
In addition to keeping monitor interfaces on the regular list of interfaces, keep those that are up and not in cooked mode on a separate list. This saves having to iterate all interfaces when delivering to monitor interfaces. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18nl80211: add request id in scheduled scan event messagesArend Van Spriel
For multi-scheduled scan support in subsequent patch a request id will be added. This patch add this request id to the scheduled scan event messages. For now the request id will always be zero. With multi-scheduled scan its value will inform user-space to which scan the event relates. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18af_key: Fix sadb_x_ipsecrequest parsingHerbert Xu
The parsing of sadb_x_ipsecrequest is broken in a number of ways. First of all we're not verifying sadb_x_ipsecrequest_len. This is needed when the structure carries addresses at the end. Worse we don't even look at the length when we parse those optional addresses. The migration code had similar parsing code that's better but it also has some deficiencies. The length is overcounted first of all as it includes the header itself. It also fails to check the length before dereferencing the sa_family field. This patch fixes those problems in parse_sockaddr_pair and then uses it in parse_ipsecrequest. Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-17net: rtnetlink: plumb extended ack to doit functionDavid Ahern
Add netlink_ext_ack arg to rtnl_doit_func. Pass extack arg to nlmsg_parse for doit functions that call it directly. This is the first step to using extended error reporting in rtnetlink. >From here individual subsystems can be updated to set netlink_ext_ack as needed. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17ipv6: sr: fix BUG due to headroom too small after SRH pushDavid Lebrun
When a locally generated packet receives an SRH with two or more segments, the remaining headroom is too small to push an ethernet header. This patch ensures that the headroom is large enough after SRH push. The BUG generated the following trace. [ 192.950285] skbuff: skb_under_panic: text:ffffffff81809675 len:198 put:14 head:ffff88006f306400 data:ffff88006f3063fa tail:0xc0 end:0x2c0 dev:A-1 [ 192.952456] ------------[ cut here ]------------ [ 192.953218] kernel BUG at net/core/skbuff.c:105! [ 192.953411] invalid opcode: 0000 [#1] PREEMPT SMP [ 192.953411] Modules linked in: [ 192.953411] CPU: 5 PID: 3433 Comm: ping6 Not tainted 4.11.0-rc3+ #237 [ 192.953411] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014 [ 192.953411] task: ffff88007c2d42c0 task.stack: ffffc90000ef4000 [ 192.953411] RIP: 0010:skb_panic+0x61/0x70 [ 192.953411] RSP: 0018:ffffc90000ef7900 EFLAGS: 00010286 [ 192.953411] RAX: 0000000000000085 RBX: 00000000000086dd RCX: 0000000000000201 [ 192.953411] RDX: 0000000080000201 RSI: ffffffff81d104c5 RDI: 00000000ffffffff [ 192.953411] RBP: ffffc90000ef7920 R08: 0000000000000001 R09: 0000000000000000 [ 192.953411] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 192.953411] R13: ffff88007c5a4000 R14: ffff88007b363d80 R15: 00000000000000b8 [ 192.953411] FS: 00007f94b558b700(0000) GS:ffff88007fd40000(0000) knlGS:0000000000000000 [ 192.953411] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 192.953411] CR2: 00007fff5ecd5080 CR3: 0000000074141000 CR4: 00000000001406e0 [ 192.953411] Call Trace: [ 192.953411] skb_push+0x3b/0x40 [ 192.953411] eth_header+0x25/0xc0 [ 192.953411] neigh_resolve_output+0x168/0x230 [ 192.953411] ? ip6_finish_output2+0x242/0x8f0 [ 192.953411] ip6_finish_output2+0x242/0x8f0 [ 192.953411] ? ip6_finish_output2+0x76/0x8f0 [ 192.953411] ip6_finish_output+0xa8/0x1d0 [ 192.953411] ip6_output+0x64/0x2d0 [ 192.953411] ? ip6_output+0x73/0x2d0 [ 192.953411] ? ip6_dst_check+0xb5/0xc0 [ 192.953411] ? dst_cache_per_cpu_get.isra.2+0x40/0x80 [ 192.953411] seg6_output+0xb0/0x220 [ 192.953411] lwtunnel_output+0xcf/0x210 [ 192.953411] ? lwtunnel_output+0x59/0x210 [ 192.953411] ip6_local_out+0x38/0x70 [ 192.953411] ip6_send_skb+0x2a/0xb0 [ 192.953411] ip6_push_pending_frames+0x48/0x50 [ 192.953411] rawv6_sendmsg+0xa39/0xf10 [ 192.953411] ? __lock_acquire+0x489/0x890 [ 192.953411] ? __mutex_lock+0x1fc/0x970 [ 192.953411] ? __lock_acquire+0x489/0x890 [ 192.953411] ? __mutex_lock+0x1fc/0x970 [ 192.953411] ? tty_ioctl+0x283/0xec0 [ 192.953411] inet_sendmsg+0x45/0x1d0 [ 192.953411] ? _copy_from_user+0x54/0x80 [ 192.953411] sock_sendmsg+0x33/0x40 [ 192.953411] SYSC_sendto+0xef/0x170 [ 192.953411] ? entry_SYSCALL_64_fastpath+0x5/0xc2 [ 192.953411] ? trace_hardirqs_on_caller+0x12b/0x1b0 [ 192.953411] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 192.953411] SyS_sendto+0x9/0x10 [ 192.953411] entry_SYSCALL_64_fastpath+0x1f/0xc2 [ 192.953411] RIP: 0033:0x7f94b453db33 [ 192.953411] RSP: 002b:00007fff5ecd0578 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 192.953411] RAX: ffffffffffffffda RBX: 00007fff5ecd16e0 RCX: 00007f94b453db33 [ 192.953411] RDX: 0000000000000040 RSI: 000055a78352e9c0 RDI: 0000000000000003 [ 192.953411] RBP: 00007fff5ecd1690 R08: 000055a78352c940 R09: 000000000000001c [ 192.953411] R10: 0000000000000000 R11: 0000000000000246 R12: 000055a783321e10 [ 192.953411] R13: 000055a7839890c0 R14: 0000000000000004 R15: 0000000000000000 [ 192.953411] Code: 00 00 48 89 44 24 10 8b 87 c4 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 90 58 d2 81 48 89 04 24 31 c0 e8 4f 70 9a ff <0f> 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 48 8b 97 d8 00 00 [ 192.953411] RIP: skb_panic+0x61/0x70 RSP: ffffc90000ef7900 [ 193.000186] ---[ end trace bd0b89fabdf2f92c ]--- [ 193.000951] Kernel panic - not syncing: Fatal exception in interrupt [ 193.001137] Kernel Offset: disabled [ 193.001169] ---[ end Kernel panic - not syncing: Fatal exception in interrupt Fixes: 19d5a26f5ef8de5dcb78799feaf404d717b1aac3 ("ipv6: sr: expand skb head only if necessary") Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17gso: Validate assumption of frag_list segementationIlan Tayari
Commit 07b26c9454a2 ("gso: Support partial splitting at the frag_list pointer") assumes that all SKBs in a frag_list (except maybe the last one) contain the same amount of GSO payload. This assumption is not always correct, resulting in the following warning message in the log: skb_segment: too many frags For example, mlx5 driver in Striding RQ mode creates some RX SKBs with one frag, and some with 2 frags. After GRO, the frag_list SKBs end up having different amounts of payload. If this frag_list SKB is then forwarded, the aforementioned assumption is violated. Validate the assumption, and fall back to software GSO if it not true. Fixes: 07b26c9454a2 ("gso: Support partial splitting at the frag_list pointer") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17sctp: get list_of_streams of strreset outreq earlierXin Long
Now when processing strreset out responses, it gets outreq->list_of_streams only when result is performed. But if result is not performed, str_p will be NULL. It will cause panic in sctp_ulpevent_make_stream_reset_event if nums is not 0. This patch is to fix it by getting outreq->list_of_streams earlier, and also to improve some codes for the strreset inreq process. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17Add uid and cookie bpf helper to cg_skb_func_protoChenbo Feng
BPF helper functions get_socket_cookie and get_socket_uid can be used for network traffic classifications, among others. Expose them also to programs of type BPF_PROG_TYPE_CGROUP_SKB. As of commit 8f917bba0042 ("bpf: pass sk to helper functions") the required skb->sk function is available at both cgroup bpf ingress and egress hooks. With these two new helper, cg_skb_func_proto is effectively the same as sk_filter_func_proto. Change since V1: Instead of add the helper to cg_skb_func_proto, redirect the cg_skb_func_proto to sk_filter_func_proto since all helper function in sk_filter_func_proto are applicable to cg_skb_func_proto now. Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>