summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2017-04-28ipvs: remove unused function ip_vs_set_state_timeoutAaron Conole
There are no in-tree callers of this function and it isn't exported. Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-04-28mac80211: add ieee80211_tx_status_extFelix Fietkau
This allows the driver to pass in struct ieee80211_tx_status directly. Make ieee80211_tx_status_noskb a wrapper around it. As with ieee80211_tx_status_noskb, there is no _ni variant of this call, because it probably won't be needed. Even if the driver won't provide any extra status info other than what's in struct ieee80211_tx_info already, it can optimize status reporting this way by passing in the station pointer. Signed-off-by: Felix Fietkau <nbd@nbd.name> [use C99 initializers] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-28mac80211: move ieee80211_tx_status_noskb below ieee80211_tx_statusFelix Fietkau
Makes further cleanups more readable Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-28mac80211: make rate control tx status API more extensibleFelix Fietkau
Rename .tx_status_noskb to .tx_status_ext and pass a new on-stack struct ieee80211_tx_status instead of struct ieee80211_tx_info. This struct can be used to pass extra information, e.g. for dynamic tx power control Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-28mac80211: use bitfield macros for encoded rateJohannes Berg
Instead of hand-coding the bit manipulations, use the bitfield macros to generate the code for the encoded bitrate. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-28mac80211: rename ieee80211_rx_status::vht_nss to just nssJohannes Berg
This field will need to be used again for HE, so rename it now. Again, mostly done with this spatch: @@ expression status; @@ -status->vht_nss +status->nss @@ expression status; @@ -status.vht_nss +status.nss Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-28mac80211: separate encoding/bandwidth from flagsJohannes Berg
We currently use a lot of flags that are mutually incompatible, separate this out into actual encoding and bandwidth enum values. Much of this again done with spatch, with manual post-editing, mostly to add the switch statements and get rid of the conversions. @@ expression status; @@ -status->enc_flags |= RX_ENC_FLAG_80MHZ +status->bw = RATE_INFO_BW_80 @@ expression status; @@ -status->enc_flags |= RX_ENC_FLAG_40MHZ +status->bw = RATE_INFO_BW_40 @@ expression status; @@ -status->enc_flags |= RX_ENC_FLAG_20MHZ +status->bw = RATE_INFO_BW_20 @@ expression status; @@ -status->enc_flags |= RX_ENC_FLAG_160MHZ +status->bw = RATE_INFO_BW_160 @@ expression status; @@ -status->enc_flags |= RX_ENC_FLAG_5MHZ +status->bw = RATE_INFO_BW_5 @@ expression status; @@ -status->enc_flags |= RX_ENC_FLAG_10MHZ +status->bw = RATE_INFO_BW_10 @@ expression status; @@ -status->enc_flags |= RX_ENC_FLAG_VHT +status->encoding = RX_ENC_VHT @@ expression status; @@ -status->enc_flags |= RX_ENC_FLAG_HT +status->encoding = RX_ENC_HT @@ expression status; @@ -status.enc_flags |= RX_ENC_FLAG_VHT +status.encoding = RX_ENC_VHT @@ expression status; @@ -status.enc_flags |= RX_ENC_FLAG_HT +status.encoding = RX_ENC_HT @@ expression status; @@ -(status->enc_flags & RX_ENC_FLAG_HT) +(status->encoding == RX_ENC_HT) @@ expression status; @@ -(status->enc_flags & RX_ENC_FLAG_VHT) +(status->encoding == RX_ENC_VHT) @@ expression status; @@ -(status->enc_flags & RX_ENC_FLAG_5MHZ) +(status->bw == RATE_INFO_BW_5) @@ expression status; @@ -(status->enc_flags & RX_ENC_FLAG_10MHZ) +(status->bw == RATE_INFO_BW_10) @@ expression status; @@ -(status->enc_flags & RX_ENC_FLAG_40MHZ) +(status->bw == RATE_INFO_BW_40) @@ expression status; @@ -(status->enc_flags & RX_ENC_FLAG_80MHZ) +(status->bw == RATE_INFO_BW_80) @@ expression status; @@ -(status->enc_flags & RX_ENC_FLAG_160MHZ) +(status->bw == RATE_INFO_BW_160) Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-28mac80211: clean up rate encoding bits in RX statusJohannes Berg
In preparation for adding support for HE rates, clean up the driver report encoding for rate/bandwidth reporting on RX frames. Much of this patch was done with the following spatch: @@ expression status; @@ -status->flag & (RX_FLAG_HT | RX_FLAG_VHT) +status->enc_flags & (RX_ENC_FLAG_HT | RX_ENC_FLAG_VHT) @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_SHORTPRE +status->enc_flags op RX_ENC_FLAG_SHORTPRE @@ expression status; @@ -status->flag & RX_FLAG_SHORTPRE +status->enc_flags & RX_ENC_FLAG_SHORTPRE @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_HT +status->enc_flags op RX_ENC_FLAG_HT @@ expression status; @@ -status->flag & RX_FLAG_HT +status->enc_flags & RX_ENC_FLAG_HT @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_40MHZ +status->enc_flags op RX_ENC_FLAG_40MHZ @@ expression status; @@ -status->flag & RX_FLAG_40MHZ +status->enc_flags & RX_ENC_FLAG_40MHZ @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_SHORT_GI +status->enc_flags op RX_ENC_FLAG_SHORT_GI @@ expression status; @@ -status->flag & RX_FLAG_SHORT_GI +status->enc_flags & RX_ENC_FLAG_SHORT_GI @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_HT_GF +status->enc_flags op RX_ENC_FLAG_HT_GF @@ expression status; @@ -status->flag & RX_FLAG_HT_GF +status->enc_flags & RX_ENC_FLAG_HT_GF @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_VHT +status->enc_flags op RX_ENC_FLAG_VHT @@ expression status; @@ -status->flag & RX_FLAG_VHT +status->enc_flags & RX_ENC_FLAG_VHT @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_STBC_MASK +status->enc_flags op RX_ENC_FLAG_STBC_MASK @@ expression status; @@ -status->flag & RX_FLAG_STBC_MASK +status->enc_flags & RX_ENC_FLAG_STBC_MASK @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_LDPC +status->enc_flags op RX_ENC_FLAG_LDPC @@ expression status; @@ -status->flag & RX_FLAG_LDPC +status->enc_flags & RX_ENC_FLAG_LDPC @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_10MHZ +status->enc_flags op RX_ENC_FLAG_10MHZ @@ expression status; @@ -status->flag & RX_FLAG_10MHZ +status->enc_flags & RX_ENC_FLAG_10MHZ @@ assignment operator op; expression status; @@ -status->flag op RX_FLAG_5MHZ +status->enc_flags op RX_ENC_FLAG_5MHZ @@ expression status; @@ -status->flag & RX_FLAG_5MHZ +status->enc_flags & RX_ENC_FLAG_5MHZ @@ assignment operator op; expression status; @@ -status->vht_flag op RX_VHT_FLAG_80MHZ +status->enc_flags op RX_ENC_FLAG_80MHZ @@ expression status; @@ -status->vht_flag & RX_VHT_FLAG_80MHZ +status->enc_flags & RX_ENC_FLAG_80MHZ @@ assignment operator op; expression status; @@ -status->vht_flag op RX_VHT_FLAG_160MHZ +status->enc_flags op RX_ENC_FLAG_160MHZ @@ expression status; @@ -status->vht_flag & RX_VHT_FLAG_160MHZ +status->enc_flags & RX_ENC_FLAG_160MHZ @@ assignment operator op; expression status; @@ -status->vht_flag op RX_VHT_FLAG_BF +status->enc_flags op RX_ENC_FLAG_BF @@ expression status; @@ -status->vht_flag & RX_VHT_FLAG_BF +status->enc_flags & RX_ENC_FLAG_BF @@ assignment operator op; expression status, STBC; @@ -status->flag op STBC << RX_FLAG_STBC_SHIFT +status->enc_flags op STBC << RX_ENC_FLAG_STBC_SHIFT @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_SHORTPRE +status.enc_flags op RX_ENC_FLAG_SHORTPRE @@ expression status; @@ -status.flag & RX_FLAG_SHORTPRE +status.enc_flags & RX_ENC_FLAG_SHORTPRE @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_HT +status.enc_flags op RX_ENC_FLAG_HT @@ expression status; @@ -status.flag & RX_FLAG_HT +status.enc_flags & RX_ENC_FLAG_HT @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_40MHZ +status.enc_flags op RX_ENC_FLAG_40MHZ @@ expression status; @@ -status.flag & RX_FLAG_40MHZ +status.enc_flags & RX_ENC_FLAG_40MHZ @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_SHORT_GI +status.enc_flags op RX_ENC_FLAG_SHORT_GI @@ expression status; @@ -status.flag & RX_FLAG_SHORT_GI +status.enc_flags & RX_ENC_FLAG_SHORT_GI @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_HT_GF +status.enc_flags op RX_ENC_FLAG_HT_GF @@ expression status; @@ -status.flag & RX_FLAG_HT_GF +status.enc_flags & RX_ENC_FLAG_HT_GF @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_VHT +status.enc_flags op RX_ENC_FLAG_VHT @@ expression status; @@ -status.flag & RX_FLAG_VHT +status.enc_flags & RX_ENC_FLAG_VHT @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_STBC_MASK +status.enc_flags op RX_ENC_FLAG_STBC_MASK @@ expression status; @@ -status.flag & RX_FLAG_STBC_MASK +status.enc_flags & RX_ENC_FLAG_STBC_MASK @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_LDPC +status.enc_flags op RX_ENC_FLAG_LDPC @@ expression status; @@ -status.flag & RX_FLAG_LDPC +status.enc_flags & RX_ENC_FLAG_LDPC @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_10MHZ +status.enc_flags op RX_ENC_FLAG_10MHZ @@ expression status; @@ -status.flag & RX_FLAG_10MHZ +status.enc_flags & RX_ENC_FLAG_10MHZ @@ assignment operator op; expression status; @@ -status.flag op RX_FLAG_5MHZ +status.enc_flags op RX_ENC_FLAG_5MHZ @@ expression status; @@ -status.flag & RX_FLAG_5MHZ +status.enc_flags & RX_ENC_FLAG_5MHZ @@ assignment operator op; expression status; @@ -status.vht_flag op RX_VHT_FLAG_80MHZ +status.enc_flags op RX_ENC_FLAG_80MHZ @@ expression status; @@ -status.vht_flag & RX_VHT_FLAG_80MHZ +status.enc_flags & RX_ENC_FLAG_80MHZ @@ assignment operator op; expression status; @@ -status.vht_flag op RX_VHT_FLAG_160MHZ +status.enc_flags op RX_ENC_FLAG_160MHZ @@ expression status; @@ -status.vht_flag & RX_VHT_FLAG_160MHZ +status.enc_flags & RX_ENC_FLAG_160MHZ @@ assignment operator op; expression status; @@ -status.vht_flag op RX_VHT_FLAG_BF +status.enc_flags op RX_ENC_FLAG_BF @@ expression status; @@ -status.vht_flag & RX_VHT_FLAG_BF +status.enc_flags & RX_ENC_FLAG_BF @@ assignment operator op; expression status, STBC; @@ -status.flag op STBC << RX_FLAG_STBC_SHIFT +status.enc_flags op STBC << RX_ENC_FLAG_STBC_SHIFT @@ @@ -RX_FLAG_STBC_SHIFT +RX_ENC_FLAG_STBC_SHIFT Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-27NFSv4: Fix callback server shutdownTrond Myklebust
We want to use kthread_stop() in order to ensure the threads are shut down before we tear down the nfs_callback_info in nfs_callback_down. Tested-and-reviewed-by: Kinglong Mee <kinglongmee@gmail.com> Reported-by: Kinglong Mee <kinglongmee@gmail.com> Fixes: bb6aeba736ba9 ("NFSv4.x: Switch to using svc_set_num_threads()...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-04-27SUNRPC: Refactor svc_set_num_threads()Trond Myklebust
Refactor to separate out the functions of starting and stopping threads so that they can be used in other helpers. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-and-reviewed-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-04-27fib_rules: fix error return codeWei Yongjun
Fix to return error code -EINVAL from the error handling case instead of 0, as done elsewhere in this function. Fixes: 622ec2c9d524 ("net: core: add UID to flows, rules, and routes") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27bridge: add per-port broadcast flood flagMike Manning
Support for l2 multicast flood control was added in commit b6cb5ac8331b ("net: bridge: add per-port multicast flood flag"). It allows broadcast as it was introduced specifically for unknown multicast flood control. But as broadcast is a special case of multicast, this may also need to be disabled. For this purpose, introduce a flag to disable the flooding of received l2 broadcasts. This approach is backwards compatible and provides flexibility in filtering for the desired packet types. Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Mike Manning <mmanning@brocade.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27net: fib: Decrease one unnecessary rt cache flush in fib_disable_ipGao Feng
The func fib_flush already flushes the rt cache if necessary, so it is not necessary to invoke rt_cache_flush again in fib_disable_ip. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27l2tp: remove useless device duplication test in l2tp_eth_create()Guillaume Nault
There's no need to verify that cfg->ifname is unique at this point. register_netdev() will return -EEXIST if asked to create a device with a name that's alrealy in use. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27net: remove unnecessary carrier status checkZhang Shengju
Since netif_carrier_on() will do nothing if device's carrier is already on, so it's unnecessary to do carrier status check. It's the same for netif_carrier_off(). Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: - fix orangefs handling of faults on write() - I'd missed that one back when orangefs was going through review. - readdir counterpart of "9p: cope with bogus responses from server in p9_client_{read,write}" - server might be lying or broken, and we'd better not overrun the kmalloc'ed buffer we are copying the results into. - NFS O_DIRECT read/write can leave iov_iter advanced by too much; that's what had been causing iov_iter_pipe() warnings davej had been seeing. - statx_timestamp.tv_nsec type fix (s32 -> u32). That one really should go in before 4.11. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: uapi: change the type of struct statx_timestamp.tv_nsec to unsigned fix nfs O_DIRECT advancing iov_iter too much p9_client_readdir() fix orangefs_bufmap_copy_from_iovec(): fix EFAULT handling
2017-04-27tcp: tcp_rack_reo_timeout() must update tp->tcp_mstampEric Dumazet
I wrongly assumed tp->tcp_mstamp was up to date at the time tcp_rack_reo_timeout() was called. It is not true, since we only update tcp->tcp_mstamp when receiving a packet (as initially done in commit 69e996c58a35 ("tcp: add tp->tcp_mstamp field") tcp_rack_reo_timeout() being called by a timer and not an incoming packet, we need to refresh tp->tcp_mstamp Fixes: 7c1c7308592f ("tcp: do not pass timestamp to tcp_rack_detect_loss()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27xfrm: fix GRO for !CONFIG_NETFILTERSabrina Dubroca
In xfrm_input() when called from GRO, async == 0, and we end up skipping the processing in xfrm4_transport_finish(). GRO path will always skip the NF_HOOK, so we don't need the special-case for !NETFILTER during GRO processing. Fixes: 7785bba299a8 ("esp: Add a software GRO codepath") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-27can: fix CAN BCM build with CONFIG_PROC_FS disabledOliver Hartkopp
The introduced namespace support moved the BCM variables for procfs into a per-net data structure. This leads to a build failure with disabled procfs: on x86_64: when CONFIG_PROC_FS is not enabled: ../net/can/bcm.c:1541:14: error: 'struct netns_can' has no member named 'bcmproc_dir' ../net/can/bcm.c:1601:14: error: 'struct netns_can' has no member named 'bcmproc_dir' ../net/can/bcm.c:1696:11: error: 'struct netns_can' has no member named 'bcmproc_dir' ../net/can/bcm.c:1707:15: error: 'struct netns_can' has no member named 'bcmproc_dir' http://marc.info/?l=linux-can&m=149321842526524&w=2 Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-04-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26mac80211: make multicast variable a bool in ieee80211_accept_frame()Luca Coelho
The multicast variable in the ieee80211_accept_frame() function is treated as a boolean, but defined as int. Fix that. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-26mac80211: disentangle iflist_mtx and chanctx_mtxJohannes Berg
At least on iwlwifi, sometimes lockdep complains that we can lock chanctx_mtx -> mvm.mutex -> iflist_mtx (due to iterate_interfaces) and iflist_mtx -> chanctx_mtx Remove the latter dependency in mac80211 by using the RTNL that we already hold in one case, and can relatively easily achieve in the other case. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-26mac80211: don't parse encrypted management frames in ieee80211_frame_ackedEmmanuel Grumbach
ieee80211_frame_acked is called when a frame is acked by the peer. In case this is a management frame, we check if this an SMPS frame, in which case we can update our antenna configuration. When we parse the management frame we look at the category in case it is an action frame. That byte sits after the IV in case the frame was encrypted. This means that if the frame was encrypted, we basically look at the IV instead of looking at the category. It is then theorically possible that we think that an SMPS action frame was acked where really we had another frame that was encrypted. Since the only management frame whose ack needs to be tracked is the SMPS action frame, and that frame is not a robust management frame, it will never be encrypted. The easiest way to fix this problem is then to not look at frames that were encrypted. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-26cfg80211: add request id parameter to .sched_scan_stop() signatureArend Van Spriel
For multiple scheduled scan support the driver needs to know which scheduled scan request is being stopped. Pass the request id in the .sched_scan_stop() callback. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-26nl80211: add support for BSSIDs in scheduled scan matchsetsArend Van Spriel
This patch allows for the scheduled scan request to specify matchsets for specific BSSIDs. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> [docs, netlink policy fix] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-26nl80211: allow multiple active scheduled scan requestsArend Van Spriel
This patch implements the idea to have multiple scheduled scan requests running concurrently. It mainly illustrates how to deal with the incoming request from user-space in terms of backward compatibility. In order to use multiple scheduled scans user-space needs to provide a flag attribute NL80211_ATTR_SCHED_SCAN_MULTI to indicate support. If not the request is treated as a legacy scan. Drivers currently supporting scheduled scan are now indicating they support a single scheduled scan request. This obsoletes WIPHY_FLAG_SUPPORTS_SCHED_SCAN. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> [clean up netlink destroy path to avoid allocations, code cleanups] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-26cfg80211: simplify netlink socket owner interface deletionJohannes Berg
There's no need to allocate a portid structure and then, for each of those, walk the interfaces - we can just add a flag to each interface and walk those directly. Due to padding in the struct, we can even do it without any memory cost, and it even simplifies the code. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-26ipv6: check raw payload size correctly in ioctlJamie Bainbridge
In situations where an skb is paged, the transport header pointer and tail pointer can be the same because the skb contents are in frags. This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a length of 0 when the length to receive is actually greater than zero. skb->len is already correctly set in ip6_input_finish() with pskb_pull(), so use skb->len as it always returns the correct result for both linear and paged data. Signed-off-by: Jamie Bainbridge <jbainbri@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: memset ca_priv data to 0 properlyWei Wang
Always zero out ca_priv data in tcp_assign_congestion_control() so that ca_priv data is cleared out during socket creation. Also always zero out ca_priv data in tcp_reinit_congestion_control() so that when cc algorithm is changed, ca_priv data is cleared out as well. We should still zero out ca_priv data even in TCP_CLOSE state because user could call connect() on AF_UNSPEC to disconnect the socket and leave it in TCP_CLOSE state and later call setsockopt() to switch cc algorithm on this socket. Fixes: 2b0a8c9ee ("tcp: add CDG congestion control") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26ipv6: check skb->protocol before lookup for nexthopWANG Cong
Andrey reported a out-of-bound access in ip6_tnl_xmit(), this is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4 neigh key as an IPv6 address: neigh = dst_neigh_lookup(skb_dst(skb), &ipv6_hdr(skb)->daddr); if (!neigh) goto tx_err_link_failure; addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE addr_type = ipv6_addr_type(addr6); if (addr_type == IPV6_ADDR_ANY) addr6 = &ipv6_hdr(skb)->daddr; memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr)); Also the network header of the skb at this point should be still IPv4 for 4in6 tunnels, we shold not just use it as IPv6 header. This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it is, we are safe to do the nexthop lookup using skb_dst() and ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which dest address we can pick here, we have to rely on callers to fill it from tunnel config, so just fall to ip6_route_output() to make the decision. Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.") Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26net: core: Prevent from dereferencing null pointer when releasing SKBMyungho Jung
Added NULL check to make __dev_kfree_skb_irq consistent with kfree family of functions. Link: https://bugzilla.kernel.org/show_bug.cgi?id=195289 Signed-off-by: Myungho Jung <mhjungk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: switch rcv_rtt_est and rcvq_space to high resolution timestampsEric Dumazet
Some devices or distributions use HZ=100 or HZ=250 TCP receive buffer autotuning has poor behavior caused by this choice. Since autotuning happens after 4 ms or 10 ms, short distance flows get their receive buffer tuned to a very high value, but after an initial period where it was frozen to (too small) initial value. With tp->tcp_mstamp introduction, we can switch to high resolution timestamps almost for free (at the expense of 8 additional bytes per TCP structure) Note that some TCP stacks use usec TCP timestamps where this patch makes even more sense : Many TCP flows have < 500 usec RTT. Hopefully this finer TS option can be standardized soon. Tested: HZ=100 kernel ./netperf -H lpaa24 -t TCP_RR -l 1000 -- -r 10000,10000 & Peer without patch : lpaa24:~# ss -tmi dst lpaa23 ... skmem:(r0,rb8388608,...) rcv_rtt:10 rcv_space:3210000 minrtt:0.017 Peer with the patch : lpaa23:~# ss -tmi dst lpaa24 ... skmem:(r0,rb428800,...) rcv_rtt:0.069 rcv_space:30000 minrtt:0.017 We can see saner RCVBUF, and more precise rcv_rtt information. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: remove ack_time from struct tcp_sacktag_stateEric Dumazet
It is no longer needed, everything uses tp->tcp_mstamp instead. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: use tp->tcp_mstamp in tcp_clean_rtx_queue()Eric Dumazet
Following patch will remove ack_time from struct tcp_sacktag_state Same info is now found in tp->tcp_mstamp Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: do not pass timestamp to tcp_rack_advance()Eric Dumazet
No longer needed, since tp->tcp_mstamp holds the information. This is needed to remove sack_state.ack_time in a following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: do not pass timestamp to tcp_rate_gen()Eric Dumazet
No longer needed, since tp->tcp_mstamp holds the information. This is needed to remove sack_state.ack_time in a following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: do not pass timestamp to tcp_fastretrans_alert()Eric Dumazet
Not used anymore now tp->tcp_mstamp holds the information. This is needed to remove sack_state.ack_time in a following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: do not pass timestamp to tcp_rack_identify_loss()Eric Dumazet
Not used anymore now tp->tcp_mstamp holds the information. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: do not pass timestamp to tcp_rack_mark_lost()Eric Dumazet
This is no longer used, since tcp_rack_detect_loss() takes the timestamp from tp->tcp_mstamp Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: do not pass timestamp to tcp_rack_detect_loss()Eric Dumazet
We can use tp->tcp_mstamp as it contains a recent timestamp. This removes a call to skb_mstamp_get() from tcp_rack_reo_timeout() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26tcp: add tp->tcp_mstamp fieldEric Dumazet
We want to use precise timestamps in TCP stack, but we do not want to call possibly expensive kernel time services too often. tp->tcp_mstamp is guaranteed to be updated once per incoming packet. We will use it in the following patches, removing specific skb_mstamp_get() calls, and removing ack_time from struct tcp_sacktag_state. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26Merge branches 'uaccess.alpha', 'uaccess.arc', 'uaccess.arm', ↵Al Viro
'uaccess.arm64', 'uaccess.avr32', 'uaccess.bfin', 'uaccess.c6x', 'uaccess.cris', 'uaccess.frv', 'uaccess.h8300', 'uaccess.hexagon', 'uaccess.ia64', 'uaccess.m32r', 'uaccess.m68k', 'uaccess.metag', 'uaccess.microblaze', 'uaccess.mips', 'uaccess.mn10300', 'uaccess.nios2', 'uaccess.openrisc', 'uaccess.parisc', 'uaccess.powerpc', 'uaccess.s390', 'uaccess.score', 'uaccess.sh', 'uaccess.sparc', 'uaccess.tile', 'uaccess.um', 'uaccess.unicore32', 'uaccess.x86' and 'uaccess.xtensa' into work.uaccess
2017-04-26xfrm: do the garbage collection after flushing policyXin Long
Now xfrm garbage collection can be triggered by 'ip xfrm policy del'. These is no reason not to do it after flushing policies, especially considering that 'garbage collection deferred' is only triggered when it reaches gc_thresh. It's no good that the policy is gone but the xdst still hold there. The worse thing is that xdst->route/orig_dst is also hold and can not be released even if the orig_dst is already expired. This patch is to do the garbage collection if there is any policy removed in xfrm_policy_flush. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-26netfilter: don't attach a nat extension by defaultFlorian Westphal
nowadays the NAT extension only stores the interface index (used to purge connections that got masqueraded when interface goes down) and pptp nat information. Previous patches moved nf_ct_nat_ext_add to those places that need it. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-26netfilter: pptp: attach nat extension when neededFlorian Westphal
make sure nat extension gets added if the master conntrack is subject to NAT. This will be required once the nat core stops adding it by default. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-26netfilter: masquerade: attach nat extension if not presentFlorian Westphal
Currently the nat extension is always attached as soon as nat module is loaded. However, most NAT uses do not need the nat extension anymore. Prepare to remove the add-nat-by-default by making those places that need it attach it if its not present yet. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-26netfilter: conntrack: handle initial extension alloc via kreallocFlorian Westphal
krealloc(NULL, ..) is same as kmalloc(), so we can avoid special-casing the initial allocation after the prealloc removal (we had to use ->alloc_len as the initial allocation size). This also means we do not zero the preallocated memory anymore; only offsets[]. Existing code makes sure the new (used) extension space gets zeroed out. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-26netfilter: conntrack: mark extension structs as constFlorian Westphal
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-26netfilter: conntrack: remove prealloc supportFlorian Westphal
It was used by the nat extension, but since commit 7c9664351980 ("netfilter: move nat hlist_head to nf_conn") its only needed for connections that use MASQUERADE target or a nat helper. Also it seems a lot easier to preallocate a fixed size instead. With default settings, conntrack first adds ecache extension (sysctl defaults to 1), so we get 40(ct extension header) + 24 (ecache) == 64 byte on x86_64 for initial allocation. Followup patches can constify the extension structs and avoid the initial zeroing of the entire extension area. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-26netfilter: SYNPROXY: Return NF_STOLEN instead of NF_DROP during handshakingGao Feng
Current SYNPROXY codes return NF_DROP during normal TCP handshaking, it is not friendly to caller. Because the nf_hook_slow would treat the NF_DROP as an error, and return -EPERM. As a result, it may cause the top caller think it meets one error. For example, the following codes are from cfv_rx_poll() err = netif_receive_skb(skb); if (unlikely(err)) { ++cfv->ndev->stats.rx_dropped; } else { ++cfv->ndev->stats.rx_packets; cfv->ndev->stats.rx_bytes += skb_len; } When SYNPROXY returns NF_DROP, then netif_receive_skb returns -EPERM. As a result, the cfv driver would treat it as an error, and increase the rx_dropped counter. So use NF_STOLEN instead of NF_DROP now because there is no error happened indeed, and free the skb directly. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>