summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2013-02-10arp: fix possible crash in arp_rcv()Eric Dumazet
We should call skb_share_check() before pskb_may_pull(), or we can crash in pskb_expand_head() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-10net/8021q: Implement Multiple VLAN Registration Protocol (MVRP)David Ward
Initial implementation of the Multiple VLAN Registration Protocol (MVRP) from IEEE 802.1Q-2011, based on the existing implementation of the GARP VLAN Registration Protocol (GVRP). Signed-off-by: David Ward <david.ward@ll.mit.edu> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-10net/802: Implement Multiple Registration Protocol (MRP)David Ward
Initial implementation of the Multiple Registration Protocol (MRP) from IEEE 802.1Q-2011, based on the existing implementation of the Generic Attribute Registration Protocol (GARP). Signed-off-by: David Ward <david.ward@ll.mit.edu> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-10VSOCK: Introduce VM SocketsAndy King
VM Sockets allows communication between virtual machines and the hypervisor. User level applications both in a virtual machine and on the host can use the VM Sockets API, which facilitates fast and efficient communication between guest virtual machines and their host. A socket address family, designed to be compatible with UDP and TCP at the interface level, is provided. Today, VM Sockets is used by various VMware Tools components inside the guest for zero-config, network-less access to VMware host services. In addition to this, VMware's users are using VM Sockets for various applications, where network access of the virtual machine is restricted or non-existent. Examples of this are VMs communicating with device proxies for proprietary hardware running as host applications and automated testing of applications running within virtual machines. The VMware VM Sockets are similar to other socket types, like Berkeley UNIX socket interface. The VM Sockets module supports both connection-oriented stream sockets like TCP, and connectionless datagram sockets like UDP. The VM Sockets protocol family is defined as "AF_VSOCK" and the socket operations split for SOCK_DGRAM and SOCK_STREAM. For additional information about the use of VM Sockets, please refer to the VM Sockets Programming Guide available at: https://www.vmware.com/support/developer/vmci-sdk/ Signed-off-by: George Zhang <georgezhang@vmware.com> Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Andy king <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Synchronize with 'net' in order to sort out some l2tp, wireless, and ipv6 GRE fixes that will be built on top of in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08net: sctp: sctp_auth_make_key_vector: use sctp_auth_create_keyDaniel Borkmann
In sctp_auth_make_key_vector, we allocate a temporary sctp_auth_bytes structure with kmalloc instead of the sctp_auth_create_key allocator. Change this to sctp_auth_create_key as it is the case everywhere else, so that we also can properly free it via sctp_auth_key_put. This makes it easier for future code changes in the structure and allocator itself, since a single API is consistently used for this purpose. Also, by using sctp_auth_create_key we're doing sanity checks over the arguments. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08ipv6: fix a RCU warning in net/ipv6/ip6_flowlabel.cAmerigo Wang
This patch fixes the following RCU warning: [ 51.680236] =============================== [ 51.681914] [ INFO: suspicious RCU usage. ] [ 51.683610] 3.8.0-rc6-next-20130206-sasha-00028-g83214f7-dirty #276 Tainted: G W [ 51.686703] ------------------------------- [ 51.688281] net/ipv6/ip6_flowlabel.c:671 suspicious rcu_dereference_check() usage! we should use rcu_dereference_bh() when we hold rcu_read_lock_bh(). Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08skbuff: Move definition of NETDEV_FRAG_PAGE_MAX_SIZEAlexander Duyck
In order to address the fact that some devices cannot support the full 32K frag size we need to have the value accessible somewhere so that we can use it to do comparisons against what the device can support. As such I am moving the values out of skbuff.c and into skbuff.h. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Revert iwlwifi reclaimed packet tracking, it causes problems for a bunch of folks. From Emmanuel Grumbach. 2) Work limiting code in brcmsmac wifi driver can clear tx status without processing the event. From Arend van Spriel. 3) rtlwifi USB driver processes wrong SKB, fix from Larry Finger. 4) l2tp tunnel delete can race with close, fix from Tom Parkin. 5) pktgen_add_device() failures are not checked at all, fix from Cong Wang. 6) Fix unintentional removal of carrier off from tun_detach(), otherwise we confuse userspace, from Michael S. Tsirkin. 7) Don't leak socket reference counts and ubufs in vhost-net driver, from Jason Wang. 8) vmxnet3 driver gets it's initial carrier state wrong, fix from Neil Horman. 9) Protect against USB networking devices which spam the host with 0 length frames, from Bjørn Mork. 10) Prevent neighbour overflows in ipv6 for locally destined routes, from Marcelo Ricardo. This is the best short-term fix for this, a longer term fix has been implemented in net-next. 11) L2TP uses ipv4 datagram routines in it's ipv6 code, whoops. This mistake is largely because the ipv6 functions don't even have some kind of prefix in their names to suggest they are ipv6 specific. From Tom Parkin. 12) Check SYN packet drops properly in tcp_rcv_fastopen_synack(), from Yuchung Cheng. 13) Fix races and TX skb freeing bugs in via-rhine's NAPI support, from Francois Romieu and your's truly. 14) Fix infinite loops and divides by zero in TCP congestion window handling, from Eric Dumazet, Neal Cardwell, and Ilpo Järvinen. 15) AF_PACKET tx ring handling can leak kernel memory to userspace, fix from Phil Sutter. 16) Fix error handling in ipv6 GRE tunnel transmit, from Tommi Rantala. 17) Protect XEN netback driver against hostile frontend putting garbage into the rings, don't leak pages in TX GOP checking, and add proper resource releasing in error path of xen_netbk_get_requests(). From Ian Campbell. 18) SCTP authentication keys should be cleared out and released with kzfree(), from Daniel Borkmann. 19) L2TP is a bit too clever trying to maintain skb->truesize, and ends up corrupting socket memory accounting to the point where packet sending is halted indefinitely. Just remove the adjustments entirely, they aren't really needed. From Eric Dumazet. 20) ATM Iphase driver uses a data type with the same name as the S390 headers, rename to fix the build. From Heiko Carstens. 21) Fix a typo in copying the inner network header offset from one SKB to another, from Pravin B Shelar. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits) net: sctp: sctp_endpoint_free: zero out secret key data net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfree atm/iphase: rename fregt_t -> ffreg_t net: usb: fix regression from FLAG_NOARP code l2tp: dont play with skb->truesize net: sctp: sctp_auth_key_put: use kzfree instead of kfree netback: correct netbk_tx_err to handle wrap around. xen/netback: free already allocated memory on failure in xen_netbk_get_requests xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop. xen/netback: shutdown the ring if it contains garbage. net: qmi_wwan: add more Huawei devices, including E320 net: cdc_ncm: add another Huawei vendor specific device ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit() tcp: fix for zero packets_in_flight was too broad brcmsmac: rework of mac80211 .flush() callback operation ssb: unregister gpios before unloading ssb bcma: unregister gpios before unloading bcma rtlwifi: Fix scheduling while atomic bug net: usbnet: fix tx_dropped statistics tcp: ipv6: Update MIB counters for drops ...
2013-02-08sunrpc: trim off trailing checksum before returning decrypted or integrity ↵Jeff Layton
authenticated buffer When GSSAPI integrity signatures are in use, or when we're using GSSAPI privacy with the v2 token format, there is a trailing checksum on the xdr_buf that is returned. It's checked during the authentication stage, and afterward nothing cares about it. Ordinarily, it's not a problem since the XDR code generally ignores it, but it will be when we try to compute a checksum over the buffer to help prevent XID collisions in the duplicate reply cache. Fix the code to trim off the checksums after verifying them. Note that in unwrap_integ_data, we must avoid trying to reverify the checksum if the request was deferred since it will no longer be present when it's revisited. Signed-off-by: Jeff Layton <jlayton@redhat.com>
2013-02-08net: sctp: sctp_endpoint_free: zero out secret key dataDaniel Borkmann
On sctp_endpoint_destroy, previously used sensitive keying material should be zeroed out before the memory is returned, as we already do with e.g. auth keys when released. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfreeDaniel Borkmann
In sctp_setsockopt_auth_key, we create a temporary copy of the user passed shared auth key for the endpoint or association and after internal setup, we free it right away. Since it's sensitive data, we should zero out the key before returning the memory back to the allocator. Thus, use kzfree instead of kfree, just as we do in sctp_auth_key_put(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08NFC: llcp: integer underflow in nfc_llcp_set_remote_gb()Dan Carpenter
If gb_len is less than 3 it would cause an integer underflow and possibly memory corruption in nfc_llcp_parse_gb_tlv(). I removed the old test for gb_len == 0. I also removed the test for ->remote_gb == NULL. It's not possible for ->remote_gb to be NULL and we have already dereferenced ->remote_gb_len so it's too late to test. The old test return -ENODEV but my test returns -EINVAL. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-02-08Merge branch 'for-john' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Fixed-up drivers/net/wireless/iwlwifi/mvm/mac80211.c to change change IEEE80211_HW_NEED_DTIM_PERIOD to IEEE80211_HW_NEED_DTIM_BEFORE_ASSOC as requested by Johannes Berg. -- JWL Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-02-08Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2013-02-08l2tp: dont play with skb->truesizeEric Dumazet
Andrew Savchenko reported a DNS failure and we diagnosed that some UDP sockets were unable to send more packets because their sk_wmem_alloc was corrupted after a while (tx_queue column in following trace) $ cat /proc/net/udp sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops ... 459: 00000000:0270 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4507 2 ffff88003d612380 0 466: 00000000:0277 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4802 2 ffff88003d613180 0 470: 076A070A:007B 00000000:0000 07 FFFF4600:00000000 00:00000000 00000000 123 0 5552 2 ffff880039974380 0 470: 010213AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4986 2 ffff88003dbd3180 0 470: 010013AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4985 2 ffff88003dbd2e00 0 470: 00FCA8C0:007B 00000000:0000 07 FFFFFB00:00000000 00:00000000 00000000 0 0 4984 2 ffff88003dbd2a80 0 ... Playing with skb->truesize is tricky, especially when skb is attached to a socket, as we can fool memory charging. Just remove this code, its not worth trying to be ultra precise in xmit path. Reported-by: Andrew Savchenko <bircoph@gmail.com> Tested-by: Andrew Savchenko <bircoph@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07net: sctp: sctp_auth_make_key_vector: remove duplicate ntohs callsDaniel Borkmann
Instead of calling 3 times ntohs(random->param_hdr.length), 2 times ntohs(hmacs->param_hdr.length), and 3 times ntohs(chunks->param_hdr.length) within the same function, we only call each once and store it in a variable. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07net: sctp: sctp_auth_key_put: use kzfree instead of kfreeDaniel Borkmann
For sensitive data like keying material, it is common practice to zero out keys before returning the memory back to the allocator. Thus, use kzfree instead of kfree. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07Merge branch 'fixes' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch into openvswitch Jesse Gross says: ==================== One bug fix for net/3.8 for a long standing problem that was reported a few times recently. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07mac80211: fix managed mode channel context useJohannes Berg
My commit f2d9d270c15ae0139b54a7e7466d738327e97e03 ("mac80211: support VHT association") introduced a very stupid bug: the loop to downgrade the channel width never attempted to actually use it again so it would downgrade all the way to 20_NOHT. Fix it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-02-07netfilter: ip6t_NPT: Ensure to check lower part of prefixes are zeroYOSHIFUJI Hideaki / 吉藤英明
RFC 6296 points that address bits that are not part of the prefix has to be zeroed. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-02-07netfilter: ip6t_NPT: Fix prefix manglingYOSHIFUJI Hideaki / 吉藤英明
Make sure only the bits that are part of the prefix are mangled. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-02-07netfilter: ip6t_NPT: Fix adjustment calculationYOSHIFUJI Hideaki / 吉藤英明
Cast __wsum from/to __sum16 is wrong. Instead, apply appropriate conversion function: csum_unfold() or csum_fold(). [ The original patch has been modified to undo the final ~ that csum_fold returns. We only need to fold the 32-bit word that results from the checksum calculation into a 16-bit to ensure that the original subnet is restored appropriately. Spotted by Ulrich Weber. ] Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-02-06ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()Tommi Rantala
ip6gre_tunnel_xmit() is leaking the skb when we hit this error branch, and the -1 return value from this function is bogus. Use the error handling we already have in place in ip6gre_tunnel_xmit() for this error case to fix this. Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06net: reset mac header in dev_start_xmit()Eric Dumazet
On 64 bit arches : There is a off-by-one error in qdisc_pkt_len_init() because mac_header is not set in xmit path. skb_mac_header() returns an out of bound value that was harmless because hdr_len is an 'unsigned int' On 32bit arches, the error is abysmal. This patch is also a prereq for "macvlan: add multicast filter" Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06net: adjust skb_gso_segment() for calling in rx pathCong Wang
skb_gso_segment() is almost always called in tx path, except for openvswitch. It calls this function when it receives the packet and tries to queue it to user-space. In this special case, the ->ip_summed check inside skb_gso_segment() is no longer true, as ->ip_summed value has different meanings on rx path. This patch adjusts skb_gso_segment() so that we can at least avoid such warnings on checksum. Cc: Jesse Gross <jesse@nicira.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06wpan: use stack buffer instead of heapAlexander Aring
head buffer is only temporary available in mac802154_header_create. So it's not necessary to put it on the heap. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-066lowpan: use stack buffer instead of heapAlexander Aring
head buffer is only temporary available in lowpan_header_create. So it's not necessary to put it on the heap. Also fixed a comment codestyle issue. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-066lowpan: Remove __init tag from lowpan_netlink_fini().David S. Miller
It's called from both __init and __exit code, so neither tag is appropriate. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06tcp: fix for zero packets_in_flight was too broadIlpo Järvinen
There are transients during normal FRTO procedure during which the packets_in_flight can go to zero between write_queue state updates and firing the resulting segments out. As FRTO processing occurs during that window the check must be more precise to not match "spuriously" :-). More specificly, e.g., when packets_in_flight is zero but FLAG_DATA_ACKED is true the problematic branch that set cwnd into zero would not be taken and new segments might be sent out later. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06netpoll: protect napi_poll and poll_controller during dev_[open|close]Neil Horman
Ivan Vercera was recently backporting commit 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that, while this patch protects the tg3 driver from having its ndo_poll_controller routine called during device initalization, it does nothing for the driver during shutdown. I.e. it would be entirely possible to have the ndo_poll_controller method (or subsequently the ndo_poll) routine called for a driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or ndo_open routine could be called. Given that the two latter routines tend to initizlize and free many data structures that the former two rely on, the result can easily be data corruption or various other crashes. Furthermore, it seems that this is potentially a problem with all net drivers that support netpoll, and so this should ideally be fixed in a common path. As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic context, so I've come up with this solution. We can use a mutex to sleep in open/close paths and just do a mutex_trylock in the napi poll path and abandon the poll attempt if we're locked, as we'll just retry the poll on the next send anyway. I've tested this here by flooding netconsole with messages on a system whos nic driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx workqueue would be forced to send frames and poll the device. While this was going on I rapidly ifdown/up'ed the interface and watched for any problems. I've not found any. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Ivan Vecera <ivecera@redhat.com> CC: "David S. Miller" <davem@davemloft.net> CC: Ben Hutchings <bhutchings@solarflare.com> CC: Francois Romieu <romieu@fr.zoreil.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06wpan: whitespace fixAlexander Aring
Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06driver-core: constify data for class_find_device()Michał Mirosław
All in-kernel users of class_find_device() don't really need mutable data for match callback. In two places (kernel/power/suspend_test.c, drivers/scsi/osd/osd_uld.c) this patch changes match callbacks to use const search data. The const is propagated to rtc_class_open() and power_supply_get_by_name() parameters. Note that there's a dev reference leak in suspend_test.c that's not touched in this patch. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-06ipv6: Don't send packet to big messages to selfSteffen Klassert
Calling icmpv6_send() on a local message size error leads to an incorrect update of the path mtu in the case when IPsec is used. So use ipv6_local_error() instead to notify the socket about the error. Reported-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06net: core: Remove unnecessary alloc/OOM messagesJoe Perches
alloc failures already get standardized OOM messages and a dump_stack. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2013-02-06mac80211: fix error in sizeof() usageCong Ding
Using 'sizeof' on array given as function argument returns size of a pointer rather than the size of array. Cc: stable@vger.kernel.org Signed-off-by: Cong Ding <dinggnu@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-02-06xfrm: make gc_thresh configurable in all namespacesMichal Kubecek
The xfrm gc threshold can be configured via xfrm{4,6}_gc_thresh sysctl but currently only in init_net, other namespaces always use the default value. This can substantially limit the number of IPsec tunnels that can be effectively used. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-06xfrm: remove unused xfrm4_policy_fini()Michal Kubecek
Function xfrm4_policy_fini() is unused since xfrm4_fini() was removed in 2.6.11. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-06xfrm: Add a state resolution packet queueSteffen Klassert
As the default, we blackhole packets until the key manager resolves the states. This patch implements a packet queue where IPsec packets are queued until the states are resolved. We generate a dummy xfrm bundle, the output routine of the returned route enqueues the packet to a per policy queue and arms a timer that checks for state resolution when dst_output() is called. Once the states are resolved, the packets are sent out of the queue. If the states are not resolved after some time, the queue is flushed. This patch keeps the defaut behaviour to blackhole packets as long as we have no states. To enable the packet queue the sysctl xfrm_larval_drop must be switched off. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-06ipvs: sctp: fix checksumming on snat and dnat handlersDaniel Borkmann
In our test lab, we have a simple SCTP client connecting to a SCTP server via an IPVS load balancer. On some machines, load balancing works, but on others the initial handshake just fails, thus no SCTP connection whatsoever can be established! We observed that the SCTP INIT-ACK handshake reply from the IPVS machine to the client had a correct IP checksum, but corrupt SCTP checksum when forwarded, thus on the client-side the packet was dropped and an intial handshake retriggered until all attempts run into the void. To fix this issue, this patch i) adds a missing CHECKSUM_UNNECESSARY after the full checksum (re-)calculation (as done in IPVS TCP and UDP code as well), ii) calculates the checksum in little-endian format (as fixed with the SCTP code in commit 4458f04c: sctp: Clean up sctp checksumming code) and iii) refactors duplicate checksum code into a common function. Tested by myself. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2013-02-05tcp: remove Appropriate Byte Count supportStephen Hemminger
TCP Appropriate Byte Count was added by me, but later disabled. There is no point in maintaining it since it is a potential source of bugs and Linux already implements other better window protection heuristics. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05ipv4: Disallow non-namespace aware protocols to register.David S. Miller
All in-tree ipv4 protocol implementations are now namespace aware. Therefore all the run-time checks are superfluous. Reject registry of any non-namespace aware ipv4 protocol. Eventually we'll remove prot->netns_ok and this registry time check as well. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05l2tp: Make ipv4 protocol handler namespace aware.David S. Miller
The infrastructure is already pretty much entirely there to allow this conversion. The tunnel and session lookups have per-namespace tables, and the ipv4 bind lookup includes the namespace in the lookup key. Set netns_ok in l2tp_ip_protocol. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05l2tp: create tunnel sockets in the right namespaceTom Parkin
When creating unmanaged tunnel sockets we should honour the network namespace passed to l2tp_tunnel_create. Furthermore, unmanaged tunnel sockets should not hold a reference to the network namespace lest they accidentally keep alive a namespace which should otherwise have been released. Unmanaged tunnel sockets now drop their namespace reference via sk_change_net, and are released in a new pernet exit callback, l2tp_exit_net. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05l2tp: prevent tunnel creation on netns mismatchTom Parkin
l2tp_tunnel_create is passed a pointer to the network namespace for the tunnel, along with an optional file descriptor for the tunnel which may be passed in from userspace via. netlink. In the case where the file descriptor is defined, ensure that the namespace associated with that socket matches the namespace explicitly passed to l2tp_tunnel_create. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05l2tp: set netnsok flag for netlink messagesTom Parkin
The L2TP netlink code can run in namespaces. Set the netnsok flag in genl_family to true to reflect that fact. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05l2tp: put tunnel socket release on a workqueueTom Parkin
To allow l2tp_tunnel_delete to be called from an atomic context, place the tunnel socket release calls on a workqueue for asynchronous execution. Tunnel memory is eventually freed in the tunnel socket destructor. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/intel/e1000e/ethtool.c drivers/net/vmxnet3/vmxnet3_drv.c drivers/net/wireless/iwlwifi/dvm/tx.c net/ipv6/route.c The ipv6 route.c conflict is simple, just ignore the 'net' side change as we fixed the same problem in 'net-next' by eliminating cached neighbours from ipv6 routes. The e1000e conflict is an addition of a new statistic in the ethtool code, trivial. The vmxnet3 conflict is about one change in 'net' removing a guarding conditional, whilst in 'net-next' we had a netdev_info() conversion. The iwlwifi conflict is dealing with a WARN_ON() conversion in 'net-next' vs. a revert happening in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05sunrpc: move address copy/cmp/convert routines and prototypes from clnt.h to ↵Jeff Layton
addr.h These routines are used by server and client code, so having them in a separate header would be best. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>