summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2010-10-01nfsd: provide callbacks on svc_xprt deletionJ. Bruce Fields
NFSv4.1 needs warning when a client tcp connection goes down, if that connection is being used as a backchannel, so that it can warn the client that it has lost the backchannel connection. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01nfsd4: remove spkm3J. Bruce Fields
Unfortunately, spkm3 never got very far; while interoperability with one other implementation was demonstrated at some point, problems were found with the spec that were deemed not worth fixing. The kernel code is useless on its own without nfs-utils patches which were never merged into nfs-utils, and were only ever available from citi.umich.edu. They appear not to have been updated since 2005. Therefore it seems safe to assume that this code has no users, and never will. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: fix race in new cache_wait code.NeilBrown
If we set up to wait for a cache item to be filled in, and then find that it is no longer pending, it could be that some other thread is in 'cache_revisit_request' and has moved our request to its 'pending' list. So when our setup_deferral calls cache_revisit_request it will find nothing to put on the pending list, and do nothing. We then return from cache_wait_req, thus leaving the 'sleeper' on-stack structure open to being corrupted by subsequent stack usage. However that 'sleeper' could still be on the 'pending' list that the other thread is looking at and so any corruption could cause it to behave badly. To avoid this race we simply take the same path as if the 'wait_for_completion_interruptible_timeout' was interrupted and if the sleeper is no longer on the list (which it won't be) we wait on the completion - which will ensure that any other cache_revisit_request will have let go of the sleeper. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: Create sockets in net namespacesPavel Emelyanov
The context is already known in all the sock_create callers. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01net: Export __sock_createPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: Tag rpc_xprt with netPavel Emelyanov
The net is known from the xprt_create and this tagging will also give un the context in the conntection workers where real sockets are created. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: Add net to xprt_createPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: Add net to rpc_create_argsPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: Pull net argument downto svc_create_socketPavel Emelyanov
After this the socket creation in it knows the context. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: Add net argument to svc_create_xprtPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: Factor out rpc_xprt freeingPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: Factor out rpc_xprt allocationPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
2010-09-30ipv4: rcu conversion in ip_route_output_slowEric Dumazet
ip_route_output_slow() is enclosed in an rcu_read_lock() protected section, so that no references are taken/released on device, thanks to __ip_dev_find() & dev_get_by_index_rcu() Tested with ip route cache disabled, and a stress test : Before patch: elapsed time : real 1m38.347s user 0m11.909s sys 23m51.501s Profile: 13788.00 22.7% ip_route_output_slow [kernel] 7875.00 13.0% dst_destroy [kernel] 3925.00 6.5% fib_semantic_match [kernel] 3144.00 5.2% fib_rules_lookup [kernel] 3061.00 5.0% dst_alloc [kernel] 2276.00 3.7% rt_set_nexthop [kernel] 1762.00 2.9% fib_table_lookup [kernel] 1538.00 2.5% _raw_read_lock [kernel] 1358.00 2.2% ip_output [kernel] After patch: real 1m28.808s user 0m13.245s sys 20m37.293s 10950.00 17.2% ip_route_output_slow [kernel] 10726.00 16.9% dst_destroy [kernel] 5170.00 8.1% fib_semantic_match [kernel] 3937.00 6.2% dst_alloc [kernel] 3635.00 5.7% rt_set_nexthop [kernel] 2900.00 4.6% fib_rules_lookup [kernel] 2240.00 3.5% fib_table_lookup [kernel] 1427.00 2.2% _raw_read_lock [kernel] 1157.00 1.8% kmem_cache_alloc [kernel] Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30ipv4: introduce __ip_dev_find()Eric Dumazet
ip_dev_find(net, addr) finds a device given an IPv4 source address and takes a reference on it. Introduce __ip_dev_find(), taking a third argument, to optionally take the device reference. Callers not asking the reference to be taken should be in an rcu_read_lock() protected section. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30vlan: dont drop packets from unknown vlans in promiscuous modeEric Dumazet
Roger Luethi noticed packets for unknown VLANs getting silently dropped even in promiscuous mode. Check for promiscuous mode in __vlan_hwaccel_rx() and vlan_gro_common() before drops. As suggested by Patrick, mark such packets to have skb->pkt_type set to PACKET_OTHERHOST to make sure they are dropped by IP stack. Reported-by: Roger Luethi <rl@hellgate.ch> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30ipv4: __mkroute_output() speedupEric Dumazet
While doing stress tests with a disabled IP route cache, I found __mkroute_output() was touching three times in_device atomic refcount. Use RCU to touch it once to reduce cache line ping pongs. Before patch time to perform the test real 1m42.009s user 0m12.545s sys 25m0.726s Profile : 16109.00 26.4% ip_route_output_slow vmlinux 7434.00 12.2% dst_destroy vmlinux 3280.00 5.4% fib_rules_lookup vmlinux 3252.00 5.3% fib_semantic_match vmlinux 2622.00 4.3% fib_table_lookup vmlinux 2535.00 4.1% dst_alloc vmlinux 1750.00 2.9% _raw_read_lock vmlinux 1532.00 2.5% rt_set_nexthop vmlinux After patch real 1m36.503s user 0m12.977s sys 23m25.608s 14234.00 22.4% ip_route_output_slow vmlinux 8717.00 13.7% dst_destroy vmlinux 4052.00 6.4% fib_rules_lookup vmlinux 3951.00 6.2% fib_semantic_match vmlinux 3191.00 5.0% dst_alloc vmlinux 1764.00 2.8% fib_table_lookup vmlinux 1692.00 2.7% _raw_read_lock vmlinux 1605.00 2.5% rt_set_nexthop vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30Phonet: restore flow control credits when sending failsRémi Denis-Courmont
This patch restores the below flow control patch submitted by Rémi Denis-Courmont, which accidentaly got lost due to Pipe controller patch on Phonet. commit 1a98214feef2221cd7c24b17cd688a5a9d85b2ea Author: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Date: Mon Aug 30 12:57:03 2010 +0000 Phonet: restore flow control credits when sending fails Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2010-09-30Revert "Bluetooth: Don't accept ConfigReq if we aren't in the BT_CONFIG state"Gustavo F. Padovan
This reverts commit 8cb8e6f1684be13b51f8429b15f39c140326b327. That commit introduced a regression with the Bluetooth Profile Tuning Suite(PTS), Reverting this make sure that L2CAP is in a qualificable state. Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-30Bluetooth: Fix inconsistent lock state with RFCOMMGustavo F. Padovan
When receiving a rfcomm connection with the old dund deamon a inconsistent lock state happens. That's because interrupts were already disabled by l2cap_conn_start() when rfcomm_sk_state_change() try to lock the spin_lock. As result we may have a inconsistent lock state for l2cap_conn_start() after rfcomm_sk_state_change() calls bh_lock_sock() and disable interrupts as well. [ 2833.151999] [ 2833.151999] ================================= [ 2833.151999] [ INFO: inconsistent lock state ] [ 2833.151999] 2.6.36-rc3 #2 [ 2833.151999] --------------------------------- [ 2833.151999] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. [ 2833.151999] krfcommd/2306 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 2833.151999] (slock-AF_BLUETOOTH){+.?...}, at: [<ffffffffa00bcb56>] rfcomm_sk_state_change+0x46/0x170 [rfcomm] [ 2833.151999] {IN-SOFTIRQ-W} state was registered at: [ 2833.151999] [<ffffffff81094346>] __lock_acquire+0x5b6/0x1560 [ 2833.151999] [<ffffffff8109534a>] lock_acquire+0x5a/0x70 [ 2833.151999] [<ffffffff81392b6c>] _raw_spin_lock+0x2c/0x40 [ 2833.151999] [<ffffffffa00a5092>] l2cap_conn_start+0x92/0x640 [l2cap] [ 2833.151999] [<ffffffffa00a6a3f>] l2cap_sig_channel+0x6bf/0x1320 [l2cap] [ 2833.151999] [<ffffffffa00a9173>] l2cap_recv_frame+0x133/0x770 [l2cap] [ 2833.151999] [<ffffffffa00a997b>] l2cap_recv_acldata+0x1cb/0x390 [l2cap] [ 2833.151999] [<ffffffffa000db4b>] hci_rx_task+0x2ab/0x450 [bluetooth] [ 2833.151999] [<ffffffff8106b22b>] tasklet_action+0xcb/0xe0 [ 2833.151999] [<ffffffff8106b91e>] __do_softirq+0xae/0x150 [ 2833.151999] [<ffffffff8102bc0c>] call_softirq+0x1c/0x30 [ 2833.151999] [<ffffffff8102ddb5>] do_softirq+0x75/0xb0 [ 2833.151999] [<ffffffff8106b56d>] irq_exit+0x8d/0xa0 [ 2833.151999] [<ffffffff8104484b>] smp_apic_timer_interrupt+0x6b/0xa0 [ 2833.151999] [<ffffffff8102b6d3>] apic_timer_interrupt+0x13/0x20 [ 2833.151999] [<ffffffff81029dfa>] cpu_idle+0x5a/0xb0 [ 2833.151999] [<ffffffff81381ded>] rest_init+0xad/0xc0 [ 2833.151999] [<ffffffff817ebc4d>] start_kernel+0x2dd/0x2e8 [ 2833.151999] [<ffffffff817eb2e6>] x86_64_start_reservations+0xf6/0xfa [ 2833.151999] [<ffffffff817eb3ce>] x86_64_start_kernel+0xe4/0xeb [ 2833.151999] irq event stamp: 731 [ 2833.151999] hardirqs last enabled at (731): [<ffffffff8106b762>] local_bh_enable_ip+0x82/0xe0 [ 2833.151999] hardirqs last disabled at (729): [<ffffffff8106b93e>] __do_softirq+0xce/0x150 [ 2833.151999] softirqs last enabled at (730): [<ffffffff8106b96e>] __do_softirq+0xfe/0x150 [ 2833.151999] softirqs last disabled at (711): [<ffffffff8102bc0c>] call_softirq+0x1c/0x30 [ 2833.151999] [ 2833.151999] other info that might help us debug this: [ 2833.151999] 2 locks held by krfcommd/2306: [ 2833.151999] #0: (rfcomm_mutex){+.+.+.}, at: [<ffffffffa00bb744>] rfcomm_run+0x174/0xb20 [rfcomm] [ 2833.151999] #1: (&(&d->lock)->rlock){+.+...}, at: [<ffffffffa00b9223>] rfcomm_dlc_accept+0x53/0x100 [rfcomm] [ 2833.151999] [ 2833.151999] stack backtrace: [ 2833.151999] Pid: 2306, comm: krfcommd Tainted: G W 2.6.36-rc3 #2 [ 2833.151999] Call Trace: [ 2833.151999] [<ffffffff810928e1>] print_usage_bug+0x171/0x180 [ 2833.151999] [<ffffffff810936c3>] mark_lock+0x333/0x400 [ 2833.151999] [<ffffffff810943ca>] __lock_acquire+0x63a/0x1560 [ 2833.151999] [<ffffffff810948b5>] ? __lock_acquire+0xb25/0x1560 [ 2833.151999] [<ffffffff8109534a>] lock_acquire+0x5a/0x70 [ 2833.151999] [<ffffffffa00bcb56>] ? rfcomm_sk_state_change+0x46/0x170 [rfcomm] [ 2833.151999] [<ffffffff81392b6c>] _raw_spin_lock+0x2c/0x40 [ 2833.151999] [<ffffffffa00bcb56>] ? rfcomm_sk_state_change+0x46/0x170 [rfcomm] [ 2833.151999] [<ffffffffa00bcb56>] rfcomm_sk_state_change+0x46/0x170 [rfcomm] [ 2833.151999] [<ffffffffa00b9239>] rfcomm_dlc_accept+0x69/0x100 [rfcomm] [ 2833.151999] [<ffffffffa00b9a49>] rfcomm_check_accept+0x59/0xd0 [rfcomm] [ 2833.151999] [<ffffffffa00bacab>] rfcomm_recv_frame+0x9fb/0x1320 [rfcomm] [ 2833.151999] [<ffffffff813932bb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60 [ 2833.151999] [<ffffffff81093acd>] ? trace_hardirqs_on_caller+0x13d/0x180 [ 2833.151999] [<ffffffff81093b1d>] ? trace_hardirqs_on+0xd/0x10 [ 2833.151999] [<ffffffffa00bb7f1>] rfcomm_run+0x221/0xb20 [rfcomm] [ 2833.151999] [<ffffffff813905e7>] ? schedule+0x287/0x780 [ 2833.151999] [<ffffffffa00bb5d0>] ? rfcomm_run+0x0/0xb20 [rfcomm] [ 2833.151999] [<ffffffff81081026>] kthread+0x96/0xa0 [ 2833.151999] [<ffffffff8102bb14>] kernel_thread_helper+0x4/0x10 [ 2833.151999] [<ffffffff813936bc>] ? restore_args+0x0/0x30 [ 2833.151999] [<ffffffff81080f90>] ? kthread+0x0/0xa0 [ 2833.151999] [<ffffffff8102bb10>] ? kernel_thread_helper+0x0/0x10 Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-30Bluetooth: Simplify L2CAP Streaming mode sendingGustavo F. Padovan
As we don't have any error control on the Streaming mode, i.e., we don't need to keep a copy of the skb for later resending we don't need to call skb_clone() on it. Then we can go one further here, and dequeue the skb before sending it, that also means we don't need to look to sk->sk_send_head anymore. The patch saves memory and time when sending Streaming mode data, so it is good to mainline. Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-30Bluetooth: fix MTU L2CAP configuration parameterAndrei Emeltchenko
When receiving L2CAP negative configuration response with respect to MTU parameter we modify wrong field. MTU here means proposed value of MTU that the remote device intends to transmit. So for local L2CAP socket it is pi->imtu. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com> Acked-by: Ville Tervo <ville.tervo@nokia.com> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-30Bluetooth: Only enable L2CAP FCS for ERTM or streamingMat Martineau
This fixes a bug which caused the FCS setting to show L2CAP_FCS_CRC16 with L2CAP modes other than ERTM or streaming. At present, this only affects the FCS value shown with getsockopt() for basic mode. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-29ip_gre: comments changeEric Dumazet
HARD_TX_LOCK no longer protects tunnels from dead loops, but xmit_recursion percpu counter. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29tcp: tcp_enter_quickack_mode can be staticstephen hemminger
Function only used in tcp_input.c Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29arp: remove unnecessary export of arp_broken_opsstephen hemminger
arp_broken_ops is only used in arp.c Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29Phonet: Correct header retrieval after pskb_may_pullKumar Sanghvi
Retrieve the header after doing pskb_may_pull since, pskb_may_pull could change the buffer structure. This is based on the comment given by Eric Dumazet on Phonet Pipe controller patch for a similar problem. Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29net: rename netdev rx_queue to ingress_queueEric Dumazet
There is some confusion with rx_queue name after RPS, and net drivers private rx_queue fields. I suggest to rename "struct net_device"->rx_queue to ingress_queue. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29ip6tnl: percpu stats accountingEric Dumazet
Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets. Other seldom used fields are kept in netdev->stats structure, possibly unsafe. This is a preliminary work to support lockless transmit path, and correct RX stats, that are already unsafe. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29ipip: enable lockless xmitsEric Dumazet
IPIP tunnels can benefit from lockless xmits, using NETIF_F_LLTX Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending 10000000 UDP frames via one ipip tunnel (size:200 bytes per frame) Before patch : real 2m53.321s user 0m10.277s sys 46m0.597s After patch: real 0m32.063s user 0m9.237s sys 8m16.255s Last problem to solve is the contention on dst : 16118.00 28.3% __ip_route_output_key vmlinux 6135.00 10.8% dst_release vmlinux 3220.00 5.6% ip_finish_output vmlinux 2149.00 3.8% ip_route_output_flow vmlinux 1575.00 2.8% ip_append_data vmlinux 1481.00 2.6% ip_push_pending_frames vmlinux 1349.00 2.4% __xfrm_lookup vmlinux 1216.00 2.1% csum_partial_copy_generic vmlinux 1208.00 2.1% udp_sendmsg vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29ip_gre: lockless xmitEric Dumazet
GRE tunnels can benefit from lockless xmits, using NETIF_F_LLTX Note: If tunnels are created with the "oseq" option, LLTX is not enabled : Even using an atomic_t o_seq, we would increase chance for packets being out of order at receiver. Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending 10000000 UDP frames via one gre tunnel (size:200 bytes per frame) Before patch : real 3m0.094s user 0m9.365s sys 47m50.103s After patch: real 0m29.756s user 0m11.097s sys 7m33.012s Last problem to solve is the contention on dst : 38660.00 21.4% __ip_route_output_key vmlinux 20786.00 11.5% dst_release vmlinux 14191.00 7.8% __xfrm_lookup vmlinux 12410.00 6.9% ip_finish_output vmlinux 4540.00 2.5% ip_push_pending_frames vmlinux 4427.00 2.4% ip_append_data vmlinux 4265.00 2.4% __alloc_skb vmlinux 4140.00 2.3% __ip_local_out vmlinux 3991.00 2.2% dev_queue_xmit vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29sit: enable lockless xmitsEric Dumazet
SIT tunnels can benefit from lockless xmits, using NETIF_F_LLTX Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending 10000000 UDP frames via one sit tunnel (size:220 bytes per frame) Before patch : real 3m15.399s user 0m9.185s sys 51m55.403s 75029.00 87.5% _raw_spin_lock vmlinux 1090.00 1.3% dst_release vmlinux 902.00 1.1% dev_queue_xmit vmlinux 627.00 0.7% sock_wfree vmlinux 613.00 0.7% ip6_push_pending_frames ipv6.ko 505.00 0.6% __ip_route_output_key vmlinux After patch: real 1m1.387s user 0m12.489s sys 15m58.868s 28239.00 23.3% dst_release vmlinux 13570.00 11.2% ip6_push_pending_frames ipv6.ko 13118.00 10.8% ip6_append_data ipv6.ko 7995.00 6.6% __ip_route_output_key vmlinux 7924.00 6.5% sk_dst_check vmlinux 5015.00 4.1% udpv6_sendmsg ipv6.ko 3594.00 3.0% sock_alloc_send_pskb vmlinux 3135.00 2.6% sock_wfree vmlinux 3055.00 2.5% ip6_sk_dst_lookup ipv6.ko 2473.00 2.0% ip_finish_output vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29sit: fix percpu stats accountingEric Dumazet
commit 15fc1f7056ebd (sit: percpu stats accounting) forgot the fallback tunnel case (sit0), and can crash pretty fast. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29ipip: fix percpu stats accountingEric Dumazet
commit 3c97af99a5aa1 (ipip: percpu stats accounting) forgot the fallback tunnel case (tunl0), and can crash pretty fast. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29net: add a recursion limit in xmit pathEric Dumazet
As tunnel devices are going to be lockless, we need to make sure a misconfigured machine wont enter an infinite loop. Add a percpu variable, and limit to three the number of stacked xmits. Reported-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29sunrpc: fix up rpcauth_remove_module section mismatchStephen Rothwell
On Wed, 29 Sep 2010 14:02:38 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote: > > After merging the final tree, today's linux-next build (powerpc > ppc44x_defconfig) produced tis warning: > > WARNING: net/sunrpc/sunrpc.o(.init.text+0x110): Section mismatch in reference from the function init_sunrpc() to the function .exit.text:rpcauth_remove_module() > The function __init init_sunrpc() references > a function __exit rpcauth_remove_module(). > This is often seen when error handling in the init function > uses functionality in the exit path. > The fix is often to remove the __exit annotation of > rpcauth_remove_module() so it may be used outside an exit section. > > Probably caused by commit 2f72c9b73730c335381b13e2bd221abe1acea394 > ("sunrpc: The per-net skeleton"). This actually causes a build failure on a sparc32 defconfig build: `rpcauth_remove_module' referenced in section `.init.text' of net/built-in.o: defined in discarded section `.exit.text' of net/built-in.o I applied the following patch for today: Fixes: `rpcauth_remove_module' referenced in section `.init.text' of net/built-in.o: defined in discarded section `.exit.text' of net/built-in.o Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-28ipv6: Implement Any-IP support for IPv6.Maciej Żenczykowski
AnyIP is the capability to receive packets and establish incoming connections on IPs we have not explicitly configured on the machine. An example use case is to configure a machine to accept all incoming traffic on eth0, and leave the policy of whether traffic for a given IP should be delivered to the machine up to the load balancer. Can be setup as follows: ip -6 rule from all iif eth0 lookup 200 ip -6 route add local default dev lo table 200 (in this case for all IPv6 addresses) Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28ipv4: Allow configuring subnets as local addressesTom Herbert
This patch allows a host to be configured to respond to any address in a specified range as if it were local, without actually needing to configure the address on an interface. This is done through routing table configuration. For instance, to configure a host to respond to any address in 10.1/16 received on eth0 as a local address we can do: ip rule add from all iif eth0 lookup 200 ip route add local 10.1/16 dev lo proto kernel scope host src 127.0.0.1 table 200 This host is now reachable by any 10.1/16 address (route lookup on input for packets received on eth0 can find the route). On output, the rule will not be matched so that this host can still send packets to 10.1/16 (not sent on loopback). Presumably, external routing can be configured to make sense out of this. To make this work, we needed to modify the logic in finding the interface which is assigned a given source address for output (dev_ip_find). We perform a normal fib_lookup instead of just a lookup on the local table, and in the lookup we ignore the input interface for matching. This patch is useful to implement IP-anycast for subnets of virtual addresses. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28ip_gre: Fix dependencies wrt. ipv6.David S. Miller
The GRE tunnel driver needs to invoke icmpv6 helpers in the ipv6 stack when ipv6 support is enabled. Therefore if IPV6 is enabled, we have to enforce that GRE's enabling (modular or static) matches that of ipv6. Reported-by: Patrick McHardy <kaber@trash.net> Reported-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28net-2.6: SYN retransmits: Add new parameter to retransmits_timed_out()Damian Lukowski
Fixes kernel Bugzilla Bug 18952 This patch adds a syn_set parameter to the retransmits_timed_out() routine and updates its callers. If not set, TCP_RTO_MIN is taken as the calculation basis as before. If set, TCP_TIMEOUT_INIT is used instead, so that sysctl_syn_retries represents the actual amount of SYN retransmissions in case no SYNACKs are received when establishing a new connection. Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28mac80211: Fix WMM driver queue configurationJuuso Oikarinen
The WMM parameter configuration function (ieee80211_sta_wmm_params) only configures the WMM parameters to the driver is the wmm_last_param_set counter value is changed by the AP. The wmm_last_param_set is initialized to -1 on association in order to ensure the configuration is made to the driver at least once on association, but currently this initialization is done *after* the WMM parameter configuration function was called. This leads to unreliability in the driver getting properly configured on first association (depending on what counter value the AP happens to use.) When disassociating (the wmm default parameters are configured to the driver) and then reassociating, due to the above the WMM configuration is not set to the driver at all. On drivers without beacon filtering the problem is corrected by later beacons, but on drivers with beacon filtering the WMM will remain permanently incorrectly configured. Fix this by moving the initialization of wmm_last_param_set to -1 before ieee80211_sta_wmm_params is called on association. Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-09-28cfg80211: always set IBSS basic ratesJohannes Berg
IBSS started from wireless extensions is currently missing basic rate configuration, fix this by moving the code to generate the default to the common code that gets invoked for both nl80211 and wext. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-09-28mac80211: fix offchannel assumption upon associationLuis R. Rodriguez
Association is dealt with as an atomic offchannel operation, we do this because we don't know we are associated until we get the associatin response from the AP. When we do get the associatin response though we were never clearing the offchannel state. This has a few implications, we told drivers we were still offchannel, and the first configured TX power for the channel does not take into account any power constraints. For ath9k this meant ANI calibration would not start upon association, and we'd have to wait until the first bgscan to be triggered. There may be other issues this resolves but I'm too lazy to comb the code to check. Cc: stable@kernel.org Cc: Amod Bodas <amod.bodas@atheros.com> Cc: Vasanth Thiagarajan <vasanth.thiagarajan@atheros.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-09-28netfilter: ctnetlink: add support for user-space expectation helpersPablo Neira Ayuso
This patch adds the basic infrastructure to support user-space expectation helpers via ctnetlink and the netfilter queuing infrastructure NFQUEUE. Basically, this patch: * adds NF_CT_EXPECT_USERSPACE flag to identify user-space created expectations. I have also added a sanity check in __nf_ct_expect_check() to avoid that kernel-space helpers may create an expectation if the master conntrack has no helper assigned. * adds some branches to check if the master conntrack helper exists, otherwise we skip the code that refers to kernel-space helper such as the local expectation list and the expectation policy. * allows to set the timeout for user-space expectations with no helper assigned. * a list of expectations created from user-space that depends on ctnetlink (if this module is removed, they are deleted). * includes USERSPACE in the /proc output for expectations that have been created by a user-space helper. This patch also modifies ctnetlink to skip including the helper name in the Netlink messages if no kernel-space helper is set (since no user-space expectation has not kernel-space kernel assigned). You can access an example user-space FTP conntrack helper at: http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-userspace-POC.tar.bz Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-09-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits) tcp: Fix >4GB writes on 64-bit. net/9p: Mount only matching virtio channels de2104x: fix ethtool tproxy: check for transparent flag in ip_route_newports ipv6: add IPv6 to neighbour table overflow warning tcp: fix TSO FACK loss marking in tcp_mark_head_lost 3c59x: fix regression from patch "Add ethtool WOL support" ipv6: add a missing unregister_pernet_subsys call s390: use free_netdev(netdev) instead of kfree() sgiseeq: use free_netdev(netdev) instead of kfree() rionet: use free_netdev(netdev) instead of kfree() ibm_newemac: use free_netdev(netdev) instead of kfree() smsc911x: Add MODULE_ALIAS() net: reset skb queue mapping when rx'ing over tunnel br2684: fix scheduling while atomic de2104x: fix TP link detection de2104x: fix power management de2104x: disable autonegotiation on broken hardware net: fix a lockdep splat e1000e: 82579 do not gate auto config of PHY by hardware during nominal use ...
2010-09-278021q: Use netif_copy_real_num_queues() to set queue countsBen Hutchings
This covers RX if necessary, as well as TX. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27net: Allow changing number of RX queues after device allocationBen Hutchings
For RPS, we create a kobject for each RX queue based on the number of queues passed to alloc_netdev_mq(). However, drivers generally do not determine the numbers of hardware queues to use until much later, so this usually represents the maximum number the driver may use and not the actual number in use. For TX queues, drivers can update the actual number using netif_set_real_num_tx_queues(). Add a corresponding function for RX queues, netif_set_real_num_rx_queues(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27net: sk_{detach|attach}_filter() rcu fixesEric Dumazet
sk_attach_filter() and sk_detach_filter() are run with socket locked. Use the appropriate rcu_dereference_protected() instead of blocking BH, and rcu_dereference_bh(). There is no point adding BH prevention and memory barrier. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27fib: use atomic_inc_not_zero() in fib_rules_lookupEric Dumazet
It seems we dont use appropriate refcount increment in an rcu_read_lock() protected section. fib_rule_get() might increment a null refcount and bad things could happen. While fib_nl_delrule() respects an rcu grace period before calling fib_rule_put(), fib_rules_cleanup_ops() calls fib_rule_put() without a grace period. Note : after this patch, we might avoid the synchronize_rcu() call done in fib_nl_delrule() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>