summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2010-06-03net: use __packed annotationEric Dumazet
cleanup patch. Use new __packed annotation in net/ and include/ (except netfilter) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-03ipv4: RCU conversion of ip_route_input_slow/ip_route_input_mcEric Dumazet
Avoid two atomic ops on struct in_device refcount per incoming packet, if slow path taken, (or route cache disabled) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-03ipv4: add LINUX_MIB_IPRPFILTER snmp counterEric Dumazet
Christoph Lameter mentioned that packets could be dropped in input path because of rp_filter settings, without any SNMP counter being incremented. System administrator can have a hard time to track the problem. This patch introduces a new counter, LINUX_MIB_IPRPFILTER, incremented each time we drop a packet because Reverse Path Filter triggers. (We receive an IPv4 datagram on a given interface, and find the route to send an answer would use another interface) netstat -s | grep IPReversePathFilter IPReversePathFilter: 21714 Reported-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02cfg80211: make action channel type optionalJohannes Berg
When sending action frames, we want to verify that we do that on the correct channel. However, checking the channel type in addition can get in the way, since the channel type could change on the fly during an association, and it's not useful to have the channel type anyway since it has no effect on the transmission. Therefore, make it optional to specify so that if wanted, it can still be checked, but is not required. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-02wireless: fix several minor description typosWalter Goldens
Signed-off-by: Walter Goldens <goldenstranger@yahoo.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-02cfg80211: don't refuse HT20 channels on devices that don't support HT40Helmut Schaa
Don't refuse HT20 channels on devices that don't support HT40. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-02mac80211: add the minstrel_ht rate control algorithmFelix Fietkau
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-02cls_u32: use skb_header_pointer() to dereference data safelyChangli Gao
use skb_header_pointer() to dereference data safely the original skb->data dereference isn't safe, as there isn't any skb->len or skb_is_nonlinear() check. skb_header_pointer() is used instead in this patch. And when the skb isn't long enough, we terminate the function u32_classify() immediately with -1. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02TCP: tcp_hybla: Fix integer overflow in slow start incrementDaniele Lacamera
For large values of rtt, 2^rho operation may overflow u32. Clamp down the increment to 2^16. Signed-off-by: Daniele Lacamera <root@danielinux.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02net: replace hooks in __netif_receive_skb V5Jiri Pirko
What this patch does is it removes two receive frame hooks (for bridge and for macvlan) from __netif_receive_skb. These are replaced them with a single hook for both. It only supports one hook per device because it makes no sense to do bridging and macvlan on the same device. Then a network driver (of virtual netdev like macvlan or bridge) can register an rx_handler for needed net device. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02ipv6: Refactor update of IPv6 flowi destination address for srcrt (RH) optionArnaud Ebalard
There are more than a dozen occurrences of following code in the IPv6 stack: if (opt && opt->srcrt) { struct rt0_hdr *rt0 = (struct rt0_hdr *) opt->srcrt; ipv6_addr_copy(&final, &fl.fl6_dst); ipv6_addr_copy(&fl.fl6_dst, rt0->addr); final_p = &final; } Replace those with a helper. Note that the helper overrides final_p in all cases. This is ok as final_p was previously initialized to NULL when declared. Signed-off-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02ipconfig: send host-name in DHCP requestsWu Fengguang
Normally dhclient can be configured to send the "host-name" option in DHCP requests to update the client's DNS record. However for an NFSROOT system, dhclient shall never be called (which may change the IP addr and therefore lose your root NFS mount connection). So enable updating the DNS record with kernel parameter ip=::::$HOST_NAME::dhcp Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02act_nat: fix the wrong checksum when addr isn't in old_addr/maskChangli Gao
fix the wrong checksum when addr isn't in old_addr/mask For TCP and UDP packets, when addr isn't in old_addr/mask we don't do SNAT or DNAT, and we should not update layer 4 checksum. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- net/sched/act_nat.c | 4 ++++ 1 file changed, 4 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02packet_mmap: expose hw packet timestamps to network packet capture utilitiesScott McMillan
This patch adds a setting, PACKET_TIMESTAMP, to specify the packet timestamp source that is exported to capture utilities like tcpdump by packet_mmap. PACKET_TIMESTAMP accepts the same integer bit field as SO_TIMESTAMPING. However, only the SOF_TIMESTAMPING_SYS_HARDWARE and SOF_TIMESTAMPING_RAW_HARDWARE values are currently recognized by PACKET_TIMESTAMP. SOF_TIMESTAMPING_SYS_HARDWARE takes precedence over SOF_TIMESTAMPING_RAW_HARDWARE if both bits are set. If PACKET_TIMESTAMP is not set, a software timestamp generated inside the networking stack is used (the behavior before this setting was added). Signed-off-by: Scott McMillan <scott.a.mcmillan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02net: CONFIG_NET_NS reductionEric Dumazet
Use read_pnet() and write_pnet() to reduce number of ifdef CONFIG_NET_NS Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02net: add additional lock to qdisc to increase throughputEric Dumazet
When many cpus compete for sending frames on a given qdisc, the qdisc spinlock suffers from very high contention. The cpu owning __QDISC_STATE_RUNNING bit has same priority to acquire the lock, and cannot dequeue packets fast enough, since it must wait for this lock for each dequeued packet. One solution to this problem is to force all cpus spinning on a second lock before trying to get the main lock, when/if they see __QDISC_STATE_RUNNING already set. The owning cpu then compete with at most one other cpu for the main lock, allowing for higher dequeueing rate. Based on a previous patch from Alexander Duyck. I added the heuristic to avoid the atomic in fast path, and put the new lock far away from the cache line used by the dequeue worker. Also try to release the busylock lock as late as possible. Tests with following script gave a boost from ~50.000 pps to ~600.000 pps on a dual quad core machine (E5450 @3.00GHz), tg3 driver. (A single netperf flow can reach ~800.000 pps on this platform) for j in `seq 0 3`; do for i in `seq 0 7`; do netperf -H 192.168.0.1 -t UDP_STREAM -l 60 -N -T $i -- -m 6 & done done Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02net: fix conflict between null_or_orig and null_or_bondJohn Fastabend
If a skb is received on an inactive bond that does not meet the special cases checked for by skb_bond_should_drop it should only be delivered to exact matches as the comment in netif_receive_skb() says. However because null_or_bond could also be null this is not always true. This patch renames null_or_bond to orig_or_bond and initializes it to orig_dev. This keeps the intent of null_or_bond to pass frames received on VLAN interfaces stacked on bonding interfaces without invalidating the statement for null_or_orig. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02net: init_vlan should not copy slave or master flagsJohn Fastabend
The vlan device should not copy the slave or master flags from the real device. It is not in the bond until added nor is it a master. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02net: Define accessors to manipulate QDISC_STATE_RUNNINGEric Dumazet
Define three helpers to manipulate QDISC_STATE_RUNNIG flag, that a second patch will move on another location. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02xfrm: force a dst reference in __xfrm_route_forward()Eric Dumazet
Packets going through __xfrm_route_forward() have a not refcounted dst entry, since we enabled a noref forwarding path. xfrm_lookup() might incorrectly release this dst entry. It's a bit late to make invasive changes in xfrm_lookup(), so lets force a refcount in this path. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-01mac80211: fix dialog token allocatorJohannes Berg
The dialog token allocator has apparently been broken since b83f4e15 ("mac80211: fix deadlock in sta->lock") because it got moved out under the spinlock. Fix it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-01mac80211: fix blockack-req processingJohannes Berg
Daniel reported that the paged RX changes had broken blockack request frame processing due to using data that wasn't really part of the skb data. Fix this using skb_copy_bits() for the needed data. As a side effect, this adds a check on processing too short frames, which previously this code could do. Reported-by: Daniel Halperin <dhalperi@cs.washington.edu> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Daniel Halperin <dhalperi@cs.washington.edu> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-01netfilter: xt_statistic: remove nth_lock spinlockEric Dumazet
Use atomic_cmpxchg() to avoid dirtying a shared location. xt_statistic_priv smp aligned to avoid sharing same cache line with other stuff. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-01netfilter: br_netfilter: use skb_set_noref()Eric Dumazet
Avoid dirtying bridge_parent_rtable refcount, using new dst noref infrastructure. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-01ipv6: get rid of ipip6_prl_lockEric Dumazet
As noticed by Julia Lawall, ipip6_tunnel_add_prl() incorrectly calls kzallloc(..., GFP_KERNEL) while a spinlock is held. She provided a patch to use GFP_ATOMIC instead. One possibility would be to convert this spinlock to a mutex, or preallocate the thing before taking the lock. After RCU conversion, it appears we dont need this lock, since caller already holds RTNL Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-01net/ipv6/mcast.c: Remove unnecessary kmalloc castsJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-01net/ipv4/igmp.c: Remove unnecessary kmalloc castsJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31net/ipv4/tcp_input.c: fix compilation breakage when FASTRETRANS_DEBUG > 1Joe Perches
Commit: c720c7e8383aff1cb219bddf474ed89d850336e3 missed these. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2010-05-31net: sock_queue_err_skb() dont mess with sk_forward_allocEric Dumazet
Correct sk_forward_alloc handling for error_queue would need to use a backlog of frames that softirq handler could not deliver because socket is owned by user thread. Or extend backlog processing to be able to process normal and error packets. Another possibility is to not use mem charge for error queue, this is what I implemented in this patch. Note: this reverts commit 29030374 (net: fix sk_forward_alloc corruptions), since we dont need to lock socket anymore. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31netfilter: xtables: stackptr should be percpuEric Dumazet
commit f3c5c1bfd4 (netfilter: xtables: make ip_tables reentrant) introduced a performance regression, because stackptr array is shared by all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache line) Fix this using alloc_percpu() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-By: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-05-31netfilter: don't xt_jumpstack_alloc twice in xt_register_tableXiaotian Feng
In xt_register_table, xt_jumpstack_alloc is called first, later xt_replace_table is used. But in xt_replace_table, xt_jumpstack_alloc will be used again. Then the memory allocated by previous xt_jumpstack_alloc will be leaked. We can simply remove the previous xt_jumpstack_alloc because there aren't any users of newinfo between xt_jumpstack_alloc and xt_replace_table. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jan Engelhardt <jengelh@medozas.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Alexey Dobriyan <adobriyan@gmail.com> Acked-By: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-05-31Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2010-05-31arp_notify: allow drivers to explicitly request a notification event.Ian Campbell
Currently such notifications are only generated when the device comes up or the address changes. However one use case for these notifications is to enable faster network recovery after a virtual machine migration (by causing switches to relearn their MAC tables). A migration appears to the network stack as a temporary loss of carrier and therefore does not trigger either of the current conditions. Rather than adding carrier up as a trigger (which can cause issues when interfaces a flapping) simply add an interface which the driver can use to explicitly trigger the notification. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31caif: cleanup: remove duplicate checksDan Carpenter
"phyinfo" can never be null here because we assigned it an address, so I removed both the assert and the second check inside the if statement. I removed the "phyinfo->phy_layer != NULL" check as well because that was asserted earlier. Walter Harms suggested I move the "phyinfo->phy_ref_count++;" outside the if condition for readability, so I have done that. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31caif: remove unneeded null check in caif_connect()Dan Carpenter
We already dereferenced uaddr towards the start of the function when we checked that "uaddr->sa_family != AF_CAIF". Both the check here and the earlier check were added in bece7b2398d0: "caif: Rewritten socket implementation". Before that patch, we assumed that we recieved a valid pointer for uaddr, and based on that, I have removed this check. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31net/dccp: Use memdup_userJulia Lawall
Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) || ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31net/can: Use memdup_userJulia Lawall
Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) || ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31tcp: tcp_md5_hash_skb_data() frag_list handlingEric Dumazet
tcp_md5_hash_skb_data() should handle skb->frag_list, and eventually recurse. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31net: remove zap_completion_queueEric Dumazet
netpoll does an interesting work in zap_completion_queue(), but this was before we did skb orphaning before delivering packets to device. It now makes sense to add a test in dev_kfree_skb_irq() to not queue a skb if already orphaned, and to remove netpoll zap_completion_queue() as a bonus. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31net: Use __this_cpu_inc() in fast pathEric Dumazet
This patch saves 224 bytes of text on my machine. __this_cpu_inc() generates a single instruction, using no scratch registers : 65 ff 04 25 a8 30 01 00 incl %gs:0x130a8 instead of : 48 c7 c2 80 30 01 00 mov $0x13080,%rdx 65 48 8b 04 25 88 ea 00 00 mov %gs:0xea88,%rax 83 44 10 28 01 addl $0x1,0x28(%rax,%rdx,1) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-31Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2010-05-29net: fix sk_forward_alloc corruptionsEric Dumazet
As David found out, sock_queue_err_skb() should be called with socket lock hold, or we risk sk_forward_alloc corruption, since we use non atomic operations to update this field. This patch adds bh_lock_sock()/bh_unlock_sock() pair to three spots. (BH already disabled) 1) skb_tstamp_tx() 2) Before calling ip_icmp_error(), in __udp4_lib_err() 3) Before calling ipv6_icmp_error(), in __udp6_lib_err() Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-29Phonet: listening socket lock protects the connected socket listRémi Denis-Courmont
The accept()'d socket need to be unhashed while the (listen()'ing) socket lock is held. This fixes a race condition that could lead to an OOPS. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-29caif: unlock on error path in cfserl_receive()Dan Carpenter
There was an spin_unlock missing on the error path. The spin_lock was tucked in with the declarations so it was hard to spot. I added a new line. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Sjur Brændeland <sjurbren@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-29net/rds: Add missing mutex_unlockJulia Lawall
Add a mutex_unlock missing on the error path. In each case, whenever the label out is reached from elsewhere in the function, mutex is not locked. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E1; @@ * mutex_lock(E1); <+... when != E1 if (...) { ... when != E1 * return ...; } ...+> * mutex_unlock(E1); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Reviewed-by: Zach Brown <zach.brown@oracle.com> Acked-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-29skb: make skb_recycle_check() return a bool valueChangli Gao
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-28IPv6: fix Mobile IPv6 regressionBrian Haley
Commit f4f914b5 (net: ipv6 bind to device issue) caused a regression with Mobile IPv6 when it changed the meaning of fl->oif to become a strict requirement of the route lookup. Instead, only force strict mode when sk->sk_bound_dev_if is set on the calling socket, getting the intended behavior and fixing the regression. Tested-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-28Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2010-05-28mac80211: make a function staticJohannes Berg
sparse correctly complains that __ieee80211_get_channel_mode is not static. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>