summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2010-06-14mac80211: defer TX agg session teardown to workJohannes Berg
Since we want the code to be able to sleep in the future, it must not be called from the timer directly. To achieve that, simply call the function drivers would call, and also use RCU in the timer to get the struct so we don't need to rely on the spinlock in the future. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: change RX aggregation lockingJohannes Berg
To prepare for allowing drivers to sleep in ampdu_action, change the locking in the RX aggregation code to use a mutex, so that it would already allow drivers to sleep. But explicitly disable BHs around the callback for now since the TX part cannot yet sleep, and drivers' locking might require it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: fix RX aggregation timerJohannes Berg
I noticed that when there was _no_ traffic at all on a given aggregation session, it would never time out. This won't happen unless you forced creating a session, but fix it anyway. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: defer RX agg session teardown to workJohannes Berg
Since we want the code to be able to sleep in the future, it must not be called from the timer directly. To prepare, move it out into the aggregation work. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: move BA session workJohannes Berg
Move the block-ack session works into common code, since it will be needed for RX agg too in the next patches. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: make TX aggregation start/stop request asyncJohannes Berg
When the driver or rate control requests starting or stopping an aggregation session, that currently causes a direct callback into the driver, which could potentially cause locking problems. Also, the functions need to be callable from contexts that cannot sleep, and thus will interfere with making the ampdu_action callback sleeping. To address these issues, add a new work item for each station that will process any start or stop requests out of line. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: refcount aggregation queue stopJohannes Berg
mac80211 currently maintains the ampdu_lock to avoid starting a queue due to one aggregation session while another aggregation session needs the queue stopped. We can do better, however, and instead refcount the queue stops for this particular purpose, thus removing the need for the lock. This will help making ampdu_action able to sleep. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: remove non-irqsafe aggregation callbacksJohannes Berg
The non-irqsafe aggregation start/stop done callbacks are currently only used by ath9k_htc, and can cause callbacks into the driver again. This might lead to locking issues, which will only get worse as we modify locking. To avoid trouble, remove the non-irqsafe versions and change ath9k_htc to use those instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: use RCU for TX aggregationJohannes Berg
Currently we allocate some memory for each TX aggregation session and additionally keep a state bitmap indicating the state it is in. By using RCU to protect the pointer, moving the state into the structure and some locking trickery we can avoid locking when the TX agg session is fully operational. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: use RCU for RX aggregationJohannes Berg
Currently we allocate some memory for each RX aggregation session and additionally keep a flag indicating whether or not it is valid. By using RCU to protect the pointer and making sure that the memory is fully set up before it becomes visible to the RX path, we can remove the need for the bool that indicates validity, as well as for locking on the RX path since it is always synchronised against itself, and we can guarantee that all other modifications are done when the structure is not visible to the RX path. The net result is that since we remove locking requirements from the RX path, we can in the future use any kind of lock for the setup and teardown code paths. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: move aggregation callback processingJohannes Berg
This moves the aggregation callback processing to the per-sdata skb queue and a work function rather than the tasklet. Unfortunately, this means that it extends the pkt_type hack to that skb queue. However, it will enable making ampdu_action API changes gradually, my current plan is to get rid of this again by forcing drivers to only return from ampdu_action() when everything is done, thus removing the callbacks completely. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: move blockack stop due to fragmentationJohannes Berg
There's a corner case where we receive a fragmented frame during a blockack session, in which case we will terminate that session. To simplify future work in this area that will culminate in allowing the driver callbacks for aggregation to sleep, move the processing of this case out of the RX path into the interface work. This will simplify future work because the new place for this code doesn't require that the function will always be atomic, which the RX path needs. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: always process blockack action from workqueueJohannes Berg
To prepare for making the ampdu_action callback sleep, make mac80211 always process blockack action frames from the skb queue. This gets rid of the current special case for managed mode interfaces as well. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: pull mgmt frame rx into rx handlerJohannes Berg
Some code is duplicated between ibss, mesh and managed mode regarding the queueing of management frames. Since all modes now use a common skb queue and a common work function, we can pull the queueing code into the rx handler directly and remove the duplicated length checks etc. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: common work skb freeingJohannes Berg
All the management processing functions free the skb after they are done, so this can be done in the new common code instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: use common work functionJohannes Berg
Even with the previous patch, IBSS, managed and mesh modes all attach their own work function to the shared work struct, which means some duplicated code. Change that to only have a frame processing function and a further work function for each of them and share some common code. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: use common work structJohannes Berg
IBSS, managed and mesh modes all have their own work struct, and in the future we want to also use it in other modes to process frames from the now common skb queue. This also makes the skb queue and work safe to use from other interface types. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: use common skb queueJohannes Berg
IBSS, managed and mesh modes all have an skb queue, and in the future we want to also use it in other modes, so make them all use a common skb queue already. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14mac80211: simplify station/aggregation codeJohannes Berg
A number of places use RCU locking for accessing the station list, even though they do not need to. Use mutex locking instead to prepare for the locking changes I want to make. The mlme code is also using a WLAN_STA_DISASSOC flag that has the same meaning as WLAN_STA_BLOCK_BA, so use that. While doing so, combine places where we loop over stations twice, and optimise away some of the loops by checking if the hardware supports aggregation at all first. Also fix a more theoretical race condition: right now we could resume, set up an aggregation session, and right after tear it down again due to the code that is needed for hardware reconfiguration here. Also mark add a comment to that code marking it as a workaround. Finally, remove a pointless aggregation disabling loop when an interface is stopped, directly after that we remove all stations from it which will also disable all aggregation sessions that may still be active, and does so in a race-free way unlike the current loop that doesn't block new sessions. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14cfg80211/mac80211: allow action frame TX/RX in IBSSJohannes Berg
When in IBSS mode, currently action frame TX and RX cannot be used. Allow using it to talk to any peer, or for public action frames. Also, while at it, restructure the code in mac80211 to make it easier to add this for other interface types in the future. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14netfilter: defrag: kill unused work parameter of frag_kfree_skb()Shan Wei
The parameter (work) is unused, remove it. Reported from Eric Dumazet. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-14netfilter: defrag: remove one redundant atomic opsShan Wei
Instead of doing one atomic operation per frag, we can factorize them. Reported from Eric Dumazet. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-14netfilter: kill redundant check code in which setting ip_summed valueShan Wei
If the returned csum value is 0, We has set ip_summed with CHECKSUM_UNNECESSARY flag in __skb_checksum_complete_head(). So this patch kills the check and changes to return to upper caller directly. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-14netfilter: nfnetlink_log: RCU conversion, part 2Eric Dumazet
- must use atomic_inc_not_zero() in instance_lookup_get() - must use hlist_add_head_rcu() instead of hlist_add_head() - must use hlist_del_rcu() instead of hlist_del() - Introduce NFULNL_COPY_DISABLED to stop lockless reader from using an instance, before we do final instance_put() on it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-13net: rxhash already set in __copy_skb_headerEric Dumazet
No need to copy rxhash again in __skb_clone() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-13net: fix deliver_no_wcard regression on loopback deviceJohn Fastabend
deliver_no_wcard is not being set in skb_copy_header. In the skb_cloned case it is not being cleared and may cause the skb to be dropped when the loopback device pushes it back up the stack. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-12irttp: Print device parameters and statistics as unsignedBen Hutchings
Device statistics have type unsigned long and several of the device-specific parameters printed here have type __u32. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-12net: Enable 64-bit net device statistics on 32-bit architecturesBen Hutchings
Use struct rtnl_link_stats64 as the statistics structure. On 32-bit architectures, insert 32 bits of padding after/before each field of struct net_device_stats to make its layout compatible with struct rtnl_link_stats64. Add an anonymous union in net_device; move stats into the union and add struct rtnl_link_stats64 stats64. Add net_device_ops::ndo_get_stats64, implementations of which will return a pointer to struct rtnl_link_stats64. Drivers that implement this operation must not update the structure asynchronously. Change dev_get_stats() to call ndo_get_stats64 if available, and to return a pointer to struct rtnl_link_stats64. Change callers of dev_get_stats() accordingly. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-11pktgen: increasing transmission granularityDaniel Turull
This patch increases the granularity of the rate generated by pktgen. The previous version of pktgen uses micro seconds (udelay) resolution when it was delayed causing gaps in the rates. It is changed to nanosecond (ndelay). Now any rate is possible. Also it allows to set, the desired rate in Mb/s or packets per second. The documentation has been updated. Signed-off-by: Daniel Turull <daniel.turull@gmail.com> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-11econet: fix lockingEric Dumazet
econet lacks proper locking. It holds econet_lock only when inserting or deleting an entry in econet_sklist, not during lookups. - convert econet_lock from rwlock to spinlock - use econet_lock in ec_listening_socket() lookup - use appropriate sock_hold() / sock_put() to avoid corruptions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-11pkt_sched: gen_kill_estimator() rcu fixesEric Dumazet
gen_kill_estimator() API is incomplete or not well documented, since caller should make sure an RCU grace period is respected before freeing stats_lock. This was partially addressed in commit 5d944c640b4 (gen_estimator: deadlock fix), but same problem exist for all gen_kill_estimator() users, if lock they use is not already RCU protected. A code review shows xt_RATEEST.c, act_api.c, act_police.c have this problem. Other are ok because they use qdisc lock, already RCU protected. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-11Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2010-06-11Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 Conflicts: drivers/net/wireless/wl12xx/wl1271.h drivers/net/wireless/wl12xx/wl1271_cmd.h
2010-06-10net-next: remove useless union keywordChangli Gao
remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10pktgen: Fix accuracy of inter-packet delay.Daniel Turull
This patch correct a bug in the delay of pktgen. It makes sure the inter-packet interval is accurate. Signed-off-by: Daniel Turull <daniel.turull@gmail.com> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10pkt_sched: gen_estimator: add a new lockEric Dumazet
gen_kill_estimator() / gen_new_estimator() is not always called with RTNL held. net/netfilter/xt_RATEEST.c is one user of these API that do not hold RTNL, so random corruptions can occur between "tc" and "iptables". Add a new fine grained lock instead of trying to use RTNL in netfilter. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10ip: ip_ra_control() rcu fixEric Dumazet
commit 66018506e15b (ip: Router Alert RCU conversion) introduced RCU lookups to ip_call_ra_chain(). It missed proper deinit phase : When ip_ra_control() deletes an ip_ra_chain, it should make sure ip_call_ra_chain() users can not start to use socket during the rcu grace period. It should also delay the sock_put() after the grace period, or we risk a premature socket freeing and corruptions, as raw sockets are not rcu protected yet. This delay avoids using expensive atomic_inc_not_zero() in ip_call_ra_chain(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10net: deliver skbs on inactive slaves to exact matchesJohn Fastabend
Currently, the accelerated receive path for VLAN's will drop packets if the real device is an inactive slave and is not one of the special pkts tested for in skb_bond_should_drop(). This behavior is different then the non-accelerated path and for pkts over a bonded vlan. For example, vlanx -> bond0 -> ethx will be dropped in the vlan path and not delivered to any packet handlers at all. However, bond0 -> vlanx -> ethx and bond0 -> ethx will be delivered to handlers that match the exact dev, because the VLAN path checks the real_dev which is not a slave and netif_recv_skb() doesn't drop frames but only delivers them to exact matches. This patch adds a sk_buff flag which is used for tagging skbs that would previously been dropped and allows the skb to continue to skb_netif_recv(). Here we add logic to check for the deliver_no_wcard flag and if it is set only deliver to handlers that match exactly. This makes both paths above consistent and gives pkt handlers a way to identify skbs that come from inactive slaves. Without this patch in some configurations skbs will be delivered to handlers with exact matches and in others be dropped out right in the vlan path. I have tested the following 4 configurations in failover modes and load balancing modes. # bond0 -> ethx # vlanx -> bond0 -> ethx # bond0 -> vlanx -> ethx # bond0 -> ethx | vlanx -> -- Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09ipv6: fix ICMP6_MIB_OUTERRORSEric Dumazet
In commit 1f8438a85366 (icmp: Account for ICMP out errors), I did a typo on IPV6 side, using ICMP6_MIB_OUTMSGS instead of ICMP6_MIB_OUTERRORS Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09ipv6: mcast: RCU conversionsEric Dumazet
- ipv6_sock_mc_join() : doesnt touch dev refcount - ipv6_sock_mc_drop() : doesnt touch dev/idev refcounts - ip6_mc_find_dev() becomes ip6_mc_find_dev_rcu() (called from rcu), and doesnt touch dev/idev refcounts - ipv6_sock_mc_close() : doesnt touch dev/idev refcounts - ip6_mc_source() uses ip6_mc_find_dev_rcu() - ip6_mc_msfilter() uses ip6_mc_find_dev_rcu() - ip6_mc_msfget() uses ip6_mc_find_dev_rcu() - ipv6_dev_mc_dec(), ipv6_chk_mcast_addr(), igmp6_event_query(), igmp6_event_report(), mld_sendpack(), igmp6_send() dont touch idev refcount Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09icmp: RCU conversion in icmp_address_reply()Eric Dumazet
- rcu_read_lock() already held by caller - use __in_dev_get_rcu() instead of in_dev_get() / in_dev_put() - remove goto out; Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09Merge branch 'num_rx_queues' of git://kernel.ubuntu.com/rtg/net-2.6David S. Miller
2010-06-09caif: fix a couple range checksDan Carpenter
The extra ! character means that these conditions are always false. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09phonet: use call_rcu for phonet device freeJiri Pirko
Use call_rcu rather than synchronize_rcu. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09net: Print num_rx_queues imbalance warning only when there are allocated queuesTim Gardner
BugLink: http://bugs.launchpad.net/bugs/591416 There are a number of network drivers (bridge, bonding, etc) that are not yet receive multi-queue enabled and use alloc_netdev(), so don't print a num_rx_queues imbalance warning in that case. Also, only print the warning once for those drivers that _are_ multi-queue enabled. Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2010-06-09Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2010-06-09netfilter: nfnetlink_log: RCU conversionEric Dumazet
- instances_lock becomes a spinlock - lockless lookups While nfnetlink_log probably not performance critical, using less rwlocks in our code is always welcomed... Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-09netfilter: nfnetlink_queue: some optimizationsEric Dumazet
- Use an atomic_t for id_sequence to avoid a spin_lock/spin_unlock pair - Group highly modified struct nfqnl_instance fields together Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-09netfilter: ip6_queue: rwlock to spinlock conversionEric Dumazet
Converts queue_lock rwlock to a spinlock. (readlocked part can be changed by reads of integer values) One atomic operation instead of four per ipq_enqueue_packet() call. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-09ipvs: Add missing locking during connection table hashing and unhashingSven Wegener
The code that hashes and unhashes connections from the connection table is missing locking of the connection being modified, which opens up a race condition and results in memory corruption when this race condition is hit. Here is what happens in pretty verbose form: CPU 0 CPU 1 ------------ ------------ An active connection is terminated and we schedule ip_vs_conn_expire() on this CPU to expire this connection. IRQ assignment is changed to this CPU, but the expire timer stays scheduled on the other CPU. New connection from same ip:port comes in right before the timer expires, we find the inactive connection in our connection table and get a reference to it. We proper lock the connection in tcp_state_transition() and read the connection flags in set_tcp_state(). ip_vs_conn_expire() gets called, we unhash the connection from our connection table and remove the hashed flag in ip_vs_conn_unhash(), without proper locking! While still holding proper locks we write the connection flags in set_tcp_state() and this sets the hashed flag again. ip_vs_conn_expire() fails to expire the connection, because the other CPU has incremented the reference count. We try to re-insert the connection into our connection table, but this fails in ip_vs_conn_hash(), because the hashed flag has been set by the other CPU. We re-schedule execution of ip_vs_conn_expire(). Now this connection has the hashed flag set, but isn't actually hashed in our connection table and has a dangling list_head. We drop the reference we held on the connection and schedule the expire timer for timeouting the connection on this CPU. Further packets won't be able to find this connection in our connection table. ip_vs_conn_expire() gets called again, we think it's already hashed, but the list_head is dangling and while removing the connection from our connection table we write to the memory location where this list_head points to. The result will probably be a kernel oops at some other point in time. This race condition is pretty subtle, but it can be triggered remotely. It needs the IRQ assignment change or another circumstance where packets coming from the same ip:port for the same service are being processed on different CPUs. And it involves hitting the exact time at which ip_vs_conn_expire() gets called. It can be avoided by making sure that all packets from one connection are always processed on the same CPU and can be made harder to exploit by changing the connection timeouts to some custom values. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Cc: stable@kernel.org Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>