linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2010-06-14	mac80211: defer TX agg session teardown to work	Johannes Berg
	Since we want the code to be able to sleep in the future, it must not be called from the timer directly. To achieve that, simply call the function drivers would call, and also use RCU in the timer to get the struct so we don't need to rely on the spinlock in the future. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: change RX aggregation locking	Johannes Berg
	To prepare for allowing drivers to sleep in ampdu_action, change the locking in the RX aggregation code to use a mutex, so that it would already allow drivers to sleep. But explicitly disable BHs around the callback for now since the TX part cannot yet sleep, and drivers' locking might require it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: fix RX aggregation timer	Johannes Berg
	I noticed that when there was _no_ traffic at all on a given aggregation session, it would never time out. This won't happen unless you forced creating a session, but fix it anyway. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: defer RX agg session teardown to work	Johannes Berg
	Since we want the code to be able to sleep in the future, it must not be called from the timer directly. To prepare, move it out into the aggregation work. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: move BA session work	Johannes Berg
	Move the block-ack session works into common code, since it will be needed for RX agg too in the next patches. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: make TX aggregation start/stop request async	Johannes Berg
	When the driver or rate control requests starting or stopping an aggregation session, that currently causes a direct callback into the driver, which could potentially cause locking problems. Also, the functions need to be callable from contexts that cannot sleep, and thus will interfere with making the ampdu_action callback sleeping. To address these issues, add a new work item for each station that will process any start or stop requests out of line. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: refcount aggregation queue stop	Johannes Berg
	mac80211 currently maintains the ampdu_lock to avoid starting a queue due to one aggregation session while another aggregation session needs the queue stopped. We can do better, however, and instead refcount the queue stops for this particular purpose, thus removing the need for the lock. This will help making ampdu_action able to sleep. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: remove non-irqsafe aggregation callbacks	Johannes Berg
	The non-irqsafe aggregation start/stop done callbacks are currently only used by ath9k_htc, and can cause callbacks into the driver again. This might lead to locking issues, which will only get worse as we modify locking. To avoid trouble, remove the non-irqsafe versions and change ath9k_htc to use those instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: use RCU for TX aggregation	Johannes Berg
	Currently we allocate some memory for each TX aggregation session and additionally keep a state bitmap indicating the state it is in. By using RCU to protect the pointer, moving the state into the structure and some locking trickery we can avoid locking when the TX agg session is fully operational. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: use RCU for RX aggregation	Johannes Berg
	Currently we allocate some memory for each RX aggregation session and additionally keep a flag indicating whether or not it is valid. By using RCU to protect the pointer and making sure that the memory is fully set up before it becomes visible to the RX path, we can remove the need for the bool that indicates validity, as well as for locking on the RX path since it is always synchronised against itself, and we can guarantee that all other modifications are done when the structure is not visible to the RX path. The net result is that since we remove locking requirements from the RX path, we can in the future use any kind of lock for the setup and teardown code paths. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: move aggregation callback processing	Johannes Berg
	This moves the aggregation callback processing to the per-sdata skb queue and a work function rather than the tasklet. Unfortunately, this means that it extends the pkt_type hack to that skb queue. However, it will enable making ampdu_action API changes gradually, my current plan is to get rid of this again by forcing drivers to only return from ampdu_action() when everything is done, thus removing the callbacks completely. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: move blockack stop due to fragmentation	Johannes Berg
	There's a corner case where we receive a fragmented frame during a blockack session, in which case we will terminate that session. To simplify future work in this area that will culminate in allowing the driver callbacks for aggregation to sleep, move the processing of this case out of the RX path into the interface work. This will simplify future work because the new place for this code doesn't require that the function will always be atomic, which the RX path needs. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: always process blockack action from workqueue	Johannes Berg
	To prepare for making the ampdu_action callback sleep, make mac80211 always process blockack action frames from the skb queue. This gets rid of the current special case for managed mode interfaces as well. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: pull mgmt frame rx into rx handler	Johannes Berg
	Some code is duplicated between ibss, mesh and managed mode regarding the queueing of management frames. Since all modes now use a common skb queue and a common work function, we can pull the queueing code into the rx handler directly and remove the duplicated length checks etc. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: common work skb freeing	Johannes Berg
	All the management processing functions free the skb after they are done, so this can be done in the new common code instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: use common work function	Johannes Berg
	Even with the previous patch, IBSS, managed and mesh modes all attach their own work function to the shared work struct, which means some duplicated code. Change that to only have a frame processing function and a further work function for each of them and share some common code. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: use common work struct	Johannes Berg
	IBSS, managed and mesh modes all have their own work struct, and in the future we want to also use it in other modes to process frames from the now common skb queue. This also makes the skb queue and work safe to use from other interface types. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: use common skb queue	Johannes Berg
	IBSS, managed and mesh modes all have an skb queue, and in the future we want to also use it in other modes, so make them all use a common skb queue already. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	mac80211: simplify station/aggregation code	Johannes Berg
	A number of places use RCU locking for accessing the station list, even though they do not need to. Use mutex locking instead to prepare for the locking changes I want to make. The mlme code is also using a WLAN_STA_DISASSOC flag that has the same meaning as WLAN_STA_BLOCK_BA, so use that. While doing so, combine places where we loop over stations twice, and optimise away some of the loops by checking if the hardware supports aggregation at all first. Also fix a more theoretical race condition: right now we could resume, set up an aggregation session, and right after tear it down again due to the code that is needed for hardware reconfiguration here. Also mark add a comment to that code marking it as a workaround. Finally, remove a pointless aggregation disabling loop when an interface is stopped, directly after that we remove all stations from it which will also disable all aggregation sessions that may still be active, and does so in a race-free way unlike the current loop that doesn't block new sessions. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	cfg80211/mac80211: allow action frame TX/RX in IBSS	Johannes Berg
	When in IBSS mode, currently action frame TX and RX cannot be used. Allow using it to talk to any peer, or for public action frames. Also, while at it, restructure the code in mac80211 to make it easier to add this for other interface types in the future. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-14	netfilter: defrag: kill unused work parameter of frag_kfree_skb()	Shan Wei
	The parameter (work) is unused, remove it. Reported from Eric Dumazet. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-14	netfilter: defrag: remove one redundant atomic ops	Shan Wei
	Instead of doing one atomic operation per frag, we can factorize them. Reported from Eric Dumazet. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-14	netfilter: kill redundant check code in which setting ip_summed value	Shan Wei
	If the returned csum value is 0, We has set ip_summed with CHECKSUM_UNNECESSARY flag in __skb_checksum_complete_head(). So this patch kills the check and changes to return to upper caller directly. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-14	netfilter: nfnetlink_log: RCU conversion, part 2	Eric Dumazet
	- must use atomic_inc_not_zero() in instance_lookup_get() - must use hlist_add_head_rcu() instead of hlist_add_head() - must use hlist_del_rcu() instead of hlist_del() - Introduce NFULNL_COPY_DISABLED to stop lockless reader from using an instance, before we do final instance_put() on it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-13	net: rxhash already set in __copy_skb_header	Eric Dumazet
	No need to copy rxhash again in __skb_clone() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-13	net: fix deliver_no_wcard regression on loopback device	John Fastabend
	deliver_no_wcard is not being set in skb_copy_header. In the skb_cloned case it is not being cleared and may cause the skb to be dropped when the loopback device pushes it back up the stack. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-12	irttp: Print device parameters and statistics as unsigned	Ben Hutchings
	Device statistics have type unsigned long and several of the device-specific parameters printed here have type __u32. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-12	net: Enable 64-bit net device statistics on 32-bit architectures	Ben Hutchings
	Use struct rtnl_link_stats64 as the statistics structure. On 32-bit architectures, insert 32 bits of padding after/before each field of struct net_device_stats to make its layout compatible with struct rtnl_link_stats64. Add an anonymous union in net_device; move stats into the union and add struct rtnl_link_stats64 stats64. Add net_device_ops::ndo_get_stats64, implementations of which will return a pointer to struct rtnl_link_stats64. Drivers that implement this operation must not update the structure asynchronously. Change dev_get_stats() to call ndo_get_stats64 if available, and to return a pointer to struct rtnl_link_stats64. Change callers of dev_get_stats() accordingly. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-11	pktgen: increasing transmission granularity	Daniel Turull
	This patch increases the granularity of the rate generated by pktgen. The previous version of pktgen uses micro seconds (udelay) resolution when it was delayed causing gaps in the rates. It is changed to nanosecond (ndelay). Now any rate is possible. Also it allows to set, the desired rate in Mb/s or packets per second. The documentation has been updated. Signed-off-by: Daniel Turull <daniel.turull@gmail.com> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-11	econet: fix locking	Eric Dumazet
	econet lacks proper locking. It holds econet_lock only when inserting or deleting an entry in econet_sklist, not during lookups. - convert econet_lock from rwlock to spinlock - use econet_lock in ec_listening_socket() lookup - use appropriate sock_hold() / sock_put() to avoid corruptions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-11	pkt_sched: gen_kill_estimator() rcu fixes	Eric Dumazet
	gen_kill_estimator() API is incomplete or not well documented, since caller should make sure an RCU grace period is respected before freeing stats_lock. This was partially addressed in commit 5d944c640b4 (gen_estimator: deadlock fix), but same problem exist for all gen_kill_estimator() users, if lock they use is not already RCU protected. A code review shows xt_RATEEST.c, act_api.c, act_police.c have this problem. Other are ok because they use qdisc lock, already RCU protected. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-11	Merge branch 'master' of ↵	David S. Miller
	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2010-06-11	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 Conflicts: drivers/net/wireless/wl12xx/wl1271.h drivers/net/wireless/wl12xx/wl1271_cmd.h
2010-06-10	net-next: remove useless union keyword	Changli Gao
	remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10	pktgen: Fix accuracy of inter-packet delay.	Daniel Turull
	This patch correct a bug in the delay of pktgen. It makes sure the inter-packet interval is accurate. Signed-off-by: Daniel Turull <daniel.turull@gmail.com> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10	pkt_sched: gen_estimator: add a new lock	Eric Dumazet
	gen_kill_estimator() / gen_new_estimator() is not always called with RTNL held. net/netfilter/xt_RATEEST.c is one user of these API that do not hold RTNL, so random corruptions can occur between "tc" and "iptables". Add a new fine grained lock instead of trying to use RTNL in netfilter. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10	ip: ip_ra_control() rcu fix	Eric Dumazet
	commit 66018506e15b (ip: Router Alert RCU conversion) introduced RCU lookups to ip_call_ra_chain(). It missed proper deinit phase : When ip_ra_control() deletes an ip_ra_chain, it should make sure ip_call_ra_chain() users can not start to use socket during the rcu grace period. It should also delay the sock_put() after the grace period, or we risk a premature socket freeing and corruptions, as raw sockets are not rcu protected yet. This delay avoids using expensive atomic_inc_not_zero() in ip_call_ra_chain(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10	net: deliver skbs on inactive slaves to exact matches	John Fastabend
	Currently, the accelerated receive path for VLAN's will drop packets if the real device is an inactive slave and is not one of the special pkts tested for in skb_bond_should_drop(). This behavior is different then the non-accelerated path and for pkts over a bonded vlan. For example, vlanx -> bond0 -> ethx will be dropped in the vlan path and not delivered to any packet handlers at all. However, bond0 -> vlanx -> ethx and bond0 -> ethx will be delivered to handlers that match the exact dev, because the VLAN path checks the real_dev which is not a slave and netif_recv_skb() doesn't drop frames but only delivers them to exact matches. This patch adds a sk_buff flag which is used for tagging skbs that would previously been dropped and allows the skb to continue to skb_netif_recv(). Here we add logic to check for the deliver_no_wcard flag and if it is set only deliver to handlers that match exactly. This makes both paths above consistent and gives pkt handlers a way to identify skbs that come from inactive slaves. Without this patch in some configurations skbs will be delivered to handlers with exact matches and in others be dropped out right in the vlan path. I have tested the following 4 configurations in failover modes and load balancing modes. # bond0 -> ethx # vlanx -> bond0 -> ethx # bond0 -> vlanx -> ethx # bond0 -> ethx \| vlanx -> -- Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09	ipv6: fix ICMP6_MIB_OUTERRORS	Eric Dumazet
	In commit 1f8438a85366 (icmp: Account for ICMP out errors), I did a typo on IPV6 side, using ICMP6_MIB_OUTMSGS instead of ICMP6_MIB_OUTERRORS Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09	ipv6: mcast: RCU conversions	Eric Dumazet
	- ipv6_sock_mc_join() : doesnt touch dev refcount - ipv6_sock_mc_drop() : doesnt touch dev/idev refcounts - ip6_mc_find_dev() becomes ip6_mc_find_dev_rcu() (called from rcu), and doesnt touch dev/idev refcounts - ipv6_sock_mc_close() : doesnt touch dev/idev refcounts - ip6_mc_source() uses ip6_mc_find_dev_rcu() - ip6_mc_msfilter() uses ip6_mc_find_dev_rcu() - ip6_mc_msfget() uses ip6_mc_find_dev_rcu() - ipv6_dev_mc_dec(), ipv6_chk_mcast_addr(), igmp6_event_query(), igmp6_event_report(), mld_sendpack(), igmp6_send() dont touch idev refcount Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09	icmp: RCU conversion in icmp_address_reply()	Eric Dumazet
	- rcu_read_lock() already held by caller - use __in_dev_get_rcu() instead of in_dev_get() / in_dev_put() - remove goto out; Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09	Merge branch 'num_rx_queues' of git://kernel.ubuntu.com/rtg/net-2.6	David S. Miller

2010-06-09	caif: fix a couple range checks	Dan Carpenter
	The extra ! character means that these conditions are always false. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09	phonet: use call_rcu for phonet device free	Jiri Pirko
	Use call_rcu rather than synchronize_rcu. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-09	net: Print num_rx_queues imbalance warning only when there are allocated queues	Tim Gardner
	BugLink: http://bugs.launchpad.net/bugs/591416 There are a number of network drivers (bridge, bonding, etc) that are not yet receive multi-queue enabled and use alloc_netdev(), so don't print a num_rx_queues imbalance warning in that case. Also, only print the warning once for those drivers that _are_ multi-queue enabled. Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2010-06-09	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2010-06-09	netfilter: nfnetlink_log: RCU conversion	Eric Dumazet
	- instances_lock becomes a spinlock - lockless lookups While nfnetlink_log probably not performance critical, using less rwlocks in our code is always welcomed... Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-09	netfilter: nfnetlink_queue: some optimizations	Eric Dumazet
	- Use an atomic_t for id_sequence to avoid a spin_lock/spin_unlock pair - Group highly modified struct nfqnl_instance fields together Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-09	netfilter: ip6_queue: rwlock to spinlock conversion	Eric Dumazet
	Converts queue_lock rwlock to a spinlock. (readlocked part can be changed by reads of integer values) One atomic operation instead of four per ipq_enqueue_packet() call. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-09	ipvs: Add missing locking during connection table hashing and unhashing	Sven Wegener
	The code that hashes and unhashes connections from the connection table is missing locking of the connection being modified, which opens up a race condition and results in memory corruption when this race condition is hit. Here is what happens in pretty verbose form: CPU 0 CPU 1 ------------ ------------ An active connection is terminated and we schedule ip_vs_conn_expire() on this CPU to expire this connection. IRQ assignment is changed to this CPU, but the expire timer stays scheduled on the other CPU. New connection from same ip:port comes in right before the timer expires, we find the inactive connection in our connection table and get a reference to it. We proper lock the connection in tcp_state_transition() and read the connection flags in set_tcp_state(). ip_vs_conn_expire() gets called, we unhash the connection from our connection table and remove the hashed flag in ip_vs_conn_unhash(), without proper locking! While still holding proper locks we write the connection flags in set_tcp_state() and this sets the hashed flag again. ip_vs_conn_expire() fails to expire the connection, because the other CPU has incremented the reference count. We try to re-insert the connection into our connection table, but this fails in ip_vs_conn_hash(), because the hashed flag has been set by the other CPU. We re-schedule execution of ip_vs_conn_expire(). Now this connection has the hashed flag set, but isn't actually hashed in our connection table and has a dangling list_head. We drop the reference we held on the connection and schedule the expire timer for timeouting the connection on this CPU. Further packets won't be able to find this connection in our connection table. ip_vs_conn_expire() gets called again, we think it's already hashed, but the list_head is dangling and while removing the connection from our connection table we write to the memory location where this list_head points to. The result will probably be a kernel oops at some other point in time. This race condition is pretty subtle, but it can be triggered remotely. It needs the IRQ assignment change or another circumstance where packets coming from the same ip:port for the same service are being processed on different CPUs. And it involves hitting the exact time at which ip_vs_conn_expire() gets called. It can be avoided by making sure that all packets from one connection are always processed on the same CPU and can be made harder to exploit by changing the connection timeouts to some custom values. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Cc: stable@kernel.org Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>