summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-06-10mac80211: disable PS while probing APJohannes Berg
When associated, but probing the AP because we detected beacon loss, we need to disable powersave to be able to receive the probe response. Change the code to do that by checking whether we're trying to probe when determining the possibility of going into PS, and recalculate the PS ability at the necessary spots. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10mac80211: disable moving between PS modes during scanLuis R. Rodriguez
We don't want to trigger moving between PS mode during scan, because then we will sometimes end up sending nullfunc frames during scan. We're supposed to only send one prior to scan and after scan. This fixes an oops which occured due to an assert in ath9k: http://marc.info/?l=linux-wireless&m=124277331319024 The assert was happening because the rate control algorithm figures it should find at least one valid dual stream or single stream rate. Since we allow mac80211 to send nullfunc frames during scan and dynamic PS was enabled at times we ended up trying to send nullfunc frames for the target sta on the wrong band for which we have no valid rate to communicate with it. This breaks the assumptions in rate control. We determine we also need to disable moving between PS modes when not associated so lets just add that now as well, and we should not have a ps_sdata when that interface cannot actually go into PS because it's not associated. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10ath5k: minor rfkill cleanupBob Copeland
Always enable rfkill since the ifdefs in the code is not really worth the Kconfig option. Also fix a few code style things, and remove the usage of the ah_gpio[] array so we can remove it later. Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10mac80211: clean up return value of __ieee80211_parse_tx_radiotapJohannes Berg
The return type has more than two values, but it can validly only ever return TX_DROP and TX_CONTINUE, so use a bool instead of ieee80211_tx_result. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10mac80211: don't use master netdev nameJohannes Berg
Always use the wiphy name instead. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10ath9k: Fix tx stuck when connected to aggr disabled HT APVasanthakumar Thiagarajan
This patch along with my previous patch in mac80211 "Fix the way ADDBA count..", fixes hang in tx when connected to an HT AP which rejects/times out on addba req. AGGR_ADDBA_PROGRESS should be cleared in aggr state when addba negotiation is terminated due to either addba response is timed out or addba is denied by the AP. With out clearing this bit, all frames are queued onto s/w queue for getting tx'd as aggr and will never be scheduled onto hw queue. Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10mac80211: Fix the way ADDBA request count being modifiedVasanthakumar Thiagarajan
addba_req_num[tid] is supposed to have the count of consecutive addba request attempts on 'tid' which failed. This count is checked against a retry threshold (3 times) before starting the addba negotiation. This patch fixes the way this addba count is incremented/reset and thereby avoids indefinite addba attempts. Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10cfg80211: fix for duplicate response for driver reg requestLuis R. Rodriguez
As Pavel puts userspace can be stupid and should not cause kernel crashes. In this case Pavel was able to find a crash here but unable to reproduce. Either way lets deal with this. This should fix: ------------[ cut here ]------------ kernel BUG at /home/proski/src/linux-2.6/net/wireless/reg.c:2132! Oops: Exception in kernel mode, sig: 5 [#1] PowerMac Modules linked in: ath5k ath [last unloaded: scsi_wait_scan] NIP: c02f3eac LR: c02f3d08 CTR: 00000000 REGS: ef107aa0 TRAP: 0700 Not tainted (2.6.30-rc8-wl) MSR: 00029032 <EE,ME,CE,IR,DR> CR: 88002442 XER: 20000000 TASK = ef84acb0[834] 'crda' THREAD: ef106000 GPR00: ef953840 ef107b50 ef84acb0 ef1380bc 00000006 c035a5c8 ef107b90 c035a5c8 GPR08: 00080005 efb68980 c0445628 ef130004 28002422 10019ce0 10012d3c 00000001 GPR16: 1070b2ac 00000005 48023558 1070b380 4802304c 00000000 ef107ddc c035a5c8 GPR24: ef107b78 c0443350 ef8bcb00 00000005 ef138080 c04a6a70 c04a0000 ef8bcb00 NIP [c02f3eac] set_regdom+0x4c4/0x4ec LR [c02f3d08] set_regdom+0x320/0x4ec Call Trace: [ef107b50] [c02f3d08] set_regdom+0x320/0x4ec (unreliable) [ef107b70] [c02f9d10] nl80211_set_reg+0x140/0x2d0 [ef107bc0] [c02aa2b8] genl_rcv_msg+0x204/0x228 [ef107c10] [c02a97cc] netlink_rcv_skb+0xe8/0x10c [ef107c30] [c02aa094] genl_rcv+0x3c/0x5c [ef107c40] [c02a9050] netlink_unicast+0x308/0x36c [ef107c80] [c02a92bc] netlink_sendmsg+0x208/0x2f0 [ef107cd0] [c0282048] sock_sendmsg+0xac/0xe4 [ef107db0] [c02822b4] sys_sendmsg+0x234/0x2d8 [ef107f00] [c0283a88] sys_socketcall+0x108/0x258 [ef107f40] [c0012790] ret_from_syscall+0x0/0x38 Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10b43: Add fw capabilitiesMichael Buesch
Add automagic feature flags, so the firmware can tell the driver about supported features and the driver can switch features on/off as needed. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10rfkill: don't impose global states on resume (just restore the previous states)Alan Jenkins
Once rfkill-input is disabled, the "global" states will only be used as default initial states. Since the states will always be the same after resume, we shouldn't generate events on resume. Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10sony-laptop: no need to unblock rfkill on loadAlan Jenkins
The re-written rfkill core ensures rfkill devices are initialized to the system default state. The core calls set_block after registration so the driver shouldn't need to. Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10rfkill: remove set_global_sw_stateAlan Jenkins
rfkill_set_global_sw_state() (previously rfkill_set_default()) will no longer be exported by the rewritten rfkill core. Instead, platform drivers which can provide persistent soft-rfkill state across power-down/reboot should indicate their initial state by calling rfkill_set_sw_state() before registration. Otherwise, they will be initialized to a default value during registration by a set_block call. We remove existing calls to rfkill_set_sw_state() which happen before registration, since these had no effect in the old model. If these drivers do have persistent state, the calls can be put back (subject to testing :-). This affects hp-wmi and acer-wmi. Drivers with persistent state will affect the global state only if rfkill-input is enabled. This is required, otherwise booting with wireless soft-blocked and pressing the wireless-toggle key once would have no apparent effect. This special case will be removed in future along with rfkill-input, in favour of a more flexible userspace daemon (see Documentation/feature-removal-schedule.txt). Now rfkill_global_states[n].def is only used to preserve global states over EPO, it is renamed to ".sav". Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10mac80211: do not pass PS frames out of mac80211 againJohannes Berg
In order to handle powersave frames properly we had needed to pass these out to the device queues again, and introduce the skb->requeue bit. This, however, also has unnecessary overhead by needing to 'clean up' already tried frames, and this clean-up code is also buggy when software encryption is used. Instead of sending the frames via the master netdev queue again, simply put them into the pending queue. This also fixes a problem where frames for that particular station could be reordered when some were still on the software queues and older ones are re-injected into the software queue after them. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10sony: fix rfkill codeJohannes Berg
During the rfkill conversion I added code to call sony_nc_rfkill_set with the wrong argument, causing a segfault Reinette reported. The compiler could not catch that because the argument is, and needs to be, void *. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Reported-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10rfkill: remove input KconfigJohannes Berg
Now that we added the ioctl, there's no need to ask the user to configure this. We will keep it enabled for now, and eventually swap the default to n. Also let embedded users select it only if they need it. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10b43/legacy: port to cfg80211 rfkillJohannes Berg
This ports the b43/legacy rfkill code to the new API offered by cfg80211 and thus removes a lot of useless stuff. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10ath5k: added cfg80211 based rfkill supportTobias Doerffel
This patch introduces initial rfkill support for the ath5k driver based on rfkill support in the cfg80211 framework. All rfkill related code is separated into newly created rfkill.c. Changes to existing code are minimal: * added a new data structure ath5k_rfkill to the ath5k_softc structure * inserted calls to HW rfkill init/deinit routines * ath5k_intr() has been extended to handle AR5K_INT_GPIO interrupts Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10rfkill: print events when input handler is disabled/enabledJohannes Berg
It is useful for debugging when we know if something disabled the in-kernel rfkill input handler. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10ar9170: xmit code revampChristian Lamparter
This patch is a back-port from aggregation testing code. In the past, we didn't limit the amount of active tx urbs. However, ar9170 only has a limited buffer reserved for pending data frames. This wasn't much of a problem with the slower 802.11b/g. We simply stopped the full queue and moved on to something different in the mean time. But - as you guessed it - this simple approach stands in way for a decent aggregation implementation. Signed-off-by: Christian Lamparter <chunkeey@web.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10ar9170: interpret firmware debug commandsJohannes Berg
This adds new commands that the original firmware will not send but we can use them to debug firmware. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10mac80211 : fix unaligned rx skbmatthieu castet
mac80211 is checking is the skb is aligned on 32 bit boundary. But it is checking against ethernet header, whereas Linux expect IP header aligned. And ethernet ether size is 6*2+2=14, so aligning ethernet header make IP header unaligned. Signed-off-by: Matthieu CASTET <castet.matthieu@free.fr> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10b43: Fix possible unaligned u32 accessMatthieu CASTET
Fix possible unaligned u32 access in b43_generate_plcp_hdr(). Unaligned data is read/write with a u32 pointer instead of using the packed structure. Some versions of gcc ignore the "packed" attribute, if the structure element is accessed through a local pointer. Signed-off-by: Matthieu CASTET <castet.matthieu@free.fr> Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10mac80211: fix minstrel single-rate memory corruptionBob Copeland
The minstrel rate controller periodically looks up rate indexes in a sampling table. When accessing a specific row and column, minstrel correctly does a bounds check which, on the surface, appears to handle the case where mi->n_rates < 2. However, mi->sample_idx is actually defined as an unsigned, so the right hand side is taken to be a huge positive number when negative, and the check will always fail. Consequently, the RC will overrun the array and cause random memory corruption when communicating with a peer that has only a single rate. The max value of mi->sample_idx is around 25 so casting to int should have no ill effects. Without the change, uptime is a few minutes under load with an AP that has a single hard-coded rate, and both the AP and STA could potentially crash. With the change, both lasted 12 hours with a steady load. Thanks to Ognjen Maric for providing the single-rate clue so I could reproduce this. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=12490 on the regression list (also http://bugzilla.kernel.org/show_bug.cgi?id=13000). Cc: stable@kernel.org Reported-by: Sergey S. Kostyliov <rathamahata@gmail.com> Reported-by: Ognjen Maric <ognjen.maric@gmail.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10net/libertas: remove GPIO-CS handling in SPI interface codeSebastian Andrzej Siewior
This removes the dependency on GPIO framework and lets the SPI host driver handle the chip select. The SPI host driver is required to keep the CS active for the entire message unless cs_change says otherwise. This patch collects the two/three single SPI transfers into a message. Also the delay in read path in case use_dummy_writes are not used is moved into the SPI host driver. Tested-by: Mike Rapoport <mike@compulab.co.il> Tested-by: Andrey Yurovsky <andrey@cozybit.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10rndis_wlan: cleanup: rename all rndis_wext* objects to rndis_wlan*Jussi Kivilinna
Driver used to be named rndis_wext before inclusion to upstream. Since rndis_wlan is being converted to cfg80211, use of rndis_wext* names can be confusing. So rename all rndis_wext to rndis_wlan (as should have been when driver was renamed). Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10rndis_wlan: cleanup: capitalize enum labelsJussi Kivilinna
Capitalize enum labels as told in Documents/CodingStyle. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10iwlwifi: port to cfg80211 rfkillJohannes Berg
This ports the iwlwifi rfkill code to the new API offered by cfg80211 and thus removes a lot of useless stuff. The soft- rfkill is completely removed since that is now handled by setting the interfaces down. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-06-10ftrace/documentation: fix typo in function grapher nameMike Frysinger
The function graph tracer is called just "function_graph" (no trailing "_tracer" needed). Signed-off-by: Mike Frysinger <vapier@gentoo.org> LKML-Reference: <1244623722-6325-1-git-send-email-vapier@gentoo.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-10cifs: add addr= mount option alias for ip=Jeff Layton
When you look in /proc/mounts, the address of the server gets displayed as "addr=". That's really a better option to use anyway since it's more generic. What if we eventually want to support non-IP transports? It also makes CIFS option consistent with the NFS option of the same name. Begin the migration to that option name by adding an alias for ip= called addr=. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-06-10Fix btrfs when ACLs are configured outAl Viro
... otherwise generic_permission() will allow *anything* for all files you don't own and that have some group permissions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: fdatasync should skip metadata writeoutHisashi Hifumi
In btrfs, fdatasync and fsync are identical, but fdatasync should skip committing transaction when inode->i_state is set just I_DIRTY_SYNC and this indicates only atime or/and mtime updates. Following patch improves fdatasync throughput. --file-block-size=4K --file-total-size=16G --file-test-mode=rndwr --file-fsync-mode=fdatasync run Results: -2.6.30-rc8 Test execution summary: total time: 1980.6540s total number of events: 10001 total time taken by event execution: 1192.9804 per-request statistics: min: 0.0000s avg: 0.1193s max: 15.3720s approx. 95 percentile: 0.7257s Threads fairness: events (avg/stddev): 625.0625/151.32 execution time (avg/stddev): 74.5613/9.46 -2.6.30-rc8-patched Test execution summary: total time: 1695.9118s total number of events: 10000 total time taken by event execution: 871.3214 per-request statistics: min: 0.0000s avg: 0.0871s max: 10.4644s approx. 95 percentile: 0.4787s Threads fairness: events (avg/stddev): 625.0000/131.86 execution time (avg/stddev): 54.4576/8.98 Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: remove crc32c.h and use libcrc32c directly.David Woodhouse
There's no need to preserve this abstraction; it used to let us use hardware crc32c support directly, but libcrc32c is already doing that for us through the crypto API -- so we're already using the Intel crc32c acceleration where appropriate. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: implement FS_IOC_GETFLAGS/SETFLAGS/GETVERSIONChristoph Hellwig
Add support for the standard attributes set via chattr and read via lsattr. Currently we store the attributes in the flags value in the btrfs inode, but I wonder whether we should split it into two so that we don't have to keep converting between the two formats. Remove the btrfs_clear_flag/btrfs_set_flag/btrfs_test_flag macros as they were confusing the existing code and got in the way of the new additions. Also add the FS_IOC_GETVERSION ioctl for getting i_generation as it's trivial. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: autodetect SSD devicesChris Mason
During mount, btrfs will check the queue nonrot flag for all the devices found in the FS. If they are all non-rotating, SSD mode is enabled by default. If the FS was mounted with -o nossd, the non-rotating flag is ignored. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: add mount -o ssd_spread to spread allocations outChris Mason
Some SSDs perform best when reusing block numbers often, while others perform much better when clustering strictly allocates big chunks of unused space. The default mount -o ssd will find rough groupings of blocks where there are a bunch of free blocks that might have some allocated blocks mixed in. mount -o ssd_spread will make sure there are no allocated blocks mixed in. It should perform better on lower end SSDs. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: avoid allocation clusters that are too spread outChris Mason
In SSD mode for data, and all the time for metadata the allocator will try to find a cluster of nearby blocks for allocations. This commit adds extra checks to make sure that each free block in the cluster is close to the last one. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: Add mount -o nossdChris Mason
This allows you to turn off the ssd mode via remount. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: avoid IO stalls behind congested devices in a multi-device FSChris Mason
The btrfs IO submission threads try to service a bunch of devices with a small number of threads. They do a congestion check to try and avoid waiting on requests for a busy device. The checks make sure we've sent a few requests down to a given device just so that we aren't bouncing between busy devices without actually sending down any IO. The counter used to decide if we can switch to the next device is somewhat overloaded. It is also being used to decide if we've done a good batch of requests between the WRITE_SYNC or regular priority lists. It may get reset to zero often, leaving us hammering on a busy device instead of moving on to another disk. This commit adds a new counter for the number of bios sent while servicing a device. It doesn't get reset or fiddled with. On multi-device filesystems, this fixes IO stalls in streaming write workloads. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: don't allow WRITE_SYNC bios to starve out regular writesChris Mason
Btrfs uses dedicated threads to submit bios when checksumming is on, which allows us to make sure the threads dedicated to checksumming don't get stuck waiting for requests. For each btrfs device, there are two lists of bios. One list is for WRITE_SYNC bios and the other is for regular priority bios. The IO submission threads used to process all of the WRITE_SYNC bios first and then switch to the regular bios. This commit makes sure we don't completely starve the regular bios by rotating between the two lists. WRITE_SYNC bios are still favored 2:1 over the regular bios, and this tries to run in batches to avoid seeking. Benchmarking shows this eliminates stalls during streaming buffered writes on both multi-device and single device filesystems. If the regular bios starve, the system can end up with a large amount of ram pinned down in writeback pages. If we are a little more fair between the two classes, we're able to keep throughput up and make progress on the bulk of our dirty ram. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: fix metadata dirty throttling limitsChris Mason
Once a metadata block has been written, it must be recowed, so the btrfs dirty balancing call has a check to make sure a fair amount of metadata was actually dirty before it started writing it back to disk. A previous commit had changed the dirty tracking for metadata without updating the btrfs dirty balancing checks. This commit switches it to use the correct counter. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: reduce mount -o ssd CPU usageChris Mason
The block allocator in SSD mode will try to find groups of free blocks that are close together. This commit makes it loop less on a given group size before bumping it. The end result is that we are less likely to fill small holes in the available free space, but we don't waste as much CPU building the large cluster used by ssd mode. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: balance btree more oftenChris Mason
With the new back reference code, the cost of a balance has gone down in terms of the number of back reference updates done. This commit makes us more aggressively balance leaves and nodes as they become less full. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: stop avoiding balancing at the end of the transaction.Chris Mason
When the delayed reference code was added, some checks were added to avoid extra balancing while the delayed references were being flushed. This made for less efficient btrees, but it reduced the chances of loops where no forward progress was made because the balances made more delayed ref updates. With the new dead root removal code and the mixed back references, the extent allocation tree is no longer using precise back refs, and the delayed reference updates don't carry the risk of looping forever anymore. So, the balance avoidance is no longer required. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE)Yan Zheng
This commit introduces a new kind of back reference for btrfs metadata. Once a filesystem has been mounted with this commit, IT WILL NO LONGER BE MOUNTABLE BY OLDER KERNELS. When a tree block in subvolume tree is cow'd, the reference counts of all extents it points to are increased by one. At transaction commit time, the old root of the subvolume is recorded in a "dead root" data structure, and the btree it points to is later walked, dropping reference counts and freeing any blocks where the reference count goes to 0. The increments done during cow and decrements done after commit cancel out, and the walk is a very expensive way to go about freeing the blocks that are no longer referenced by the new btree root. This commit reduces the transaction overhead by avoiding the need for dead root records. When a non-shared tree block is cow'd, we free the old block at once, and the new block inherits old block's references. When a tree block with reference count > 1 is cow'd, we increase the reference counts of all extents the new block points to by one, and decrease the old block's reference count by one. This dead tree avoidance code removes the need to modify the reference counts of lower level extents when a non-shared tree block is cow'd. But we still need to update back ref for all pointers in the block. This is because the location of the block is recorded in the back ref item. We can solve this by introducing a new type of back ref. The new back ref provides information about pointer's key, level and in which tree the pointer lives. This information allow us to find the pointer by searching the tree. The shortcoming of the new back ref is that it only works for pointers in tree blocks referenced by their owner trees. This is mostly a problem for snapshots, where resolving one of these fuzzy back references would be O(number_of_snapshots) and quite slow. The solution used here is to use the fuzzy back references in the common case where a given tree block is only referenced by one root, and use the full back references when multiple roots have a reference on a given block. This commit adds per subvolume red-black tree to keep trace of cached inodes. The red-black tree helps the balancing code to find cached inodes whose inode numbers within a given range. This commit improves the balancing code by introducing several data structures to keep the state of balancing. The most important one is the back ref cache. It caches how the upper level tree blocks are referenced. This greatly reduce the overhead of checking back ref. The improved balancing code scales significantly better with a large number of snapshots. This is a very large commit and was written in a number of pieces. But, they depend heavily on the disk format change and were squashed together to make sure git bisect didn't end up in a bad state wrt space balancing or the format change. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10btrfs: Fix set/clear_extent_bit for 'end == (u64)-1'Yan Zheng
There are some 'start = state->end + 1;' like code in set_extent_bit and clear_extent_bit. They overflow when end == (u64)-1. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10xfs: use generic Posix ACL codeChristoph Hellwig
This patch rips out the XFS ACL handling code and uses the generic fs/posix_acl.c code instead. The ondisk format is of course left unchanged. This also introduces the same ACL caching all other Linux filesystems do by adding pointers to the acl and default acl in struct xfs_inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-10[libata] ata_piix: Enable parallel scanArjan van de Ven
This patch turns on parallel scanning for the ata_piix driver. This driver is used on most netbooks (no AHCI for cheap storage it seems). The scan is the dominating time factor in the kernel boot for these devices; with this flag it gets cut in half for the device I used for testing (eeepc). Alan took a look at the driver source and concluded that it ought to be safe to do for this driver. Alan has also checked with the hardware team. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10sata_nv: use hardreset only for post-boot probingTejun Heo
When I thought it was finally defeated, it came back with vengeance. The failure cases are ever more convoluted. Now there is a single combination which fails boot probing - MCP5x + Intel SSD and there are two hotplug failure reports on different flavors where softreset fails to bring up the device. Through the many bug reports after the switch to hardreset, the following patterns emerged. - Softreset during boot always works. - Hardreset during boot sometimes fails to bring up the link on certain comibnations and device signature acquisition is unreliable. - Hardreset is often necessary after hotplug. It looks like the old behavior of preferring softreset was somehow pretty close to the working reset protocol although it could have lost a device during phy error handling by issuing hardreset. This patch implements nv_hardreset() which kicks in only for post-boot (!LOADING) device probing resets. This should be able to work around all known problem cases. This isn't perfect but given the various hardreset quirks on these controllers, I think this is as good as it can get. Tested on mcp5x (swncq), nf3 and ck804 for all both boot, warm and hot probing cases. Kudos to all the bug reporters and their painful hours with these damn controllers. ;-) Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Robert Hancock <hancockr@shaw.ca> Reported-by: David Lang <david@lang.hm> Reported-by: Samo Vodopivec <lament.email.si@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10[libata] ahci: Restore SB600 SATA controller 64 bit DMAShane Huang
Community reported one SB600 SATA issue(BZ #9412), which led to 64 bit DMA disablement for all SB600 revisions by driver maintainers with commits c7a42156d99bcea7f8173ba7a6034bbaa2ecb77c and 4cde32fc4b32e96a99063af3183acdfd54c563f0. But the root cause is ASUS M2A-VM system BIOS bug in old revisions like 0901, while forcing into 32bit DMA happens to work as workaround. Now it's time to withdraw 4cde32fc4b32e96a99063af3183acdfd54c563f0 so as to restore the SB600 SATA 64bit DMA capability. This patch is also adding the workaround for M2A-VM old BIOS revisions, but users are suggested to upgrade their system BIOS to the latest one if they meet this issue. Signed-off-by: Shane Huang <shane.huang@amd.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10perf_counter tools: Propagate signals properlyPeter Zijlstra
Currently report and stat catch SIGINT (and others) without altering their exit state. This means that things like: while :; do perf stat ./foo ; done Loops become hard-to-interrupt, because bash never sees perf terminate due to interruption. Fix this. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>