summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2010-03-18Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: don't warn about page discards on shutdown xfs: use scalable vmap API xfs: remove old vmap cache
2010-03-18ocfs2: Always try for maximum bits with new local alloc windowsMark Fasheh
What we were doing before was to ask for the current window size as the maximum allocation. This had the effect of limiting the amount of allocation we could get for the local alloc during times when the window size was shrunk due to fragmentation. In some cases, that could actually *increase* fragmentation by artificially limiting the number of bits we can accept. So while we still want to ask for a minimum number of bits equal to window size, there is no reason why we should limit the number of bits the local alloc should accept. Hence always allow the maximum number of local alloc bits. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-18Btrfs: fix the inode ref searches done by btrfs_search_path_in_treeChris Mason
This is used by the inode lookup ioctl to follow all the backrefs up to the subvol root. But the search being done would sometimes land one past the last item in the leaf instead of finding the backref. This changes the search to look for the highest possible backref and hop back one item. It also fixes a leaked path on failure to find the root. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-18Btrfs: allow treeid==0 in the inode lookup ioctlChris Mason
When a root id of 0 is sent to the inode lookup ioctl, it will use the root of the file we're ioctling and pass the root id back to userland along with the results. This allows userland to do searches based on that root later on. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-18Btrfs: return keys for large items to the search ioctlChris Mason
The search ioctl was skipping large items entirely (ones that are too big for the results buffer). This changes things to at least copy the item header so that we can send information about the item back to userland. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-18Btrfs: fix key checks and advance in the search ioctlChris Mason
The search ioctl was working well for finding tree roots, but using it for generic searches requires a few changes to how the keys are advanced. This treats the search control min fields for objectid, type and offset more like a key, where we drop the offset to zero once we bump the type, etc. The downside of this is that we are changing the min_type and min_offset fields during the search, and so the ioctl caller needs extra checks to make sure the keys in the result are the ones it wanted. This also changes key_in_sk to use btrfs_comp_cpu_keys, just to make things more readable. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-18ocfs2: enable discontig block group support.Tao Ma
Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-04-27ocfs2: Set ac_last_group properly with discontig group.Tao Ma
ac_last_group is used to record the last block group we used during allocation. But the initialization process only calls ocfs2_which_suballoc_group and fails to use suballoc_loc properly. So let us do it. Another function ocfs2_test_suballoc_bit also needs fix. I have searched all the callers of ocfs2_which_suballoc_group, and all the callers notices suballoc_loc now. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-03-22ocfs2: Free block to the right block group.Tao Ma
In case the block we are going to free is allocated from a discontiguous block group, we have to use suballoc_loc to be the right group. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-05-17ocfs2: Add ocfs2_gd_is_discontig.Tao Ma
Add ocfs2_gd_is_discontig so that we can test whether a group descriptor is discontiguous or not. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-04-13ocfs2: ocfs2_group_bitmap_size has to handle old volume.Tao Ma
ocfs2_group_bitmap_size has to handle the case when the volume don't have discontiguous block group support. So pass the feature_incompat in and check it. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-04-22ocfs2: Some tiny bug fixes for discontiguous block allocation.Tao Ma
The fixes include: 1. some endian problems. 2. we should use bit/bpc in ocfs2_block_group_grow_discontig to allocate clusters. 3. set num_clusters properly in __ocfs2_claim_clusters. 4. change name from ocfs2_supports_discontig_bh to ocfs2_supports_discontig_bg. Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-03-26ocfs2: Don't relink cluster groups when allocating discontig block groupsJoel Becker
We don't have enough credits, and the filesystem is in a full state anyway. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-26ocfs2: Grow discontig block groups in one transaction.Joel Becker
Rather than extending the transaction every time we add an extent to a discontiguous block group, we grab enough credits to fill the extent list up front. This means we can free the bits in the same transaction if we end up not getting enough space. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-26ocfs2: Set suballoc_loc on allocated metadata.Joel Becker
Get the suballoc_loc from ocfs2_claim_new_inode() or ocfs2_claim_metadata(). Store it on the appropriate field of the block we just allocated. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-26ocfs2: Return allocated metadata blknos on the ocfs2_suballoc_result.Joel Becker
Rather than calculating the resulting block number, return it on the ocfs2_suballoc_result structure. This way we can calculate block numbers for discontiguous block groups. Cluster groups keep doing it the old way. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-06ocfs2: ocfs2_claim_*() don't need an ocfs2_super argument.Joel Becker
They all take an ocfs2_alloc_context, which has the allocation inode. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-03-26ocfs2: Trim suballocations if they cross discontiguous regionsJoel Becker
A discontiguous block group can find a range of free bits that straddle more than one region of its space. Callers can't handle that, so we trim the returned bits until they fit within one region. Only cluster allocations ask for min_bits>1. Discontiguous block groups are only for block allocations. So min_bits doesn't matter here. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-26ocfs2: ocfs2_claim_suballoc_bits() doesn't need an osb argument.Joel Becker
It's contained on ac->ac_inode->i_sb anyway. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-26ocfs2: Add suballoc_loc to metadata blocks.Joel Becker
We need a suballoc_loc field on any suballocated block. Define them. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-04-13ocfs2: Pass suballocation results back via a structure.Joel Becker
We're going to be adding more info to a suballocator allocation. Rather than growing every function in the chain, let's pass a result structure around. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-04-13ocfs2: Allocate discontiguous block groups.Joel Becker
If we cannot get a contiguous region for a block group, allocate a discontiguous one when the filesystem supports it. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-04-13ocfs2: Define data structures for discontiguous block groups.Joel Becker
Defines the OCFS2_FEATURE_INCOMPAT_DISCONTIG_BG feature bit and modifies struct ocfs2_group_desc for the feature. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-03-17ntfs: use bitmap_weightAkinobu Mita
Use bitmap_weight() instead of doing hweight32() for each u32 element in the page. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-17jffs2: fix up rb_root initializations to use RB_ROOTVenkatesh Pallipadi
jffs2 uses rb_node = NULL; to zero rb_root. The problem with this is that 17d9ddc72fb8bba0d4f678 ("rbtree: Add support for augmented rbtrees") in the linux-next tree adds a new field to that struct which needs to be NULL as well. This patch uses RB_ROOT as the intializer so all of the relevant fields will be NULL'd. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Eric Paris <eparis@redhat.com> Acked-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-17ocfs2: set i_mode on disk during acl operationsMark Fasheh
ocfs2_set_acl() and ocfs2_init_acl() were setting i_mode on the in-memory inode, but never setting it on the disk copy. Thus, acls were some times not getting propagated between nodes. This patch fixes the issue by adding a helper function ocfs2_acl_set_mode() which does this the right way. ocfs2_set_acl() and ocfs2_init_acl() are then updated to call ocfs2_acl_set_mode(). Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-17ocfs2: Update i_blocks in reflink operations.Tao Ma
In reflink, we need to upate i_blocks for the target inode. Reported-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-17ocfs2: Change bg_chain check for ocfs2_validate_gd_parent.Tao Ma
In ocfs2_validate_gd_parent, we check bg_chain against the cl_next_free_rec of the dinode. Actually in resize, we have the chance of bg_chain == cl_next_free_rec. So add some additional condition check for it. I also rename paramter "clean_error" to "resize", since the old one is not clearly enough to indicate that we should only meet with this case in resize. btw, the correpsonding bug is http://oss.oracle.com/bugzilla/show_bug.cgi?id=1230. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-17[PATCH] Skip check for mandatory locks when unlockingSachin Prabhu
ocfs2_lock() will skip locks on file which has mode set to 02666. This is a problem in cases where the mode of the file is changed after a process has obtained a lock on the file. ocfs2_lock() should skip the check for mandatory locks when unlocking a file. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-16nfsd: factor out hash functions for export caches.NeilBrown
Both the _lookup and the _update functions for these two caches independently calculate the hash of the key. So factor out that code for improved reuse. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-16xfs: don't warn about page discards on shutdownDave Chinner
If we are doing a forced shutdown, we can get lots of noise about delalloc pages being discarded. This is happens by design during a forced shutdown, so don't spam the logs with these messages. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-03-16xfs: use scalable vmap APIAlex Elder
Re-apply a commit that had been reverted due to regressions that have since been fixed. From 95f8e302c04c0b0c6de35ab399a5551605eeb006 Mon Sep 17 00:00:00 2001 From: Nick Piggin <npiggin@suse.de> Date: Tue, 6 Jan 2009 14:43:09 +1100 Implement XFS's large buffer support with the new vmap APIs. See the vmap rewrite (db64fe02) for some numbers. The biggest improvement that comes from using the new APIs is avoiding the global KVA allocation lock on every call. Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Only modifications here were a minor reformat, plus making the patch apply given the new use of xfs_buf_is_vmapped(). Modified-by: Alex Elder <aelder@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-03-16xfs: remove old vmap cacheAlex Elder
Re-apply a commit that had been reverted due to regressions that have since been fixed. Original commit: d2859751cd0bf586941ffa7308635a293f943c17 Author: Nick Piggin <npiggin@suse.de> Date: Tue, 6 Jan 2009 14:40:44 +1100 XFS's vmap batching simply defers a number (up to 64) of vunmaps, and keeps track of them in a list. To purge the batch, it just goes through the list and calls vunamp on each one. This is pretty poor: a global TLB flush is generally still performed on each vunmap, with the most expensive parts of the operation being the broadcast IPIs and locking involved in the SMP callouts, and the locking involved in the vmap management -- none of these are avoided by just batching up the calls. I'm actually surprised it ever made much difference. (Now that the lazy vmap allocator is upstream, this description is not quite right, but the vunmap batching still doesn't seem to do much). Rip all this logic out of XFS completely. I will improve vmap performance and scalability directly in subsequent patch. Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> The only change I made was to use the "new" xfs_buf_is_vmapped() function in a place it had been open-coded in the original. Modified-by: Alex Elder <aelder@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-03-16Btrfs: buffer results in the space_info ioctlChris Mason
The space_info ioctl was using copy_to_user inside rcu_read_lock. This commit changes things to copy into a buffer first and then dump the result down to userland. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-16Btrfs: use __u64 types in ioctl.hSage Weil
Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-16Btrfs: fix search_ioctl key advanceSage Weil
key->type is u8, not u64. fs/btrfs/ioctl.c: In function 'copy_to_sk': fs/btrfs/ioctl.c:1024: warning: comparison is always true due to limited range of data type Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-16Fix typos in commentsThomas Weber
[Ss]ytem => [Ss]ystem udpate => update paramters => parameters orginal => original Signed-off-by: Thomas Weber <swirl@gmx.li> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-03-16fat: Cleanup nls_unload() usageOGAWA Hirofumi
Other users doesn't check NULL explicitly. So, these doesn't also check to remove inconsistency. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2010-03-16fat: use pack_hex_byte() instead of custom oneAndy Shevchenko
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2010-03-15NFS: ensure bdi_unregister is called on mount failure.NeilBrown
bdi_unregister is called by nfs_put_super which is only called by generic_shutdown_super if ->s_root is not NULL. So if we error out in a circumstance where we called nfs_bdi_register (i.e. server != NULL) but have not set s_root, then we need to call bdi_unregister explicitly in nfs_get_sb and various other *_get_sb() functions. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-15cifs: trivial white spaceDan Carpenter
I fixed the indent level. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-03-15Btrfs: fix gfp flags masking in the compression codeNick Piggin
GFP_FS must be masked out, NOFS can't be or'd in. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-15Btrfs: don't look at bio flags after submit_bioChris Mason
After callling submit_bio, the bio can be freed at any time. The btrfs submission thread helper was checking the bio flags too late, which might not give the correct answer. When CONFIG_DEBUG_PAGE_ALLOC is turned on, it can lead to oopsen. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-15btrfs: using btrfs_stack_device_id() get devidXiao Guangrong
We can use btrfs_stack_device_id() to get dev_item->devid Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-15btrfs: use memparseAkinobu Mita
Use memparse() instead of its own private implementation. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: linux-btrfs@vger.kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-15Btrfs: add a "df" ioctl for btrfsJosef Bacik
df is a very loaded question in btrfs. This gives us a way to get the per-space usage information so we can tell exactly what is in use where. This will help us figure out ENOSPC problems, and help users better understand where their disk space is going. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-15Btrfs: cache the extent state everywhere we possibly can V2Josef Bacik
This patch just goes through and fixes everybody that does lock_extent() blah unlock_extent() to use lock_extent_bits() blah unlock_extent_cached() and pass around a extent_state so we only have to do the searches once per function. This gives me about a 3 mb/s boots on my random write test. I have not converted some things, like the relocation and ioctl's, since they aren't heavily used and the relocation stuff is in the middle of being re-written. I also changed the clear_extent_bit() to only unset the cached state if we are clearing EXTENT_LOCKED and related stuff, so we can do things like this lock_extent_bits() clear delalloc bits unlock_extent_cached() without losing our cached state. I tested this thoroughly and turned on LEAK_DEBUG to make sure we weren't leaking extent states, everything worked out fine. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-15Btrfs: cache ordered extent when completing ioJosef Bacik
When finishing io we run btrfs_dec_test_ordered_pending, and then immediately run btrfs_lookup_ordered_extent, but btrfs_dec_test_ordered_pending does that already, so we're searching twice when we don't have to. This patch lets us pass a btrfs_ordered_extent in to btrfs_dec_test_ordered_pending so if we do complete io on that ordered extent we can just use the one we found then instead of having to do another btrfs_lookup_ordered_extent. This made my fio job with the other patch go from 24 mb/s to 29 mb/s. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-15Btrfs: cache extent state in find_delalloc_rangeJosef Bacik
This patch makes us cache the extent state we find in find_delalloc_range since we'll have to lock the extent later on in the function. This will keep us from re-searching for the rang when we try to lock the extent. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-03-15Btrfs: change the ordered tree to use a spinlock instead of a mutexJosef Bacik
The ordered tree used to need a mutex, but currently all we use it for is to protect the rb_tree, and a spin_lock is just fine for that. Using a spin_lock instead makes dbench run a little faster, 58 mb/s instead of 51 mb/s, and have less latency, 3445.138 ms instead of 3820.633 ms. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>