summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2009-11-11Btrfs: fix some metadata enospc issuesJosef Bacik
We weren't reserving metadata space for rename, rmdir and unlink, which could cause problems. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-11-11Btrfs: fix how we set max_size for free space clustersJosef Bacik
This patch fixes a problem where max_size can be set to 0 even though we filled the cluster properly. We set max_size to 0 if we restart the cluster window, but if the new start entry is big enough to be our new cluster then we could return with a max_size set to 0, which will mean the next time we try to allocate from this cluster it will fail. So set max_extent to the entry's size. Tested this on my box and now we actually allocate from the cluster after we fill it. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-11-11Btrfs: cleanup transaction starting and fix journal_info usageJosef Bacik
We use journal_info to tell if we're in a nested transaction to make sure we don't commit the transaction within a nested transaction. We use another method to see if there are any outstanding ioctl trans handles, so if we're starting one do not set current->journal_info, since it will screw with other filesystems. This patch also cleans up the starting stuff so there aren't any magic numbers. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-11-11Btrfs: fix data allocation hint startJosef Bacik
Sometimes our start allocation hint when we cow a file can be either EXTENT_HOLE or some other such place holder, which is not optimal. So if we find that our em->block_start is one of these special values, check to see where the first block of the inode is stored, and use that as a hint. If that block is also a special value, just fallback on a hint of 0 and let the allocator figure out a good place to put the data. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-11-11JBD/JBD2: free j_wbuf if journal init fails.Tao Ma
If journal init fails, we need to free j_wbuf. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz>
2009-11-11ext3: Wait for proper transaction commit on fsyncJan Kara
We cannot rely on buffer dirty bits during fsync because pdflush can come before fsync is called and clear dirty bits without forcing a transaction commit. What we do is that we track which transaction has last changed the inode and which transaction last changed allocation and force it to disk on fsync. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2009-11-11ext3: retry failed direct IO allocationsEric Sandeen
On a 256M 4k block filesystem, doing this in a loop: dd if=/dev/zero of=test oflag=direct bs=1M count=64 rm -f test eventually leads to spurious ENOSPC: dd: writing `test': No space left on device As with other block allocation callers, it looks like we need to potentially retry the allocations on the initial ENOSPC. A similar patch went into ext4 (commit fbbf69456619de5d251cb9f1df609069178c62d5) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2009-11-11sysctl: Don't look at ctl_name and strategy in the generic codeEric W. Biederman
The ctl_name and strategy fields are unused, now that sys_sysctl is a compatibility wrapper around /proc/sys. No longer looking at them in the generic code is effectively what we are doing now and provides the guarantee that during further cleanups we can just remove references to those fields and everything will work ok. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-11NFSv4: Fix a cache validation bug which causes getcwd() to return ENOENTTrond Myklebust
Changeset a65318bf3afc93ce49227e849d213799b072c5fd (NFSv4: Simplify some cache consistency post-op GETATTRs) incorrectly changed the getattr bitmap for readdir(). This causes the readdir() function to fail to return a fileid/inode number, which again exposed a bug in the NFS readdir code that causes spurious ENOENT errors to appear in applications (see http://bugzilla.kernel.org/show_bug.cgi?id=14541). The immediate band aid is to revert the incorrect bitmap change, but more long term, we should change the NFS readdir code to cope with the fact that NFSv4 servers are not required to support fileids/inode numbers. Reported-by: Daniel J Blueman <daniel.blueman@gmail.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-11-10block: Expose discard granularityMartin K. Petersen
While SSDs track block usage on a per-sector basis, RAID arrays often have allocation blocks that are bigger. Allow the discard granularity and alignment to be set and teach the topology stacking logic how to handle them. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix missing cleanup of gc cache on error cases nilfs2: fix kernel oops in error case of nilfs_ioctl_move_blocks
2009-11-09qnx4fs: add missing KERN_xxx to printk() callsAnders Larsen
fixed printk calls to consistently specify a KERN_xxx level. Signed-off-by: Anders Larsen <al@alarsen.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-11-08ext4: partial revert to fix double brelse WARNING()Theodore Ts'o
This is a partial revert of commit 6487a9d (only the changes made to fs/ext4/namei.c), since it is causing the following brelse() double-free warning when running fsstress on a file system with 1k blocksize and we run into a block allocation failure while converting a single-block directory to a multi-block hash-tree indexed directory. WARNING: at fs/buffer.c:1197 __brelse+0x2e/0x33() Hardware name: VFS: brelse: Trying to free free buffer Modules linked in: Pid: 2226, comm: jbd2/sdd-8 Not tainted 2.6.32-rc6-00577-g0003f55 #101 Call Trace: [<c01587fb>] warn_slowpath_common+0x65/0x95 [<c0158869>] warn_slowpath_fmt+0x29/0x2c [<c021168e>] __brelse+0x2e/0x33 [<c0288a9f>] jbd2_journal_refile_buffer+0x67/0x6c [<c028a9ed>] jbd2_journal_commit_transaction+0x319/0x14d8 [<c0164d73>] ? try_to_del_timer_sync+0x58/0x60 [<c0175bcc>] ? sched_clock_cpu+0x12a/0x13e [<c017f6b4>] ? trace_hardirqs_off+0xb/0xd [<c0175c1f>] ? cpu_clock+0x3f/0x5b [<c017f6ec>] ? lock_release_holdtime+0x36/0x137 [<c0664ad0>] ? _spin_unlock_irqrestore+0x44/0x51 [<c0180af3>] ? trace_hardirqs_on_caller+0x103/0x124 [<c0180b1f>] ? trace_hardirqs_on+0xb/0xd [<c0164d73>] ? try_to_del_timer_sync+0x58/0x60 [<c0290d1c>] kjournald2+0x11a/0x310 [<c017118e>] ? autoremove_wake_function+0x0/0x38 [<c0290c02>] ? kjournald2+0x0/0x310 [<c0170ee6>] kthread+0x66/0x6b [<c0170e80>] ? kthread+0x0/0x6b [<c01251b3>] kernel_thread_helper+0x7/0x10 ---[ end trace 5579351b86af61e3 ]--- Commit 6487a9d was an attempt some buffer head leaks in an ENOSPC error path, but in some cases it actually results in an excess ENOSPC, as shown above. Fixing this means cleaning up who is responsible for releasing the buffer heads from the callee to the caller of add_dirent_to_buf(). Since that's a relatively complex change, and we're late in the rcX development cycle, I'm reverting this now, and holding back a more complete fix until after 2.6.32 ships. We've lived with this buffer_head leak on ENOSPC in ext3 and ext4 for a very long time; a few more months won't kill us. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Curt Wohlgemuth <curtw@google.com>
2009-11-08nilfs2: fix missing cleanup of gc cache on error casesRyusuke Konishi
This fixes an -rc1 regression brought by the commit: 1cf58fa840472ec7df6bf2312885949ebb308853 ("nilfs2: shorten freeze period due to GC in write operation v3"). Although the patch moved out a function call of nilfs_ioctl_move_blocks() to nilfs_ioctl_clean_segments() from nilfs_ioctl_prepare_clean_segments(), it didn't move corresponding cleanup job needed for the error case. This will move the missing cleanup job to the destination function. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Jiro SEKIBA <jir@unicus.jp>
2009-11-08nilfs2: fix kernel oops in error case of nilfs_ioctl_move_blocksRyusuke Konishi
This fixes a kernel oops reported by Markus Trippelsdorf in the email titled "[NILFS users] kernel Oops while running nilfs_cleanerd". The oops was caused by a bug of error path in nilfs_ioctl_move_blocks() function, which was inlined in nilfs_ioctl_clean_segments(). nilfs_ioctl_move_blocks checks duplication of blocks which will be moved in garbage collection. But, the check should have be done within nilfs_ioctl_move_inode_block() to prevent list corruption among buffers storing the target blocks. To fix the kernel oops, this moves forward the duplication check before the list insertion. I also tested this for stable trees [2.6.30, 2.6.31]. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: stable <stable@kernel.org>
2009-11-06compat: move sockios handling to net/socket.cArnd Bergmann
This removes the original socket compat_ioctl code from fs/compat_ioctl.c and converts the code from the copy in net/socket.c into a single function. We add a few cycles of runtime to compat_sock_ioctl() with the long switch() statement, but gain some cycles in return by simplifying the call chain to get there. Due to better inlining, save 1.5kb of object size in the process, and enable further savings: before: text data bss dec hex filename 13540 18008 2080 33628 835c obj/fs/compat_ioctl.o 14565 636 40 15241 3b89 obj/net/socket.o after: text data bss dec hex filename 8916 15176 2080 26172 663c obj/fs/compat_ioctl.o 20725 636 40 21401 5399 obj/net/socket.o Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06appletalk: handle SIOCATALKDIFADDR compat ioctlArnd Bergmann
We must not have a compat ioctl handler for SIOCATALKDIFADDR in common code, because the same number is used in other protocols with different data structures. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06net, compat_ioctl: handle socket ioctl abuses in tty driversArnd Bergmann
Slip and a few other drivers use the same ioctl numbers on tty devices that are normally meant for sockets. This causes problems with our compat_ioctl handling that tries to convert the data structures in a different format. Fortunately, these five drivers all use 32 bit compatible data structures in the ioctl numbers, so we can just add a trivial compat_ioctl conversion function to each of them. SIOCSIFENCAP and SIOCGIFENCAP do not need to live in fs/compat_ioctl.c after this any more, and they are not used on any sockets. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06net/tun: handle compat_ioctl directlyArnd Bergmann
The tun driver is the only code in the kernel that operates on a character device with struct ifreq. Change the driver to handle the conversion itself so we can contain the remaining ifreq handling in the socket layer. This also fixes a bug in the handling of invalid ioctl numbers on an unbound tun device. The driver treats this as a TUNSETIFF in native mode, but there is no way for the generic compat_ioctl() function to emulate this behaviour. Possibly the driver was only doing this accidentally anyway, but if any code relies on this misfeature, it now also works in compat mode. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06cifs: don't use CIFSGetSrvInodeNumber in is_path_accessibleJeff Layton
Because it's lighter weight, CIFS tries to use CIFSGetSrvInodeNumber to verify the accessibility of the root inode and then falls back to doing a full QPathInfo if that fails with -EOPNOTSUPP. I have at least a report of a server that returns NT_STATUS_INTERNAL_ERROR rather than something that translates to EOPNOTSUPP. Rather than trying to be clever with that call, just have is_path_accessible do a normal QPathInfo. That call is widely supported and it shouldn't increase the overhead significantly. Cc: Stable <stable@kernel.org> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-11-06cifs: clean up handling when server doesn't consistently support inode numbersJeff Layton
It's possible that a server will return a valid FileID when we query the FILE_INTERNAL_INFO for the root inode, but then zeroed out inode numbers when we do a FindFile with an infolevel of SMB_FIND_FILE_ID_FULL_DIR_INFO. In this situation turn off querying for server inode numbers, generate a warning for the user and just generate an inode number using iunique. Once we generate any inode number with iunique we can no longer use any server inode numbers or we risk collisions, so ensure that we don't do that in cifs_get_inode_info either. Cc: Stable <stable@kernel.org> Reported-by: Timothy Normand Miller <theosib@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-11-06ext4: Fix return value of ext4_split_unwritten_extents() to fix direct I/OMingming
To prepare for a direct I/O write, we need to split the unwritten extents before submitting the I/O. When no extents needed to be split, ext4_split_unwritten_extents() was incorrectly returning 0 instead of the size of uninitialized extents. This bug caused the wrong return value sent back to VFS code when it gets called from async IO path, leading to an unnecessary fall back to buffered IO. This bug also hid the fact that the check to see whether or not a split would be necessary was incorrect; we can only skip splitting the extent if the write completely covers the uninitialized extent. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-11-05Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: invalidate target of rename fuse: fix kunmap in fuse_ioctl_copy_user fuse: prevent fuse_put_request on invalid pointer
2009-11-05Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/kvm: Remove problematic BUILD_BUG_ON statement powerpc/pci: Fix regression in powerpc MSI-X powerpc: Avoid giving out RTC dates below EPOCH powerpc/mm: Remove debug context clamping from nohash code powerpc: Cleanup Kconfig selection of hugetlbfs support
2009-11-05Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: sysfs: Don't leak secdata when a sysfs_dirent is freed.
2009-11-05Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, fs: Fix x86 procfs stack information for threads on 64-bit x86: Add reboot quirk for 3 series Mac mini x86: Fix printk message typo in mtrr cleanup code dma-debug: Fix compile warning with PAE enabled x86/amd-iommu: Un__init function required on shutdown x86/amd-iommu: Workaround for erratum 63
2009-11-05nfsd: use STATEID_FMT and STATEID_VAL for printing stateidsBenny Halevy
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-05sysfs: Don't leak secdata when a sysfs_dirent is freed.Eric W. Biederman
While refreshing my sysfs patches I noticed a leak in the secdata implementation. We don't free the secdata when we free the sysfs dirent. This is a bug in 2.6.32-rc5 that we really should close. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
2009-11-04nfsd: register NFS_ACL with rpcbindPeter Staubach
Modify the NFS server to register the NFS_ACL services with the rpcbind daemon. This allows the client to ping for the existence of the NFS_ACL support via commands such as "rpcinfo -t <server> nfs_acl". This patch also modifies the NFS_ACL support so that responses to version 2 NULLPROC requests can be made. The changelog for the patch which turned off this functionality mentioned something about not registering the NFS_ACL as being part of some tradition. I can't find this tradition and the only other implementation which supports NFS_ACL does register them with the rpcbind daemon. Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-04x86, fs: Fix x86 procfs stack information for threads on 64-bitStefani Seibold
This patch fixes two issues in the procfs stack information on x86-64 linux. The 32 bit loader compat_do_execve did not store stack start. (this was figured out by Alexey Dobriyan). The stack information on a x64_64 kernel always shows 0 kbyte stack usage, because of a missing implementation of the KSTK_ESP macro which always returned -1. The new implementation now returns the right value. Signed-off-by: Stefani Seibold <stefani@seibold.net> Cc: Americo Wang <xiyou.wangcong@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1257240160.4889.24.camel@wall-e> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04fuse: invalidate target of renameMiklos Szeredi
Invalidate the target's attributes, which may have changed (such as nlink, change time) so that they are refreshed on the next getattr(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2009-11-04fuse: fix kunmap in fuse_ioctl_copy_userJens Axboe
Looks like another victim of the confusing kmap() vs kmap_atomic() API differences. Reported-by: Todor Gyumyushev <yodor1@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: stable@kernel.org
2009-11-04fuse: prevent fuse_put_request on invalid pointerAnand V. Avati
fuse_direct_io() has a loop where requests are allocated in each iteration. if allocation fails, the loop is broken out and follows into an unconditional fuse_put_request() on that invalid pointer. Signed-off-by: Anand V. Avati <avati@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@kernel.org
2009-11-04sendfile(): check f_op.splice_write() rather than f_op.sendpage()Changli Gao
sendfile(2) was reworked with the splice infrastructure, but it still checks f_op.sendpage() instead of f_op.splice_write() wrongly. Although if f_op.sendpage() exists, f_op.splice_write() always exists at the same time currently, the assumption will be broken in future silently. This patch also brings a side effect: sendfile(2) can work with any output file. Some security checks related to f_op are added too. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-03Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: cfq-iosched: limit coop preemption cfq-iosched: fix bad return value cfq_should_preempt() backing-dev: bdi sb prune should be in the unregister path, not destroy Fix bio_alloc() and bio_kmalloc() documentation bio_put(): add bio_clone() to the list of functions in the comment
2009-11-03Merge branch 'for-linus' into for-2.6.33Jens Axboe
Conflicts: block/cfq-iosched.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-03ext4: code clean up for dio fallocate handlingMingming
The ext4_debug() call in ext4_end_io_dio() should be moved after the check to make sure that io_end is non-NULL. The comment above ext4_get_block_dio_write() ("Maximum number of blocks...") is a duplicate; the original and correct comment is above the #define DIO_MAX_BLOCKS up above. Based on review comments from Curt Wohlgemuth. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-11-10ext4: skip conversion of uninit extents after direct IO if there isn't anyMingming
At the end of direct I/O operation, ext4_ext_direct_IO() always called ext4_convert_unwritten_extents(), regardless of whether there were any unwritten extents involved in the I/O or not. This commit adds a state flag so that ext4_ext_direct_IO() only calls ext4_convert_unwritten_extents() when necessary. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-11-10ext4: fix ext4_ext_direct_IO()'s return value after converting uninit extentsMingming
After a direct I/O request covering an uninitalized extent (i.e., created using the fallocate system call) or a hole in a file, ext4 will convert the uninitialized extent so it is marked as initialized by calling ext4_convert_unwritten_extents(). This function returns zero on success. This return value was getting returned by ext4_direct_IO(); however the file system's direct_IO function is supposed to return the number of bytes read or written on a success. By returning zero, it confused the direct I/O code into falling back to buffered I/O unnecessarily. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-11-03nilfs2: add zero-fill for new btree node buffersRyusuke Konishi
Adds missing initialization of newly allocated b-tree node buffers. This avoids garbage data to be mixed in b-tree node blocks. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-11-03nilfs2: fix irregular checkpoint creation due to data flushRyusuke Konishi
When nilfs flushes out dirty data to reduce memory pressure, creation of checkpoints is wrongly postponed. This bug causes irregular checkpoint creation especially in small footprint systems. To correct this issue, a timer for the checkpoint creation has to be continued if a log writer does not create a checkpoint. This will do the correction. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-11-03nilfs2: fix dirty page accounting leak causing hang at writeRyusuke Konishi
Bruno Prémont and Dunphy, Bill noticed me that NILFS will certainly hang on ARM-based targets. I found this was caused by an underflow of dirty pages counter. A b-tree cache routine was marking page dirty without adjusting page account information. This fixes the dirty page accounting leak and resolves the hang on arm-based targets. Reported-by: Bruno Prémont <bonbons@linux-vserver.org> Reported-by: Dunphy, Bill <WDunphy@tandbergdata.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Bruno Prémont <bonbons@linux-vserver.org> Cc: stable <stable@kernel.org>
2009-11-02ext4: discard preallocation when restarting a transaction during truncateAneesh Kumar K.V
When restart a transaction during a truncate operation, we drop and reacquire i_data_sem. After reacquiring i_data_sem, we need to discard any inode-based preallocation that might have been grabbed while we released i_data_sem (for example, if pdflush is allocating blocks and racing against the truncate). Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-11-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: fix readdir corner cases 9p: fix readlink 9p: fix a small bug in readdir for long directories
2009-11-02Revert "ext4: Remove journal_checksum mount option and enable it by default"Linus Torvalds
This reverts commit d0646f7b636d067d715fab52a2ba9c6f0f46b0d7, as requested by Eric Sandeen. It can basically cause an ext4 filesystem to miss recovery (and thus get mounted with errors) if the journal checksum does not match. Quoth Eric: "My hand-wavy hunch about what is happening is that we're finding a bad checksum on the last partially-written transaction, which is not surprising, but if we have a wrapped log and we're doing the initial scan for head/tail, and we abort scanning on that bad checksum, then we are essentially running an unrecovered filesystem. But that's hand-wavy and I need to go look at the code. We lived without journal checksums on by default until now, and at this point they're doing more harm than good, so we should revert the default-changing commit until we can fix it and do some good power-fail testing with the fixes in place." See http://bugzilla.kernel.org/show_bug.cgi?id=14354 for all the gory details. Requested-by: Eric Sandeen <sandeen@redhat.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Alexey Fisher <bug-track@fisher-privat.net> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Mathias Burén <mathias.buren@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-029p: fix readdir corner casesEric Van Hensbergen
The patch below also addresses a couple of other corner cases in readdir seen with a large (e.g. 64k) msize. I'm not sure what people think of my co-opting of fid->aux here. I'd be happy to rework if there's a better way. When the size of the user supplied buffer passed to readdir is smaller than the data returned in one go by the 9P read request, v9fs_dir_readdir() currently discards extra data so that, on the next call, a 9P read request will be issued with offset < previous offset + bytes returned, which voilates the constraint described in paragraph 3 of read(5) description. This patch preseves the leftover data in fid->aux for use in the next call. Signed-off-by: Jim Garlick <garlick@llnl.gov> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-11-029p: fix readlinkMartin Stava
I do not know if you've looked on the patch, but unfortunately it is incorrect. A suggested better version is in this email (the old version didn't work in case the user provided buffer was not long enough - it incorrectly appended null byte on a position of last char, and thus broke the contract of the readlink method). However, I'm still not sure this is 100% correct thing to do, I think readlink is supposed to return buffer without last null byte in all cases, but we do return last null byte (even the old version).. on the other hand it is likely unspecified what is in the remaining part of the buffer, so null character may be fine there ;): Signed-off-by: Martin Stava <martin.stava@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-11-029p: fix a small bug in readdir for long directoriesMartin Stava
Here is a proposed patch for bug in readdir. Listing of dirs with many files fails without this patch. Signed-off-by: Martin Stava <martin.stava@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-11-02Fix bio_alloc() and bio_kmalloc() documentationAlberto Bertogli
Commit 451a9ebf accidentally broke bio_alloc() and bio_kmalloc() comments by (almost) swapping them. This patch fixes that, by placing the comments in the right place. Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-02bio_put(): add bio_clone() to the list of functions in the commentAlberto Bertogli
In bio_put()'s comment, add bio_clone() to the list of functions that can give you a bio reference. Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>