summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2014-04-16fs: Don't return 0 from get_anon_bdevThomas Bächler
Commit 9e30cc9595303b27b48 removed an internal mount. This has the side-effect that rootfs now has FSID 0. Many userspace utilities assume that st_dev in struct stat is never 0, so this change breaks a number of tools in early userspace. Since we don't know how many userspace programs are affected, make sure that FSID is at least 1. References: http://article.gmane.org/gmane.linux.kernel/1666905 References: http://permalink.gmane.org/gmane.linux.utilities.util-linux-ng/8557 Cc: 3.14 <stable@vger.kernel.org> Signed-off-by: Thomas Bächler <thomas@archlinux.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-16fs: cifs: remove unused variable.Cyril Roelandt
In SMB2_set_compression(), the "res_key" variable is only initialized to NULL and later kfreed. It is therefore useless and should be removed. Found with the following semantic patch: <smpl> @@ identifier foo; identifier f; type T; @@ * f(...) { ... * T *foo = NULL; ... when forall when != foo * kfree(foo); ... } </smpl> Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2014-04-16Return correct error on query of xattr on file with empty xattrsSteve French
xfstest 020 detected a problem with cifs xattr handling. When a file had an empty xattr list, we returned success (with an empty xattr value) on query of particular xattrs rather than returning ENODATA. This patch fixes it so that query of an xattr returns ENODATA when the xattr list is empty for the file. Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com>
2014-04-16cifs: Wait for writebacks to complete before attempting write.Sachin Prabhu
Problem reported in Red Hat bz 1040329 for strict writes where we cache only when we hold oplock and write direct to the server when we don't. When we receive an oplock break, we first change the oplock value for the inode in cifsInodeInfo->oplock to indicate that we no longer hold the oplock before we enqueue a task to flush changes to the backing device. Once we have completed flushing the changes, we return the oplock to the server. There are 2 ways here where we can have data corruption 1) While we flush changes to the backing device as part of the oplock break, we can have processes write to the file. These writes check for the oplock, find none and attempt to write directly to the server. These direct writes made while we are flushing from cache could be overwritten by data being flushed from the cache causing data corruption. 2) While a thread runs in cifs_strict_writev, the machine could receive and process an oplock break after the thread has checked the oplock and found that it allows us to cache and before we have made changes to the cache. In that case, we end up with a dirty page in cache when we shouldn't have any. This will be flushed later and will overwrite all subsequent writes to the part of the file represented by this page. Before making any writes to the server, we need to confirm that we are not in the process of flushing data to the server and if we are, we should wait until the process is complete before we attempt the write. We should also wait for existing writes to complete before we process an oplock break request which changes oplock values. We add a version specific downgrade_oplock() operation to allow for differences in the oplock values set for the different smb versions. Cc: stable@vger.kernel.org Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com>
2014-04-16aio: block io_destroy() until all context requests are completedAnatol Pomozov
deletes aio context and all resources related to. It makes sense that no IO operations connected to the context should be running after the context is destroyed. As we removed io_context we have no chance to get requests status or call io_getevents(). man page for io_destroy says that this function may block until all context's requests are completed. Before kernel 3.11 io_destroy() blocked indeed, but since aio refactoring in 3.11 it is not true anymore. Here is a pseudo-code that shows a testcase for a race condition discovered in 3.11: initialize io_context io_submit(read to buffer) io_destroy() // context is destroyed so we can free the resources free(buffers); // if the buffer is allocated by some other user he'll be surprised // to learn that the buffer still filled by an outstanding operation // from the destroyed io_context The fix is straight-forward - add a completion struct and wait on it in io_destroy, complete() should be called when number of in-fligh requests reaches zero. If two or more io_destroy() called for the same context simultaneously then only the first one waits for IO completion, other calls behaviour is undefined. Tested: ran http://pastebin.com/LrPsQ4RL testcase for several hours and do not see the race condition anymore. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
2014-04-15NFS: Don't ignore suid/sgid bit changes after a successful writeTrond Myklebust
If we suspect that the server may have cleared the suid/sgid bit, then mark the inode for revalidation. Reported-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-04-15NFS: Don't declare inode uptodate unless all attributes were checkedTrond Myklebust
Fix a bug, whereby nfs_update_inode() was declaring the inode to be up to date despite not having checked all the attributes. The bug occurs because the temporary variable in which we cache the validity information is 'sanitised' before reapplying to nfsi->cache_validity. Reported-by: Kinglong Mee <kinglongmee@gmail.com> Cc: stable@vger.kernel.org # 3.5+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-04-15NFS: Fix memroy leak for double mountsKinglong Mee
When double mounting same nfs filesystem, the devname saved in d_fsdata will be lost.The second mount should not change the devname that be saved in d_fsdata. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-04-15locks: allow __break_lease to sleep even when break_time is 0Jeff Layton
A fl->fl_break_time of 0 has a special meaning to the lease break code that basically means "never break the lease". knfsd uses this to ensure that leases don't disappear out from under it. Unfortunately, the code in __break_lease can end up passing this value to wait_event_interruptible as a timeout, which prevents it from going to sleep at all. This makes __break_lease to spin in a tight loop and causes soft lockups. Fix this by ensuring that we pass a minimum value of 1 as a timeout instead. Cc: <stable@vger.kernel.org> Cc: J. Bruce Fields <bfields@fieldses.org> Reported-by: Terry Barnaby <terry1@beam.ltd.uk> Signed-off-by: Jeff Layton <jlayton@redhat.com>
2014-04-14ext4: fix ext4_count_free_clusters() with EXT4FS_DEBUG and bigalloc enabledAzat Khuzhin
With bigalloc enabled we must use EXT4_CLUSTERS_PER_GROUP() instead of EXT4_BLOCKS_PER_GROUP() otherwise we will go beyond the allocated buffer. $ mount -t ext4 /dev/vde /vde [ 70.573993] EXT4-fs DEBUG (fs/ext4/mballoc.c, 2346): ext4_mb_alloc_groupinfo: [ 70.575174] allocated s_groupinfo array for 1 meta_bg's [ 70.576172] EXT4-fs DEBUG (fs/ext4/super.c, 2092): ext4_check_descriptors: [ 70.576972] Checking group descriptorsBUG: unable to handle kernel paging request at ffff88006ab56000 [ 72.463686] IP: [<ffffffff81394eb9>] __bitmap_weight+0x2a/0x7f [ 72.464168] PGD 295e067 PUD 2961067 PMD 7fa8e067 PTE 800000006ab56060 [ 72.464738] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 72.465139] Modules linked in: [ 72.465402] CPU: 1 PID: 3560 Comm: mount Tainted: G W 3.14.0-rc2-00069-ge57bce1 #60 [ 72.466079] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 72.466505] task: ffff88007ce6c8a0 ti: ffff88006b7f0000 task.ti: ffff88006b7f0000 [ 72.466505] RIP: 0010:[<ffffffff81394eb9>] [<ffffffff81394eb9>] __bitmap_weight+0x2a/0x7f [ 72.466505] RSP: 0018:ffff88006b7f1c00 EFLAGS: 00010206 [ 72.466505] RAX: 0000000000000000 RBX: 000000000000050a RCX: 0000000000000040 [ 72.466505] RDX: 0000000000000000 RSI: 0000000000080000 RDI: 0000000000000000 [ 72.466505] RBP: ffff88006b7f1c28 R08: 0000000000000002 R09: 0000000000000000 [ 72.466505] R10: 000000000000babe R11: 0000000000000400 R12: 0000000000080000 [ 72.466505] R13: 0000000000000200 R14: 0000000000002000 R15: ffff88006ab55000 [ 72.466505] FS: 00007f43ba1fa840(0000) GS:ffff88007f800000(0000) knlGS:0000000000000000 [ 72.466505] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 72.466505] CR2: ffff88006ab56000 CR3: 000000006b7e6000 CR4: 00000000000006e0 [ 72.466505] Stack: [ 72.466505] ffff88006ab65000 0000000000000000 0000000000000000 0000000000010000 [ 72.466505] ffff88006ab6f400 ffff88006b7f1c58 ffffffff81396bb8 0000000000010000 [ 72.466505] 0000000000000000 ffff88007b869a90 ffff88006a48a000 ffff88006b7f1c70 [ 72.466505] Call Trace: [ 72.466505] [<ffffffff81396bb8>] memweight+0x5f/0x8a [ 72.466505] [<ffffffff811c3b19>] ext4_count_free+0x13/0x21 [ 72.466505] [<ffffffff811c396c>] ext4_count_free_clusters+0xdb/0x171 [ 72.466505] [<ffffffff811e3bdd>] ext4_fill_super+0x117c/0x28ef [ 72.466505] [<ffffffff81391569>] ? vsnprintf+0x1c7/0x3f7 [ 72.466505] [<ffffffff8114d8dc>] mount_bdev+0x145/0x19c [ 72.466505] [<ffffffff811e2a61>] ? ext4_calculate_overhead+0x2a1/0x2a1 [ 72.466505] [<ffffffff811dab1d>] ext4_mount+0x15/0x17 [ 72.466505] [<ffffffff8114e3aa>] mount_fs+0x67/0x150 [ 72.466505] [<ffffffff811637ea>] vfs_kern_mount+0x64/0xde [ 72.466505] [<ffffffff81165d19>] do_mount+0x6fe/0x7f5 [ 72.466505] [<ffffffff81126cc8>] ? strndup_user+0x3a/0xd9 [ 72.466505] [<ffffffff8116604b>] SyS_mount+0x85/0xbe [ 72.466505] [<ffffffff81619e90>] tracesys+0xdd/0xe2 [ 72.466505] Code: c3 89 f0 b9 40 00 00 00 55 99 48 89 e5 41 57 f7 f9 41 56 49 89 ff 41 55 45 31 ed 41 54 41 89 f4 53 31 db 41 89 c6 45 39 ee 7e 10 <4b> 8b 3c ef 49 ff c5 e8 bf ff ff ff 01 c3 eb eb 31 c0 45 85 f6 [ 72.466505] RIP [<ffffffff81394eb9>] __bitmap_weight+0x2a/0x7f [ 72.466505] RSP <ffff88006b7f1c00> [ 72.466505] CR2: ffff88006ab56000 [ 72.466505] ---[ end trace 7d051a08ae138573 ]--- Killed Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-14btrfs: fix use-after-free in mount_subvol()Christoph Jaeger
Pointer 'newargs' is used after the memory that it points to has already been freed. Picked up by Coverity - CID 1201425. Fixes: 0723a0473f ("btrfs: allow mounting btrfs subvolumes with different ro/rw options") Signed-off-by: Christoph Jaeger <christophjaeger@linux.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-14fs/jfs/jfs_inode.c: atomically set inode->i_flagsFabian Frederick
According to commit 5f16f3225b0624 ext4: atomically set inode->i_flags in ext4_set_inode_flags() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu>
2014-04-14xfs: don't try to use the filestream allocator for metadata allocationsChristoph Hellwig
xfs_bmap_btalloc_nullfb has two entirely different control flows when using the filestream allocator vs the regular one, but it get the conditionals wrong and ends up mixing the two for metadata allocations. Fix this by adding a missing userdata check and slight refactoring. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused calculation in xfs_dir2_sf_addname()Eric Sandeen
The "add_entsize" calculated here is never used. "incr_isize" accounts for the inode expansion of the old entries + parent + new entry all by itself. Once we've removed add_entsize there, it's just a pointless intermediate variable elsewhere, so remove it. For that matter, old_isize is gratuitous too, so nuke that. And add a few comments so the magic "+1's" and "+2's" make a bit more sense. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove pointless pointer increment in xfs_dir2_block_compact()Eric Sandeen
xfs_dir2_block_compact() is passed a pointer to *blp, and advances it locally - but nobody uses the pointer (locally) after that. This behavior came about as part of prior refactoring, 20f7e9f xfs: factor dir2 block read operations and looking at the code as it was before, it seems quite clear that this change introduced a bug; the pre-refactoring code expects blp to be modified after compaction. And indeed it did; see this commit which fixed it: 37f1356 xfs: recalculate leaf entry pointer after compacting a dir2 block So the bug was introduced & resolved in the 3.8 cycle. Whoops. Well, it's fixed now, and mystery solved; just remove the now-pointless local increment of the blp pointer. (I guess we should have run clang earlier!) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused trans pointer arg from xlog_recover_unmount_trans()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused ail pointer arg from xfs_trans_ail_cursor_done()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused xfs_mount arg from xfs_symlink_hdr_ok()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused bp arg from xfs_iflush_fork()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused pag ptr arg from iterator execute functionsEric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused length arg from alloc_block opsEric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused mp arg from xfs_calc_dquots_per_chunk()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused mp arg from xfs_dir2 dataptr/byte functionsEric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused tp arg from xfs_da_reada_buf & callersEric Sandeen
This one hits a few functions as we unravel the unused arg up through the callers. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused bip arg from xfs_buf_item_log_segment()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused flags arg from _xfs_buf_get_pages()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused args from xfs_alloc_buftarg()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused blocksize arg from xfs_setsize_buftarg()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused level arg from xfs_btree_read_buf_block()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused mp arg from xfs_bmap_forkoff_reset()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused mp arg from xfs_bmdr_maxrecs()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused mp arg from xfs_attr3_rmt_hdr_ok()Eric Sandeen
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: remove unused tp arg from xfs_bmap_last_offset() and callersEric Sandeen
remove unused transaction pointer from various callchains leading to xfs_bmap_last_offset(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: zeroing space needs to punch delalloc blocksDave Chinner
When we are zeroing space andit is covered by a delalloc range, we need to punch the delalloc range out before we truncate the page cache. Failing to do so leaves and inconsistency between the page cache and the extent tree, which we later trip over when doing direct IO over the same range. Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: xfs_vm_write_end truncates too much on failureDave Chinner
Similar to the write_begin problem, xfs-vm_write_end will truncate back to the old EOF, potentially removing page cache from over the top of delalloc blocks with valid data in them. Fix this by truncating back to just the start of the failed write. Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: write failure beyond EOF truncates too much dataDave Chinner
If we fail a write beyond EOF and have to handle it in xfs_vm_write_begin(), we truncate the inode back to the current inode size. This doesn't take into account the fact that we may have already made successful writes to the same page (in the case of block size < page size) and hence we can truncate the page cache away from blocks with valid data in them. If these blocks are delayed allocation blocks, we now have a mismatch between the page cache and the extent tree, and this will trigger - at minimum - a delayed block count mismatch assert when the inode is evicted from the cache. We can also trip over it when block mapping for direct IO - this is the most common symptom seen from fsx and fsstress when run from xfstests. Fix it by only truncating away the exact range we are updating state for in this write_begin call. Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-14xfs: kill buffers over failed write ranges properlyDave Chinner
When a write fails, if we don't clear the delalloc flags from the buffers over the failed range, they can persist beyond EOF and cause problems. writeback will see the pages in the page cache, see they are dirty and continually retry the write, assuming that the page beyond EOF is just racing with a truncate. The page will eventually be released due to some other operation (e.g. direct IO), and it will not pass through invalidation because it is dirty. Hence it will be released with buffer_delay set on it, and trigger warnings in xfs_vm_releasepage() and assert fail in xfs_file_aio_write_direct because invalidation failed and we didn't write the corect amount. This causes failures on block size < page size filesystems in fsx and fsstress workloads run by xfstests. Fix it by completely trashing any state on the buffer that could be used to imply that it contains valid data when the delalloc range over the buffer is punched out during the failed write handling. Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-13cifs: Use min_t() when comparing "size_t" and "unsigned long"Geert Uytterhoeven
On 32 bit, size_t is "unsigned int", not "unsigned long", causing the following warning when comparing with PAGE_SIZE, which is always "unsigned long": fs/cifs/file.c: In function ‘cifs_readdata_to_iov’: fs/cifs/file.c:2757: warning: comparison of distinct pointer types lacks a cast Introduced by commit 7f25bba819a3 ("cifs_iovec_read: keep iov_iter between the calls of cifs_readdata_to_iov()"), which changed the signedness of "remaining" and the code from min_t() to min(). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-13ext4: always check ext4_ext_find_extent resultDmitry Monakhov
Where are some places where logic guaranties us that extent we are searching exits, but this may not be true due to on-disk data corruption. If such corruption happens we must prevent possible null pointer dereferences. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-13ext4: fix error handling in ext4_ext_shift_extentsDmitry Monakhov
Fix error handling by adding some. :-) Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-12ext4: silence sparse check warning for function ext4_trim_extentjon ernst
This fixes the following sparse warning: CHECK fs/ext4/mballoc.c fs/ext4/mballoc.c:5019:9: warning: context imbalance in 'ext4_trim_extent' - unexpected unlock Signed-off-by: "Jon Ernst" <jonernst07@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-12ext4: COLLAPSE_RANGE only works on extent-based filesTheodore Ts'o
Unfortunately, we weren't checking to make sure of this the inode was extent-based before attempt operate on it. Hilarity ensues. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Namjae Jeon <namjae.jeon@samsung.com>
2014-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull yet more networking updates from David Miller: 1) Various fixes to the new Redpine Signals wireless driver, from Fariya Fatima. 2) L2TP PPP connect code takes PMTU from the wrong socket, fix from Dmitry Petukhov. 3) UFO and TSO packets differ in whether they include the protocol header in gso_size, account for that in skb_gso_transport_seglen(). From Florian Westphal. 4) If VLAN untagging fails, we double free the SKB in the bridging output path. From Toshiaki Makita. 5) Several call sites of sk->sk_data_ready() were referencing an SKB just added to the socket receive queue in order to calculate the second argument via skb->len. This is dangerous because the moment the skb is added to the receive queue it can be consumed in another context and freed up. It turns out also that none of the sk->sk_data_ready() implementations even care about this second argument. So just kill it off and thus fix all these use-after-free bugs as a side effect. 6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti. 7) pktgen needs to do locking properly for LLTX devices, from Daniel Borkmann. 8) xen-netfront driver initializes TX array entries in RX loop :-) From Vincenzo Maffione. 9) After refactoring, some tunnel drivers allow a tunnel to be configured on top itself. Fix from Nicolas Dichtel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) vti: don't allow to add the same tunnel twice gre: don't allow to add the same tunnel twice drivers: net: xen-netfront: fix array initialization bug pktgen: be friendly to LLTX devices r8152: check RTL8152_UNPLUG net: sun4i-emac: add promiscuous support net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO net: ipv6: Fix oif in TCP SYN+ACK route lookup. drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts drivers: net: cpsw: discard all packets received when interface is down net: Fix use after free by removing length arg from sk_data_ready callbacks. Drivers: net: hyperv: Address UDP checksum issues Drivers: net: hyperv: Negotiate suitable ndis version for offload support Drivers: net: hyperv: Allocate memory for all possible per-pecket information bridge: Fix double free and memory leak around br_allowed_ingress bonding: Remove debug_fs files when module init fails i40evf: program RSS LUT correctly i40evf: remove open-coded skb_cow_head ixgb: remove open-coded skb_cow_head igbvf: remove open-coded skb_cow_head ...
2014-04-12ceph: fix pr_fmt() redefinitionLinus Torvalds
The vfs merge caused a latent bug to show up: In file included from fs/ceph/super.h:4:0, from fs/ceph/ioctl.c:3: include/linux/ceph/ceph_debug.h:4:0: warning: "pr_fmt" redefined [enabled by default] #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt ^ In file included from include/linux/kernel.h:13:0, from include/linux/uio.h:12, from include/linux/socket.h:7, from include/uapi/linux/in.h:22, from include/linux/in.h:23, from fs/ceph/ioctl.c:1: include/linux/printk.h:214:0: note: this is the location of the previous definition #define pr_fmt(fmt) fmt ^ where the reason is that <linux/ceph_debug.h> is included much too late for the "pr_fmt()" define. The include of <linux/ceph_debug.h> needs to be the first include in the file, but fs/ceph/ioctl.c had for some reason missed that, and it wasn't noticeable until some unrelated header file changes brought in an indirect earlier include of <linux/kernel.h>. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "The first vfs pile, with deep apologies for being very late in this window. Assorted cleanups and fixes, plus a large preparatory part of iov_iter work. There's a lot more of that, but it'll probably go into the next merge window - it *does* shape up nicely, removes a lot of boilerplate, gets rid of locking inconsistencie between aio_write and splice_write and I hope to get Kent's direct-io rewrite merged into the same queue, but some of the stuff after this point is having (mostly trivial) conflicts with the things already merged into mainline and with some I want more testing. This one passes LTP and xfstests without regressions, in addition to usual beating. BTW, readahead02 in ltp syscalls testsuite has started giving failures since "mm/readahead.c: fix readahead failure for memoryless NUMA nodes and limit readahead pages" - might be a false positive, might be a real regression..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) missing bits of "splice: fix racy pipe->buffers uses" cifs: fix the race in cifs_writev() ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure kill generic_file_buffered_write() ocfs2_file_aio_write(): switch to generic_perform_write() ceph_aio_write(): switch to generic_perform_write() xfs_file_buffered_aio_write(): switch to generic_perform_write() export generic_perform_write(), start getting rid of generic_file_buffer_write() generic_file_direct_write(): get rid of ppos argument btrfs_file_aio_write(): get rid of ppos kill the 5th argument of generic_file_buffered_write() kill the 4th argument of __generic_file_aio_write() lustre: don't open-code kernel_recvmsg() ocfs2: don't open-code kernel_recvmsg() drbd: don't open-code kernel_recvmsg() constify blk_rq_map_user_iov() and friends lustre: switch to kernel_sendmsg() ocfs2: don't open-code kernel_sendmsg() take iov_iter stuff to mm/iov_iter.c process_vm_access: tidy up a bit ...
2014-04-12Merge git://git.infradead.org/users/eparis/auditLinus Torvalds
Pull audit updates from Eric Paris. * git://git.infradead.org/users/eparis/audit: (28 commits) AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range audit: do not cast audit_rule_data pointers pointlesly AUDIT: Allow login in non-init namespaces audit: define audit_is_compat in kernel internal header kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c sched: declare pid_alive as inline audit: use uapi/linux/audit.h for AUDIT_ARCH declarations syscall_get_arch: remove useless function arguments audit: remove stray newline from audit_log_execve_info() audit_panic() call audit: remove stray newlines from audit_log_lost messages audit: include subject in login records audit: remove superfluous new- prefix in AUDIT_LOGIN messages audit: allow user processes to log from another PID namespace audit: anchor all pid references in the initial pid namespace audit: convert PPIDs to the inital PID namespace. pid: get pid_t ppid of task in init_pid_ns audit: rename the misleading audit_get_context() to audit_take_context() audit: Add generic compat syscall support audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL ...
2014-04-12ext4: fix byte order problems introduced by the COLLAPSE_RANGE patchesZheng Liu
This commit tries to fix some byte order issues that is found by sparse check. $ make M=fs/ext4 C=2 CF=-D__CHECK_ENDIAN__ ... CHECK fs/ext4/extents.c fs/ext4/extents.c:5232:41: warning: restricted __le32 degrades to integer fs/ext4/extents.c:5236:52: warning: bad assignment (-=) to restricted __le32 fs/ext4/extents.c:5258:45: warning: bad assignment (-=) to restricted __le32 fs/ext4/extents.c:5303:28: warning: restricted __le32 degrades to integer fs/ext4/extents.c:5318:18: warning: incorrect type in assignment (different base types) fs/ext4/extents.c:5318:18: expected unsigned int [unsigned] [usertype] ex_start fs/ext4/extents.c:5318:18: got restricted __le32 [usertype] ee_block fs/ext4/extents.c:5319:24: warning: restricted __le32 degrades to integer fs/ext4/extents.c:5334:31: warning: incorrect type in assignment (different base types) ... Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-12ext4: use i_size_read in ext4_unaligned_aio()Theodore Ts'o
We haven't taken i_mutex yet, so we need to use i_size_read(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2014-04-12fs: disallow all fallocate operation on active swapfileLukas Czerner
Currently some file system have IS_SWAPFILE check in their fallocate implementations and some do not. However we should really prevent any fallocate operation on swapfile so move the check to vfs and remove the redundant checks from the file systems fallocate implementations. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-12fs: move falloc collapse range check into the filesystem methodsLukas Czerner
Currently in do_fallocate in collapse range case we're checking whether offset + len is not bigger than i_size. However there is nothing which would prevent i_size from changing so the check is pointless. It should be done in the file system itself and the file system needs to make sure that i_size is not going to change. The i_size check for the other fallocate modes are also done in the filesystems. As it is now we can easily crash the kernel by having two processes doing truncate and fallocate collapse range at the same time. This can be reproduced on ext4 and it is theoretically possible on xfs even though I was not able to trigger it with this simple test. This commit removes the check from do_fallocate and adds it to the file system. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>