summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2015-10-09f2fs: fix incorrect return statement in the function ↵Nicholas Krause
f2fs_ioc_release_volatile_write This fixes the incorrect return statement at the end of the function f2fs_ioc_release_volatile_write's body for returning zero as this is incorrect due to the function call before this return statement to the function punch_hole being able to fail and we should return this function's return fail directly in order to signal to callers of the function f2fs_ioc_release_volatile if a failure arises with this call to punch_hole fails. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-09f2fs: trace in batches extent info updateChao Yu
Rename trace_f2fs_update_extent_tree to trace_f2fs_update_extent_tree_range, then expand and enable it to trace in batches extent info updates. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-09nfsd/blocklayout: accept any minlengthChristoph Hellwig
Recent Linux clients have started to send GETLAYOUT requests with minlength less than blocksize. Servers aren't really allowed to impose this kind of restriction on layouts; see RFC 5661 section 18.43.3 for details. This has been observed to cause indefinite hangs on fsx runs on some clients. Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-10-09Merge tag 'v4.3-rc4' into for-4.4/coreJens Axboe
Linux 4.3-rc4 Pulling in v4.3-rc4 to avoid conflicts with NVMe fixes that have gone in since for-4.4/core was based.
2015-10-08NFSv4: Unify synchronous and asynchronous error handlingTrond Myklebust
They now only differ in the way we handle waiting, so let's unify. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08NFSv4: Don't use synchronous delegation recall in exception handlingTrond Myklebust
The code needs to be able to work from inside an asynchronous context. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08NFSv4: nfs4_async_handle_error should take a non-const nfs_serverTrond Myklebust
For symmetry with the synchronous handler, and so that we can potentially handle errors such as NFS4ERR_BADNAME. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08NFSv4: Update the delay statistics counter for synchronous delaysTrond Myklebust
Currently, we only do so for asynchronous delays. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08NFSv4: Refactor NFSv4 error handlingTrond Myklebust
Prepare for unification of the synchronous and asynchronous error handling. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08btrfs: switch more printks to our helpersDavid Sterba
Convert the simple cases, not all functions provide a way to reach the fs_info. Also skipped debugging messages (print-tree, integrity checker and pr_debug) and messages that are printed from possibly unfinished mount. Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-08btrfs: switch message printers to ratelimited variantsDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-08btrfs: introduce ratelimited variants of message printing functionsDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-08btrfs: switch message printers to ratelimited _in_rcu variantsDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-08btrfs: introduce ratelimited _in_rcu variants of message printing functionsDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-08btrfs: switch message printers to _in_rcu variantsDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-08btrfs: introduce _in_rcu variants of message printing functionsDavid Sterba
Due to the missing variants there are messages that lack the information printed by btrfs_info etc helpers. Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-07Merge tag 'nfs-for-4.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Bugfixes: - Fix a use-after-free bug in the RPC/RDMA client - Fix a write performance regression - Fix up page writeback accounting - Don't try to reclaim unused state owners - Fix a NFSv4 nograce recovery hang - reset states to use open_stateid when returning delegation voluntarily - Fix a tracepoint NULL-pointer dereference" * tag 'nfs-for-4.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Fix a tracepoint NULL-pointer dereference nfs4: reset states to use open_stateid when returning delegation voluntarily NFSv4: Fix a nograce recovery hang NFSv4.1: nfs4_opendata_check_deleg needs to handle NFS4_OPEN_CLAIM_DELEG_CUR_FH NFSv4: Don't try to reclaim unused state owners NFS: Fix a write performance regression NFS: Fix up page writeback accounting xprtrdma: disconnect and flush cqs before freeing buffers
2015-10-06NFS: Fix a tracepoint NULL-pointer dereferenceAnna Schumaker
Running xfstest generic/013 with the tracepoint nfs:nfs4_open_file enabled produces a NULL-pointer dereference when calculating fileid and filehandle of the opened file. Fix this by checking if state is NULL before trying to use the inode pointer. Reported-by: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-06BTRFS: support NFSv2 exportNeilBrown
The "fh_len" passed to ->fh_to_* is not guaranteed to be that same as that returned by encode_fh - it may be larger. With NFSv2, the filehandle is fixed length, so it may appear longer than expected and be zero-padded. So we must test that fh_len is at least some value, not exactly equal to it. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: David Sterba <dsterba@suse.cz>
2015-10-06Btrfs: open_ctree: Fix possible memory leakchandan
After reading one of chunk or tree root tree's root node from disk, if the root node does not have EXTENT_BUFFER_UPTODATE flag set, we fail to release the memory used by the root node. Fix this. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
2015-10-06Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull CIFS fixes from Steve French: "Two fixes for problems pointed out by automated tools. Thanks PaX/grsecurity team and Dan Carpenter (and the Smatch tool)" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: [CIFS] Update cifs version number [SMB3] Do not fall back to SMBWriteX in set_file_size error cases [SMB3] Missing null tcon check
2015-10-05Btrfs: fix deadlock when finalizing block group creationFilipe Manana
Josef ran into a deadlock while a transaction handle was finalizing the creation of its block groups, which produced the following trace: [260445.593112] fio D ffff88022a9df468 0 8924 4518 0x00000084 [260445.593119] ffff88022a9df468 ffffffff81c134c0 ffff880429693c00 ffff88022a9df488 [260445.593126] ffff88022a9e0000 ffff8803490d7b00 ffff8803490d7b18 ffff88022a9df4b0 [260445.593132] ffff8803490d7af8 ffff88022a9df488 ffffffff8175a437 ffff8803490d7b00 [260445.593137] Call Trace: [260445.593145] [<ffffffff8175a437>] schedule+0x37/0x80 [260445.593189] [<ffffffffa0850f37>] btrfs_tree_lock+0xa7/0x1f0 [btrfs] [260445.593197] [<ffffffff810db7c0>] ? prepare_to_wait_event+0xf0/0xf0 [260445.593225] [<ffffffffa07eac44>] btrfs_lock_root_node+0x34/0x50 [btrfs] [260445.593253] [<ffffffffa07eff6b>] btrfs_search_slot+0x88b/0xa00 [btrfs] [260445.593295] [<ffffffffa08389df>] ? free_extent_buffer+0x4f/0x90 [btrfs] [260445.593324] [<ffffffffa07f1a06>] btrfs_insert_empty_items+0x66/0xc0 [btrfs] [260445.593351] [<ffffffffa07ea94a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs] [260445.593394] [<ffffffffa08403b9>] btrfs_finish_chunk_alloc+0x1c9/0x570 [btrfs] [260445.593427] [<ffffffffa08002ab>] btrfs_create_pending_block_groups+0x11b/0x200 [btrfs] [260445.593459] [<ffffffffa0800964>] do_chunk_alloc+0x2a4/0x2e0 [btrfs] [260445.593491] [<ffffffffa0803815>] find_free_extent+0xa55/0xd90 [btrfs] [260445.593524] [<ffffffffa0803c22>] btrfs_reserve_extent+0xd2/0x220 [btrfs] [260445.593532] [<ffffffff8119fe5d>] ? account_page_dirtied+0xdd/0x170 [260445.593564] [<ffffffffa0803e78>] btrfs_alloc_tree_block+0x108/0x4a0 [btrfs] [260445.593597] [<ffffffffa080c9de>] ? btree_set_page_dirty+0xe/0x10 [btrfs] [260445.593626] [<ffffffffa07eb5cd>] __btrfs_cow_block+0x12d/0x5b0 [btrfs] [260445.593654] [<ffffffffa07ebbff>] btrfs_cow_block+0x11f/0x1c0 [btrfs] [260445.593682] [<ffffffffa07ef8c7>] btrfs_search_slot+0x1e7/0xa00 [btrfs] [260445.593724] [<ffffffffa08389df>] ? free_extent_buffer+0x4f/0x90 [btrfs] [260445.593752] [<ffffffffa07f1a06>] btrfs_insert_empty_items+0x66/0xc0 [btrfs] [260445.593830] [<ffffffffa07ea94a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs] [260445.593905] [<ffffffffa08403b9>] btrfs_finish_chunk_alloc+0x1c9/0x570 [btrfs] [260445.593946] [<ffffffffa08002ab>] btrfs_create_pending_block_groups+0x11b/0x200 [btrfs] [260445.593990] [<ffffffffa0815798>] btrfs_commit_transaction+0xa8/0xb40 [btrfs] [260445.594042] [<ffffffffa085abcd>] ? btrfs_log_dentry_safe+0x6d/0x80 [btrfs] [260445.594089] [<ffffffffa082bc84>] btrfs_sync_file+0x294/0x350 [btrfs] [260445.594115] [<ffffffff8123e29b>] vfs_fsync_range+0x3b/0xa0 [260445.594133] [<ffffffff81023891>] ? syscall_trace_enter_phase1+0x131/0x180 [260445.594149] [<ffffffff8123e35d>] do_fsync+0x3d/0x70 [260445.594169] [<ffffffff81023bb8>] ? syscall_trace_leave+0xb8/0x110 [260445.594187] [<ffffffff8123e600>] SyS_fsync+0x10/0x20 [260445.594204] [<ffffffff8175de6e>] entry_SYSCALL_64_fastpath+0x12/0x71 This happened because the same transaction handle created a large number of block groups and while finalizing their creation (inserting new items and updating existing items in the chunk and device trees) a new metadata extent had to be allocated and no free space was found in the current metadata block groups, which made find_free_extent() attempt to allocate a new block group via do_chunk_alloc(). However at do_chunk_alloc() we ended up allocating a new system chunk too and exceeded the threshold of 2Mb of reserved chunk bytes, which makes do_chunk_alloc() enter the final part of block group creation again (at btrfs_create_pending_block_groups()) and attempt to lock again the root of the chunk tree when it's already write locked by the same task. Similarly we can deadlock on extent tree nodes/leafs if while we are running delayed references we end up creating a new metadata block group in order to allocate a new node/leaf for the extent tree (as part of a CoW operation or growing the tree), as btrfs_create_pending_block_groups inserts items into the extent tree as well. In this case we get the following trace: [14242.773581] fio D ffff880428ca3418 0 3615 3100 0x00000084 [14242.773588] ffff880428ca3418 ffff88042d66b000 ffff88042a03c800 ffff880428ca3438 [14242.773594] ffff880428ca4000 ffff8803e4b20190 ffff8803e4b201a8 ffff880428ca3460 [14242.773600] ffff8803e4b20188 ffff880428ca3438 ffffffff8175a437 ffff8803e4b20190 [14242.773606] Call Trace: [14242.773613] [<ffffffff8175a437>] schedule+0x37/0x80 [14242.773656] [<ffffffffa057ff07>] btrfs_tree_lock+0xa7/0x1f0 [btrfs] [14242.773664] [<ffffffff810db7c0>] ? prepare_to_wait_event+0xf0/0xf0 [14242.773692] [<ffffffffa0519c44>] btrfs_lock_root_node+0x34/0x50 [btrfs] [14242.773720] [<ffffffffa051ef6b>] btrfs_search_slot+0x88b/0xa00 [btrfs] [14242.773750] [<ffffffffa0520a06>] btrfs_insert_empty_items+0x66/0xc0 [btrfs] [14242.773758] [<ffffffff811ef4a2>] ? kmem_cache_alloc+0x1d2/0x200 [14242.773786] [<ffffffffa0520ad1>] btrfs_insert_item+0x71/0xf0 [btrfs] [14242.773818] [<ffffffffa052f292>] btrfs_create_pending_block_groups+0x102/0x200 [btrfs] [14242.773850] [<ffffffffa052f96e>] do_chunk_alloc+0x2ae/0x2f0 [btrfs] [14242.773934] [<ffffffffa0532825>] find_free_extent+0xa55/0xd90 [btrfs] [14242.773998] [<ffffffffa0532c22>] btrfs_reserve_extent+0xc2/0x1d0 [btrfs] [14242.774041] [<ffffffffa0532e38>] btrfs_alloc_tree_block+0x108/0x4a0 [btrfs] [14242.774078] [<ffffffffa051a5cd>] __btrfs_cow_block+0x12d/0x5b0 [btrfs] [14242.774118] [<ffffffffa051abff>] btrfs_cow_block+0x11f/0x1c0 [btrfs] [14242.774155] [<ffffffffa051e8c7>] btrfs_search_slot+0x1e7/0xa00 [btrfs] [14242.774194] [<ffffffffa0528021>] ? __btrfs_free_extent.isra.70+0x2e1/0xcb0 [btrfs] [14242.774235] [<ffffffffa0520a06>] btrfs_insert_empty_items+0x66/0xc0 [btrfs] [14242.774274] [<ffffffffa051994a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs] [14242.774318] [<ffffffffa052c433>] __btrfs_run_delayed_refs+0xbb3/0x1020 [btrfs] [14242.774358] [<ffffffffa052f404>] btrfs_run_delayed_refs.part.78+0x74/0x280 [btrfs] [14242.774391] [<ffffffffa052f627>] btrfs_run_delayed_refs+0x17/0x20 [btrfs] [14242.774432] [<ffffffffa05be236>] commit_cowonly_roots+0x8d/0x2bd [btrfs] [14242.774474] [<ffffffffa059d07f>] ? __btrfs_run_delayed_items+0x1cf/0x210 [btrfs] [14242.774516] [<ffffffffa05adac3>] ? btrfs_qgroup_account_extents+0x83/0x130 [btrfs] [14242.774558] [<ffffffffa0544c40>] btrfs_commit_transaction+0x590/0xb40 [btrfs] [14242.774599] [<ffffffffa0589b9d>] ? btrfs_log_dentry_safe+0x6d/0x80 [btrfs] [14242.774642] [<ffffffffa055ac54>] btrfs_sync_file+0x294/0x350 [btrfs] [14242.774650] [<ffffffff8123e29b>] vfs_fsync_range+0x3b/0xa0 [14242.774657] [<ffffffff81023891>] ? syscall_trace_enter_phase1+0x131/0x180 [14242.774663] [<ffffffff8123e35d>] do_fsync+0x3d/0x70 [14242.774669] [<ffffffff81023bb8>] ? syscall_trace_leave+0xb8/0x110 [14242.774675] [<ffffffff8123e600>] SyS_fsync+0x10/0x20 [14242.774681] [<ffffffff8175de6e>] entry_SYSCALL_64_fastpath+0x12/0x71 Fix this by never recursing into the finalization phase of block group creation and making sure we never trigger the finalization of block group creation while running delayed references. Reported-by: Josef Bacik <jbacik@fb.com> Fixes: 00d80e342c0f ("Btrfs: fix quick exhaustion of the system array in the superblock") Signed-off-by: Filipe Manana <fdmanana@suse.com>
2015-10-05Btrfs: update fix for read corruption of compressed and shared extentsFilipe Manana
My previous fix in commit 005efedf2c7d ("Btrfs: fix read corruption of compressed and shared extents") was effective only if the compressed extents cover a file range with a length that is not a multiple of 16 pages. That's because the detection of when we reached a different range of the file that shares the same compressed extent as the previously processed range was done at extent_io.c:__do_contiguous_readpages(), which covers subranges with a length up to 16 pages, because extent_readpages() groups the pages in clusters no larger than 16 pages. So fix this by tracking the start of the previously processed file range's extent map at extent_readpages(). The following test case for fstests reproduces the issue: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter # real QA test starts here _need_to_be_root _supported_fs btrfs _supported_os Linux _require_scratch _require_cloner rm -f $seqres.full test_clone_and_read_compressed_extent() { local mount_opts=$1 _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount $mount_opts # Create our test file with a single extent of 64Kb that is going to # be compressed no matter which compression algo is used (zlib/lzo). $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 64K" \ $SCRATCH_MNT/foo | _filter_xfs_io # Now clone the compressed extent into an adjacent file offset. $CLONER_PROG -s 0 -d $((64 * 1024)) -l $((64 * 1024)) \ $SCRATCH_MNT/foo $SCRATCH_MNT/foo echo "File digest before unmount:" md5sum $SCRATCH_MNT/foo | _filter_scratch # Remount the fs or clear the page cache to trigger the bug in # btrfs. Because the extent has an uncompressed length that is a # multiple of 16 pages, all the pages belonging to the second range # of the file (64K to 128K), which points to the same extent as the # first range (0K to 64K), had their contents full of zeroes instead # of the byte 0xaa. This was a bug exclusively in the read path of # compressed extents, the correct data was stored on disk, btrfs # just failed to fill in the pages correctly. _scratch_remount echo "File digest after remount:" # Must match the digest we got before. md5sum $SCRATCH_MNT/foo | _filter_scratch } echo -e "\nTesting with zlib compression..." test_clone_and_read_compressed_extent "-o compress=zlib" _scratch_unmount echo -e "\nTesting with lzo compression..." test_clone_and_read_compressed_extent "-o compress=lzo" status=0 exit Cc: stable@vger.kernel.org Signed-off-by: Filipe Manana <fdmanana@suse.com> Tested-by: Timofey Titovets <nefelim4ag@gmail.com>
2015-10-05Btrfs: send, fix corner case for reference overwrite detectionFilipe Manana
When the inode given to did_overwrite_ref() matches the current progress and has a reference that collides with the reference of other inode that has the same number as the current progress, we were always telling our caller that the inode's reference was overwritten, which is incorrect because the other inode might be a new inode (different generation number) in which case we must return false from did_overwrite_ref() so that its callers don't use an orphanized path for the inode (as it will never be orphanized, instead it will be unlinked and the new inode created later). The following test case for fstests reproduces the issue: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { rm -fr $send_files_dir rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter # real QA test starts here _supported_fs btrfs _supported_os Linux _require_scratch _need_to_be_root send_files_dir=$TEST_DIR/btrfs-test-$seq rm -f $seqres.full rm -fr $send_files_dir mkdir $send_files_dir _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount # Create our test file with a single extent of 64K. mkdir -p $SCRATCH_MNT/foo $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 64K" $SCRATCH_MNT/foo/bar \ | _filter_xfs_io _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \ $SCRATCH_MNT/mysnap1 _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT \ $SCRATCH_MNT/mysnap2 echo "File digest before being replaced:" md5sum $SCRATCH_MNT/mysnap1/foo/bar | _filter_scratch # Remove the file and then create a new one in the same location with # the same name but with different content. This new file ends up # getting the same inode number as the previous one, because that inode # number was the highest inode number used by the snapshot's root and # therefore when attempting to find the a new inode number for the new # file, we end up reusing the same inode number. This happens because # currently btrfs uses the highest inode number summed by 1 for the # first inode created once a snapshot's root is loaded (done at # fs/btrfs/inode-map.c:btrfs_find_free_objectid in the linux kernel # tree). # Having these two different files in the snapshots with the same inode # number (but different generation numbers) caused the btrfs send code # to emit an incorrect path for the file when issuing an unlink # operation because it failed to realize they were different files. rm -f $SCRATCH_MNT/mysnap2/foo/bar $XFS_IO_PROG -f -c "pwrite -S 0xbb 0 96K" \ $SCRATCH_MNT/mysnap2/foo/bar | _filter_xfs_io _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT/mysnap2 \ $SCRATCH_MNT/mysnap2_ro _run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $send_files_dir/1.snap _run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 \ $SCRATCH_MNT/mysnap2_ro -f $send_files_dir/2.snap echo "File digest in the original filesystem after being replaced:" md5sum $SCRATCH_MNT/mysnap2_ro/foo/bar | _filter_scratch # Now recreate the filesystem by receiving both send streams and verify # we get the same file contents that the original filesystem had. _scratch_unmount _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount _run_btrfs_util_prog receive -vv $SCRATCH_MNT -f $send_files_dir/1.snap _run_btrfs_util_prog receive -vv $SCRATCH_MNT -f $send_files_dir/2.snap echo "File digest in the new filesystem:" # Must match the digest from the new file. md5sum $SCRATCH_MNT/mysnap2_ro/foo/bar | _filter_scratch status=0 exit Reported-by: Martin Raiber <martin@urbackup.org> Fixes: 8b191a684968 ("Btrfs: incremental send, check if orphanized dir inode needs delayed rename") Signed-off-by: Filipe Manana <fdmanana@suse.com>
2015-10-04jffs2: fix a memleak in read_direntry()Wei Fang
Need to free the memory allocated for 'fd' if failed to read all of the remainder name. Signed-off-by: Wei Fang <fangwei1@huawei.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2015-10-04sysfs: correctly handle short reads on PREALLOC attrs.NeilBrown
attributes declared with __ATTR_PREALLOC use sysfs_kf_read() which ignores the 'count' arg. So a 1-byte read request can return more bytes than that. This is seen with the 'dash' shell when 'read' is used on some 'md' sysfs attributes. So only return the 'min' of count and the attribute length. Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-04debugfs: document that debugfs_remove*() accepts NULL and error valuesUlf Magnusson
According to commit a59d6293e537 ("debugfs: change parameter check in debugfs_remove() functions"), this is meant to make cleanup easier for callers. In that case it ought to be documented. Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-04debugfs: Pass bool pointer to debugfs_create_bool()Viresh Kumar
Its a bit odd that debugfs_create_bool() takes 'u32 *' as an argument, when all it needs is a boolean pointer. It would be better to update this API to make it accept 'bool *' instead, as that will make it more consistent and often more convenient. Over that bool takes just a byte. That required updates to all user sites as well, in the same commit updating the API. regmap core was also using debugfs_{read|write}_file_bool(), directly and variable types were updated for that to be bool as well. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-03[CIFS] Update cifs version numberSteve French
Update modinfo cifs.ko version number to 2.08 Signed-off-by: Steve French <steve.french@primarydata.com>
2015-10-03UBIFS: print verbose message when rescanning a corrupted nodeshengyong
This is a trivial fix of showing verbose message when leb-recovery detects a corrupted node, which is not the last one in the LEB. Rescan expects to show more detail of the corrupted node. Reviewed-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Signed-off-by: Sheng Yong <shengyong1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-03UBIFS: call dbg_is_power_cut() instead of reading c->dbg->pc_happenedshengyong
Call dbg_is_power_cut() to emulate power cut instead of reading c->dbg->pc_happened. Otherwise, the function becomes dead code. Signed-off-by: Sheng Yong <shengyong1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-03UBIFS: fix a typo in comment of ubifs_budget_reqDongsheng Yang
s/now/how Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Reviewed-by: Richard Weinberger <richard@nod.at> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-10-03UBIFS: use kmemdup rather than duplicating its implementationAndrzej Hajda
The patch was generated using fixed coccinelle semantic patch scripts/coccinelle/api/memdup.cocci [1]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2014320 Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Richard Weinberger <richard@nod.at> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-10-03ext4 crypto: fix bugs in ext4_encrypted_zeroout()Theodore Ts'o
Fix multiple bugs in ext4_encrypted_zeroout(), including one that could cause us to write an encrypted zero page to the wrong location on disk, potentially causing data and file system corruption. Fortunately, this tends to only show up in stress tests, but even with these fixes, we are seeing some test failures with generic/127 --- but these are now caused by data failures instead of metadata corruption. Since ext4_encrypted_zeroout() is only used for some optimizations to keep the extent tree from being too fragmented, and ext4_encrypted_zeroout() itself isn't all that optimized from a time or IOPS perspective, disable the extent tree optimization for encrypted inodes for now. This prevents the data corruption issues reported by generic/127 until we can figure out what's going wrong. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2015-10-03ext4 crypto: replace some BUG_ON()'s with error checksTheodore Ts'o
Buggy (or hostile) userspace should not be able to cause the kernel to crash. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2015-10-03ext4 crypto: ext4_page_crypto() doesn't need a encryption contextTheodore Ts'o
Since ext4_page_crypto() doesn't need an encryption context (at least not any more), this allows us to simplify a number function signature and also allows us to avoid needing to allocate a context in ext4_block_write_begin(). It also means we no longer need a separate ext4_decrypt_one() function. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-10-03ext4: optimize ext4_writepage() for attempted 4k delalloc writesTheodore Ts'o
In cases where the file system block size is the same as the page size, and ext4_writepage() is asked to write out a page which is either has the unwritten bit set in the extent tree, or which does not yet have a block assigned due to delayed allocation, we can bail out early and, unlocking the page earlier and avoiding a round trip through ext4_bio_write_page() with the attendant calls to set_page_writeback() and redirty_page_for_writeback(). Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-10-02ext4 crypto: fix memory leak in ext4_bio_write_page()Theodore Ts'o
There are times when ext4_bio_write_page() is called even though we don't actually need to do any I/O. This happens when ext4_writepage() gets called by the jbd2 commit path when an inode needs to force its pages written out in order to provide data=ordered guarantees --- and a page is backed by an unwritten (e.g., uninitialized) block on disk, or if delayed allocation means the page's backing store hasn't been allocated yet. In that case, we need to skip the call to ext4_encrypt_page(), since in addition to wasting CPU, it leads to a bounce page and an ext4 crypto context getting leaked. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2015-10-02nfs4: reset states to use open_stateid when returning delegation voluntarilyJeff Layton
When the client goes to return a delegation, it should always update any nfs4_state currently set up to use that delegation stateid to instead use the open stateid. It already does do this in some cases, particularly in the state recovery code, but not currently when the delegation is voluntarily returned (e.g. in advance of a RENAME). This causes the client to try to continue using the delegation stateid after the DELEGRETURN, e.g. in LAYOUTGET. Set the nfs4_state back to using the open stateid in nfs4_open_delegation_recall, just before clearing the NFS_DELEGATED_STATE bit. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-02NFSv4: Fix a nograce recovery hangBenjamin Coddington
Since commit 5cae02f42793130e1387f4ec09c4d07056ce9fa5 an OPEN_CONFIRM should have a privileged sequence in the recovery case to allow nograce recovery to proceed for NFSv4.0. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-02NFSv4.1: nfs4_opendata_check_deleg needs to handle NFS4_OPEN_CLAIM_DELEG_CUR_FHTrond Myklebust
We need to warn against broken NFSv4.1 servers that try to hand out delegations in response to NFS4_OPEN_CLAIM_DELEG_CUR_FH. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-02NFSv4: Don't try to reclaim unused state ownersTrond Myklebust
Currently, we don't test if the state owner is in use before we try to recover it. The problem is that if the refcount is zero, then the state owner will be waiting on the lru list for garbage collection. The expectation in that case is that if you bump the refcount, then you must also remove the state owner from the lru list. Otherwise the call to nfs4_put_state_owner will corrupt that list by trying to add our state owner a second time. Avoid the whole problem by just skipping state owners that hold no state. Reported-by: Andrew W Elble <aweits@rit.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-02NFS: Fix a write performance regressionTrond Myklebust
If all other conditions in nfs_can_extend_write() are met, and there are no locks, then we should be able to assume close-to-open semantics and the ability to extend our write to cover the whole page. With this patch, the xfstests generic/074 test completes in 242s instead of >1400s on my test rig. Fixes: bd61e0a9c852 ("locks: convert posix locks to file_lock_context") Cc: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-02NFS: Fix up page writeback accountingTrond Myklebust
Currently, we are crediting all the calls to nfs_writepages_callback() (i.e. the nfs_writepages() callback) to nfs_writepage(). Aside from being inconsistent with the behaviour of the equivalent readpage/readpages accounting, this also means that we cannot distinguish between bulk writes and single page writebacks (which confuses the 'nfsiostat -p' tool). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-01[SMB3] Do not fall back to SMBWriteX in set_file_size error casesSteve French
The error paths in set_file_size for cifs and smb3 are incorrect. In the unlikely event that a server did not support set file info of the file size, the code incorrectly falls back to trying SMBWriteX (note that only the original core SMB Write, used for example by DOS, can set the file size this way - this actually does not work for the more recent SMBWriteX). The idea was since the old DOS SMB Write could set the file size if you write zero bytes at that offset then use that if server rejects the normal set file info call. Fortunately the SMBWriteX will never be sent on the wire (except when file size is zero) since the length and offset fields were reversed in the two places in this function that call SMBWriteX causing the fall back path to return an error. It is also important to never call an SMB request from an SMB2/sMB3 session (which theoretically would be possible, and can cause a brief session drop, although the client recovers) so this should be fixed. In practice this path does not happen with modern servers but the error fall back to SMBWriteX is clearly wrong. Removing the calls to SMBWriteX in the error paths in cifs_set_file_size Pointed out by PaX/grsecurity team Signed-off-by: Steve French <steve.french@primarydata.com> Reported-by: PaX Team <pageexec@freemail.hu> CC: Emese Revfy <re.emese@gmail.com> CC: Brad Spengler <spender@grsecurity.net> CC: Stable <stable@vger.kernel.org>
2015-10-01dax: fix NULL pointer in __dax_pmd_fault()Ross Zwisler
Commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX") moved some code in __dax_pmd_fault() that was responsible for zeroing newly allocated PMD pages. The new location didn't properly set up 'kaddr', so when run this code resulted in a NULL pointer BUG. Fix this by getting the correct 'kaddr' via bdev_direct_access(). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-01gfs2: Add missing else in trans_add_meta/dataBob Peterson
This patch fixes a timing window that causes a segfault. The problem is that bd can remain NULL throughout the function and then reference that NULL pointer if the bh->b_private starts out NULL, then someone sets it to non-NULL inside the locking. In that case, bd still needs to be set. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2015-10-01Btrfs: move kobj stuff out of dev_replace lock rangeLiu Bo
To avoid deadlock described in commit 084b6e7c7607 ("btrfs: Fix a lockdep warning when running xfstest."), we should move kobj stuff out of dev_replace lock range. "It is because the btrfs_kobj_{add/rm}_device() will call memory allocation with GFP_KERNEL, which may flush fs page cache to free space, waiting for it self to do the commit, causing the deadlock. To solve the problem, move btrfs_kobj_{add/rm}_device() out of the dev_replace lock range, also involing split the btrfs_rm_dev_replace_srcdev() function into remove and free parts. Now only btrfs_rm_dev_replace_remove_srcdev() is called in dev_replace lock range, and kobj_{add/rm} and btrfs_rm_dev_replace_free_srcdev() are called out of the lock range." Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> [added lockup description] Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-01Btrfs: add helper for closing one deviceAnand Jain
Signed-off-by: Anand Jain <anand.jain@oracle.com> [reworded subject and changelog] Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-01Btrfs: don't log error from btrfs_get_bdev_and_sbAnand Jain
Originally the message was not in a helper but ended up there. We should print error messages from callers instead. Signed-off-by: Anand Jain <anand.jain@oracle.com> [reworded subject and changelog] Signed-off-by: David Sterba <dsterba@suse.com>