summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2018-01-14pnfs/blocklayout: Add module alias for LAYOUT4_SCSIBenjamin Coddington
The blocklayout module contains the client support for both block and SCSI layouts. Add a module alias for the SCSI layout type so that the module will be loaded for SCSI layouts. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14NFS: remove unused offset arg in nfs_pgio_rpcsetupBenjamin Coddington
nfs_pgio_rpcsetup() is always called with an offset of 0, so we should be able to drop the arguement altogether. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14NFSv4: always set NFS_LOCK_LOST when a lock is lost.NeilBrown
There are 2 comments in the NFSv4 code which suggest that SIGLOST should possibly be sent to a process. In these cases a lock has been lost. The current practice is to set NFS_LOCK_LOST so that read/write returns EIO when a lock is lost. So change these comments to code when sets NFS_LOCK_LOST. One case is when lock recovery after apparent server restart fails with NFS4ERR_DENIED, NFS4ERR_RECLAIM_BAD, or NFS4ERRO_RECLAIM_CONFLICT. The other case is when a lock attempt as part of lease recovery fails with NFS4ERR_DENIED. In an ideal world, these should not happen. However I have a packet trace showing an NFSv4.1 session getting NFS4ERR_BADSESSION after an extended network parition. The NFSv4.1 client treats this like server reboot until/unless it get NFS4ERR_NO_GRACE, in which case it switches over to "nograce" recovery mode. In this network trace, the client attempts to recover a lock and the server (incorrectly) reports NFS4ERR_DENIED rather than NFS4ERR_NO_GRACE. This leads to the ineffective comment and the client then continues to write using the OPEN stateid. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14nfs: remove dead code from nfs_encode_fh()NeilBrown
This code can never be used as the IS_AUTOMOUNT(inode) case has already been handled. So remove it to avoid confusion. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14Support statx() mask and query flags parametersTrond Myklebust
Support the query flags AT_STATX_FORCE_SYNC by forcing an attribute revalidation, and AT_STATX_DONT_SYNC by returning cached attributes only. Use the mask to optimise away server revalidation for attributes that are not being requested by the user. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14NFS: Fix nfsstat breakage due to LOOKUPPTrond Myklebust
The LOOKUPP operation was inserted into the nfs4_procedures array rather than being appended, which put /proc/net/rpc/nfs out of whack, and broke the nfsstat utility. Fix by moving the LOOKUPP operation to the end of the array, and by ensuring that it keeps the same length whether or not NFSV4.1 and NFSv4.2 are compiled in. Fixes: 5b5faaf6df734 ("nfs4: add NFSv4 LOOKUPP handlers") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14NFSv4: Convert LOCKU to use nfs4_async_handle_exception()Trond Myklebust
Convert CLOSE so that it specifies the correct stateid and inode for the error handling. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14NFSv4: Convert DELEGRETURN to use nfs4_handle_exception()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14NFSv4: Convert CLOSE to use nfs4_async_handle_exception()Trond Myklebust
Convert CLOSE so that it specifies the correct stateid, state and inode for the error handling. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14NFS: Add a cond_resched() to nfs_commit_release_pages()Trond Myklebust
The commit list can get very large, and so we need a cond_resched() in nfs_commit_release_pages() in order to ensure we don't hog the CPU for excessive periods of time. Reported-by: Mike Galbraith <efault@gmx.de> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-12error-injection: Add injectable error typesMasami Hiramatsu
Add injectable error types for each error-injectable function. One motivation of error injection test is to find software flaws, mistakes or mis-handlings of expectable errors. If we find such flaws by the test, that is a program bug, so we need to fix it. But if the tester miss input the error (e.g. just return success code without processing anything), it causes unexpected behavior even if the caller is correctly programmed to handle any errors. That is not what we want to test by error injection. To clarify what type of errors the caller must expect for each injectable function, this introduces injectable error types: - EI_ETYPE_NULL : means the function will return NULL if it fails. No ERR_PTR, just a NULL. - EI_ETYPE_ERRNO : means the function will return -ERRNO if it fails. - EI_ETYPE_ERRNO_NULL : means the function will return -ERRNO (ERR_PTR) or NULL. ALLOW_ERROR_INJECTION() macro is expanded to get one of NULL, ERRNO, ERRNO_NULL to record the error type for each function. e.g. ALLOW_ERROR_INJECTION(open_ctree, ERRNO) This error types are shown in debugfs as below. ==== / # cat /sys/kernel/debug/error_injection/list open_ctree [btrfs] ERRNO io_ctl_init [btrfs] ERRNO ==== Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-12error-injection: Separate error-injection from kprobeMasami Hiramatsu
Since error-injection framework is not limited to be used by kprobes, nor bpf. Other kernel subsystems can use it freely for checking safeness of error-injection, e.g. livepatch, ftrace etc. So this separate error-injection framework from kprobes. Some differences has been made: - "kprobe" word is removed from any APIs/structures. - BPF_ALLOW_ERROR_INJECTION() is renamed to ALLOW_ERROR_INJECTION() since it is not limited for BPF too. - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this feature. It is automatically enabled if the arch supports error injection feature for kprobe or ftrace etc. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-12xfs: account finobt blocks properly in perag reservationBrian Foster
XFS started using the perag metadata reservation pool for free inode btree blocks in commit 76d771b4cbe33 ("xfs: use per-AG reservations for the finobt"). To handle backwards compatibility, finobt blocks are accounted against the pool so long as the full reservation is available at mount time. Otherwise the ->m_inotbt_nores flag is set and the filesystem falls back to the traditional per-transaction finobt reservation. This commit has two problems: - finobt blocks are always accounted against the metadata reservation on allocation, regardless of ->m_inotbt_nores state - finobt blocks are never returned to the reservation pool on free The first problem affects reflink+finobt filesystems where the full finobt reservation is not available at mount time. finobt blocks are essentially stolen from the reflink reservation, putting refcountbt management at risk of allocation failure. The second problem is an unconditional leak of metadata reservation whenever finobt is enabled. Update the finobt block allocation callouts to consider ->m_inotbt_nores and account blocks appropriately. Blocks should be consistently accounted against the metadata pool when ->m_inotbt_nores is false and otherwise tagged as RESV_NONE. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-12xfs: fix check on struct_version for versions 4 or greaterColin Ian King
It appears that the check for versions 4 or more is incorrect and is off-by-one. Fix this. Detected by CoverityScan, CID#1463775 ("Logically dead code") Fixes: ac503a4cc9e8 ("xfs: refactor the geometry structure filling function") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-12xfs: destroy mutex pag_ici_reclaim_lock before freeXiongwei Song
The mutex pag_ici_reclaim_lock of xfs_perag_t structure is initialized in xfs_initialize_perag. If happen errors in xfs_initialize_perag, or free resources in xfs_free_perag, wo need to destroy the mutex before free perag. Signed-off-by: Xiongwei Song <sxwjean@me.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-12xfs: use %px for data pointers when debuggingDarrick J. Wong
Starting with commit 57e734423ad ("vsprintf: refactor %pK code out of pointer"), the behavior of the raw '%p' printk format specifier was changed to print a 32-bit hash of the pointer value to avoid leaking kernel pointers into dmesg. For most situations that's good. This is /undesirable/ behavior when we're trying to debug XFS, however, so define a PTR_FMT that prints the actual pointer when we're in debug mode. Note that %p for tracepoints still prints the raw pointer, so in the long run we could consider rewriting some of these messages as tracepoints. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-12xfs: use %pS printk format for direct instruction addressesDarrick J. Wong
Use the %pS instead of the %pF printk format specifier for printing symbols from direct addresses. This is needed for the ia64, ppc64 and parisc64 architectures. While we're at it, be consistent with the capitalization of the 'S'. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-12xfs: change 0x%p -> %p in print messagesDarrick J. Wong
Since %p prepends "0x" to the outputted string, we can drop the prefix. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-12signal: Ensure generic siginfos the kernel sends have all bits initializedEric W. Biederman
Call clear_siginfo to ensure stack allocated siginfos are fully initialized before being passed to the signal sending functions. This ensures that if there is the kind of confusion documented by TRAP_FIXME, FPE_FIXME, or BUS_FIXME the kernel won't send unitialized data to userspace when the kernel generates a signal with SI_USER but the copy to userspace assumes it is a different kind of signal, and different fields are initialized. This also prepares the way for turning copy_siginfo_to_user into a copy_to_user, by removing the need in many cases to perform a field by field copy simply to skip the uninitialized fields. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-01-11fscrypt: remove 'ci' parameter from fscrypt_put_encryption_info()Eric Biggers
fscrypt_put_encryption_info() is only called when evicting an inode, so the 'struct fscrypt_info *ci' parameter is always NULL, and there cannot be races with other threads. This was cruft left over from the broken key revocation code. Remove the unused parameter and the cmpxchg(). Also remove the #ifdefs around the fscrypt_put_encryption_info() calls, since fscrypt_notsupp.h defines a no-op stub for it. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: fix up fscrypt_fname_encrypted_size() for internal useEric Biggers
Filesystems don't need fscrypt_fname_encrypted_size() anymore, so unexport it and move it to fscrypt_private.h. We also never calculate the encrypted size of a filename without having the fscrypt_info present since it is needed to know the amount of NUL-padding which is determined by the encryption policy, and also we will always truncate the NUL-padding to the maximum filename length. Therefore, also make fscrypt_fname_encrypted_size() assume that the fscrypt_info is present, and make it truncate the returned length to the specified max_len. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: define fscrypt_fname_alloc_buffer() to be for presented namesEric Biggers
Previously fscrypt_fname_alloc_buffer() was used to allocate buffers for both presented (decrypted or encoded) and encrypted filenames. That was confusing, because it had to allocate the worst-case size for either, e.g. including NUL-padding even when it was meaningless. But now that fscrypt_setup_filename() no longer calls it, it is only used in the ->get_link() and ->readdir() paths, which specifically want a buffer for presented filenames. Therefore, switch the behavior over to allocating the buffer for presented filenames only. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: calculate NUL-padding length in one place onlyEric Biggers
Currently, when encrypting a filename (either a real filename or a symlink target) we calculate the amount of NUL-padding twice: once before encryption and once during encryption in fname_encrypt(). It is needed before encryption to allocate the needed buffer size as well as calculate the size the symlink target will take up on-disk before creating the symlink inode. Calculating the size during encryption as well is redundant. Remove this redundancy by always calculating the exact size beforehand, and making fname_encrypt() just add as much NUL padding as is needed to fill the output buffer. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: move fscrypt_symlink_data to fscrypt_private.hEric Biggers
Now that all filesystems have been converted to use the symlink helper functions, they no longer need the declaration of 'struct fscrypt_symlink_data'. Move it from fscrypt.h to fscrypt_private.h. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: remove fscrypt_fname_usr_to_disk()Eric Biggers
fscrypt_fname_usr_to_disk() sounded very generic but was actually only used to encrypt symlinks. Remove it now that all filesystems have been switched over to fscrypt_encrypt_symlink(). Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11ubifs: switch to fscrypt_get_symlink()Eric Biggers
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11ubifs: switch to fscrypt ->symlink() helper functionsEric Biggers
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11ubifs: free the encrypted symlink targetEric Biggers
ubifs_symlink() forgot to free the kmalloc()'ed buffer holding the encrypted symlink target, creating a memory leak. Fix it. (UBIFS could actually encrypt directly into ui->data, removing the temporary buffer, but that is left for the patch that switches to use the symlink helper functions.) Fixes: ca7f85be8d6c ("ubifs: Add support for encrypted symlinks") Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11f2fs: switch to fscrypt_get_symlink()Eric Biggers
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11f2fs: switch to fscrypt ->symlink() helper functionsEric Biggers
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11ext4: switch to fscrypt_get_symlink()Eric Biggers
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11ext4: switch to fscrypt ->symlink() helper functionsEric Biggers
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: new helper function - fscrypt_get_symlink()Eric Biggers
Filesystems also have duplicate code to support ->get_link() on encrypted symlinks. Factor it out into a new function fscrypt_get_symlink(). It takes in the contents of the encrypted symlink on-disk and provides the target (decrypted or encoded) that should be returned from ->get_link(). Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: new helper functions for ->symlink()Eric Biggers
Currently, filesystems supporting fscrypt need to implement some tricky logic when creating encrypted symlinks, including handling a peculiar on-disk format (struct fscrypt_symlink_data) and correctly calculating the size of the encrypted symlink. Introduce helper functions to make things a bit easier: - fscrypt_prepare_symlink() computes and validates the size the symlink target will require on-disk. - fscrypt_encrypt_symlink() creates the encrypted target if needed. The new helpers actually fix some subtle bugs. First, when checking whether the symlink target was too long, filesystems didn't account for the fact that the NUL padding is meant to be truncated if it would cause the maximum length to be exceeded, as is done for filenames in directories. Consequently users would receive ENAMETOOLONG when creating symlinks close to what is supposed to be the maximum length. For example, with EXT4 with a 4K block size, the maximum symlink target length in an encrypted directory is supposed to be 4093 bytes (in comparison to 4095 in an unencrypted directory), but in FS_POLICY_FLAGS_PAD_32-mode only up to 4064 bytes were accepted. Second, symlink targets of "." and ".." were not being encrypted, even though they should be, as these names are special in *directory entries* but not in symlink targets. Fortunately, we can fix this simply by starting to encrypt them, as old kernels already accept them in encrypted form. Third, the output string length the filesystems were providing when doing the actual encryption was incorrect, as it was forgotten to exclude 'sizeof(struct fscrypt_symlink_data)'. Fortunately though, this bug didn't make a difference. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: trim down fscrypt.h includesEric Biggers
fscrypt.h included way too many other headers, given that it is included by filesystems both with and without encryption support. Trim down the includes list by moving the needed includes into more appropriate places, and removing the unneeded ones. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.cEric Biggers
Only fs/crypto/fname.c cares about treating the "." and ".." filenames specially with regards to encryption, so move fscrypt_is_dot_dotdot() from fscrypt.h to there. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.hEric Biggers
The encryption modes are validated by fs/crypto/, not by individual filesystems. Therefore, move fscrypt_valid_enc_modes() from fscrypt.h to fscrypt_private.h. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11fscrypt: move fscrypt_info_cachep declaration to fscrypt_private.hEric Biggers
The fscrypt_info kmem_cache is internal to fscrypt; filesystems don't need to access it. So move its declaration into fscrypt_private.h. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11ext4: create ext4_kset dynamicallyRiccardo Schirone
ksets contain a kobject and they should always be allocated dynamically, because it is unknown to whoever creates them when ksets can be released. Signed-off-by: Riccardo Schirone <sirmy15@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11ext4: create ext4_feat kobject dynamicallyRiccardo Schirone
kobjects should always be allocated dynamically, because it is unknown to whoever creates them when kobjects can be released. Signed-off-by: Riccardo Schirone <sirmy15@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11ext4: release kobject/kset even when init/register failRiccardo Schirone
Even when kobject_init_and_add/kset_register fail, the kobject has been already initialized and the refcount set to 1. Thus it is necessary to release the kobject/kset, to avoid the memory associated with it hanging around forever. Signed-off-by: Riccardo Schirone <sirmy15@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-11ext4: fix incorrect indentation of if statementColin Ian King
The indentation is incorrect and spaces need replacing with a tab on the if statement. Cleans up smatch warning: fs/ext4/namei.c:3220 ext4_link() warn: inconsistent indenting Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2018-01-11ext4: use 'sbi' instead of 'EXT4_SB(sb)'Jun Piao
We could use 'sbi' instead of 'EXT4_SB(sb)' to make code more elegant. Signed-off-by: Jun Piao <piaojun@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2018-01-10ext4: save error to disk in __ext4_grp_locked_error()Zhouyi Zhou
In the function __ext4_grp_locked_error(), __save_error_info() is called to save error info in super block block, but does not sync that information to disk to info the subsequence fsck after reboot. This patch writes the error information to disk. After this patch, I think there is no obvious EXT4 error handle branches which leads to "Remounting filesystem read-only" will leave the disk partition miss the subsequence fsck. Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-01-10jbd2: fix sphinx kernel-doc build warningsTobin C. Harding
Sphinx emits various (26) warnings when building make target 'htmldocs'. Currently struct definitions contain duplicate documentation, some as kernel-docs and some as standard c89 comments. We can reduce duplication while cleaning up the kernel docs. Move all kernel-docs to right above each struct member. Use the set of all existing comments (kernel-doc and c89). Add documentation for missing struct members and function arguments. Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-01-10ext4: fix a race in the ext4 shutdown pathHarshad Shirwadkar
This patch fixes a race between the shutdown path and bio completion handling. In the ext4 direct io path with async io, after submitting a bio to the block layer, if journal starting fails, ext4_direct_IO_write() would bail out pretending that the IO failed. The caller would have had no way of knowing whether or not the IO was successfully submitted. So instead, we return -EIOCBQUEUED in this case. Now, the caller knows that the IO was submitted. The bio completion handler takes care of the error. Tested: Ran the shutdown xfstest test 461 in loop for over 2 hours across 4 machines resulting in over 400 runs. Verified that the race didn't occur. Usually the race was seen in about 20-30 iterations. Signed-off-by: Harshad Shirwadkar <harshads@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-01-09mbcache: make sure c_entry_count is not decremented past zeroJiang Biao
Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Signed-off-by: Theodore Ts'o <tytso@mit.edu> CC: Eric Biggers <ebiggers@google.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Jan Kara <jack@suse.cz>
2018-01-09ext4: no need flush workqueue before destroying itpiaojun
destroy_workqueue() will do flushing work for us. Signed-off-by: Jun Piao <piaojun@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2018-01-09xfs: clarify units in the failed metadata io messageDarrick J. Wong
If a metadata IO error happens, we report the location of the failed IO request in units of daddrs. However, the printk message misleads people into thinking that the units are fs blocks, so fix the reported units. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-09xfs: harden directory integrity checks some moreDarrick J. Wong
If a malicious filesystem image contains a block+ format directory wherein the directory inode's core.mode is set such that S_ISDIR(core.mode) == 0, and if there are subdirectories of the corrupted directory, an attempt to traverse up the directory tree will crash the kernel in __xfs_dir3_data_check. Running the online scrub's parent checks will tend to do this. The crash occurs because the directory inode's d_ops get set to xfs_dir[23]_nondir_ops (it's not a directory) but the parent pointer scrubber's indiscriminate call to xfs_readdir proceeds past the ASSERT if we have non fatal asserts configured. Fix the null pointer dereference crash in __xfs_dir3_data_check by looking for S_ISDIR or wrong d_ops; and teach the parent scrubber to bail out if it is fed a non-directory "parent". Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>