summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2020-07-21exfat: fix name_hash computation on big endian systemsIlya Ponetayev
On-disk format for name_hash field is LE, so it must be explicitly transformed on BE system for proper result. Fixes: 370e812b3ec1 ("exfat: add nls operations") Cc: stable@vger.kernel.org # v5.7 Signed-off-by: Chen Minqiang <ptpt52@gmail.com> Signed-off-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-07-21exfat: fix wrong size update of stream entry by typoHyeongseok Kim
The stream.size field is updated to the value of create timestamp of the file entry. Fix this to use correct stream entry pointer. Fixes: 29bbb14bfc80 ("exfat: fix incorrect update of stream entry in __exfat_truncate()") Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-07-21exfat: fix wrong hint_stat initialization in exfat_find_dir_entry()Namjae Jeon
We found the wrong hint_stat initialization in exfat_find_dir_entry(). It should be initialized when cluster is EXFAT_EOF_CLUSTER. Fixes: ca06197382bd ("exfat: add directory operations") Cc: stable@vger.kernel.org # v5.7 Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-07-21exfat: fix overflow issue in exfat_cluster_to_sector()Namjae Jeon
An overflow issue can occur while calculating sector in exfat_cluster_to_sector(). It needs to cast clus's type to sector_t before left shifting. Fixes: 1acf1a564b60 ("exfat: add in-memory and on-disk structures and headers") Cc: stable@vger.kernel.org # v5.7 Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-07-20fscrypt: rename FS_KEY_DERIVATION_NONCE_SIZEEric Biggers
The name "FS_KEY_DERIVATION_NONCE_SIZE" is a bit outdated since due to the addition of FSCRYPT_POLICY_FLAG_DIRECT_KEY, the file nonce may now be used as a tweak instead of for key derivation. Also, we're now prefixing the fscrypt constants with "FSCRYPT_" instead of "FS_". Therefore, rename this constant to FSCRYPT_FILE_NONCE_SIZE. Link: https://lore.kernel.org/r/20200708215722.147154-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-07-20fscrypt: add comments that describe the HKDF info stringsEric Biggers
Each HKDF context byte is associated with a specific format of the remaining part of the application-specific info string. Add comments so that it's easier to keep track of what these all are. Link: https://lore.kernel.org/r/20200708215529.146890-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-07-20f2fs: segment.h: delete a duplicated wordRandy Dunlap
Drop the repeated word "the" in a comment. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Chao Yu <chao@kernel.org> Cc: linux-f2fs-devel@lists.sourceforge.net Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-20f2fs: compress: fix to avoid memory leak on cc->cpagesChao Yu
Memory allocated for storing compressed pages' poitner should be released after f2fs_write_compressed_pages(), otherwise it will cause memory leak issue. Signed-off-by: Chao Yu <yuchao0@huawei.com> Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-20f2fs: use generic names for generic ioctlsEric Biggers
Don't define F2FS_IOC_* aliases to ioctls that already have a generic FS_IOC_* name. These aliases are unnecessary, and they make it unclear which ioctls are f2fs-specific and which are generic. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-20zonefs: count pages after truncating the iteratorJohannes Thumshirn
Count pages after possibly truncating the iterator to the maximum zone append size, not before. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-07-20zonefs: Fix compilation warningDamien Le Moal
Avoid the compilation warning "Variable 'ret' is reassigned a value before the old one has been used." in zonefs_create_zgroup() by setting ret for the error path only if an error happens. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-07-20Merge 5.8-rc6 into driver-core-nextGreg Kroah-Hartman
We need the driver core fixes in here too. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-19proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTOREAdrian Reber
Opening files in /proc/pid/map_files when the current user is CAP_CHECKPOINT_RESTORE capable in the root namespace is useful for checkpointing and restoring to recover files that are unreachable via the file system such as deleted files, or memfd files. Signed-off-by: Adrian Reber <areber@redhat.com> Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Link: https://lore.kernel.org/r/20200719100418.2112740-5-areber@redhat.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-199p: remove unused code in 9pJianyong Wu
These codes have been commented out since 2007 and lay in kernel since then. So, it's better to remove them. Link: http://lkml.kernel.org/r/20200628074337.45895-1-jianyong.wu@arm.com Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2020-07-199p: Fix memory leak in v9fs_mountZheng Bin
v9fs_mount v9fs_session_init v9fs_cache_session_get_cookie v9fs_random_cachetag -->alloc cachetag v9ses->fscache = fscache_acquire_cookie -->maybe NULL sb = sget -->fail, goto clunk clunk_fid: v9fs_session_close if (v9ses->fscache) -->NULL kfree(v9ses->cachetag) Thus memleak happens. Link: http://lkml.kernel.org/r/20200615012153.89538-1-zhengbin13@huawei.com Fixes: 60e78d2c993e ("9p: Add fscache support to 9p") Cc: <stable@vger.kernel.org> # v2.6.32+ Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2020-07-199p: retrieve fid from file when file instance exist.Jianyong Wu
In the current setattr implementation in 9p, fid is always retrieved from dentry no matter file instance exists or not. If so, there may be some info related to opened file instance dropped. So it's better to retrieve fid from file instance when it is passed to setattr. for example: fd=open("tmp", O_RDWR); ftruncate(fd, 10); The file context related with the fd will be lost as fid is always retrieved from dentry, then the backend can't get the info of file context. It is against the original intention of user and may lead to bug. Link: http://lkml.kernel.org/r/20200710101548.10108-1-jianyong.wu@arm.com Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2020-07-18io_uring: always allow drain/link/hardlink/async sqe flagsDaniele Albano
We currently filter these for timeout_remove/async_cancel/files_update, but we only should be filtering for fixed file and buffer select. This also causes a second read of sqe->flags, which isn't needed. Just check req->flags for the relevant bits. This then allows these commands to be used in links, for example, like everything else. Signed-off-by: Daniele Albano <d.albano@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-17io_uring: ensure double poll additions work with both request typesJens Axboe
The double poll additions were centered around doing POLL_ADD on file descriptors that use more than one waitqueue (typically one for read, one for write) when being polled. However, it can also end up being triggered for when we use poll triggered retry. For that case, we cannot safely use req->io, as that could be used by the request type itself. Add a second io_poll_iocb pointer in the structure we allocate for poll based retry, and ensure we use the right one from the two paths. Fixes: 18bceab101ad ("io_uring: allow POLL_ADD with double poll_wait() users") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-17Merge tag 'nfs-for-5.8-3' of git://git.linux-nfs.org/projects/anna/linux-nfs ↵Linus Torvalds
into master Pull NFS client fixes from Anna Schumaker: "A few more NFS client bugfixes for Linux 5.8: NFS: - Fix interrupted slots by using the SEQUENCE operation SUNRPC: - revert d03727b248d0 to fix unkillable IOs xprtrdma: - Fix double-free in rpcrdma_ep_create() - Fix recursion into rpcrdma_xprt_disconnect() - Fix return code from rpcrdma_xprt_connect() - Fix handling of connect errors - Fix incorrect header size calculations" * tag 'nfs-for-5.8-3' of git://git.linux-nfs.org/projects/anna/linux-nfs: SUNRPC reverting d03727b248d0 ("NFSv4 fix CLOSE not waiting for direct IO compeletion") xprtrdma: fix incorrect header size calculations NFS: Fix interrupted slots by sending a solo SEQUENCE operation xprtrdma: Fix handling of connect errors xprtrdma: Fix return code from rpcrdma_xprt_connect() xprtrdma: Fix recursion into rpcrdma_xprt_disconnect() xprtrdma: Fix double-free in rpcrdma_ep_create()
2020-07-17xfs: preserve inode versioning across remountsEric Sandeen
The MS_I_VERSION mount flag is exposed via the VFS, as documented in the mount manpages etc; see the iversion and noiversion mount options in mount(8). As a result, mount -o remount looks for this option in /proc/mounts and will only send the I_VERSION flag back in during remount it it is present. Since it's not there, a remount will /remove/ the I_VERSION flag at the vfs level, and iversion functionality is lost. xfs v5 superblocks intend to always have i_version enabled; it is set as a default at mount time, but is lost during remount for the reasons above. The generic fix would be to expose this documented option in /proc/mounts, but since that was rejected, fix it up again in the xfs remount path instead, so that at least xfs won't suffer from this misbehavior. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-07-17SUNRPC reverting d03727b248d0 ("NFSv4 fix CLOSE not waiting for direct IO ↵Olga Kornievskaia
compeletion") Reverting commit d03727b248d0 "NFSv4 fix CLOSE not waiting for direct IO compeletion". This patch made it so that fput() by calling inode_dio_done() in nfs_file_release() would wait uninterruptably for any outstanding directIO to the file (but that wait on IO should be killable). The problem the patch was also trying to address was REMOVE returning ERR_ACCESS because the file is still opened, is supposed to be resolved by server returning ERR_FILE_OPEN and not ERR_ACCESS. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-07-17Merge tag 'io_uring-5.8-2020-07-17' of git://git.kernel.dk/linux-block into ↵Linus Torvalds
master Pull io_uring fix from Jens Axboe: "Fix for a case where, with automatic buffer selection, we can leak the buffer descriptor for recvmsg" * tag 'io_uring-5.8-2020-07-17' of git://git.kernel.dk/linux-block: io_uring: fix recvmsg memory leak with buffer selection
2020-07-17Merge tag 'fuse-fixes-5.8-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse into master Pull fuse fixes from Miklos Szeredi: - two regressions in this cycle caused by the conversion of writepage list to an rb_tree - two regressions in v5.4 cause by the conversion to the new mount API - saner behavior of fsconfig(2) for the reconfigure case - an ancient issue with FS_IOC_{GET,SET}FLAGS ioctls * tag 'fuse-fixes-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: Fix parameter for FS_IOC_{GET,SET}FLAGS fuse: don't ignore errors from fuse_writepages_fill() fuse: clean up condition for writepage sending fuse: reject options on reconfigure via fsconfig(2) fuse: ignore 'data' argument of mount(..., MS_REMOUNT) fuse: use ->reconfigure() instead of ->remount_fs() fuse: fix warning in tree_insert() and clean up writepage insertion fuse: move rb_erase() before tree_insert()
2020-07-17Merge tag 'ovl-fixes-5.8-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into master Pull overlayfs fixes from Miklos Szeredi: - fix a regression introduced in v4.20 in handling a regenerated squashfs lower layer - two regression fixes for this cycle, one of which is Oops inducing - miscellaneous issues * tag 'ovl-fixes-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: fix lookup of indexed hardlinks with metacopy ovl: fix unneeded call to ovl_change_flags() ovl: fix mount option checks for nfs_export with no upperdir ovl: force read-only sb on failure to create index dir ovl: fix regression with re-formatted lower squashfs ovl: fix oops in ovl_indexdir_cleanup() with nfs_export=on ovl: relax WARN_ON() when decoding lower directory file handle ovl: remove not used argument in ovl_check_origin ovl: change ovl_copy_up_flags static ovl: inode reference leak in ovl_is_inuse true case.
2020-07-17NFSv4.0 allow nconnect for v4.0Olga Kornievskaia
It looks like this "else" is just a typo. It turns off nconnect for NFSv4.0 even though it works for every other version. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-17freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFSHe Zhe
commit 0688e64bc600 ("NFS: Allow signal interruption of NFS4ERR_DELAYed operations") introduces nfs4_delay_interruptible which also needs an _unsafe version to avoid the following call trace for the same reason explained in commit 416ad3c9c006 ("freezer: add unsafe versions of freezable helpers for NFS") CPU: 4 PID: 3968 Comm: rm Tainted: G W 5.8.0-rc4 #1 Hardware name: Marvell OcteonTX CN96XX board (DT) Call trace: dump_backtrace+0x0/0x1dc show_stack+0x20/0x30 dump_stack+0xdc/0x150 debug_check_no_locks_held+0x98/0xa0 nfs4_delay_interruptible+0xd8/0x120 nfs4_handle_exception+0x130/0x170 nfs4_proc_rmdir+0x8c/0x220 nfs_rmdir+0xa4/0x360 vfs_rmdir.part.0+0x6c/0x1b0 do_rmdir+0x18c/0x210 __arm64_sys_unlinkat+0x64/0x7c el0_svc_common.constprop.0+0x7c/0x110 do_el0_svc+0x24/0xa0 el0_sync_handler+0x13c/0x1b8 el0_sync+0x158/0x180 Signed-off-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-17Merge branch 'xattr-devel'Trond Myklebust
2020-07-16treewide: Remove uninitialized_var() usageKees Cook
Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. In preparation for removing[2] the[3] macro[4], remove all remaining needless uses with the following script: git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \ xargs perl -pi -e \ 's/\buninitialized_var\(([^\)]+)\)/\1/g; s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;' drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid pathological white-space. No outstanding warnings were found building allmodconfig with GCC 9.3.0 for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64, alpha, and m68k. [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5 Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-16f2fs: Eliminate usage of uninitialized_var() macroJason Yan
This is an effort to eliminate the uninitialized_var() macro[1]. The use of this macro is the wrong solution because it forces off ANY analysis by the compiler for a given variable. It even masks "unused variable" warnings. Quoted from Linus[2]: "It's a horrible thing to use, in that it adds extra cruft to the source code, and then shuts up a compiler warning (even the _reliable_ warnings from gcc)." Fix it by remove this variable since it is not needed at all. [1] https://github.com/KSPP/linux/issues/81 [2] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Suggested-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Link: https://lore.kernel.org/r/20200615085132.166470-1-yanaijie@huawei.com Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-16block: integrate bd_start_claiming into __blkdev_getChristoph Hellwig
bd_start_claiming duplicates a lot of the work done in __blkdev_get. Integrate the two functions to avoid the duplicate work, and to do the right thing for the md -ERESTARTSYS corner case. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-16block: use bd_prepare_to_claim directly in the loop driverChristoph Hellwig
The arcane magic in bd_start_claiming is only needed to be able to claim a block_device that hasn't been fully set up. Switch the loop driver that claims from the ioctl path with a fully set up struct block_device to just use the much simpler bd_prepare_to_claim directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-16block: refactor bd_start_claimingChristoph Hellwig
Move the locking and assignment of bd_claiming from bd_start_claiming to bd_prepare_to_claim. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-16block: simplify the restart case in __blkdev_getChristoph Hellwig
Insted of duplicating all the cleanup logic jump to the code that cleans up anyway, and restart after that. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-16fs: add a vfs_fchmod helperChristoph Hellwig
Add a helper for struct file based chmode operations. To be used by the initramfs code soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16fs: add a vfs_fchown helperChristoph Hellwig
Add a helper for struct file based chown operations. To be used by the initramfs code soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16ovl: fix lookup of indexed hardlinks with metacopyAmir Goldstein
We recently moved setting inode flag OVL_UPPERDATA to ovl_lookup(). When looking up an overlay dentry, upperdentry may be found by index and not by name. In that case, we fail to read the metacopy xattr and falsly set the OVL_UPPERDATA on the overlay inode. This caused a regression in xfstest overlay/033 when run with OVERLAY_MOUNT_OPTIONS="-o metacopy=on". Fixes: 28166ab3c875 ("ovl: initialize OVL_UPPERDATA in ovl_lookup()") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-16ovl: fix unneeded call to ovl_change_flags()Amir Goldstein
The check if user has changed the overlay file was wrong, causing unneeded call to ovl_change_flags() including taking f_lock on every file access. Fixes: d989903058a8 ("ovl: do not generate duplicate fsnotify events for "fake" path") Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-15afs: Fix interruption of operationsDavid Howells
The afs filesystem driver allows unstarted operations to be cancelled by signal, but most of these can easily be restarted (mkdir for example). The primary culprits for reproducing this are those applications that use SIGALRM to display a progress counter. File lock-extension operation is marked uninterruptible as we have a limited time in which to do it, and the release op is marked uninterruptible also as if we fail to unlock a file, we'll have to wait 20 mins before anyone can lock it again. The store operation logs a warning if it gets interruption, e.g.: kAFS: Unexpected error from FS.StoreData -4 because it's run from the background - but it can also be run from fdatasync()-type things. However, store options aren't marked interruptible at the moment. Fix this in the following ways: (1) Mark store operations as uninterruptible. It might make sense to relax this for certain situations, but I'm not sure how to make sure that background store ops aren't affected by signals to foreground processes that happen to trigger them. (2) In afs_get_io_locks(), where we're getting the serialisation lock for talking to the fileserver, return ERESTARTSYS rather than EINTR because a lot of the operations (e.g. mkdir) are restartable if we haven't yet started sending the op to the server. Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16ovl: fix mount option checks for nfs_export with no upperdirAmir Goldstein
Without upperdir mount option, there is no index dir and the dependency checks nfs_export => index for mount options parsing are incorrect. Allow the combination nfs_export=on,index=off with no upperdir and move the check for dependency redirect_dir=nofollow for non-upper mount case to mount options parsing. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-16ovl: force read-only sb on failure to create index dirAmir Goldstein
With index feature enabled, on failure to create index dir, overlay is being mounted read-only. However, we do not forbid user to remount overlay read-write. Fix that by setting ofs->workdir to NULL, which prevents remount read-write. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-16ovl: fix regression with re-formatted lower squashfsAmir Goldstein
Commit 9df085f3c9a2 ("ovl: relax requirement for non null uuid of lower fs") relaxed the requirement for non null uuid with single lower layer to allow enabling index and nfs_export features with single lower squashfs. Fabian reported a regression in a setup when overlay re-uses an existing upper layer and re-formats the lower squashfs image. Because squashfs has no uuid, the origin xattr in upper layer are decoded from the new lower layer where they may resolve to a wrong origin file and user may get an ESTALE or EIO error on lookup. To avoid the reported regression while still allowing the new features with single lower squashfs, do not allow decoding origin with lower null uuid unless user opted-in to one of the new features that require following the lower inode of non-dir upper (index, xino, metacopy). Reported-by: Fabian <godi.beat@gmx.net> Link: https://lore.kernel.org/linux-unionfs/32532923.JtPX5UtSzP@fgdesktop/ Fixes: 9df085f3c9a2 ("ovl: relax requirement for non null uuid of lower fs") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-16ovl: fix oops in ovl_indexdir_cleanup() with nfs_export=onAmir Goldstein
Mounting with nfs_export=on, xfstests overlay/031 triggers a kernel panic since v5.8-rc1 overlayfs updates. overlayfs: orphan index entry (index/00fb1..., ftype=4000, nlink=2) BUG: kernel NULL pointer dereference, address: 0000000000000030 RIP: 0010:ovl_cleanup_and_whiteout+0x28/0x220 [overlay] Bisect point at commit c21c839b8448 ("ovl: whiteout inode sharing") Minimal reproducer: -------------------------------------------------- rm -rf l u w m mkdir -p l u w m mkdir -p l/testdir touch l/testdir/testfile mount -t overlay -o lowerdir=l,upperdir=u,workdir=w,nfs_export=on overlay m echo 1 > m/testdir/testfile umount m rm -rf u/testdir mount -t overlay -o lowerdir=l,upperdir=u,workdir=w,nfs_export=on overlay m umount m -------------------------------------------------- When mount with nfs_export=on, and fail to verify an orphan index, we're cleaning this index from indexdir by calling ovl_cleanup_and_whiteout(). This dereferences ofs->workdir, that was earlier set to NULL. The design was that ovl->workdir will point at ovl->indexdir, but we are assigning ofs->indexdir to ofs->workdir only after ovl_indexdir_cleanup(). There is no reason not to do it sooner, because once we get success from ofs->indexdir = ovl_workdir_create(... there is no turning back. Reported-and-tested-by: Murphy Zhou <jencce.kernel@gmail.com> Fixes: c21c839b8448 ("ovl: whiteout inode sharing") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-16ovl: relax WARN_ON() when decoding lower directory file handleAmir Goldstein
Decoding a lower directory file handle to overlay path with cold inode/dentry cache may go as follows: 1. Decode real lower file handle to lower dir path 2. Check if lower dir is indexed (was copied up) 3. If indexed, get the upper dir path from index 4. Lookup upper dir path in overlay 5. If overlay path found, verify that overlay lower is the lower dir from step 1 On failure to verify step 5 above, user will get an ESTALE error and a WARN_ON will be printed. A mismatch in step 5 could be a result of lower directory that was renamed while overlay was offline, after that lower directory has been copied up and indexed. This is a scripted reproducer based on xfstest overlay/052: # Create lower subdir create_dirs create_test_files $lower/lowertestdir/subdir mount_dirs # Copy up lower dir and encode lower subdir file handle touch $SCRATCH_MNT/lowertestdir test_file_handles $SCRATCH_MNT/lowertestdir/subdir -p -o $tmp.fhandle # Rename lower dir offline unmount_dirs mv $lower/lowertestdir $lower/lowertestdir.new/ mount_dirs # Attempt to decode lower subdir file handle test_file_handles $SCRATCH_MNT -p -i $tmp.fhandle Since this WARN_ON() can be triggered by user we need to relax it. Fixes: 4b91c30a5a19 ("ovl: lookup connected ancestor of dir in inode cache") Cc: <stable@vger.kernel.org> # v4.16+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-16ovl: remove not used argument in ovl_check_originyoungjun
ovl_check_origin outparam 'ctrp' argument not used by caller. So remove this argument. Signed-off-by: youngjun <her0gyugyu@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-16ovl: change ovl_copy_up_flags staticyoungjun
"ovl_copy_up_flags" is used in copy_up.c. so, change it static. Signed-off-by: youngjun <her0gyugyu@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-16ovl: inode reference leak in ovl_is_inuse true case.youngjun
When "ovl_is_inuse" true case, trap inode reference not put. plus adding the comment explaining sequence of ovl_is_inuse after ovl_setup_trap. Fixes: 0be0bfd2de9d ("ovl: fix regression caused by overlapping layers detection") Cc: <stable@vger.kernel.org> # v4.19+ Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: youngjun <her0gyugyu@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-15io_uring: fix recvmsg memory leak with buffer selectionPavel Begunkov
io_recvmsg() doesn't free memory allocated for struct io_buffer. This can causes a leak when used with automatic buffer selection. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-15fanotify: break up fanotify_alloc_event()Amir Goldstein
Break up fanotify_alloc_event() into helpers by event struct type. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-15fanotify: create overflow event typeAmir Goldstein
The special overflow event is allocated as struct fanotify_path_event, but with a null path. Use a special event type to identify the overflow event, so the helper fanotify_has_event_path() will always indicate a non null path. Allocating the overflow event doesn't need any of the fancy stuff in fanotify_alloc_event(), so create a simplified helper for allocating the overflow event. There is also no need to store and report the pid with an overflow event. Link: https://lore.kernel.org/r/20200708111156.24659-7-amir73il@gmail.com Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-15inotify: do not use objectid when comparing eventsAmir Goldstein
inotify's event->wd is the object identifier. Compare that instead of the common fsnotidy event objectid, so we can get rid of the objectid field later. Link: https://lore.kernel.org/r/20200708111156.24659-6-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>