summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2019-06-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The new route handling in ip_mc_finish_output() from 'net' overlapped with the new support for returning congestion notifications from BPF programs. In order to handle this I had to take the dev_loopback_xmit() calls out of the switch statement. The aquantia driver conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge tag 'for-linus-20190627' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux Pull pidfd fixes from Christian Brauner: "Userspace tools and libraries such as strace or glibc need a cheap and reliable way to tell whether CLONE_PIDFD is supported. The easiest way is to pass an invalid fd value in the return argument, perform the syscall and verify the value in the return argument has been changed to a valid fd. However, if CLONE_PIDFD is specified we currently check if pidfd == 0 and return EINVAL if not. The check for pidfd == 0 was originally added to enable us to abuse the return argument for passing additional flags along with CLONE_PIDFD in the future. However, extending legacy clone this way would be a terrible idea and with clone3 on the horizon and the ability to reuse CLONE_DETACHED with CLONE_PIDFD there's no real need for this clutch. So remove the pidfd == 0 check and help userspace out. Also, accordig to Al, anon_inode_getfd() should only be used past the point of no failure and ksys_close() should not be used at all since it is far too easy to get wrong. Al's motto being "basically, once it's in descriptor table, it's out of your control". So Al's patch switches back to what we already had in v1 of the original patchset and uses a anon_inode_getfile() + put_user() + fd_install() sequence in the success path and a fput() + put_unused_fd() in the failure path. The other two changes should be trivial" * tag 'for-linus-20190627' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux: proc: remove useless d_is_dir() check copy_process(): don't use ksys_close() on cleanups samples: make pidfd-metadata fail gracefully on older kernels fork: don't check parent_tidptr with CLONE_PIDFD
2019-06-28Merge tag 'afs-fixes-20190620' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "The in-kernel AFS client has been undergoing testing on opendev.org on one of their mirror machines. They are using AFS to hold data that is then served via apache, and Ian Wienand had reported seeing oopses, spontaneous machine reboots and updates to volumes going missing. This patch series appears to have fixed the problem, very probably due to patch (2), but it's not 100% certain. (1) Fix the printing of the "vnode modified" warning to exclude checks on files for which we don't have a callback promise from the server (and so don't expect the server to tell us when it changes). Without this, for every file or directory for which we still have an in-core inode that gets changed on the server, we may get a message logged when we next look at it. This can happen in bulk if, for instance, someone does "vos release" to update a R/O volume from a R/W volume and a whole set of files are all changed together. We only really want to log a message if the file changed and the server didn't tell us about it or we failed to track the state internally. (2) Fix accidental corruption of either afs_vlserver struct objects or the the following memory locations (which could hold anything). The issue is caused by a union that points to two different structs in struct afs_call (to save space in the struct). The call cleanup code assumes that it can simply call the cleanup for one of those structs if not NULL - when it might be actually pointing to the other struct. This means that every Volume Location RPC op is going to corrupt something. (3) Fix an uninitialised spinlock. This isn't too bad, it just causes a one-off warning if lockdep is enabled when "vos release" is called, but the spinlock still behaves correctly. (4) Fix the setting of i_block in the inode. This causes du, for example, to produce incorrect results, but otherwise should not be dangerous to the kernel" * tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix setting of i_blocks afs: Fix uninitialised spinlock afs_volume::cb_break_lock afs: Fix vlserver record corruption afs: Fix over zealous "vnode modified" warnings
2019-06-27proc: remove useless d_is_dir() checkChristian Brauner
Remove the d_is_dir() check from tgid_pidfd_to_pid(). It is pointless since you should never get &proc_tgid_base_operations for f_op on a non-directory. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Christian Brauner <christian@brauner.io>
2019-06-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Minor SPDX change conflict. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-21Merge tag 'nfs-for-5.2-3' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull more NFS client fixes from Anna Schumaker: "These are mostly refcounting issues that people have found recently. The revert fixes a suspend recovery performance issue. - SUNRPC: Fix a credential refcount leak - Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE" - SUNRPC: Fix xps refcount imbalance on the error path - NFS4: Only set creation opendata if O_CREAT" * tag 'nfs-for-5.2-3' of git://git.linux-nfs.org/projects/anna/linux-nfs: SUNRPC: Fix a credential refcount leak Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE" net :sunrpc :clnt :Fix xps refcount imbalance on the error path NFS4: Only set creation opendata if O_CREAT
2019-06-21NFS4: Only set creation opendata if O_CREATBenjamin Coddington
We can end up in nfs4_opendata_alloc during task exit, in which case current->fs has already been cleaned up. This leads to a crash in current_umask(). Fix this by only setting creation opendata if we are actually doing an open with O_CREAT. We can drop the check for NULL nfs4_open_createattrs, since O_CREAT will never be set for the recovery path. Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-06-21Merge tag 'spdx-5.2-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull still more SPDX updates from Greg KH: "Another round of SPDX updates for 5.2-rc6 Here is what I am guessing is going to be the last "big" SPDX update for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates that were "easy" to determine by pattern matching. The ones after this are going to be a bit more difficult and the people on the spdx list will be discussing them on a case-by-case basis now. Another 5000+ files are fixed up, so our overall totals are: Files checked: 64545 Files with SPDX: 45529 Compared to the 5.1 kernel which was: Files checked: 63848 Files with SPDX: 22576 This is a huge improvement. Also, we deleted another 20000 lines of boilerplate license crud, always nice to see in a diffstat" * tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (65 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485 ...
2019-06-21Merge tag '5.2-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Four small SMB3 fixes, all for stable" * tag '5.2-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix GlobalMid_Lock bug in cifs_reconnect SMB3: retry on STATUS_INSUFFICIENT_RESOURCES instead of failing write cifs: add spinlock for the openFileList to cifsInodeInfo cifs: fix panic in smb2_reconnect
2019-06-20Merge tag 'ovl-fixes-5.2-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "Fix two regressions in this cycle, and a couple of older bugs" * tag 'ovl-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: make i_ino consistent with st_ino in more cases ovl: fix typo in MODULE_PARM_DESC ovl: fix bogus -Wmaybe-unitialized warning ovl: don't fail with disconnected lower NFS ovl: fix wrong flags check in FS_IOC_FS[SG]ETXATTR ioctls
2019-06-20Merge tag 'fuse-fixes-5.2-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fix from Miklos Szeredi: "Just a single revert, fixing a regression in -rc1" * tag 'fuse-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: Revert "fuse: require /dev/fuse reads to have enough buffer capacity"
2019-06-20Merge tag 'for_v5.2-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull two misc vfs fixes from Jan Kara: "One small quota fix fixing spurious EDQUOT errors and one fanotify fix fixing a bug in the new fanotify FID reporting code" * tag 'for_v5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fanotify: update connector fsid cache on add mark quota: fix a problem about transfer quota
2019-06-20afs: Fix setting of i_blocksDavid Howells
The setting of i_blocks, which is calculated from i_size, has got accidentally misordered relative to the setting of i_size when initially setting up an inode. Further, i_blocks isn't updated by afs_apply_status() when the size is updated. To fix this, break the i_size/i_blocks setting out into a helper function and call it from both places. Fixes: a58823ac4589 ("afs: Fix application of status and callback to be under same lock") Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20Merge tag 'for-linus-20190620' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Three fixes that should go into this series. One is a set of two patches from Christoph, fixing a page leak on same page merges. Boiled down version of a bigger fix, but this one is more appropriate for this late in the cycle (and easier to backport to stable). The last patch is for a divide error in MD, from Mariusz (via Song)" * tag 'for-linus-20190620' of git://git.kernel.dk/linux-block: md: fix for divide error in status_resync block: fix page leak when merging to same page block: return from __bio_try_merge_page if merging occured in the same page
2019-06-20afs: Fix uninitialised spinlock afs_volume::cb_break_lockDavid Howells
Fix the cb_break_lock spinlock in afs_volume struct by initialising it when the volume record is allocated. Also rename the lock to cb_v_break_lock to distinguish it from the lock of the same name in the afs_server struct. Without this, the following trace may be observed when a volume-break callback is received: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 2 PID: 50 Comm: kworker/2:1 Not tainted 5.2.0-rc1-fscache+ #3045 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Workqueue: afs SRXAFSCB_CallBack Call Trace: dump_stack+0x67/0x8e register_lock_class+0x23b/0x421 ? check_usage_forwards+0x13c/0x13c __lock_acquire+0x89/0xf73 lock_acquire+0x13b/0x166 ? afs_break_callbacks+0x1b2/0x3dd _raw_write_lock+0x2c/0x36 ? afs_break_callbacks+0x1b2/0x3dd afs_break_callbacks+0x1b2/0x3dd ? trace_event_raw_event_afs_server+0x61/0xac SRXAFSCB_CallBack+0x11f/0x16c process_one_work+0x2c5/0x4ee ? worker_thread+0x234/0x2ac worker_thread+0x1d8/0x2ac ? cancel_delayed_work_sync+0xf/0xf kthread+0x11f/0x127 ? kthread_park+0x76/0x76 ret_from_fork+0x24/0x30 Fixes: 68251f0a6818 ("afs: Fix whole-volume callback handling") Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20afs: Fix vlserver record corruptionDavid Howells
Because I made the afs_call struct share pointers to an afs_server object and an afs_vlserver object to save space, afs_put_call() calls afs_put_server() on afs_vlserver object (which is only meant for the afs_server object) because it sees that call->server isn't NULL. This means that the afs_vlserver object gets unpredictably and randomly modified, depending on what config options are set (such as lockdep). Fix this by getting rid of the union and having two non-overlapping pointers in the afs_call struct. Fixes: ffba718e9354 ("afs: Get rid of afs_call::reply[]") Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20afs: Fix over zealous "vnode modified" warningsDavid Howells
Occasionally, warnings like this: vnode modified 2af7 on {10000b:1} [exp 2af2] YFS.FetchStatus(vnode) are emitted into the kernel log. This indicates that when we were applying the updated vnode (file) status retrieved from the server to an inode we saw that the data version number wasn't what we were expecting (in this case it's 0x2af7 rather than 0x2af2). We've usually received a callback from the server prior to this point - or the callback promise has lapsed - so the warning is merely informative and the state is to be expected. Fix this by only emitting the warning if the we still think that we have a valid callback promise and haven't received a callback. Also change the format slightly so so that the new data version doesn't look like part of the text, the like is prefixed with "kAFS: " and the message is ranked as a warning. Fixes: 31143d5d515e ("AFS: implement basic file write support") Reported-by: Ian Wienand <iwienand@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499Thomas Gleixner
Based on 1 normalized pattern(s): this work is licensed under the terms of the gnu gpl version 2 see the copying file in the top level directory extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 35 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.797835076@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 231Thomas Gleixner
Based on 1 normalized pattern(s): this library is free software you can redistribute it and or modify it under the terms of the gnu general public license v2 as published by the free software foundation this library is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu lesser general public license for more details you should have received a copy of the gnu lesser general public license along with this library if not write to the free software foundation inc 59 temple place suite 330 boston ma 02111 1307 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 2 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.539286961@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19fanotify: update connector fsid cache on add markAmir Goldstein
When implementing connector fsid cache, we only initialized the cache when the first mark added to object was added by FAN_REPORT_FID group. We forgot to update conn->fsid when the second mark is added by FAN_REPORT_FID group to an already attached connector without fsid cache. Reported-and-tested-by: syzbot+c277e8e2f46414645508@syzkaller.appspotmail.com Fixes: 77115225acc6 ("fanotify: cache fsid in fsnotify_mark_connector") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-19quota: fix a problem about transfer quotayangerkun
Run below script as root, dquot_add_space will return -EDQUOT since __dquot_transfer call dquot_add_space with flags=0, and dquot_add_space think it's a preallocation. Fix it by set flags as DQUOT_SPACE_WARN. mkfs.ext4 -O quota,project /dev/vdb mount -o prjquota /dev/vdb /mnt setquota -P 23 1 1 0 0 /dev/vdb dd if=/dev/zero of=/mnt/test-file bs=4K count=1 chattr -p 23 test-file Fixes: 7b9ca4c61bc2 ("quota: Reduce contention on dq_data_lock") Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-19ovl: make i_ino consistent with st_ino in more casesAmir Goldstein
Relax the condition that overlayfs supports nfs export, to require that i_ino is consistent with st_ino/d_ino. It is enough to require that st_ino and d_ino are consistent. This fixes the failure of xfstest generic/504, due to mismatch of st_ino to inode number in the output of /proc/locks. Fixes: 12574a9f4c9c ("ovl: consistent i_ino for non-samefs with xino") Cc: <stable@vger.kernel.org> # v4.19 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-18Merge tag 'for-5.2-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - regression where properties stored as xattrs are not properly persisted - a small readahead fix (the fstests testcase for that fix hangs on unpatched kernel, so we'd like get it merged to ease future testing) - fix a race during block group creation and deletion * tag 'for-5.2-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: fix failure to persist compression property xattr deletion on fsync btrfs: start readahead also in seed devices Btrfs: fix race between block group removal and block group allocation
2019-06-18ovl: fix typo in MODULE_PARM_DESCNicolas Schier
Change first argument to MODULE_PARM_DESC() calls, that each of them matched the actual module parameter name. The matching results in changing (the 'parm' section from) the output of `modinfo overlay` from: parm: ovl_check_copy_up:Obsolete; does nothing parm: redirect_max:ushort parm: ovl_redirect_max:Maximum length of absolute redirect xattr value parm: redirect_dir:bool parm: ovl_redirect_dir_def:Default to on or off for the redirect_dir feature parm: redirect_always_follow:bool parm: ovl_redirect_always_follow:Follow redirects even if redirect_dir feature is turned off parm: index:bool parm: ovl_index_def:Default to on or off for the inodes index feature parm: nfs_export:bool parm: ovl_nfs_export_def:Default to on or off for the NFS export feature parm: xino_auto:bool parm: ovl_xino_auto_def:Auto enable xino feature parm: metacopy:bool parm: ovl_metacopy_def:Default to on or off for the metadata only copy up feature into: parm: check_copy_up:Obsolete; does nothing parm: redirect_max:Maximum length of absolute redirect xattr value (ushort) parm: redirect_dir:Default to on or off for the redirect_dir feature (bool) parm: redirect_always_follow:Follow redirects even if redirect_dir feature is turned off (bool) parm: index:Default to on or off for the inodes index feature (bool) parm: nfs_export:Default to on or off for the NFS export feature (bool) parm: xino_auto:Auto enable xino feature (bool) parm: metacopy:Default to on or off for the metadata only copy up feature (bool) Signed-off-by: Nicolas Schier <n.schier@avm.de> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-18ovl: fix bogus -Wmaybe-unitialized warningArnd Bergmann
gcc gets a bit confused by the logic in ovl_setup_trap() and can't figure out whether the local 'trap' variable in the caller was initialized or not: fs/overlayfs/super.c: In function 'ovl_fill_super': fs/overlayfs/super.c:1333:4: error: 'trap' may be used uninitialized in this function [-Werror=maybe-uninitialized] iput(trap); ^~~~~~~~~~ fs/overlayfs/super.c:1312:17: note: 'trap' was declared here Reword slightly to make it easier for the compiler to understand. Fixes: 146d62e5a586 ("ovl: detect overlapping layers") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-18ovl: don't fail with disconnected lower NFSMiklos Szeredi
NFS mounts can be disconnected from fs root. Don't fail the overlapping layer check because of this. The check is not authoritative anyway, since topology can change during or after the check. Reported-by: Antti Antinoja <antti@fennosys.fi> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 146d62e5a586 ("ovl: detect overlapping layers")
2019-06-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Honestly all the conflicts were simple overlapping changes, nothing really interesting to report. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-17fs/namespace: fix unprivileged mount propagationChristian Brauner
When propagating mounts across mount namespaces owned by different user namespaces it is not possible anymore to move or umount the mount in the less privileged mount namespace. Here is a reproducer: sudo mount -t tmpfs tmpfs /mnt sudo --make-rshared /mnt # create unprivileged user + mount namespace and preserve propagation unshare -U -m --map-root --propagation=unchanged # now change back to the original mount namespace in another terminal: sudo mkdir /mnt/aaa sudo mount -t tmpfs tmpfs /mnt/aaa # now in the unprivileged user + mount namespace mount --move /mnt/aaa /opt Unfortunately, this is a pretty big deal for userspace since this is e.g. used to inject mounts into running unprivileged containers. So this regression really needs to go away rather quickly. The problem is that a recent change falsely locked the root of the newly added mounts by setting MNT_LOCKED. Fix this by only locking the mounts on copy_mnt_ns() and not when adding a new mount. Fixes: 3bd045cc9c4b ("separate copying and locking mount tree on cross-userns copies") Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Tested-by: Christian Brauner <christian@brauner.io> Acked-by: Christian Brauner <christian@brauner.io> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-17vfs: fsmount: add missing mntget()Eric Biggers
sys_fsmount() needs to take a reference to the new mount when adding it to the anonymous mount namespace. Otherwise the filesystem can be unmounted while it's still in use, as found by syzkaller. Reported-by: Mark Rutland <mark.rutland@arm.com> Reported-by: syzbot+99de05d099a170867f22@syzkaller.appspotmail.com Reported-by: syzbot+7008b8b8ba7df475fdc8@syzkaller.appspotmail.com Fixes: 93766fbd2696 ("vfs: syscall: Add fsmount() to create a mount for a superblock") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-17cifs: fix GlobalMid_Lock bug in cifs_reconnectRonnie Sahlberg
We can not hold the GlobalMid_Lock spinlock during the dfs processing in cifs_reconnect since it invokes things that may sleep and thus trigger : BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:23 Thus we need to drop the spinlock during this code block. RHBZ: 1716743 Cc: stable@vger.kernel.org Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-06-17SMB3: retry on STATUS_INSUFFICIENT_RESOURCES instead of failing writeSteve French
Some servers such as Windows 10 will return STATUS_INSUFFICIENT_RESOURCES as the number of simultaneous SMB3 requests grows (even though the client has sufficient credits). Return EAGAIN on STATUS_INSUFFICIENT_RESOURCES so that we can retry writes which fail with this status code. This (for example) fixes large file copies to Windows 10 on fast networks. Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-06-17block: return from __bio_try_merge_page if merging occured in the same pageChristoph Hellwig
We currently have an input same_page parameter to __bio_try_merge_page to prohibit merging in the same page. The rationale for that is that some callers need to account for every page added to a bio. Instead of letting these callers call twice into the merge code to account for the new vs existing page cases, just turn the paramter into an output one that returns if a merge in the same page occured and let them act accordingly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-17Btrfs: fix failure to persist compression property xattr deletion on fsyncFilipe Manana
After the recent series of cleanups in the properties and xattrs modules that landed in the 5.2 merge window, we ended up with a regression where after deleting the compression xattr property through the setflags ioctl, we don't set the BTRFS_INODE_COPY_EVERYTHING flag in the inode anymore. As a consequence, if the inode was fsync'ed when it had the compression property set, after deleting the compression property through the setflags ioctl and fsync'ing again the inode, the log will still contain the compression xattr, because the inode did not had that bit set, which made the fsync not delete all xattrs from the log and copy all xattrs from the subvolume tree to the log tree. This regression happens due to the fact that that series of cleanups made btrfs_set_prop() call the old function do_setxattr() (which is now named btrfs_setxattr()), and not the old version of btrfs_setxattr(), which is now called btrfs_setxattr_trans(). Fix this by setting the BTRFS_INODE_COPY_EVERYTHING bit in the current btrfs_setxattr() function and remove it from everywhere else, including its setup at btrfs_ioctl_setflags(). This is cleaner, avoids similar regressions in the future, and centralizes the setup of the bit. After all, the need to setup this bit should only be in the xattrs module, since it is an implementation of xattrs. Fixes: 04e6863b19c722 ("btrfs: split btrfs_setxattr calls regarding transaction") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-14Merge tag 'gfs2-v5.2.fixes2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Andreas Gruenbacher: "Fix rounding error in gfs2_iomap_page_prepare" * tag 'gfs2-v5.2.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix rounding error in gfs2_iomap_page_prepare
2019-06-14Merge tag 'for-linus-20190614' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Remove references to old schedulers for the scheduler switching and blkio controller documentation (Andreas) - Kill duplicate check for report zone for null_blk (Chaitanya) - Two bcache fixes (Coly) - Ensure that mq-deadline is selected if zoned block device is enabled, as we need that to support them (Damien) - Fix io_uring memory leak (Eric) - ps3vram fallout from LBDAF removal (Geert) - Redundant blk-mq debugfs debugfs_create return check cleanup (Greg) - Extend NOPLM quirk for ST1000LM024 drives (Hans) - Remove error path warning that can now trigger after the queue removal/addition fixes (Ming) * tag 'for-linus-20190614' of git://git.kernel.dk/linux-block: block/ps3vram: Use %llu to format sector_t after LBDAF removal libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached bcache: fix stack corruption by PRECEDING_KEY() blk-mq: remove WARN_ON(!q->elevator) from blk_mq_sched_free_requests blkio-controller.txt: Remove references to CFQ block/switching-sched.txt: Update to blk-mq schedulers null_blk: remove duplicate check for report zone blk-mq: no need to check return value of debugfs_create functions io_uring: fix memory leak of UNIX domain socket inode block: force select mq-deadline for zoned block devices
2019-06-14gfs2: Fix rounding error in gfs2_iomap_page_prepareAndreas Gruenbacher
The pos and len arguments to the iomap page_prepare callback are not block aligned, so we need to take that into account when computing the number of blocks. Fixes: d0a22a4b03b8 ("gfs2: Fix iomap write page reclaim deadlock") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-14btrfs: start readahead also in seed devicesNaohiro Aota
Currently, btrfs does not consult seed devices to start readahead. As a result, if readahead zone is added to the seed devices, btrfs_reada_wait() indefinitely wait for the reada_ctl to finish. You can reproduce the hung by modifying btrfs/163 to have larger initial file size (e.g. xfs_io pwrite 4M instead of current 256K). Fixes: 7414a03fbf9e ("btrfs: initial readahead code and prototypes") Cc: stable@vger.kernel.org # 3.2+: ce7791ffee1e: Btrfs: fix race between readahead and device replace/removal Cc: stable@vger.kernel.org # 3.2+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-13fs/ocfs2: fix race in ocfs2_dentry_attach_lock()Wengang Wang
ocfs2_dentry_attach_lock() can be executed in parallel threads against the same dentry. Make that race safe. The race is like this: thread A thread B (A1) enter ocfs2_dentry_attach_lock, seeing dentry->d_fsdata is NULL, and no alias found by ocfs2_find_local_alias, so kmalloc a new ocfs2_dentry_lock structure to local variable "dl", dl1 ..... (B1) enter ocfs2_dentry_attach_lock, seeing dentry->d_fsdata is NULL, and no alias found by ocfs2_find_local_alias so kmalloc a new ocfs2_dentry_lock structure to local variable "dl", dl2. ...... (A2) set dentry->d_fsdata with dl1, call ocfs2_dentry_lock() and increase dl1->dl_lockres.l_ro_holders to 1 on success. ...... (B2) set dentry->d_fsdata with dl2 call ocfs2_dentry_lock() and increase dl2->dl_lockres.l_ro_holders to 1 on success. ...... (A3) call ocfs2_dentry_unlock() and decrease dl2->dl_lockres.l_ro_holders to 0 on success. .... (B3) call ocfs2_dentry_unlock(), decreasing dl2->dl_lockres.l_ro_holders, but see it's zero now, panic Link: http://lkml.kernel.org/r/20190529174636.22364-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reported-by: Daniel Sobe <daniel.sobe@nxp.com> Tested-by: Daniel Sobe <daniel.sobe@nxp.com> Reviewed-by: Changwei Ge <gechangwei@live.cn> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13cifs: add spinlock for the openFileList to cifsInodeInfoRonnie Sahlberg
We can not depend on the tcon->open_file_lock here since in multiuser mode we may have the same file/inode open via multiple different tcons. The current code is race prone and will crash if one user deletes a file at the same time a different user opens/create the file. To avoid this we need to have a spinlock attached to the inode and not the tcon. RHBZ: 1580165 CC: Stable <stable@vger.kernel.org> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-06-13cifs: fix panic in smb2_reconnectRonnie Sahlberg
RH Bugzilla: 1702264 We need to protect so that the call to smb2_reconnect() in smb2_reconnect_server() does not end up freeing the session because it can lead to a use after free and crash. Reviewed-by: Aurelien Aptel <aaptel@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-06-13io_uring: fix memory leak of UNIX domain socket inodeEric Biggers
Opening and closing an io_uring instance leaks a UNIX domain socket inode. This is because the ->file of the io_uring instance's internal UNIX domain socket is set to point to the io_uring file, but then sock_release() sees the non-NULL ->file and assumes the inode reference is held by the file so doesn't call iput(). That's not the case here, since the reference is still meant to be held by the socket; the actual inode of the io_uring file is different. Fix this leak by NULL-ing out ->file before releasing the socket. Reported-by: syzbot+111cb28d9f583693aefa@syzkaller.appspotmail.com Fixes: 2b188cc1bb85 ("Add io_uring IO interface") Cc: <stable@vger.kernel.org> # v5.1+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-12Btrfs: fix race between block group removal and block group allocationFilipe Manana
If a task is removing the block group that currently has the highest start offset amongst all existing block groups, there is a short time window where it races with a concurrent block group allocation, resulting in a transaction abort with an error code of EEXIST. The following diagram explains the race in detail: Task A Task B btrfs_remove_block_group(bg offset X) remove_extent_mapping(em offset X) -> removes extent map X from the tree of extent maps (fs_info->mapping_tree), so the next call to find_next_chunk() will return offset X btrfs_alloc_chunk() find_next_chunk() --> returns offset X __btrfs_alloc_chunk(offset X) btrfs_make_block_group() btrfs_create_block_group_cache() --> creates btrfs_block_group_cache object with a key corresponding to the block group item in the extent, the key is: (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G) --> adds the btrfs_block_group_cache object to the list new_bgs of the transaction handle btrfs_end_transaction(trans handle) __btrfs_end_transaction() btrfs_create_pending_block_groups() --> sees the new btrfs_block_group_cache in the new_bgs list of the transaction handle --> its call to btrfs_insert_item() fails with -EEXIST when attempting to insert the block group item key (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G) because task A has not removed that key yet --> aborts the running transaction with error -EEXIST btrfs_del_item() -> removes the block group's key from the extent tree, key is (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G) A sample transaction abort trace: [78912.403537] ------------[ cut here ]------------ [78912.403811] BTRFS: Transaction aborted (error -17) [78912.404082] WARNING: CPU: 2 PID: 20465 at fs/btrfs/extent-tree.c:10551 btrfs_create_pending_block_groups+0x196/0x250 [btrfs] (...) [78912.405642] CPU: 2 PID: 20465 Comm: btrfs Tainted: G W 5.0.0-btrfs-next-46 #1 [78912.405941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [78912.406586] RIP: 0010:btrfs_create_pending_block_groups+0x196/0x250 [btrfs] (...) [78912.407636] RSP: 0018:ffff9d3d4b7e3b08 EFLAGS: 00010282 [78912.407997] RAX: 0000000000000000 RBX: ffff90959a3796f0 RCX: 0000000000000006 [78912.408369] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff909636b16860 [78912.408746] RBP: ffff909626758a58 R08: 0000000000000000 R09: 0000000000000000 [78912.409144] R10: ffff9095ff462400 R11: 0000000000000000 R12: ffff90959a379588 [78912.409521] R13: ffff909626758ab0 R14: ffff9095036c0000 R15: ffff9095299e1158 [78912.409899] FS: 00007f387f16f700(0000) GS:ffff909636b00000(0000) knlGS:0000000000000000 [78912.410285] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [78912.410673] CR2: 00007f429fc87cbc CR3: 000000014440a004 CR4: 00000000003606e0 [78912.411095] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [78912.411496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [78912.411898] Call Trace: [78912.412318] __btrfs_end_transaction+0x5b/0x1c0 [btrfs] [78912.412746] btrfs_inc_block_group_ro+0xcf/0x160 [btrfs] [78912.413179] scrub_enumerate_chunks+0x188/0x5b0 [btrfs] [78912.413622] ? __mutex_unlock_slowpath+0x100/0x2a0 [78912.414078] btrfs_scrub_dev+0x2ef/0x720 [btrfs] [78912.414535] ? __sb_start_write+0xd4/0x1c0 [78912.414963] ? mnt_want_write_file+0x24/0x50 [78912.415403] btrfs_ioctl+0x17fb/0x3120 [btrfs] [78912.415832] ? lock_acquire+0xa6/0x190 [78912.416256] ? do_vfs_ioctl+0xa2/0x6f0 [78912.416685] ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs] [78912.417116] do_vfs_ioctl+0xa2/0x6f0 [78912.417534] ? __fget+0x113/0x200 [78912.417954] ksys_ioctl+0x70/0x80 [78912.418369] __x64_sys_ioctl+0x16/0x20 [78912.418812] do_syscall_64+0x60/0x1b0 [78912.419231] entry_SYSCALL_64_after_hwframe+0x49/0xbe [78912.419644] RIP: 0033:0x7f3880252dd7 (...) [78912.420957] RSP: 002b:00007f387f16ed68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [78912.421426] RAX: ffffffffffffffda RBX: 000055f5becc1df0 RCX: 00007f3880252dd7 [78912.421889] RDX: 000055f5becc1df0 RSI: 00000000c400941b RDI: 0000000000000003 [78912.422354] RBP: 0000000000000000 R08: 00007f387f16f700 R09: 0000000000000000 [78912.422790] R10: 00007f387f16f700 R11: 0000000000000246 R12: 0000000000000000 [78912.423202] R13: 00007ffda49c266f R14: 0000000000000000 R15: 00007f388145e040 [78912.425505] ---[ end trace eb9bfe7c426fc4d3 ]--- Fix this by calling remove_extent_mapping(), at btrfs_remove_block_group(), only at the very end, after removing the block group item key from the extent tree (and removing the free space tree entry if we are using the free space tree feature). Fixes: 04216820fe83d5 ("Btrfs: fix race between fs trimming and block group remove/allocation") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-11Merge tag 'for-5.2-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One regression fix to TRIM ioctl. The range cannot be used as its meaning can be confusing regarding physical and logical addresses. This confusion in code led to potential corruptions when the range overlapped data. The original patch made it to several stable kernels and was promptly reverted, the version for master branch is different due to additional changes but the change is effectively the same" * tag 'for-5.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: Always trim all unallocated space in btrfs_trim_free_extents
2019-06-11ovl: fix wrong flags check in FS_IOC_FS[SG]ETXATTR ioctlsAmir Goldstein
The ioctl argument was parsed as the wrong type. Fixes: b21d9c435f93 ("ovl: support the FS_IOC_FS[SG]ETXATTR ioctls") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-11Revert "fuse: require /dev/fuse reads to have enough buffer capacity"Miklos Szeredi
This reverts commit d4b13963f217dd947da5c0cabd1569e914d21699. The commit introduced a regression in glusterfs-fuse. Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-08Merge tag 'ceph-for-5.2-rc4' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "A change to call iput() asynchronously to avoid a possible deadlock when iput_final() needs to wait for in-flight I/O (e.g. readahead) and a fixup for a cleanup that went into -rc1" * tag 'ceph-for-5.2-rc4' of git://github.com/ceph/ceph-client: ceph: fix error handling in ceph_get_caps() ceph: avoid iput_final() while holding mutex or in dispatch thread ceph: single workqueue for inode related works
2019-06-08Merge tag 'spdx-5.2-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull yet more SPDX updates from Greg KH: "Another round of SPDX header file fixes for 5.2-rc4 These are all more "GPL-2.0-or-later" or "GPL-2.0-only" tags being added, based on the text in the files. We are slowly chipping away at the 700+ different ways people tried to write the license text. All of these were reviewed on the spdx mailing list by a number of different people. We now have over 60% of the kernel files covered with SPDX tags: $ ./scripts/spdxcheck.py -v 2>&1 | grep Files Files checked: 64533 Files with SPDX: 40392 Files with errors: 0 I think the majority of the "easy" fixups are now done, it's now the start of the longer-tail of crazy variants to wade through" * tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (159 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 450 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 449 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 448 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 446 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 445 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 444 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 443 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 442 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 440 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 438 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 437 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 436 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 435 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 434 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 433 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 432 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 431 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 430 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 429 ...
2019-06-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Some ISDN files that got removed in net-next had some changes done in mainline, take the removals. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-07btrfs: Always trim all unallocated space in btrfs_trim_free_extentsNikolay Borisov
This patch removes support for range parameters of FITRIM ioctl when trimming unallocated space on devices. This is necessary since ranges passed from user space are generally interpreted as logical addresses, whereas btrfs_trim_free_extents used to interpret them as device physical extents. This could result in counter-intuitive behavior for users so it's best to remove that support altogether. Additionally, the existing range support had a bug where if an offset was passed to FITRIM which overflows u64 e.g. -1 (parsed as u64 18446744073709551615) then wrong data was fed into btrfs_issue_discard, which in turn leads to wrap-around when aligning the passed range and results in wrong regions being discarded which leads to data corruption. Fixes: c2d1b3aae336 ("btrfs: Honour FITRIM range constraints during free space trim") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>