summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2016-06-20dcache_{readdir,dir_lseek}(): don't bother with nested ->d_lockAl Viro
Make sure that directory is locked shared in dcache_dir_lseek(); for dcache_readdir() it's already tru, and that's enough to make simple_positive() stable. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A couple more of d_walk()/d_subdirs reordering fixes (stable fodder; ought to solve that crap for good) and a fix for a brown paperbag bug in d_alloc_parallel() (this cycle)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix idiotic braino in d_alloc_parallel() autofs races much milder d_walk() race
2016-06-20ecryptfs: fix spelling mistakesChris J Arges
Noticed some minor spelling errors when looking through the code. Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2016-06-20eCryptfs: fix typos in commentWei Yuan
Signed-off-by: Weiyuan <weiyuan.wei@huawei.com> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2016-06-20ecryptfs: drop null test before destroy functionsJulia Lawall
Remove unneeded NULL test. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x; @@ -if (x != NULL) \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2016-06-20fix idiotic braino in d_alloc_parallel()Al Viro
Check for d_unhashed() while searching in in-lookup hash was absolutely wrong. Worse, it masked a deadlock on dget() done under bitlock that nests inside ->d_lock. Thanks to J. R. Okajima for spotting it. Spotted-by: "J. R. Okajima" <hooanon05g@gmail.com> Wearing-brown-paperbag: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-19Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF fixes and a reiserfs fix from Jan Kara: "A couple of udf fixes (most notably a bug in parsing UDF partitions which led to inability to mount recent Windows installation media) and a reiserfs fix for handling kstrdup failure" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: reiserfs: check kstrdup failure udf: Use correct partition reference number for metadata udf: Use IS_ERR when loading metadata mirror file entry udf: Don't BUG on missing metadata partition descriptor
2016-06-19quota: use time64_t internallyArnd Bergmann
The quota subsystem has two formats, the old v1 format using architecture specific time_t values on the on-disk format, while the v2 format (introduced in Linux 2.5.16 and 2.4.22) uses fixed 64-bit little-endian. While there is no future for the v1 format beyond y2038, the v2 format is almost there on 32-bit architectures, as both the user interface and the on-disk format use 64-bit timestamps, just not the time_t inbetween. This changes the internal representation to use time64_t, which will end up doing the right thing everywhere for v2 format. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jan Kara <jack@suse.cz>
2016-06-18Merge tag 'driver-core-4.7-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are a small number of debugfs, ISA, and one driver core fix for 4.7-rc4. All of these resolve reported issues. The ISA ones have spent the least amount of time in linux-next, sorry about that, I didn't realize they were regressions that needed to get in now (thanks to Thorsten for the prodding!) but they do all pass the 0-day bot tests. The others have been in linux-next for a while now. Full details about them are in the shortlog below" * tag 'driver-core-4.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: isa: Dummy isa_register_driver should return error code isa: Call isa_bus_init before dependent ISA bus drivers register watchdog: ebc-c384_wdt: Allow build for X86_64 iio: stx104: Allow build for X86_64 gpio: Allow PC/104 devices on X86_64 isa: Allow ISA-style drivers on modern systems base: make module_create_drivers_dir race-free debugfs: open_proxy_open(): avoid double fops release debugfs: full_proxy_open(): free proxy on ->open() failure kernel/kcov: unproxify debugfs file's fops
2016-06-18Merge branch 'for-linus-4.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "The most user visible change here is a fix for our recent superblock validation checks that were causing problems on non-4k pagesized systems" * 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: btrfs_check_super_valid: Allow 4096 as stripesize btrfs: remove build fixup for qgroup_account_snapshot btrfs: use new error message helper in qgroup_account_snapshot btrfs: avoid blocking open_ctree from cleaner_kthread Btrfs: don't BUG_ON() in btrfs_orphan_add btrfs: account for non-CoW'd blocks in btrfs_abort_transaction Btrfs: check if extent buffer is aligned to sectorsize btrfs: Use correct format specifier
2016-06-17Btrfs: btrfs_check_super_valid: Allow 4096 as stripesizeChandan Rajendra
Older btrfs-progs/mkfs.btrfs sets 4096 as the stripesize. Hence restricting stripesize to be equal to sectorsize would cause super block validation to return an error on architectures where PAGE_SIZE is not equal to 4096. Hence as a workaround, this commit allows stripesize to be set to 4096 bytes. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-17btrfs: remove build fixup for qgroup_account_snapshotDavid Sterba
Introduced in 2c1984f244838477aab ("btrfs: build fixup for qgroup_account_snapshot") as temporary bisectability build fixup. Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-17btrfs: use new error message helper in qgroup_account_snapshotDavid Sterba
We've renamed btrfs_std_error, this one is left from last merge. Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-17btrfs: avoid blocking open_ctree from cleaner_kthreadZygo Blaxell
This fixes a problem introduced in commit 2f3165ecf103599f82bf0ea254039db335fb5005 "btrfs: don't force mounts to wait for cleaner_kthread to delete one or more subvolumes". open_ctree eventually calls btrfs_replay_log which in turn calls btrfs_commit_super which tries to lock the cleaner_mutex, causing a recursive mutex deadlock during mount. Instead of playing whack-a-mole trying to keep up with all the functions that may want to lock cleaner_mutex, put all the cleaner_mutex lockers back where they were, and attack the problem more directly: keep cleaner_kthread asleep until the filesystem is mounted. When filesystems are mounted read-only and later remounted read-write, open_ctree did not set fs_info->open and neither does anything else. Set this flag in btrfs_remount so that neither btrfs_delete_unused_bgs nor cleaner_kthread get confused by the common case of "/" filesystem read-only mount followed by read-write remount. Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-17Btrfs: don't BUG_ON() in btrfs_orphan_addJosef Bacik
This is just a screwup for developers, so change it to an ASSERT() so developers notice when things go wrong and deal with the error appropriately if ASSERT() isn't enabled. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-17btrfs: account for non-CoW'd blocks in btrfs_abort_transactionJeff Mahoney
The test for !trans->blocks_used in btrfs_abort_transaction is insufficient to determine whether it's safe to drop the transaction handle on the floor. btrfs_cow_block, informed by should_cow_block, can return blocks that have already been CoW'd in the current transaction. trans->blocks_used is only incremented for new block allocations. If an operation overlaps the blocks in the current transaction entirely and must abort the transaction, we'll happily let it clean up the trans handle even though it may have modified the blocks and will commit an incomplete operation. In the long-term, I'd like to do closer tracking of when the fs is actually modified so we can still recover as gracefully as possible, but that approach will need some discussion. In the short term, since this is the only code using trans->blocks_used, let's just switch it to a bool indicating whether any blocks were used and set it when should_cow_block returns false. Cc: stable@vger.kernel.org # 3.4+ Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-17Btrfs: check if extent buffer is aligned to sectorsizeLiu Bo
Thanks to fuzz testing, we can pass an invalid bytenr to extent buffer via alloc_extent_buffer(). An unaligned eb can have more pages than it should have, which ends up extent buffer's leak or some corrupted content in extent buffer. This adds a warning to let us quickly know what was happening. Now that alloc_extent_buffer() no more returns NULL, this changes its caller and callers of its caller to match with the new error handling. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-17btrfs: Use correct format specifierHeinrich Schuchardt
Component mirror_num of struct btrfsic_block is defined as unsigned int. Use %u as format specifier. Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-17gfs2: Initialize iopen glock holder for new inodesAndreas Gruenbacher
In gfs2_init_inode_once, initialize inode->i_iopen_gh.gh_gl to NULL: otherwise, when gfs2_inode_lookup fails, the iopen glock holder can remain unset and iget_failed can end up accessing random memory. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2016-06-16Merge tag 'nfsd-4.7-1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd bugfixes from Bruce Fields: "Oleg Drokin found and fixed races in the nfsd4 state code that go back to the big nfs4_lock_state removal around 3.17 (but that were also probably hard to reproduce before client changes in 3.20 allowed the client to perform parallel opens). Also fix a 4.1 backchannel crash due to rpc multipath changes in 4.6. Trond acked the client-side rpc fixes going through my tree" * tag 'nfsd-4.7-1' of git://linux-nfs.org/~bfields/linux: nfsd: Make init_open_stateid() a bit more whole nfsd: Extend the mutex holding region around in nfsd4_process_open2() nfsd: Always lock state exclusively. rpc: share one xps between all backchannels nfsd4/rpc: move backchannel create logic into rpc code SUNRPC: fix xprt leak on xps allocation failure nfsd: Fix NFSD_MDS_PR_KEY on 32-bit by adding ULL postfix
2016-06-16Merge branch 'overlayfs-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "This contains two regression fixes: one for the xattr API update and one for using the mounter's creds in file creation in overlayfs. There's also a fix for a bug in handling hard linked AF_UNIX sockets that's been there from day one. This fix is overlayfs only despite the fact that it touches code outside the overlay filesystem: d_real() is an identity function for all except overlay dentries" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: fix uid/gid when creating over whiteout ovl: xattr filter fix af_unix: fix hard linked sockets on overlay vfs: add d_real_inode() helper
2016-06-15nfsd: Make init_open_stateid() a bit more wholeOleg Drokin
Move the state selection logic inside from the caller, always making it return correct stp to use. Signed-off-by: J . Bruce Fields <bfields@fieldses.org> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-06-15nfsd: Extend the mutex holding region around in nfsd4_process_open2()Oleg Drokin
To avoid racing entry into nfs4_get_vfs_file(). Make init_open_stateid() return with locked stateid to be unlocked by the caller. Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-06-15nfsd: Always lock state exclusively.Oleg Drokin
It used to be the case that state had an rwlock that was locked for write by downgrades, but for read for upgrades (opens). Well, the problem is if there are two competing opens for the same state, they step on each other toes potentially leading to leaking file descriptors from the state structure, since access mode is a bitmap only set once. Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-06-15f2fs: find parent dentry correctlySheng Yong
If dotdot directory is corrupted, its slot may be ocupied by another file. In this case, dentry[1] is not the parent directory. Rename and cross-rename will update the inode in dentry[1] incorrectly. This patch finds dotdot dentry by name. Signed-off-by: Sheng Yong <shengyong1@huawei.com> [Jaegeuk Kim: remove wron bug_on] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-15NFS: Cache access checks more aggressivelyTrond Myklebust
If an attribute revalidation fails, then we already know that we'll zap the access cache. If, OTOH, the inode isn't changing, there should be no need to eject access calls just because they are old. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-06-15nfsd4/rpc: move backchannel create logic into rpc codeJ. Bruce Fields
Also simplify the logic a bit. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Trond Myklebust <trondmy@primarydata.com>
2016-06-15ovl: fix uid/gid when creating over whiteoutMiklos Szeredi
Fix a regression when creating a file over a whiteout. The new file/directory needs to use the current fsuid/fsgid, not the ones from the mounter's credentials. The refcounting is a bit tricky: prepare_creds() sets an original refcount, override_creds() gets one more, which revert_cred() drops. So 1) we need to expicitly put the mounter's credentials when overriding with the updated one 2) we need to put the original ref to the updated creds (and this can safely be done before revert_creds(), since we'll still have the ref from override_creds()). Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Fixes: 3fe6e52f0626 ("ovl: override creds with the ones from the superblock mounter") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-06-15debugfs: open_proxy_open(): avoid double fops releaseNicolai Stange
Debugfs' open_proxy_open(), the ->open() installed at all inodes created through debugfs_create_file_unsafe(), - grabs a reference to the original file_operations instance passed to debugfs_create_file_unsafe() via fops_get(), - installs it at the file's ->f_op by means of replace_fops() - and calls fops_put() on it. Since the semantics of replace_fops() are such that the reference's ownership is transferred, the subsequent fops_put() will result in a double release when the file is eventually closed. Currently, this is not an issue since fops_put() basically does a module_put() on the file_operations' ->owner only and there don't exist any modules calling debugfs_create_file_unsafe() yet. This is expected to change in the future though, c.f. commit c64688081490 ("debugfs: add support for self-protecting attribute file fops"). Remove the call to fops_put() from open_proxy_open(). Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-15debugfs: full_proxy_open(): free proxy on ->open() failureNicolai Stange
Debugfs' full_proxy_open(), the ->open() installed at all inodes created through debugfs_create_file(), - grabs a reference to the original struct file_operations instance passed to debugfs_create_file(), - dynamically allocates a proxy struct file_operations instance wrapping the original - and installs this at the file's ->f_op. Afterwards, it calls the original ->open() and passes its return value back to the VFS layer. Now, if that return value indicates failure, the VFS layer won't ever call ->release() and thus, neither the reference to the original file_operations nor the memory for the proxy file_operations will get released, i.e. both are leaked. Upon failure of the original fops' ->open(), undo the proxy installation. That is: - Set the struct file ->f_op to what it had been when full_proxy_open() was entered. - Drop the reference to the original file_operations. - Free the memory holding the proxy file_operations. Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private data") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-15mnt: Account for MS_RDONLY in fs_fully_visibleEric W. Biederman
In rare cases it is possible for s_flags & MS_RDONLY to be set but MNT_READONLY to be clear. This starting combination can cause fs_fully_visible to fail to ensure that the new mount is readonly. Therefore force MNT_LOCK_READONLY in the new mount if MS_RDONLY is set on the source filesystem of the mount. In general both MS_RDONLY and MNT_READONLY are set at the same for mounts so I don't expect any programs to care. Nor do I expect MS_RDONLY to be set on proc or sysfs in the initial user namespace, which further decreases the likelyhood of problems. Which means this change should only affect system configurations by paranoid sysadmins who should welcome the additional protection as it keeps people from wriggling out of their policies. Cc: stable@vger.kernel.org Fixes: 8c6cf9cc829f ("mnt: Modify fs_fully_visible to deal with locked ro nodev and atime") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-06-14pstore/ram: add Device Tree bindingsGreg Hackmann
ramoops is one of the remaining places where ARM vendors still rely on board-specific shims. Device Tree lets us replace those shims with generic code. These bindings mirror the ramoops module parameters, with two small differences: (1) dump_oops becomes an optional "no-dump-oops" property, since ramoops sets dump_oops=1 by default. (2) mem_type=1 becomes the more self-explanatory "unbuffered" property. Signed-off-by: Greg Hackmann <ghackmann@google.com> [fixed platform_get_drvdata() crash, thanks to Brian Norris] [switched from u64 to u32 to simplify code, various whitespace fixes] [use dev_of_node() to gain code-elimination for CONFIG_OF=n] Signed-off-by: Kees Cook <keescook@chromium.org>
2016-06-14nfsd: Fix NFSD_MDS_PR_KEY on 32-bit by adding ULL postfixGeert Uytterhoeven
On 32-bit: fs/nfsd/blocklayout.c: In function ‘nfsd4_block_get_device_info_scsi’: fs/nfsd/blocklayout.c:337: warning: integer constant is too large for ‘long’ type fs/nfsd/blocklayout.c:344: warning: integer constant is too large for ‘long’ type fs/nfsd/blocklayout.c: In function ‘nfsd4_scsi_fence_client’: fs/nfsd/blocklayout.c:385: warning: integer constant is too large for ‘long’ type Add the missing "ULL" postfix to 64-bit constant NFSD_MDS_PR_KEY to fix this. Fixes: f99d4fbdae6765d0 ("nfsd: add SCSI layout support") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-06-13f2fs: fix deadlock in add_link failureJaegeuk Kim
mkdir sync_dirty_inode - init_inode_metadata - lock_page(node) - make_empty_dir - filemap_fdatawrite() - do_writepages - lock_page(data) - write_page(data) - lock_page(node) - f2fs_init_acl - error - truncate_inode_pages - lock_page(data) So, we don't need to truncate data pages in this error case, which will be done by f2fs_evict_inode. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-13f2fs: introduce mode=lfs mount optionJaegeuk Kim
This mount option is to enable original log-structured filesystem forcefully. So, there should be no random writes for main area. Especially, this supports host-managed SMR device. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-13NFS: Don't flush caches for a getattr that races with writebackTrond Myklebust
If there were outstanding writes then chalk up the unexpected change attribute on the server to them. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-06-13freevxfs: update Kconfig informationKrzysztof Błaszkowski
Signed-off-by: Krzysztof Błaszkowski <kb@sysmikro.com.pl> Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-06-12freevxfs: refactor readdir and lookup codeKrzysztof Błaszkowski
This change fixes also a buffer overflow which was caused by accessing address space beyond mapped page Signed-off-by: Krzysztof Błaszkowski <kb@sysmikro.com.pl> Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-06-12freevxfs: fix lack of inode initializationKrzysztof Błaszkowski
There is nothing worse than just allocated inode without being initialized _once(). Signed-off-by: Krzysztof Błaszkowski <kb@sysmikro.com.pl> Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-06-12freevxfs: fix memory leak in vxfs_read_fshead()Krzysztof Błaszkowski
Every successful mount two structs vxfs_fsh were not released. Signed-off-by: Krzysztof Błaszkowski <kb@sysmikro.com.pl> Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-06-12autofs racesAl Viro
* make autofs4_expire_indirect() skip the dentries being in process of expiry * do *not* mess with list_move(); making sure that dentry with AUTOFS_INF_EXPIRING are not picked for expiry is enough. * do not remove NO_RCU when we set EXPIRING, don't bother with smp_mb() there. Clear it at the same time we clear EXPIRING. Makes a bunch of tests simpler. * rename NO_RCU to WANT_EXPIRE, which is what it really is. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-11fs/dcache.c: Save one 32-bit multiply in dcache lookupGeorge Spelvin
Noe that we're mixing in the parent pointer earlier, we don't need to use hash_32() to mix its bits. Instead, we can just take the msbits of the hash value directly. For those applications which use the partial_name_hash(), move the multiply to end_name_hash. Signed-off-by: George Spelvin <linux@sciencehorizons.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-10vfs: make the string hashes salt the hashLinus Torvalds
We always mixed in the parent pointer into the dentry name hash, but we did it late at lookup time. It turns out that we can simplify that lookup-time action by salting the hash with the parent pointer early instead of late. A few other users of our string hashes also wanted to mix in their own pointers into the hash, and those are updated to use the same mechanism. Hash users that don't have any particular initial salt can just use the NULL pointer as a no-salt. Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-10rxrpc: Limit the listening backlogDavid Howells
Limit the socket incoming call backlog queue size so that a remote client can't pump in sufficient new calls that the server runs out of memory. Note that this is partially theoretical at the moment since whilst the number of calls is limited, the number of packets trying to set up new calls is not. This will be addressed in a later patch. If the caller of listen() specifies a backlog INT_MAX, then they get the current maximum; anything else greater than max_backlog or anything negative incurs EINVAL. The limit on the maximum queue size can be set by: echo N >/proc/sys/net/rxrpc/max_backlog where 4<=N<=32. Further, set the default backlog to 0, requiring listen() to be called before we start actually queueing new calls. Whilst this kind of is a change in the UAPI, the caller can't actually *accept* new calls anyway unless they've first called listen() to put the socket into the LISTENING state - thus the aforementioned new calls would otherwise just sit there, eating up kernel memory. (Note that sockets that don't have a non-zero service ID bound don't get incoming calls anyway.) Given that the default backlog is now 0, make the AFS filesystem call kernel_listen() to set the maximum backlog for itself. Possible improvements include: (1) Trimming a too-large backlog to max_backlog when listen is called. (2) Trimming the backlog value whenever the value is used so that changes to max_backlog are applied to an open socket automatically. Note that the AFS filesystem opens one socket and keeps it open for extended periods, so would miss out on changes to max_backlog. (3) Having a separate setting for the AFS filesystem. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-10Merge branch 'for-linus-4.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "Has some fixes and some new self tests for btrfs. The self tests are usually disabled in the .config file (unless you're doing btrfs dev work), and this bunch is meant to find problems with the 64K page size patches. Jeff has a patch to help people see if they are using the hardware assist crc32c module, which really helps us nail down problems when people ask why crcs are using so much CPU. Otherwise, it's small fixes" * 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: self-tests: Fix extent buffer bitmap test fail on BE system Btrfs: self-tests: Fix test_bitmaps fail on 64k sectorsize Btrfs: self-tests: Use macros instead of constants and add missing newline Btrfs: self-tests: Support testing all possible sectorsizes and nodesizes Btrfs: self-tests: Execute page straddling test only when nodesize < PAGE_SIZE btrfs: advertise which crc32c implementation is being used at module load Btrfs: add validadtion checks for chunk loading Btrfs: add more validation checks for superblock Btrfs: clear uptodate flags of pages in sys_array eb Btrfs: self-tests: Support non-4k page size Btrfs: Fix integer overflow when calculating bytes_per_bitmap Btrfs: test_check_exists: Fix infinite loop when searching for free space entries Btrfs: end transaction if we abort when creating uuid root btrfs: Use __u64 in exported linux/btrfs.h.
2016-06-10Merge branch 'stacking-fixes' (vfs stacking fixes from Jann)Linus Torvalds
Merge filesystem stacking fixes from Jann Horn. * emailed patches from Jann Horn <jannh@google.com>: sched: panic on corrupted stack end ecryptfs: forbid opening files without mmap handler proc: prevent stacking filesystems on top
2016-06-10ecryptfs: forbid opening files without mmap handlerJann Horn
This prevents users from triggering a stack overflow through a recursive invocation of pagefault handling that involves mapping procfs files into virtual memory. Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Tyler Hicks <tyhicks@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-10proc: prevent stacking filesystems on topJann Horn
This prevents stacking filesystems (ecryptfs and overlayfs) from using procfs as lower filesystem. There is too much magic going on inside procfs, and there is no good reason to stack stuff on top of procfs. (For example, procfs does access checks in VFS open handlers, and ecryptfs by design calls open handlers from a kernel thread that doesn't drop privileges or so.) Signed-off-by: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-10much milder d_walk() raceAl Viro
d_walk() relies upon the tree not getting rearranged under it without rename_lock being touched. And we do grab rename_lock around the places that change the tree topology. Unfortunately, branch reordering is just as bad from d_walk() POV and we have two places that do it without touching rename_lock - one in handling of cursors (for ramfs-style directories) and another in autofs. autofs one is a separate story; this commit deals with the cursors. * mark cursor dentries explicitly at allocation time * make __dentry_kill() leave ->d_child.next pointing to the next non-cursor sibling, making sure that it won't be moved around unnoticed before the parent is relocked on ascend-to-parent path in d_walk(). * make d_walk() skip cursors explicitly; strictly speaking it's not necessary (all callbacks we pass to d_walk() are no-ops on cursors), but it makes analysis easier. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-10GFS2: don't set rgrp gl_object until it's inserted into rgrp treeBob Peterson
Before this patch, function read_rindex_entry would set a rgrp glock's gl_object pointer to itself before inserting the rgrp into the rgrp rbtree. The problem is: if another process was also reading the rgrp in, and had already inserted its newly created rgrp, then the second call to read_rindex_entry would overwrite that value, then return a bad return code to the caller. Later, other functions would reference the now-freed rgrp memory by way of gl_object. In some cases, that could result in gfs2_rgrp_brelse being called twice for the same rgrp: once for the failed attempt and once for the "real" rgrp release. Eventually the kernel would panic. There are also a number of other things that could go wrong when a kernel module is accessing freed storage. For example, this could result in rgrp corruption because the fake rgrp would point to a fake bitmap in memory too, causing gfs2_inplace_reserve to search some random memory for free blocks, and find some, since we were never setting rgd->rd_bits to NULL before freeing it. This patch fixes the problem by not setting gl_object until we have successfully inserted the rgrp into the rbtree. Also, it sets rd_bits to NULL as it frees them, which will ensure any accidental access to the wrong rgrp will result in a kernel panic rather than file system corruption, which is preferred. Signed-off-by: Bob Peterson <rpeterso@redhat.com>