summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2019-09-05vfs: Convert romfs to use the new mount APIDavid Howells
Convert the romfs filesystem to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-mtd@lists.infradead.org cc: linux-block@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05vfs: Add a single-or-reconfig keying to vfs_get_super()David Howells
Add an additional keying mode to vfs_get_super() to indicate that only a single superblock should exist in the system, and that, if it does, further mounts should invoke reconfiguration upon it. This allows mount_single() to be replaced. [Fix by Eric Biggers folded in] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05vfs: Create fs_context-aware mount_bdev() replacementDavid Howells
Create a function, get_tree_bdev(), that is fs_context-aware and a ->get_tree() counterpart of mount_bdev(). It caches the block device pointer in the fs_context struct so that this information can be passed into sget_fc()'s test and set functions. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jens Axboe <axboe@kernel.dk> cc: linux-block@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05new helper: get_tree_keyed()Al Viro
For vfs_get_keyed_super users. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05vfs: set fs_context::user_ns for reconfigureEric Biggers
fs_context::user_ns is used by fuse_parse_param(), even during remount, so it needs to be set to the existing value for reconfigure. Reproducer: #include <fcntl.h> #include <sys/mount.h> int main() { char opts[128]; int fd = open("/dev/fuse", O_RDWR); sprintf(opts, "fd=%d,rootmode=040000,user_id=0,group_id=0", fd); mkdir("mnt", 0777); mount("foo", "mnt", "fuse.foo", 0, opts); mount("foo", "mnt", "fuse.foo", MS_REMOUNT, opts); } Crash: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 0 PID: 129 Comm: syz_make_kuid Not tainted 5.3.0-rc5-next-20190821 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014 RIP: 0010:map_id_range_down+0xb/0xc0 kernel/user_namespace.c:291 [...] Call Trace: map_id_down kernel/user_namespace.c:312 [inline] make_kuid+0xe/0x10 kernel/user_namespace.c:389 fuse_parse_param+0x116/0x210 fs/fuse/inode.c:523 vfs_parse_fs_param+0xdb/0x1b0 fs/fs_context.c:145 vfs_parse_fs_string+0x6a/0xa0 fs/fs_context.c:188 generic_parse_monolithic+0x85/0xc0 fs/fs_context.c:228 parse_monolithic_mount_data+0x1b/0x20 fs/fs_context.c:708 do_remount fs/namespace.c:2525 [inline] do_mount+0x39a/0xa60 fs/namespace.c:3107 ksys_mount+0x7d/0xd0 fs/namespace.c:3325 __do_sys_mount fs/namespace.c:3339 [inline] __se_sys_mount fs/namespace.c:3336 [inline] __x64_sys_mount+0x20/0x30 fs/namespace.c:3336 do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reported-by: syzbot+7d6a57304857423318a5@syzkaller.appspotmail.com Fixes: 408cbe695350 ("vfs: Convert fuse to use the new mount API") Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05erofs: use read_cache_page_gfp for erofs_get_meta_pageGao Xiang
As Christoph said [1], "I'd much prefer to just use read_cache_page_gfp, and live with the fact that this allocates bufferheads behind you for now. I'll try to speed up my attempts to get rid of the buffer heads on the block device mapping instead. " This simplifies the code a lot and a minor thing is "no REQ_META (e.g. for blktrace) on metadata at all..." [1] https://lore.kernel.org/r/20190903153704.GA2201@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-26-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: always use iget5_lockedGao Xiang
As Christoph said [1] [2], "Just use the slightly more complicated 32-bit version everywhere so that you have a single actually tested code path. And then remove this helper. " [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ [2] https://lore.kernel.org/r/20190902125320.GA16726@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-25-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: use read_mapping_page instead of sb_breadGao Xiang
As Christoph said [1], "This seems to be your only direct use of buffer heads, which while not deprecated are a bit of an ugly step child. So if you can easily avoid creating a buffer_head dependency in a new filesystem I think you should avoid it. " [1] https://lore.kernel.org/r/20190902125109.GA9826@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-24-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: rename errln/infoln/debugln to erofs_{err, info, dbg}Gao Xiang
Add prefix "erofs_" to these functions and print sb->s_id as a prefix to erofs_{err, info} so that the user knows which file system is affected. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-23-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: save one level of indentationGao Xiang
As Christoph said [1], ".. and save one level of indentation." [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-22-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill use_vmap module parameterGao Xiang
As Christoph said [1], "vm_map_ram is supposed to generally behave better. So if it doesn't please report that that to the arch maintainer and linux-mm so that they can look into the issue. Having user make choices of deep down kernel internals is just a horrible interface. Please talk to maintainers of other bits of the kernel if you see issues and / or need enhancements. " Let's redo the previous conclusion and kill the vmap approach. [1] https://lore.kernel.org/r/20190830165533.GA10909@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-21-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill all erofs specific fault injectionGao Xiang
As Christoph suggested [1], "Please just use plain kmalloc everywhere and let the normal kernel error injection code take care of injeting any errors." [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-20-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: add "erofs_" prefix for common and short functionsGao Xiang
Add erofs_ prefix to free_inode, alloc_inode, ... Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-19-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill __submit_bio()Gao Xiang
As Christoph pointed out [1], " Why is there __submit_bio which really just obsfucates what is going on? Also why is __submit_bio using bio_set_op_attrs instead of opencode it as the comment right next to it asks you to? " Let's use submit_bio directly instead. [1] https://lore.kernel.org/r/20190830162812.GA10694@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-18-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill prio and nofail of erofs_get_meta_page()Gao Xiang
As Christoph pointed out [1], "Why is there __erofs_get_meta_page with the two weird booleans instead of a single erofs_get_meta_page that gets and gfp_t for additional flags and an unsigned int for additional bio op flags." And since all callers can handle errors, let's kill prio and nofail and erofs_get_inline_page() now. [1] https://lore.kernel.org/r/20190830162812.GA10694@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-17-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: localize erofs_grab_bio()Gao Xiang
As Christoph pointed out [1], "erofs_grab_bio tries to handle a bio_alloc failure, except that the function will not actually fail due the mempool backing it." Sorry about useless code, fix it now and localize erofs_grab_bio [2]. [1] https://lore.kernel.org/r/20190830162812.GA10694@infradead.org/ [2] https://lore.kernel.org/r/20190902122016.GL15931@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-16-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill verbose debug info in erofs_fill_superGao Xiang
As Christoph said [1], "That is some very verbose debug info. We usually don't add that and let people trace the function instead. " [1] https://lore.kernel.org/r/20190829101545.GC20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-15-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: use dsb instead of layout for ondisk super_blockGao Xiang
As Christoph pointed out [1], "Why is the variable name for the on-disk subperblock layout? We usually still calls this something with sb in the name, e.g. dsb. for disksuper block. " Let's fix it. [1] https://lore.kernel.org/r/20190829101545.GC20598@infradead.org/ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-14-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: better erofs symlink stuffsGao Xiang
Fix as Christoph suggested [1] [2], "remove is_inode_fast_symlink and just opencode it in the few places using it" and "Please just set the ops directly instead of obsfucating that in a single caller, single line inline function. And please set it instead of the normal symlink iops in the same place where you also set those." [1] https://lore.kernel.org/r/20190830163910.GB29603@infradead.org/ [2] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-13-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: update comments in inode.cGao Xiang
As Christoph suggested [1], update them all. [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-12-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: update erofs_fs.h commentsGao Xiang
As Christoph said [1] [2], update it now. [1] https://lore.kernel.org/r/20190902124521.GA22153@infradead.org/ [2] https://lore.kernel.org/r/20190902120548.GB15931@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-11-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: use erofs_inode namingGao Xiang
As Christoph suggested [1], "Why is this called vnode instead of inode? That seems like a rather odd naming for a Linux file system." [1] https://lore.kernel.org/r/20190829101545.GC20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-10-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill erofs_{init,exit}_inode_cacheGao Xiang
As Christoph said [1] "having this function seems entirely pointless", let's kill those. filesystem function name ext2,f2fs,ext4,isofs,squashfs,cifs,... init_inodecache In addition, add a necessary "rcu_barrier()" on exit_fs(); [1] https://lore.kernel.org/r/20190829101545.GC20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-9-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: better naming for erofs inode related stuffsGao Xiang
updates inode naming - kill is_inode_layout_compression [1] - kill magic underscores [2] [3] - better naming for datamode & data_mapping_mode [3] - better naming erofs_inode_{compact, extended} [4] [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ [2] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ [3] https://lore.kernel.org/r/20190902122627.GN15931@infradead.org/ [4] https://lore.kernel.org/r/20190902125438.GA17750@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-8-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: use feature_incompat rather than requirementsGao Xiang
As Christoph said [1], "This is only cosmetic, why not stick to feature_compat and feature_incompat?" In my thought, requirements means "incompatible" instead of "feature" though. [1] https://lore.kernel.org/r/20190902125109.GA9826@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-7-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: update erofs_inode_is_data_compressed helperGao Xiang
As Christoph said, "This looks like a really obsfucated way to write: return datamode == EROFS_INODE_FLAT_COMPRESSION || datamode == EROFS_INODE_FLAT_COMPRESSION_LEGACY; " Although I had my own consideration, it's the right way for now. [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-6-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill __packed for on-disk structuresGao Xiang
As Christoph suggested "Please don't add __packed" [1], remove all __packed except struct erofs_dirent here. Note that all on-disk fields except struct erofs_dirent (12 bytes with a 8-byte nid) in EROFS are naturally aligned. [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-5-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: some macros are much more readable as a functionGao Xiang
As Christoph suggested [1], these macros are much more readable as a function. [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-4-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: on-disk format should have explicitly assigned numbersGao Xiang
As Christoph suggested [1], on-disk format should have explicitly assigned numbers. [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-3-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: remove all the byte offset commentsGao Xiang
As Christoph suggested [1], "Please remove all the byte offset comments. that is something that can easily be checked with gdb or pahole." [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-2-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-04ext4: Reduce ext4 timestamp warningsDeepa Dinamani
When ext4 file systems were created intentionally with 128 byte inodes, the rate-limited warning of eventual possible timestamp overflow are still emitted rather frequently. Remove the warning for now. Discussion for whether any warning is needed, and where it should be emitted, can be found at https://lore.kernel.org/lkml/1567523922.5576.57.camel@lca.pw/. I can post a separate follow-up patch after the conclusion. Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-09-04configfs: provide exclusion between IO and removalsAl Viro
Make sure that attribute methods are not called after the item has been removed from the tree. To do so, we * at the point of no return in removals, grab ->frag_sem exclusive and mark the fragment dead. * call the methods of attributes with ->frag_sem taken shared and only after having verified that the fragment is still alive. The main benefit is for method instances - they are guaranteed that the objects they are accessing *and* all ancestors are still there. Another win is that we don't need to bother with extra refcount on config_item when opening a file - the item will be alive for as long as it stays in the tree, and we won't touch it/attributes/any associated data after it's been removed from the tree. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04gfs2: Use async glocks for renameBob Peterson
Because s_vfs_rename_mutex is not cluster-wide, multiple nodes can reverse the roles of which directories are "old" and which are "new" for the purposes of rename. This can cause deadlocks where two nodes end up waiting for each other. There can be several layers of directory dependencies across many nodes. This patch fixes the problem by acquiring all gfs2_rename's inode glocks asychronously and waiting for all glocks to be acquired. That way all inodes are locked regardless of the order. The timeout value for multiple asynchronous glocks is calculated to be the total of the individual wait times for each glock times two. Since gfs2_exchange is very similar to gfs2_rename, both functions are patched in the same way. A new async glock wait queue, sd_async_glock_wait, keeps a list of waiters for these events. If gfs2's holder_wake function detects an async holder, it wakes up any waiters for the event. The waiter only tests whether any of its requests are still pending. Since the glocks are sent to dlm asychronously, the wait function needs to check to see which glocks, if any, were granted. If a glock is granted by dlm (and therefore held), its minimum hold time is checked and adjusted as necessary, as other glock grants do. If the event times out, all glocks held thus far must be dequeued to resolve any existing deadlocks. Then, if there are any outstanding locking requests, we need to loop around and wait for dlm to respond to those requests too. After we release all requests, we return -ESTALE to the caller (vfs rename) which loops around and retries the request. Node1 Node2 --------- --------- 1. Enqueue A Enqueue B 2. Enqueue B Enqueue A 3. A granted 6. B granted 7. Wait for B 8. Wait for A 9. A times out (since Node 1 holds A) 10. Dequeue B (since it was granted) 11. Wait for all requests from DLM 12. B Granted (since Node2 released it in step 10) 13. Rename 14. Dequeue A 15. DLM Grants A 16. Dequeue A (due to the timeout and since we no longer have B held for our task). 17. Dequeue B 18. Return -ESTALE to vfs 19. VFS retries the operation, goto step 1. This release-all-locks / acquire-all-locks may slow rename / exchange down as both nodes struggle in the same way and do the same thing. However, this will only happen when there is contention for the same inodes, which ought to be rare. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-04gfs2: create function gfs2_glock_update_hold_timeAndreas Gruenbacher
This patch moves the code that updates glock minimum hold time to a separate function. This will be called by a future patch. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2019-09-04gfs2: separate holder for rgrps in gfs2_renameBob Peterson
Before this patch, gfs2_rename added a holder for the rgrp glock to its array of holders, ghs. There's nothing wrong with that, but this patch separates it into a separate holder. This is done to ensure it's always locked last as per the proper glock lock ordering, and also to pave the way for a future patch in which we will lock the non-rgrp glocks asynchronously. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-04gfs2: Delete an unnecessary check before brelse()Markus Elfring
The brelse() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. [The same applies to brelse() in gfs2_dir_no_add (which Coccinelle apparently missed), so fix that as well.] Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-04gfs2: Minor PAGE_SIZE arithmetic cleanupsAndreas Gruenbacher
Replace divisions by PAGE_SIZE with shifts by PAGE_SHIFT and similar. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-04fs-udf: Delete an unnecessary check before brelse()Markus Elfring
The brelse() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Link: https://lore.kernel.org/r/a254c1d1-0109-ab51-c67a-edc5c1c4b4cd@web.de Signed-off-by: Jan Kara <jack@suse.cz>
2019-09-04ext2: Delete an unnecessary check before brelse()Markus Elfring
The brelse() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Link: https://lore.kernel.org/r/51dea296-2207-ebc0-bac3-13f3e5c3b235@web.de Signed-off-by: Jan Kara <jack@suse.cz>
2019-09-04udf: Drop forward function declarationsJan Kara
Move some functions to make forward declarations unnecessary. Signed-off-by: Jan Kara <jack@suse.cz>
2019-09-04udf: Verify domain identifier fieldsJan Kara
OSTA UDF standard defines that domain identifier in logical volume descriptor and file set descriptor should contain a particular string and the identifier suffix contains flags possibly making media write-protected. Verify these constraints and allow only read-only mount if they are not met. Tested-by: Steven J. Magnani <steve@digidescorp.com> Reviewed-by: Steven J. Magnani <steve@digidescorp.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-09-04erofs: using switch-case while checking the inode type.Pratik Shinde
while filling the linux inode, using switch-case statement to check the type of inode. switch-case statement looks more clean here. Signed-off-by: Pratik Shinde <pratikshinde320@gmail.com> Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190830095615.10995-1-pratikshinde320@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-03xfs: Fix deadlock between AGI and AGF with RENAME_WHITEOUTkaixuxia
When performing rename operation with RENAME_WHITEOUT flag, we will hold AGF lock to allocate or free extents in manipulating the dirents firstly, and then doing the xfs_iunlink_remove() call last to hold AGI lock to modify the tmpfile info, so we the lock order AGI->AGF. The big problem here is that we have an ordering constraint on AGF and AGI locking - inode allocation locks the AGI, then can allocate a new extent for new inodes, locking the AGF after the AGI. Hence the ordering that is imposed by other parts of the code is AGI before AGF. So we get an ABBA deadlock between the AGI and AGF here. Process A: Call trace: ? __schedule+0x2bd/0x620 schedule+0x33/0x90 schedule_timeout+0x17d/0x290 __down_common+0xef/0x125 ? xfs_buf_find+0x215/0x6c0 [xfs] down+0x3b/0x50 xfs_buf_lock+0x34/0xf0 [xfs] xfs_buf_find+0x215/0x6c0 [xfs] xfs_buf_get_map+0x37/0x230 [xfs] xfs_buf_read_map+0x29/0x190 [xfs] xfs_trans_read_buf_map+0x13d/0x520 [xfs] xfs_read_agf+0xa6/0x180 [xfs] ? schedule_timeout+0x17d/0x290 xfs_alloc_read_agf+0x52/0x1f0 [xfs] xfs_alloc_fix_freelist+0x432/0x590 [xfs] ? down+0x3b/0x50 ? xfs_buf_lock+0x34/0xf0 [xfs] ? xfs_buf_find+0x215/0x6c0 [xfs] xfs_alloc_vextent+0x301/0x6c0 [xfs] xfs_ialloc_ag_alloc+0x182/0x700 [xfs] ? _xfs_trans_bjoin+0x72/0xf0 [xfs] xfs_dialloc+0x116/0x290 [xfs] xfs_ialloc+0x6d/0x5e0 [xfs] ? xfs_log_reserve+0x165/0x280 [xfs] xfs_dir_ialloc+0x8c/0x240 [xfs] xfs_create+0x35a/0x610 [xfs] xfs_generic_create+0x1f1/0x2f0 [xfs] ... Process B: Call trace: ? __schedule+0x2bd/0x620 ? xfs_bmapi_allocate+0x245/0x380 [xfs] schedule+0x33/0x90 schedule_timeout+0x17d/0x290 ? xfs_buf_find+0x1fd/0x6c0 [xfs] __down_common+0xef/0x125 ? xfs_buf_get_map+0x37/0x230 [xfs] ? xfs_buf_find+0x215/0x6c0 [xfs] down+0x3b/0x50 xfs_buf_lock+0x34/0xf0 [xfs] xfs_buf_find+0x215/0x6c0 [xfs] xfs_buf_get_map+0x37/0x230 [xfs] xfs_buf_read_map+0x29/0x190 [xfs] xfs_trans_read_buf_map+0x13d/0x520 [xfs] xfs_read_agi+0xa8/0x160 [xfs] xfs_iunlink_remove+0x6f/0x2a0 [xfs] ? current_time+0x46/0x80 ? xfs_trans_ichgtime+0x39/0xb0 [xfs] xfs_rename+0x57a/0xae0 [xfs] xfs_vn_rename+0xe4/0x150 [xfs] ... In this patch we move the xfs_iunlink_remove() call to before acquiring the AGF lock to preserve correct AGI/AGF locking order. Signed-off-by: kaixuxia <kaixuxia@tencent.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-03xfs: define a flags field for the AG geometry ioctl structureDarrick J. Wong
Define a flags field for the AG geometry ioctl structure. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-09-03xfs: add a xfs_valid_startblock helperChristoph Hellwig
Add a helper that validates the startblock is valid. This checks for a non-zero block on the main device, but skips that check for blocks on the realtime device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-03devpts_pty_kill(): don't bother with d_delete()Al Viro
we are not retaining dentries there anyway (simple_dentry_operations), so d_delete()+dput() == d_drop()+dput() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-03fs/namei.c: keep track of nd->root refcount statusAl Viro
The rules for nd->root are messy: * if we have LOOKUP_ROOT, it doesn't contribute to refcounts * if we have LOOKUP_RCU, it doesn't contribute to refcounts * if nd->root.mnt is NULL, it doesn't contribute to refcounts * otherwise it does contribute terminate_walk() needs to drop the references if they are contributing. So everything else should be careful not to confuse it, leading to rather convoluted code. It's easier to keep track of whether we'd grabbed the reference(s) explicitly. Use a new flag for that. Don't bother with zeroing nd->root.mnt on unlazy failures and in terminate_walk - it's not needed anymore (terminate_walk() won't care and the next path_init() will zero nd->root in !LOOKUP_ROOT case anyway). Resulting rules for nd->root refcounts are much simpler: they are contributing iff LOOKUP_ROOT_GRABBED is set in nd->flags. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-039p/vfs_super.c: Remove unused parameter data in v9fs_fill_superBharath Vedartham
v9fs_fill_super has a param 'void *data' which is unused in the function. This patch removes the 'void *data' param in v9fs_fill_super and changes the parameters in all function calls of v9fs_fill_super. Link: http://lkml.kernel.org/r/20190523165619.GA4209@bharath12345-Inspiron-5559 Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
2019-09-039p/cache.c: Fix memory leak in v9fs_cache_session_get_cookieBharath Vedartham
v9fs_cache_session_get_cookie assigns a random cachetag to v9ses->cachetag, if the cachetag is not assigned previously. v9fs_random_cachetag allocates memory to v9ses->cachetag with kmalloc and uses scnprintf to fill it up with a cachetag. But if scnprintf fails, v9ses->cachetag is not freed in the current code causing a memory leak. Fix this by freeing v9ses->cachetag it v9fs_random_cachetag fails. This was reported by syzbot, the link to the report is below: https://syzkaller.appspot.com/bug?id=f012bdf297a7a4c860c38a88b44fbee43fd9bbf3 Link: http://lkml.kernel.org/r/20190522194519.GA5313@bharath12345-Inspiron-5559 Reported-by: syzbot+3a030a73b6c1e9833815@syzkaller.appspotmail.com Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
2019-09-039p: avoid attaching writeback_fid on mmap with type PRIVATEChengguang Xu
Currently on mmap cache policy, we always attach writeback_fid whether mmap type is SHARED or PRIVATE. However, in the use case of kata-container which combines 9p(Guest OS) with overlayfs(Host OS), this behavior will trigger overlayfs' copy-up when excute command inside container. Link: http://lkml.kernel.org/r/20190820100325.10313-1-cgxu519@zoho.com.cn Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>