summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2019-06-28xfs: factor out log buffer writing from xlog_syncChristoph Hellwig
Replace the not very useful xlog_bdstrat wrapper with a new version that that takes care of all the common logic for writing log buffers. Use the opportunity to avoid overloading the buffer address with the log relative address, and to shed the unused return value. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: don't use REQ_PREFLUSH for split log writesChristoph Hellwig
If we have to split a log write because it wraps the end of the log we can't just use REQ_PREFLUSH to flush before the first log write, as the writes might get reordered somewhere in the I/O stack. Issue a manual flush in that case so that the ordering of the two log I/Os doesn't matter. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: remove XLOG_STATE_IOABORTChristoph Hellwig
This value is the only flag in ic_state, which we otherwise use as a state. Switch it to a new debug-only field and also report and actual error in the buffer in the I/O completion path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: reformat xlog_get_lowest_lsnChristoph Hellwig
Reformat xlog_get_lowest_lsn to our usual style. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: cleanup xlog_get_iclog_buffer_sizeChristoph Hellwig
We don't really need all the messy branches in the function, as it really does three things, out of which 2 are common for all branches: 1) set up mount point log buffer size and count values if not already done from mount options 2) calculate the number of log headers 3) set up all the values in struct xlog based on the above Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: remove the l_iclog_size_log field from struct xlogChristoph Hellwig
This field is never used, so we can simply kill it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: make mem_to_page available outside of xfs_buf.cChristoph Hellwig
Rename the function to kmem_to_page and move it to kmem.h together with our kmem_large allocator that may either return kmalloced or vmalloc pages. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: renumber XBF_WRITE_FAILChristoph Hellwig
Assining a numerical value that is not close to the flags defined near by is just asking for conflicts later on. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: remove the never used _XBF_COMPOUND flagChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: remove the no-op spinlock_destroy stubChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: move xfs_ino_geometry to xfs_shared.hDarrick J. Wong
The inode geometry structure isn't related to ondisk format; it's support for the mount structure. Move it to xfs_shared.h. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-28afs: Add support for the UAE error tableDavid Howells
Add support for mapping AFS UAE abort codes to Linux errno values. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-28NFS/flexfiles: Use the correct TCP timeout for flexfiles I/OTrond Myklebust
Fix a typo where we're confusing the default TCP retrans value (NFS_DEF_TCP_RETRANS) for the default TCP timeout value. Fixes: 15d03055cf39f ("pNFS/flexfiles: Set reasonable default ...") Cc: stable@vger.kernel.org # 4.8+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-06-28cifs: fix crash querying symlinks stored as reparse-pointsRonnie Sahlberg
We never parsed/returned any data from .get_link() when the object is a windows reparse-point containing a symlink. This results in the VFS layer oopsing accessing an uninitialized buffer: ... [ 171.407172] Call Trace: [ 171.408039] readlink_copy+0x29/0x70 [ 171.408872] vfs_readlink+0xc1/0x1f0 [ 171.409709] ? readlink_copy+0x70/0x70 [ 171.410565] ? simple_attr_release+0x30/0x30 [ 171.411446] ? getname_flags+0x105/0x2a0 [ 171.412231] do_readlinkat+0x1b7/0x1e0 [ 171.412938] ? __ia32_compat_sys_newfstat+0x30/0x30 ... Fix this by adding code to handle these buffers and make sure we do return a valid buffer to .get_link() CC: Stable <stable@vger.kernel.org> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-06-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The new route handling in ip_mc_finish_output() from 'net' overlapped with the new support for returning congestion notifications from BPF programs. In order to handle this I had to take the dev_loopback_xmit() calls out of the switch statement. The aquantia driver conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge tag 'for-linus-20190627' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux Pull pidfd fixes from Christian Brauner: "Userspace tools and libraries such as strace or glibc need a cheap and reliable way to tell whether CLONE_PIDFD is supported. The easiest way is to pass an invalid fd value in the return argument, perform the syscall and verify the value in the return argument has been changed to a valid fd. However, if CLONE_PIDFD is specified we currently check if pidfd == 0 and return EINVAL if not. The check for pidfd == 0 was originally added to enable us to abuse the return argument for passing additional flags along with CLONE_PIDFD in the future. However, extending legacy clone this way would be a terrible idea and with clone3 on the horizon and the ability to reuse CLONE_DETACHED with CLONE_PIDFD there's no real need for this clutch. So remove the pidfd == 0 check and help userspace out. Also, accordig to Al, anon_inode_getfd() should only be used past the point of no failure and ksys_close() should not be used at all since it is far too easy to get wrong. Al's motto being "basically, once it's in descriptor table, it's out of your control". So Al's patch switches back to what we already had in v1 of the original patchset and uses a anon_inode_getfile() + put_user() + fd_install() sequence in the success path and a fput() + put_unused_fd() in the failure path. The other two changes should be trivial" * tag 'for-linus-20190627' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux: proc: remove useless d_is_dir() check copy_process(): don't use ksys_close() on cleanups samples: make pidfd-metadata fail gracefully on older kernels fork: don't check parent_tidptr with CLONE_PIDFD
2019-06-28Merge tag 'afs-fixes-20190620' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "The in-kernel AFS client has been undergoing testing on opendev.org on one of their mirror machines. They are using AFS to hold data that is then served via apache, and Ian Wienand had reported seeing oopses, spontaneous machine reboots and updates to volumes going missing. This patch series appears to have fixed the problem, very probably due to patch (2), but it's not 100% certain. (1) Fix the printing of the "vnode modified" warning to exclude checks on files for which we don't have a callback promise from the server (and so don't expect the server to tell us when it changes). Without this, for every file or directory for which we still have an in-core inode that gets changed on the server, we may get a message logged when we next look at it. This can happen in bulk if, for instance, someone does "vos release" to update a R/O volume from a R/W volume and a whole set of files are all changed together. We only really want to log a message if the file changed and the server didn't tell us about it or we failed to track the state internally. (2) Fix accidental corruption of either afs_vlserver struct objects or the the following memory locations (which could hold anything). The issue is caused by a union that points to two different structs in struct afs_call (to save space in the struct). The call cleanup code assumes that it can simply call the cleanup for one of those structs if not NULL - when it might be actually pointing to the other struct. This means that every Volume Location RPC op is going to corrupt something. (3) Fix an uninitialised spinlock. This isn't too bad, it just causes a one-off warning if lockdep is enabled when "vos release" is called, but the spinlock still behaves correctly. (4) Fix the setting of i_block in the inode. This causes du, for example, to produce incorrect results, but otherwise should not be dangerous to the kernel" * tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix setting of i_blocks afs: Fix uninitialised spinlock afs_volume::cb_break_lock afs: Fix vlserver record corruption afs: Fix over zealous "vnode modified" warnings
2019-06-27iomap: fix page_done callback for short writesAndreas Gruenbacher
When we truncate a short write to have it retried, pass the truncated length to the page_done callback instead of the full length. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-27fs: fold __generic_write_end back into generic_write_endChristoph Hellwig
This effectively reverts a6d639da63ae ("fs: factor out a __generic_write_end helper") as we now open code what is left of that helper in iomap. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-27iomap: don't mark the inode dirty in iomap_write_endAndreas Gruenbacher
Marking the inode dirty for each page copied into the page cache can be very inefficient for file systems that use the VFS dirty inode tracking, and is completely pointless for those that don't use the VFS dirty inode tracking. So instead, only set an iomap flag when changing the in-core inode size, and open code the rest of __generic_write_end. Partially based on code from Christoph Hellwig. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-27keys: Replace uid/gid/perm permissions checking with an ACLDavid Howells
Replace the uid/gid/perm permissions checking on a key with an ACL to allow the SETATTR and SEARCH permissions to be split. This will also allow a greater range of subjects to represented. ============ WHY DO THIS? ============ The problem is that SETATTR and SEARCH cover a slew of actions, not all of which should be grouped together. For SETATTR, this includes actions that are about controlling access to a key: (1) Changing a key's ownership. (2) Changing a key's security information. (3) Setting a keyring's restriction. And actions that are about managing a key's lifetime: (4) Setting an expiry time. (5) Revoking a key. and (proposed) managing a key as part of a cache: (6) Invalidating a key. Managing a key's lifetime doesn't really have anything to do with controlling access to that key. Expiry time is awkward since it's more about the lifetime of the content and so, in some ways goes better with WRITE permission. It can, however, be set unconditionally by a process with an appropriate authorisation token for instantiating a key, and can also be set by the key type driver when a key is instantiated, so lumping it with the access-controlling actions is probably okay. As for SEARCH permission, that currently covers: (1) Finding keys in a keyring tree during a search. (2) Permitting keyrings to be joined. (3) Invalidation. But these don't really belong together either, since these actions really need to be controlled separately. Finally, there are number of special cases to do with granting the administrator special rights to invalidate or clear keys that I would like to handle with the ACL rather than key flags and special checks. =============== WHAT IS CHANGED =============== The SETATTR permission is split to create two new permissions: (1) SET_SECURITY - which allows the key's owner, group and ACL to be changed and a restriction to be placed on a keyring. (2) REVOKE - which allows a key to be revoked. The SEARCH permission is split to create: (1) SEARCH - which allows a keyring to be search and a key to be found. (2) JOIN - which allows a keyring to be joined as a session keyring. (3) INVAL - which allows a key to be invalidated. The WRITE permission is also split to create: (1) WRITE - which allows a key's content to be altered and links to be added, removed and replaced in a keyring. (2) CLEAR - which allows a keyring to be cleared completely. This is split out to make it possible to give just this to an administrator. (3) REVOKE - see above. Keys acquire ACLs which consist of a series of ACEs, and all that apply are unioned together. An ACE specifies a subject, such as: (*) Possessor - permitted to anyone who 'possesses' a key (*) Owner - permitted to the key owner (*) Group - permitted to the key group (*) Everyone - permitted to everyone Note that 'Other' has been replaced with 'Everyone' on the assumption that you wouldn't grant a permit to 'Other' that you wouldn't also grant to everyone else. Further subjects may be made available by later patches. The ACE also specifies a permissions mask. The set of permissions is now: VIEW Can view the key metadata READ Can read the key content WRITE Can update/modify the key content SEARCH Can find the key by searching/requesting LINK Can make a link to the key SET_SECURITY Can change owner, ACL, expiry INVAL Can invalidate REVOKE Can revoke JOIN Can join this keyring CLEAR Can clear this keyring The KEYCTL_SETPERM function is then deprecated. The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set, or if the caller has a valid instantiation auth token. The KEYCTL_INVALIDATE function then requires INVAL. The KEYCTL_REVOKE function then requires REVOKE. The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an existing keyring. The JOIN permission is enabled by default for session keyrings and manually created keyrings only. ====================== BACKWARD COMPATIBILITY ====================== To maintain backward compatibility, KEYCTL_SETPERM will translate the permissions mask it is given into a new ACL for a key - unless KEYCTL_SET_ACL has been called on that key, in which case an error will be returned. It will convert possessor, owner, group and other permissions into separate ACEs, if each portion of the mask is non-zero. SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY. WRITE permission turns on WRITE, REVOKE and, if a keyring, CLEAR. JOIN is turned on if a keyring is being altered. The KEYCTL_DESCRIBE function translates the ACL back into a permissions mask to return depending on possessor, owner, group and everyone ACEs. It will make the following mappings: (1) INVAL, JOIN -> SEARCH (2) SET_SECURITY -> SETATTR (3) REVOKE -> WRITE if SETATTR isn't already set (4) CLEAR -> WRITE Note that the value subsequently returned by KEYCTL_DESCRIBE may not match the value set with KEYCTL_SETATTR. ======= TESTING ======= This passes the keyutils testsuite for all but a couple of tests: (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed if the type doesn't have ->read(). You still can't actually read the key. (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't work as Other has been replaced with Everyone in the ACL. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-27keys: Pass the network namespace into request_key mechanismDavid Howells
Create a request_key_net() function and use it to pass the network namespace domain tag into DNS revolver keys and rxrpc/AFS keys so that keys for different domains can coexist in the same keyring. Signed-off-by: David Howells <dhowells@redhat.com> cc: netdev@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-afs@lists.infradead.org
2019-06-27gfs2: replace more printk with calls to fs_info and friendsBob Peterson
This patch replaces a few leftover printk errors with calls to fs_info and similar, so that the file system having the error is properly logged. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27gfs2: dump fsid when dumping glock problemsBob Peterson
Before this patch, if a glock error was encountered, the glock with the problem was dumped. But sometimes you may have lots of file systems mounted, and that doesn't tell you which file system it was for. This patch adds a new boolean parameter fsid to the dump_glock family of functions. For non-error cases, such as dumping the glocks debugfs file, the fsid is not dumped in order to keep lock dumps and glocktop as clean as possible. For all error cases, such as GLOCK_BUG_ON, the file system id is now printed. This will make it easier to debug. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27gfs2: simplify gfs2_freeze by removing caseBob Peterson
Function gfs2_freeze had a case statement that simply checked the error code, but the break statements just made the logic hard to read. This patch simplifies the logic in favor of a simple if. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27gfs2: Rename SDF_SHUTDOWN to SDF_WITHDRAWNBob Peterson
Before this patch, the superblock flag indicating when a file system is withdrawn was called SDF_SHUTDOWN. This patch simply renames it to the more obvious SDF_WITHDRAWN. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27gfs2: Warn when a journal replay overwrites a rgrp with buffersBob Peterson
This patch adds some instrumentation in gfs2's journal replay that indicates when we're about to overwrite a rgrp for which we already have a valid buffer_head. When this problem occurs, it's a situation in which this node has been granted a rgrp glock and subsequently read in buffer_heads for it, and possibly even made changes to the rgrp bits and/or allocation values. But now another node has failed and forced us to replay its journal, but its journal contains a copy of the same rgrp, without a revoke, which means we're about to overwrite a rgrp that we now rightfully own, with an obsolete copy. That is always a problem. It means the other node (which failed and left its journal to be replayed) failed to flush out its rgrp buffers, write out the revoke, and invalidate its copy before it released the glock to our possession. No node should ever release a glock until its metadata has been written to the journal and revoked and invalidated.. We also kludge around the problem and refuse to replace our good copy with the journals bad copy by not marking the buffer dirty, but never do it silently. That's wallpapering over a larger problem that still exists. IOW, if this situation can happen to this node, it can also happen to a different node and we wouldn't even know it or be able to circumvent it: Suppose we have a 3-node cluster: Node 1 fails, leaving an obsolete rgrp block in its journal without a revoke. Node 2 grabs the rgrp as soon as the rgrp glock is released and starts making changes, allocating and freeing blocks from the rgrp, etc. Node 3 replays the journal from node 1, oblivious and unaware that it's about to overwrite node 2's changes. So we still need to be vocal and log the error to make it apparent that a corruption path still exists in gfs2. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27gfs2: log which portion of the journal is replayedBob Peterson
When a journal is replayed, gfs2 logs a message similar to: jid=X: Replaying journal... This patch adds the tail and block number so that the range of the replayed block is also printed. These values will match the values shown if the journal is dumped with gfs2_edit -p journalX. The resulting output looks something like this: jid=1: Replaying journal...0x28b7 to 0x2beb This will allow us to better debug file system corruption problems. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27gfs2: eliminate tr_num_revoke_rmBob Peterson
For its journal processing, gfs2 kept track of the number of buffers added and removed on a per-transaction basis. These values are used to calculate space needed in the journal. But while these calculations make sense for the number of buffers, they make no sense for revokes. Revokes are managed in their own list, linked from the superblock. So it's entirely unnecessary to keep separate per-transaction counts for revokes added and removed. A single count will do the same job. Therefore, this patch combines the transaction revokes into a single count. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27gfs2: kthread and remount improvementsBob Peterson
Before this patch, gfs2 saved the pointers to the two daemon threads (logd and quotad) in the superblock, but they were never cleared, even if the threads were stopped (e.g. on remount -o ro). That meant that certain error conditions (like a withdrawn file system) could race. For example, xfstests generic/361 caused an IO error during remount -o ro, which caused the kthreads to be stopped, then the error flagged. Later, when the test unmounted the file system, it would try to stop the threads a second time with kthread_stop. This patch does two things: First, every time it stops the threads it zeroes out the thread pointer, and also checks whether it's NULL before trying to stop it. Second, in function gfs2_remount_fs, it was returning if an error was logged by either of the two functions for gfs2_make_fs_ro and _rw, which caused it to bypass the online uevent at the bottom of the function. This removes that bypass in favor of just running the whole function, then returning the error. That way, unmounts and remounts won't hang forever. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27gfs2: Use IS_ERR_OR_NULLKefeng Wang
Use IS_ERR_OR_NULL where appropriate. (Several more places converted by Andreas.) Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27gfs2: Clean up freeing struct gfs2_sbdAndreas Gruenbacher
Add a free_sbd function for freeing a struct gfs2_sbd. Use that for freeing a super-block descriptor, either directly or via kobject_put. Free sd_lkstats inside the kobject release function: that way, gfs2_put_super will no longer leak sd_lkstats. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27fscrypt: remove selection of CONFIG_CRYPTO_SHA256Eric Biggers
fscrypt only uses SHA-256 for AES-128-CBC-ESSIV, which isn't the default and is only recommended on platforms that have hardware accelerated AES-CBC but not AES-XTS. There's no link-time dependency, since SHA-256 is requested via the crypto API on first use. To reduce bloat, we should limit FS_ENCRYPTION to selecting the default algorithms only. SHA-256 by itself isn't that much bloat, but it's being discussed to move ESSIV into a crypto API template, which would incidentally bring in other things like "authenc" support, which would all end up being built-in since FS_ENCRYPTION is now a bool. For Adiantum encryption we already just document that users who want to use it have to enable CONFIG_CRYPTO_ADIANTUM themselves. So, let's do the same for AES-128-CBC-ESSIV and CONFIG_CRYPTO_SHA256. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-06-27ceph: fix ceph_mdsc_build_path to not stop on first componentJeff Layton
When ceph_mdsc_build_path is handed a positive dentry, it will return a zero-length path string with the base set to that dentry. This is not what we want. Always include at least one path component in the string. ceph_mdsc_build_path has behaved this way for a long time but it didn't matter until recent d_name handling rework. Fixes: 964fff7491e4 ("ceph: use ceph_mdsc_build_path instead of clone_dentry_name") Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-06-27proc: remove useless d_is_dir() checkChristian Brauner
Remove the d_is_dir() check from tgid_pidfd_to_pid(). It is pointless since you should never get &proc_tgid_base_operations for f_op on a non-directory. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Christian Brauner <christian@brauner.io>
2019-06-26fs/adfs: add time stamp and file type helpersRussell King
Add some helpers to check whether the inode has a time stamp and file type, and to parse the file type from the load address. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: super: limit idlen according to directory typeRussell King
Limit idlen according to the directory type, as idlen (the size of a fragment ID) can not be more than 16 with the "new directory" layout. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: super: fix use-after-free bugRussell King
Fix a use-after-free bug during filesystem initialisation, where we access the disc record (which is stored in a buffer) after we have released the buffer. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: super: safely update options on remountRussell King
Only update the options on remount if we successfully parse all options, rather than updating those we've managed to parse. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: super: correct superblock flagsRussell King
We don't support atime updates of any kind, and we ought to set the read-only bit if we are compiled without write support. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: clean up indirect disc addresses and fragment IDsRussell King
We use a variety of different names for the indirect disc address of the current object, use a variety of different types, and print it in a variety of different ways. Bring some consistency to this by naming it "indaddr", use u32 or __u32 as the type since it fits in 32-bits, and always print it with %06x (with no leading hex prefix.) When printing it was a directory identifer, use "dir %06x" otherwise use "object %06x". Do the same for fragment IDs and the parent indirect disc addresses. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: clean up error message printingRussell King
Overhaul our message printing: - provide a consistent way to print messages: - filesystem corruption should be reported via adfs_error() - everything else should use adfs_msg() - clean up the error message printing when mounting a filesystem - fix the messages printed by the big directory format code to only use adfs_error() when there is filesystem corruption, otherwise use adfs_msg(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: use %pV for error messagesRussell King
Rather than using vsnprintf() with a temporary buffer on the stack, use %pV to print error messages. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: use format_version from disc_recordRussell King
We only use the format version in one place during filesystem mount, so it is pointless storing it in the superblock structure. Also, we should be using the version from the disc record in the map rather than the boot block. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: add helper to get filesystem sizeRussell King
Add a helper to get the filesystem size from the disc record and eliminate the "s_size" member of the adfs superblock structure. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26fs/adfs: add helper to get discrecord from mapRussell King
Add a helper to get the disc record from the map, rather than open coding this in adfs_fill_super(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-25quota: honor quota type in Q_XGETQSTAT[V] callsEric Sandeen
The code in quota_getstate and quota_getstatev is strange; it says the returned fs_quota_stat[v] structure has room for only one type of time limits, so fills it in with the first enabled quota, even though every quotactl command must have a type sent in by the user. Instead of just picking the first enabled quota, fill in the reply with the timers for the quota type that was actually requested. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-24Merge tag 'v5.2-rc6' into perf/core, to refresh branchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24Merge tag 'v5.2-rc6' into sched/core, to refresh the branchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24io_uring: add support for sqe linksJens Axboe
With SQE links, we can create chains of dependent SQEs. One example would be queueing an SQE that's a read from one file descriptor, with the linked SQE being a write to another with the same set of buffers. An SQE link will not stall the pipeline, it'll just ensure that dependent SQEs aren't issued before the previous link has completed. Any error at submission or completion time will break the chain of SQEs. For completions, this also includes short reads or writes, as the next SQE could depend on the previous one being fully completed. Any SQE in a chain that gets canceled due to any of the above errors, will get an CQE fill with -ECANCELED as the error value. Signed-off-by: Jens Axboe <axboe@kernel.dk>