summaryrefslogtreecommitdiff
path: root/fs/nfs/inode.c
AgeCommit message (Collapse)Author
2021-04-27Merge branch 'work.inode-type-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs inode type handling updates from Al Viro: "We should never change the type bits of ->i_mode or the method tables (->i_op and ->i_fop) of a live inode. Unfortunately, not all filesystems took care to prevent that" * 'work.inode-type-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: spufs: fix bogosity in S_ISGID handling 9p: missing chunk of "fs/9p: Don't update file type when updating file attributes" openpromfs: don't do unlock_new_inode() until the new inode is set up hostfs_mknod(): don't bother with init_special_inode() cifs: have cifs_fattr_to_inode() refuse to change type on live inode cifs: have ->mkdir() handle race with another client sanely do_cifs_create(): don't set ->i_mode of something we had not created gfs2: be careful with inode refresh ocfs2_inode_lock_update(): make sure we don't change the type bits of i_mode orangefs_inode_is_stale(): i_mode type bits do *not* form a bitmap... vboxsf: don't allow to change the inode type afs: Fix updating of i_mode due to 3rd party change ceph: don't allow type or device number to change on non-I_NEW inodes ceph: fix up error handling with snapdirs new helper: inode_wrong_type()
2021-04-14NFS: Split attribute support out from the server capabilitiesTrond Myklebust
There are lots of attributes, and they are crowding out the bit space. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFS: Don't store NFS_INO_REVAL_FORCEDTrond Myklebust
NFS_INO_REVAL_FORCED is intended to tell us that the cache needs revalidation despite the fact that we hold a delegation. We shouldn't need to store it anymore, though. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Another inode revalidation improvementTrond Myklebust
If we're trying to update the inode because a previous update left the cache in a partially unrevalidated state, then allow the update if the change attrs match. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Use information about the change attribute to optimise updatesTrond Myklebust
If the NFSv4.2 server supports the 'change_attr_type' attribute, then allow the client to optimise its attribute cache update strategy. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Simplify cache consistency in nfs_check_inode_attributes()Trond Myklebust
We should not be invalidating the access or acl caches in nfs_check_inode_attributes(), since the point is we're unsure about whether the contents of the struct nfs_fattr are fully up to date. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Remove a line of code that has no effect in nfs_update_inode()Trond Myklebust
Commit 0b467264d0db ("NFS: Fix attribute revalidation") changed the way we populate the 'invalid' attribute, and made the line that strips away the NFS_INO_INVALID_ATTR bits redundant. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Fix up handling of outstanding layoutcommit in nfs_update_inode()Trond Myklebust
If there is an outstanding layoutcommit, then the list of attributes whose values are expected to change is not the full set. So let's be explicit about the full list. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Separate tracking of file mode cache validity from the uid/gidTrond Myklebust
chown()/chgrp() and chmod() are separate operations, and in addition, there are mode operations that are performed automatically by the server. So let's track mode validity separately from the file ownership validity. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Separate tracking of file nlinks cache validity from the mode/uid/gidTrond Myklebust
Rename can cause us to revalidate the access cache, so lets track the nlinks separately from the mode/uid/gid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: Don't set NFS_INO_REVAL_PAGECACHE in the inode cache validityTrond Myklebust
It is no longer necessary to preserve the NFS_INO_REVAL_PAGECACHE flag. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: Replace use of NFS_INO_REVAL_PAGECACHE when checking cache validityTrond Myklebust
When checking cache validity, be more specific than just 'we want to check the page cache validity'. In almost all cases, we want to check that change attribute, and possibly also the size. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: Add a cache validity flag argument to nfs_revalidate_inode()Trond Myklebust
Add an argument to nfs_revalidate_inode() to allow callers to specify which attributes they need to check for validity. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: nfs_setattr_update_inode() should clear the suid/sgid bitsTrond Myklebust
When we do a 'chown' or 'chgrp', the server will clear the suid/sgid bits. Ensure that we mirror that in nfs_setattr_update_inode(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: Fix up statx() resultsTrond Myklebust
If statx has valid attributes available that weren't asked for, then return them and set the result mask appropriately. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: Don't revalidate attributes that are not being asked forTrond Myklebust
If the user doesn't set STATX_UID/GID/MODE, then don't care if they are known to be stale. Ditto if we're not being asked for the file size. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: Fix up revalidation of space usedTrond Myklebust
Ensure that when the change attribute or the size change, we also remember to revalidate the space used. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: NFS_INO_REVAL_PAGECACHE should mark the change attribute invalidTrond Myklebust
When we're looking to revalidate the page cache, we should just ensure that we mark the change attribute invalid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: Mask out unsupported attributes in nfs_getattr()Trond Myklebust
We don't currently support STATX_BTIME, so don't advertise it in the return values for nfs_getattr(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-12NFS: Deal correctly with attribute generation counter overflowTrond Myklebust
We need to use unsigned long subtraction and then convert to signed in order to deal correcly with C overflow rules. Fixes: f5062003465c ("NFS: Set an attribute barrier on all updates") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-05NFS: Fix fscache invalidation in nfs_set_cache_invalid()Trond Myklebust
Ensure that we invalidate the fscache before we strip the NFS_INO_INVALID_DATA flag. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-03-08NFS: Fix open coded versions of nfs_set_cache_invalid() in NFSv4Trond Myklebust
nfs_set_cache_invalid() has code to handle delegations, and other optimisations, so let's use it when appropriate. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-03-08NFS: Fix open coded versions of nfs_set_cache_invalid()Trond Myklebust
nfs_set_cache_invalid() has code to handle delegations, and other optimisations, so let's use it when appropriate. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-03-08NFS: Clean up function nfs_mark_dir_for_revalidate()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-03-08new helper: inode_wrong_type()Al Viro
inode_wrong_type(inode, mode) returns true if setting inode->i_mode to given value would've changed the inode type. We have enough of those checks open-coded to make a helper worthwhile. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-02-26Merge tag 'nfs-for-5.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS Client Updates from Anna Schumaker: "New Features: - Support for eager writes, and the write=eager and write=wait mount options - Other Bugfixes and Cleanups: - Fix typos in some comments - Fix up fall-through warnings for Clang - Cleanups to the NFS readpage codepath - Remove FMR support in rpcrdma_convert_iovs() - Various other cleanups to xprtrdma - Fix xprtrdma pad optimization for servers that don't support RFC 8797 - Improvements to rpcrdma tracepoints - Fix up nfs4_bitmask_adjust() - Optimize sparse writes past the end of files" * tag 'nfs-for-5.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits) NFS: Support the '-owrite=' option in /proc/self/mounts and mountinfo NFS: Set the stable writes flag when initialising the super block NFS: Add mount options supporting eager writes NFS: Add support for eager writes NFS: 'flags' field should be unsigned in struct nfs_server NFS: Don't set NFS_INO_INVALID_XATTR if there is no xattr cache NFS: Always clear an invalid mapping when attempting a buffered write NFS: Optimise sparse writes past the end of file NFS: Fix documenting comment for nfs_revalidate_file_size() NFSv4: Fixes for nfs4_bitmask_adjust() xprtrdma: Clean up rpcrdma_prepare_readch() rpcrdma: Capture bytes received in Receive completion tracepoints xprtrdma: Pad optimization, revisited rpcrdma: Fix comments about reverse-direction operation xprtrdma: Refactor invocations of offset_in_page() xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map() xprtrdma: Remove FMR support in rpcrdma_convert_iovs() NFS: Add nfs_pageio_complete_read() and remove nfs_readpage_async() NFS: Call readpage_async_filler() from nfs_readpage_async() NFS: Refactor nfs_readpage() and nfs_readpage_async() to use nfs_readdesc ...
2021-02-09NFS: Don't set NFS_INO_INVALID_XATTR if there is no xattr cacheTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-02-08NFS: Always clear an invalid mapping when attempting a buffered writeTrond Myklebust
If the page cache is invalid, then we can't do read-modify-write, so ensure that we do clear it when we know it is invalid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-01-24fs: make helpers idmap mount awareChristian Brauner
Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches. As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods. Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24stat: handle idmapped mountsChristian Brauner
The generic_fillattr() helper fills in the basic attributes associated with an inode. Enable it to handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace before we store the uid and gid. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-12-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-12-02NFS: switch nfsiod to be an UNBOUND workqueue.NeilBrown
nfsiod is currently a concurrency-managed workqueue (CMWQ). This means that workitems scheduled to nfsiod on a given CPU are queued behind all other work items queued on any CMWQ on the same CPU. This can introduce unexpected latency. Occaionally nfsiod can even cause excessive latency. If the work item to complete a CLOSE request calls the final iput() on an inode, the address_space of that inode will be dismantled. This takes time proportional to the number of in-memory pages, which on a large host working on large files (e.g.. 5TB), can be a large number of pages resulting in a noticable number of seconds. We can avoid these latency problems by switching nfsiod to WQ_UNBOUND. This causes each concurrent work item to gets a dedicated thread which can be scheduled to an idle CPU. There is precedent for this as several other filesystems use WQ_UNBOUND workqueue for handling various async events. Signed-off-by: NeilBrown <neilb@suse.de> Fixes: ada609ee2ac2 ("workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02NFS: Improve handling of directory verifiersTrond Myklebust
If the server insists on using the readdir verifiers in order to allow cookies to expire, then we should ensure that we cache the verifier with the cookie, so that we can return an error if the application tries to use the expired cookie. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
2020-07-17Merge branch 'xattr-devel'Trond Myklebust
2020-07-13NFSv4.2: add client side xattr caching.Frank van der Linden
Implement client side caching for NFSv4.2 extended attributes. The cache is a per-inode hashtable, with name/value entries. There is one special entry for the listxattr cache. NFS inodes have a pointer to a cache structure. The cache structure is allocated on demand, freed when the cache is invalidated. Memory shrinkers keep the size in check. Large entries (> PAGE_SIZE) are collected by a separate shrinker, and freed more aggressively than others. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-13nfs: define and use the NFS_INO_INVALID_XATTR flagFrank van der Linden
Define the NFS_INO_INVALID_XATTR flag, to be used for the NFSv4.2 xattr cache, and use it where appropriate. No functional change as yet. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-12NFS: Allow applications to speed up readdir+statx() using AT_STATX_DONT_SYNCTrond Myklebust
If the application uses the AT_STATX_DONT_SYNC flag after doing readdir(), then we should still mark the parent inode as seeing a readdirplus hit. That ensures that we continue to use readdirplus in the 'ls -l' type of workflow to do fast lookups of the dentries. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-06-11nfs: set invalid blocks after NFSv4 writesZheng Bin
Use the following command to test nfsv4(size of file1M is 1MB): mount -t nfs -o vers=4.0,actimeo=60 127.0.0.1/dir1 /mnt cp file1M /mnt du -h /mnt/file1M -->0 within 60s, then 1M When write is done(cp file1M /mnt), will call this: nfs_writeback_done nfs4_write_done nfs4_write_done_cb nfs_writeback_update_inode nfs_post_op_update_inode_force_wcc_locked(change, ctime, mtime nfs_post_op_update_inode_force_wcc_locked nfs_set_cache_invalid nfs_refresh_inode_locked nfs_update_inode nfsd write response contains change, ctime, mtime, the flag will be clear after nfs_update_inode. Howerver, write response does not contain space_used, previous open response contains space_used whose value is 0, so inode->i_blocks is still 0. nfs_getattr -->called by "du -h" do_update |= force_sync || nfs_attribute_cache_expired -->false in 60s cache_validity = READ_ONCE(NFS_I(inode)->cache_validity) do_update |= cache_validity & (NFS_INO_INVALID_ATTR -->false if (do_update) { __nfs_revalidate_inode } Within 60s, does not send getattr request to nfsd, thus "du -h /mnt/file1M" is 0. Add a NFS_INO_INVALID_BLOCKS flag, set it when nfsv4 write is done. Fixes: 16e143751727 ("NFS: More fine grained attribute tracking") Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-04-06NFS: Clean up process of marking inode stale.Trond Myklebust
Instead of the various open coded calls to set the NFS_INO_STALE bit and call nfs_zap_caches(), consolidate them into a single function nfs_set_inode_stale(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-03-16NFS: alloc_nfs_open_context() must use the file cred when availableTrond Myklebust
If we're creating a nfs_open_context() for a specific file pointer, we must use the cred assigned to that file. Fixes: a52458b48af1 ("NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-02-12NFSv4: Fix revalidation of dentries with delegationsTrond Myklebust
If a dentry was not initially looked up while we were holding a delegation, then we do still need to revalidate that it still holds the same name. If there are multiple hard links to the same file, then all the hard links need validation. Reported-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> [Anna: Put nfs_unset_verifier_delegated() under CONFIG_NFS_V4] Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-02-03NFS: nfs_find_open_context() should use cred_fscmp()Trond Myklebust
We want to find open contexts that match our filesystem access properties. They don't have to exactly match the cred. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15NFS: Add mount option 'softreval'Trond Myklebust
Add a mount option 'softreval' that allows attribute revalidation 'getattr' calls to time out, and causes them to fall back to using the cached attributes. The use case for this option is for ensuring that we can still (slowly) traverse paths and use cached information even when the server is down. Once the server comes back up again, the getattr calls start succeeding, and the caches will revalidate as usual. The 'softreval' mount option is automatically enabled if you have specified 'softerr'. It can be turned off using the options 'nosoftreval', or 'hard'. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-11-03NFS: Convert struct nfs_fattr to use struct timespec64Trond Myklebust
NFSv4 supports 64-bit times, so we should switch to using struct timespec64 when decoding attributes. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-09-02NFS: Fix inode fileid checks in attribute revalidation codeTrond Myklebust
We want to throw out the attrbute if it refers to the mounted on fileid, and not the real fileid. However we do not want to block cache consistency updates from NFSv4 writes. Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Fixes: 7e10cc25bfa0 ("NFS: Don't refresh attributes with mounted-on-file...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-19NFS: Don't refresh attributes with mounted-on-file informationTrond Myklebust
If we've been given the attributes of the mounted-on-file, then do not use those to check or update the attributes on the application-visible inode. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-06Merge branch 'containers'Trond Myklebust
2019-07-06NFS: Add deferred cache invalidation for close-to-open consistency violationsTrond Myklebust
If the client detects that close-to-open cache consistency has been violated, and that the file or directory has been changed on the server, then do a cache invalidation when we're done working with the file. The reason we don't do an immediate cache invalidation is that we want to avoid performance problems due to false positives. Also, note that we cannot guarantee cache consistency in this situation even if we do invalidate the cache. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-06NFS: Cleanup - add nfs_clients_exit to mirror nfs_clients_initTrond Myklebust
Add a helper to clean up the struct nfs_net when it is being destroyed. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-06NFS: Create a root NFS directory in /sys/fs/nfsTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-06NFSv4: Handle the special Linux file open access modeTrond Myklebust
According to the open() manpage, Linux reserves the access mode 3 to mean "check for read and write permission on the file and return a file descriptor that can't be used for reading or writing." Currently, the NFSv4 code will ask the server to open the file, and will use an incorrect share access mode of 0. Since it has an incorrect share access mode, the client later forgets to send a corresponding close, meaning it can leak stateids on the server. Fixes: ce4ef7c0a8a05 ("NFS: Split out NFS v4 file operations") Cc: stable@vger.kernel.org # 3.6+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>