summaryrefslogtreecommitdiff
path: root/fs/overlayfs/copy_up.c
AgeCommit message (Collapse)Author
2025-02-19ovl: fix UAF in ovl_dentry_update_reval by moving dput() in ovl_link_upVasiliy Kovalev
The issue was caused by dput(upper) being called before ovl_dentry_update_reval(), while upper->d_flags was still accessed in ovl_dentry_remote(). Move dput(upper) after its last use to prevent use-after-free. BUG: KASAN: slab-use-after-free in ovl_dentry_remote fs/overlayfs/util.c:162 [inline] BUG: KASAN: slab-use-after-free in ovl_dentry_update_reval+0xd2/0xf0 fs/overlayfs/util.c:167 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114 print_address_description mm/kasan/report.c:377 [inline] print_report+0xc3/0x620 mm/kasan/report.c:488 kasan_report+0xd9/0x110 mm/kasan/report.c:601 ovl_dentry_remote fs/overlayfs/util.c:162 [inline] ovl_dentry_update_reval+0xd2/0xf0 fs/overlayfs/util.c:167 ovl_link_up fs/overlayfs/copy_up.c:610 [inline] ovl_copy_up_one+0x2105/0x3490 fs/overlayfs/copy_up.c:1170 ovl_copy_up_flags+0x18d/0x200 fs/overlayfs/copy_up.c:1223 ovl_rename+0x39e/0x18c0 fs/overlayfs/dir.c:1136 vfs_rename+0xf84/0x20a0 fs/namei.c:4893 ... </TASK> Fixes: b07d5cc93e1b ("ovl: update of dentry revalidate flags after copy up") Reported-by: syzbot+316db8a1191938280eb6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=316db8a1191938280eb6 Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org> Link: https://lore.kernel.org/r/20250214215148.761147-1-kovalev@altlinux.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-06ovl: pass realinode to ovl_encode_real_fh() instead of realdentryAmir Goldstein
We want to be able to encode an fid from an inode with no alias. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250105162404.357058-2-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-19fs: relax assertions on failure to encode file handlesAmir Goldstein
Encoding file handles is usually performed by a filesystem >encode_fh() method that may fail for various reasons. The legacy users of exportfs_encode_fh(), namely, nfsd and name_to_handle_at(2) syscall are ready to cope with the possibility of failure to encode a file handle. There are a few other users of exportfs_encode_{fh,fid}() that currently have a WARN_ON() assertion when ->encode_fh() fails. Relax those assertions because they are wrong. The second linked bug report states commit 16aac5ad1fa9 ("ovl: support encoding non-decodable file handles") in v6.6 as the regressing commit, but this is not accurate. The aforementioned commit only increases the chances of the assertion and allows triggering the assertion with the reproducer using overlayfs, inotify and drop_caches. Triggering this assertion was always possible with other filesystems and other reasons of ->encode_fh() failures and more particularly, it was also possible with the exact same reproducer using overlayfs that is mounted with options index=on,nfs_export=on also on kernels < v6.6. Therefore, I am not listing the aforementioned commit as a Fixes commit. Backport hint: this patch will have a trivial conflict applying to v6.6.y, and other trivial conflicts applying to stable kernels < v6.6. Reported-by: syzbot+ec07f6f5ce62b858579f@syzkaller.appspotmail.com Tested-by: syzbot+ec07f6f5ce62b858579f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-unionfs/671fd40c.050a0220.4735a.024f.GAE@google.com/ Reported-by: Dmitry Safonov <dima@arista.com> Closes: https://lore.kernel.org/linux-fsdevel/CAGrbwDTLt6drB9eaUagnQVgdPBmhLfqqxAf3F+Juqy_o6oP8uw@mail.gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20241219115301.465396-1-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-22Merge tag 'ovl-update-6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs updates from Amir Goldstein: - Fix a syzbot reported NULL pointer deref with bfs lower layers - Fix a copy up failure of large file from lower fuse fs - Followup cleanup of backing_file API from Miklos - Introduction and use of revert/override_creds_light() helpers, that were suggested by Christian as a mitigation to cache line bouncing and false sharing of fields in overlayfs creator_cred long lived struct cred copy. - Store up to two backing file references (upper and lower) in an ovl_file container instead of storing a single backing file in file->private_data. This is used to avoid the practice of opening a short lived backing file for the duration of some file operations and to avoid the specialized use of FDPUT_FPUT in such occasions, that was getting in the way of Al's fd_file() conversions. * tag 'ovl-update-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: Filter invalid inodes with missing lookup function ovl: convert ovl_real_fdget() callers to ovl_real_file() ovl: convert ovl_real_fdget_path() callers to ovl_real_file_path() ovl: store upper real file in ovl_file struct ovl: allocate a container struct ovl_file for ovl private context ovl: do not open non-data lower file for fsync ovl: Optimize override/revert creds ovl: pass an explicit reference of creators creds to callers ovl: use wrapper ovl_revert_creds() fs/backing-file: Convert to revert/override_creds_light() cred: Add a light version of override/revert_creds() backing-file: clean up the API ovl: properly handle large files in ovl_security_fileattr
2024-11-11ovl: use wrapper ovl_revert_creds()Vinicius Costa Gomes
Introduce ovl_revert_creds() wrapper of revert_creds() to match callers of ovl_override_creds(). Suggested-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-10-07remove pointless includes of <linux/fdtable.h>Al Viro
some of those used to be needed, some had been cargo-culted for no reason... Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-09-19Merge tag 'ovl-update-6.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs updates from Amir Goldstein: - Increase robustness of overlayfs to crashes in the case of underlying filesystems that to not guarantee metadata ordering to persistent storage (problem was reported with ubifs). - Deny mount inside container with features that require root privileges to work properly, instead of failing operations later. - Some clarifications to overlayfs documentation. * tag 'ovl-update-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: fail if trusted xattrs are needed but caller lacks permission overlayfs.rst: update metacopy section in overlayfs documentation ovl: fsync after metadata copy-up ovl: don't set the superblock's errseq_t manually
2024-08-30ovl: fsync after metadata copy-upAmir Goldstein
For upper filesystems which do not use strict ordering of persisting metadata changes (e.g. ubifs), when overlayfs file is modified for the first time, copy up will create a copy of the lower file and its parent directories in the upper layer. Permission lost of the new upper parent directory was observed during power-cut stress test. Fix by moving the fsync call to after metadata copy to make sure that the metadata copied up directory and files persists to disk before renaming from tmp to final destination. With metacopy enabled, this change will hurt performance of workloads such as chown -R, so we keep the legacy behavior of fsync only on copyup of data. Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxj-pOvmw1-uXR3qVdqtLjSkwcR9nVKcNU_vC10Zyf2miQ@mail.gmail.com/ Reported-and-tested-by: Fei Lv <feilv@asrmicro.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-07-31lsm: Refactor return value of LSM hook inode_copy_up_xattrXu Kuohai
To be consistent with most LSM hooks, convert the return value of hook inode_copy_up_xattr to 0 or a negative error code. Before: - Hook inode_copy_up_xattr returns 0 when accepting xattr, 1 when discarding xattr, -EOPNOTSUPP if it does not know xattr, or any other negative error code otherwise. After: - Hook inode_copy_up_xattr returns 0 when accepting xattr, *-ECANCELED* when discarding xattr, -EOPNOTSUPP if it does not know xattr, or any other negative error code otherwise. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-04-09security: allow finer granularity in permitting copy-up of security xattrsStefan Berger
Copying up xattrs is solely based on the security xattr name. For finer granularity add a dentry parameter to the security_inode_copy_up_xattr hook definition, allowing decisions to be based on the xattr content as well. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM,SELinux) Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-03-17ovl: relax WARN_ON in ovl_verify_area()Amir Goldstein
syzbot hit an assertion in copy up data loop which looks like it is the result of a lower file whose size is being changed underneath overlayfs. This type of use case is documented to cause undefined behavior, so returning EIO error for the copy up makes sense, but it should not be causing a WARN_ON assertion. Reported-and-tested-by: syzbot+3abd99031b42acf367ef@syzkaller.appspotmail.com Fixes: ca7ab482401c ("ovl: add permission hooks outside of do_splice_direct()") Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-02-06remap_range: merge do_clone_file_range() into vfs_clone_file_range()Amir Goldstein
commit dfad37051ade ("remap_range: move permission hooks out of do_clone_file_range()") moved the permission hooks from do_clone_file_range() out to its caller vfs_clone_file_range(), but left all the fast sanity checks in do_clone_file_range(). This makes the expensive security hooks be called in situations that they would not have been called before (e.g. fs does not support clone). The only reason for the do_clone_file_range() helper was that overlayfs did not use to be able to call vfs_clone_file_range() from copy up context with sb_writers lock held. However, since commit c63e56a4a652 ("ovl: do not open/llseek lower file with upper sb_writers held"), overlayfs just uses an open coded version of vfs_clone_file_range(). Merge_clone_file_range() into vfs_clone_file_range(), restoring the original order of checks as it was before the regressing commit and adapt the overlayfs code to call vfs_clone_file_range() before the permission hooks that were added by commit ca7ab482401c ("ovl: add permission hooks outside of do_splice_direct()"). Note that in the merge of do_clone_file_range(), the file_start_write() context was reduced to cover ->remap_file_range() without holding it over the permission hooks, which was the reason for doing the regressing commit in the first place. Reported-and-tested-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401312229.eddeb9a6-oliver.sang@intel.com Fixes: dfad37051ade ("remap_range: move permission hooks out of do_clone_file_range()") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20240202102258.1582671-1-amir73il@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-01-11Merge tag 'pull-rename' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull rename updates from Al Viro: "Fix directory locking scheme on rename This was broken in 6.5; we really can't lock two unrelated directories without holding ->s_vfs_rename_mutex first and in case of same-parent rename of a subdirectory 6.5 ends up doing just that" * tag 'pull-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: rename(): avoid a deadlock in the case of parents having no common ancestor kill lock_two_inodes() rename(): fix the locking of subdirectories f2fs: Avoid reading renamed directory if parent does not change ext4: don't access the source subdirectory content on same-directory rename ext2: Avoid reading renamed directory if parent does not change udf_rename(): only access the child content on cross-directory rename ocfs2: Avoid touching renamed directory if parent does not change reiserfs: Avoid touching renamed directory if parent does not change
2024-01-10Merge tag 'ovl-update-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs updates from Amir Goldstein: "This is a very small update with no bug fixes and no new features. The larger update of overlayfs for this cycle, the re-factoring of overlayfs code into generic backing_file helpers, was already merged via Christian. Summary: - Simplify/clarify some code No bug fixes here, just some changes following questions from Al about overlayfs code that could be a little more simple to follow. - Overlayfs documentation style fixes Mainly fixes for ReST formatting suggested by documentation developers" * tag 'ovl-update-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: overlayfs.rst: fix ReST formatting overlayfs.rst: use consistent feature names ovl: initialize ovl_copy_up_ctx.destname inside ovl_do_copy_up() ovl: remove redundant ofs->indexdir member
2024-01-08Merge tag 'vfs-6.8.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds
Pull vfs rw updates from Christian Brauner: "This contains updates from Amir for read-write backing file helpers for stacking filesystems such as overlayfs: - Fanotify is currently in the process of introducing pre content events. Roughly, a new permission event will be added indicating that it is safe to write to the file being accessed. These events are used by hierarchical storage managers to e.g., fill the content of files on first access. During that work we noticed that our current permission checking is inconsistent in rw_verify_area() and remap_verify_area(). Especially in the splice code permission checking is done multiple times. For example, one time for the whole range and then again for partial ranges inside the iterator. In addition, we mostly do permission checking before we call file_start_write() except for a few places where we call it after. For pre-content events we need such permission checking to be done before file_start_write(). So this is a nice reason to clean this all up. After this series, all permission checking is done before file_start_write(). As part of this cleanup we also massaged the splice code a bit. We got rid of a few helpers because we are alredy drowning in special read-write helpers. We also cleaned up the return types for splice helpers. - Introduce generic read-write helpers for backing files. This lifts some overlayfs code to common code so it can be used by the FUSE passthrough work coming in over the next cycles. Make Amir and Miklos the maintainers for this new subsystem of the vfs" * tag 'vfs-6.8.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (30 commits) fs: fix __sb_write_started() kerneldoc formatting fs: factor out backing_file_mmap() helper fs: factor out backing_file_splice_{read,write}() helpers fs: factor out backing_file_{read,write}_iter() helpers fs: prepare for stackable filesystems backing file helpers fsnotify: optionally pass access range in file permission hooks fsnotify: assert that file_start_write() is not held in permission hooks fsnotify: split fsnotify_perm() into two hooks fs: use splice_copy_file_range() inline helper splice: return type ssize_t from all helpers fs: use do_splice_direct() for nfsd/ksmbd server-side-copy fs: move file_start_write() into direct_splice_actor() fs: fork splice_file_range() from do_splice_direct() fs: create {sb,file}_write_not_started() helpers fs: create file_write_started() helper fs: create __sb_write_started() helper fs: move kiocb_start_write() into vfs_iocb_iter_write() fs: move permission hook out of do_iter_read() fs: move permission hook out of do_iter_write() fs: move file_start_write() into vfs_iter_write() ...
2023-12-17ovl: fix dentry reference leak after changes to underlying layersAmir Goldstein
syzbot excercised the forbidden practice of moving the workdir under lowerdir while overlayfs is mounted and tripped a dentry reference leak. Fixes: c63e56a4a652 ("ovl: do not open/llseek lower file with upper sb_writers held") Reported-and-tested-by: syzbot+8608bb4553edb8c78f41@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-12-12splice: return type ssize_t from all helpersAmir Goldstein
Not sure why some splice helpers return long, maybe historic reasons. Change them all to return ssize_t to conform to the splice methods and to the rest of the helpers. Suggested-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20231208-horchen-helium-d3ec1535ede5@brauner/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231212094440.250945-2-amir73il@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-12-01fs: move file_start_write() into direct_splice_actor()Amir Goldstein
The callers of do_splice_direct() hold file_start_write() on the output file. This may cause file permission hooks to be called indirectly on an overlayfs lower layer, which is on the same filesystem of the output file and could lead to deadlock with fanotify permission events. To fix this potential deadlock, move file_start_write() from the callers into the direct_splice_actor(), so file_start_write() will not be held while splicing from the input file. Suggested-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20231128214258.GA2398475@perftesting/ Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231130141624.3338942-3-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-25rename(): avoid a deadlock in the case of parents having no common ancestorAl Viro
... and fix the directory locking documentation and proof of correctness. Holding ->s_vfs_rename_mutex *almost* prevents ->d_parent changes; the case where we really don't want it is splicing the root of disconnected tree to somewhere. In other words, ->s_vfs_rename_mutex is sufficient to stabilize "X is an ancestor of Y" only if X and Y are already in the same tree. Otherwise it can go from false to true, and one can construct a deadlock on that. Make lock_two_directories() report an error in such case and update the callers of lock_rename()/lock_rename_child() to handle such errors. And yes, such conditions are not impossible to create ;-/ Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-11-24ovl: add permission hooks outside of do_splice_direct()Amir Goldstein
The main callers of do_splice_direct() also call rw_verify_area() for the entire range that is being copied, e.g. by vfs_copy_file_range() or do_sendfile() before calling do_splice_direct(). The only caller that does not have those checks for entire range is ovl_copy_up_file(). In preparation for removing the checks inside do_splice_direct(), add rw_verify_area() call in ovl_copy_up_file(). For extra safety, perform minimal sanity checks from rw_verify_area() for non negative offsets also in the copy up do_splice_direct() loop without calling the file permission hooks. This is needed for fanotify "pre content" events. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231122122715.2561213-2-amir73il@gmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-20ovl: initialize ovl_copy_up_ctx.destname inside ovl_do_copy_up()Amir Goldstein
The ->destname member of struct ovl_copy_up_ctx is initialized inside ovl_copy_up_one() to ->d_name of the overlayfs dentry being copied up and then it may be overridden by index name inside ovl_do_copy_up(). ovl_inode_lock() in ovl_copy_up_start() and ovl_copy_up() in ovl_rename() effectively stabilze ->d_name of the overlayfs dentry being copied up, but ovl_inode_lock() is not held when ->d_name is being read. It is not a correctness bug, because if ovl_do_copy_up() races with ovl_rename() and ctx.destname is freed, we will not end up calling ovl_do_copy_up() with the dead name reference. The code becomes much easier to understand and to document if the initialization of c->destname is always done inside ovl_do_copy_up(), either to the index entry name, or to the overlay dentry ->d_name. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-10-31ovl: do not encode lower fh with upper sb_writers heldAmir Goldstein
When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs. The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal. Move all the callers that encode lower fh to before ovl_want_write(). Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-10-31ovl: do not open/llseek lower file with upper sb_writers heldAmir Goldstein
overlayfs file open (ovl_maybe_lookup_lowerdata) and overlay file llseek take the ovl_inode_lock, without holding upper sb_writers. In case of nested lower overlay that uses same upper fs as this overlay, lockdep will warn about (possibly false positive) circular lock dependency when doing open/llseek of lower ovl file during copy up with our upper sb_writers held, because the locking ordering seems reverse to the locking order in ovl_copy_up_start(): - lower ovl_inode_lock - upper sb_writers Let the copy up "transaction" keeps an elevated mnt write count on upper mnt, but leaves taking upper sb_writers to lower level helpers only when they actually need it. This allows to avoid holding upper sb_writers during lower file open/llseek and prevents the lockdep warning. Minimizing the scope of upper sb_writers during copy up is also needed for fixing another possible deadlocks by a following patch. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-10-31ovl: reorder ovl_want_write() after ovl_inode_lock()Amir Goldstein
Make the locking order of ovl_inode_lock() strictly between the two vfs stacked layers, i.e.: - ovl vfs locks: sb_writers, inode_lock, ... - ovl_inode_lock - upper vfs locks: sb_writers, inode_lock, ... To that effect, move ovl_want_write() into the helpers ovl_nlink_start() and ovl_copy_up_start which currently take the ovl_inode_lock() after ovl_want_write(). Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-09-26Merge tag 'v6.6-rc4.vfs.fixes' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "This contains the usual miscellaneous fixes and cleanups for vfs and individual fses: Fixes: - Revert ki_pos on error from buffered writes for direct io fallback - Add missing documentation for block device and superblock handling for changes merged this cycle - Fix reiserfs flexible array usage - Ensure that overlayfs sets ctime when setting mtime and atime - Disable deferred caller completions with overlayfs writes until proper support exists Cleanups: - Remove duplicate initialization in pipe code - Annotate aio kioctx_table with __counted_by" * tag 'v6.6-rc4.vfs.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: overlayfs: set ctime when setting mtime and atime ntfs3: put resources during ntfs_fill_super() ovl: disable IOCB_DIO_CALLER_COMP porting: document superblock as block device holder porting: document new block device opening order fs/pipe: remove duplicate "offset" initializer fs-writeback: do not requeue a clean inode having skipped pages aio: Annotate struct kioctx_table with __counted_by direct_write_fallback(): on error revert the ->ki_pos update from buffered write reiserfs: Replace 1-element array with C99 style flex-array
2023-09-25overlayfs: set ctime when setting mtime and atimeJeff Layton
Nathan reported that he was seeing the new warning in setattr_copy_mgtime pop when starting podman containers. Overlayfs is trying to set the atime and mtime via notify_change without also setting the ctime. POSIX states that when the atime and mtime are updated via utimes() that we must also update the ctime to the current time. The situation with overlayfs copy-up is analogies, so add ATTR_CTIME to the bitmask. notify_change will fill in the value. Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Amir Goldstein <amir73il@gmail.com> Message-Id: <20230913-ctime-v1-1-c6bc509cbc27@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-09-04ovl: fix failed copyup of fileattr on a symlinkAmir Goldstein
Some local filesystems support setting persistent fileattr flags (e.g. FS_NOATIME_FL) on directories and regular files via ioctl. Some of those persistent fileattr flags are reflected to vfs as in-memory inode flags (e.g. S_NOATIME). Overlayfs uses the in-memory inode flags (e.g. S_NOATIME) on a lower file as an indication that a the lower file may have persistent inode fileattr flags (e.g. FS_NOATIME_FL) that need to be copied to upper file. However, in some cases, the S_NOATIME in-memory flag could be a false indication for persistent FS_NOATIME_FL fileattr. For example, with NFS and FUSE lower fs, as was the case in the two bug reports, the S_NOATIME flag is set unconditionally for all inodes. Users cannot set persistent fileattr flags on symlinks and special files, but in some local fs, such as ext4/btrfs/tmpfs, the FS_NOATIME_FL fileattr flag are inheritted to symlinks and special files from parent directory. In both cases described above, when lower symlink has the S_NOATIME flag, overlayfs will try to copy the symlink's fileattrs and fail with error ENOXIO, because it could not open the symlink for the ioctl security hook. To solve this failure, do not attempt to copyup fileattrs for anything other than directories and regular files. Reported-by: Ruiwen Zhao <ruiwen@google.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217850 Fixes: 72db82115d2b ("ovl: copy up sync/noatime fileattr flags") Cc: <stable@vger.kernel.org> # v5.15 Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-08-12ovl: make consistent use of OVL_FS()Andrea Righi
Always use OVL_FS() to retrieve the corresponding struct ovl_fs from a struct super_block. Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-08-12ovl: add support for unique fsid per instanceAmir Goldstein
The legacy behavior of ovl_statfs() reports the f_fsid filled by underlying upper fs. This fsid is not unique among overlayfs instances on the same upper fs. With mount option uuid=on, generate a non-persistent uuid per overlayfs instance and use it as the seed for f_fsid, similar to tmpfs. This is useful for reporting fanotify events with fid info from different instances of overlayfs over the same upper fs. The old behavior of null uuid and upper fs fsid is retained with the mount option uuid=null, which is the default. The mount option uuid=off that disables uuid checks in underlying layers also retains the legacy behavior. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-08-12ovl: Handle verity during copy-upAlexander Larsson
During regular metacopy, if lowerdata file has fs-verity enabled, and the verity option is enabled, we add the digest to the metacopy xattr. If verity is required, and lowerdata does not have fs-verity enabled, fall back to full copy-up (or the generated metacopy would not validate). Signed-off-by: Alexander Larsson <alexl@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-08-12ovl: Validate verity xattr when resolving lowerdataAlexander Larsson
The new digest field in the metacopy xattr is used during lookup to record whether the header contained a digest in the OVL_HAS_DIGEST flags. When accessing file data the first time, if OVL_HAS_DIGEST is set, we reload the metadata and check that the source lowerdata inode matches the specified digest in it (according to the enabled verity options). If the verity check passes we store this info in the inode flags as OVL_VERIFIED_DIGEST, so that we can avoid doing it again if the inode remains in memory. The verification is done in ovl_maybe_validate_verity() which needs to be called in the same places as ovl_maybe_lookup_lowerdata(), so there is a new ovl_verify_lowerdata() helper that calls these in the right order, and all current callers of ovl_maybe_lookup_lowerdata() are changed to call it instead. Signed-off-by: Alexander Larsson <alexl@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-06-19ovl: implement lazy lookup of lowerdata in data-only layersAmir Goldstein
Defer lookup of lowerdata in the data-only layers to first data access or before copy up. We perform lowerdata lookup before copy up even if copy up is metadata only copy up. We can further optimize this lookup later if needed. We do best effort lazy lookup of lowerdata for d_real_inode(), because this interface does not expect errors. The only current in-tree caller of d_real_inode() is trace_uprobe and this caller is likely going to be followed reading from the file, before placing uprobes on offset within the file, so lowerdata should be available when setting the uprobe. Tested-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19ovl: update of dentry revalidate flags after copy upAmir Goldstein
After copy up, we may need to update d_flags if upper dentry is on a remote fs and lower dentries are not. Add helpers to allow incremental update of the revalidate flags. Fixes: bccece1ead36 ("ovl: allow remote upper") Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-03-06ovl: check for ->listxattr() supportChristian Brauner
We have decoupled vfs_listxattr() from IOP_XATTR. Instead we just need to check whether inode->i_op->listxattr is implemented. Cc: linux-unionfs@vger.kernel.org Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2023-01-27ovl: fail on invalid uid/gid mapping at copy upMiklos Szeredi
If st_uid/st_gid doesn't have a mapping in the mounter's user_ns, then copy-up should fail, just like it would fail if the mounter task was doing the copy using "cp -a". There's a corner case where the "cp -a" would succeed but copy up fail: if there's a mapping of the invalid uid/gid (65534 by default) in the user namespace. This is because stat(2) will return this value if the mapping doesn't exist in the current user_ns and "cp -a" will in turn be able to create a file with this uid/gid. This behavior would be inconsistent with POSIX ACL's, which return -1 for invalid uid/gid which result in a failed copy. For consistency and simplicity fail the copy of the st_uid/st_gid are invalid. Fixes: 459c7c565ac3 ("ovl: unprivieged mounts") Cc: <stable@vger.kernel.org> # v5.11 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Seth Forshee <sforshee@kernel.org>
2023-01-27ovl: fix tmpfile leakMiklos Szeredi
Missed an error cleanup. Reported-by: syzbot+fd749a7ea127a84e0ffd@syzkaller.appspotmail.com Fixes: 2b1a77461f16 ("ovl: use vfs_tmpfile_open() helper") Cc: <stable@vger.kernel.org> # v6.1 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-10-20ovl: use posix acl apiChristian Brauner
Now that posix acls have a proper api us it to copy them. All filesystems that can serve as lower or upper layers for overlayfs have gained support for the new posix acl api in previous patches. So switch all internal overlayfs codepaths for copying posix acls to the new posix acl api. Acked-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-10-10Merge tag 'pull-tmpfile' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs tmpfile updates from Al Viro: "Miklos' ->tmpfile() signature change; pass an unopened struct file to it, let it open the damn thing. Allows to add tmpfile support to FUSE" * tag 'pull-tmpfile' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fuse: implement ->tmpfile() vfs: open inside ->tmpfile() vfs: move open right after ->tmpfile() vfs: make vfs_tmpfile() static ovl: use vfs_tmpfile_open() helper cachefiles: use vfs_tmpfile_open() helper cachefiles: only pass inode to *mark_inode_inuse() helpers cachefiles: tmpfile error handling cleanup hugetlbfs: cleanup mknod and tmpfile vfs: add vfs_tmpfile_open() helper
2022-09-24ovl: use vfs_tmpfile_open() helperMiklos Szeredi
If tmpfile is used for copy up, then use this helper to create the tmpfile and open it at the same time. This will later allow filesystems such as fuse to do this operation atomically. Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-09-01overlayfs: constify pathAl Viro
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-03Merge tag 'pull-work.lseek' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs lseek updates from Al Viro: "Jason's lseek series. Saner handling of 'lseek should fail with ESPIPE' - this gets rid of the magical no_llseek thing and makes checks consistent. In particular, the ad-hoc "can we do splice via internal pipe" checks got saner (and somewhat more permissive, which is what Jason had been after, AFAICT)" * tag 'pull-work.lseek' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: remove no_llseek fs: check FMODE_LSEEK to control internal pipe splicing vfio: do not set FMODE_LSEEK flag dma-buf: remove useless FMODE_LSEEK flag fs: do not compare against ->llseek fs: clear or set FMODE_LSEEK based on llseek function
2022-07-16fs: do not compare against ->llseekJason A. Donenfeld
Now vfs_llseek() can simply check for FMODE_LSEEK; if it's set, we know that ->llseek() won't be NULL and if it's not we should just fail with -ESPIPE. A couple of other places where we used to check for special values of ->llseek() (somewhat inconsistently) switched to checking FMODE_LSEEK. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-26attr: port attribute changes to new typesChristian Brauner
Now that we introduced new infrastructure to increase the type safety for filesystems supporting idmapped mounts port the first part of the vfs over to them. This ports the attribute changes codepaths to rely on the new better helpers using a dedicated type. Before this change we used to take a shortcut and place the actual values that would be written to inode->i_{g,u}id into struct iattr. This had the advantage that we moved idmappings mostly out of the picture early on but it made reasoning about changes more difficult than it should be. The filesystem was never explicitly told that it dealt with an idmapped mount. The transition to the value that needed to be stored in inode->i_{g,u}id appeared way too early and increased the probability of bugs in various codepaths. We know place the same value in struct iattr no matter if this is an idmapped mount or not. The vfs will only deal with type safe vfs{g,u}id_t. This makes it massively safer to perform permission checks as the type will tell us what checks we need to perform and what helpers we need to use. Fileystems raising FS_ALLOW_IDMAP can't simply write ia_vfs{g,u}id to inode->i_{g,u}id since they are different types. Instead they need to use the dedicated vfs{g,u}id_to_k{g,u}id() helpers that map the vfs{g,u}id into the filesystem. The other nice effect is that filesystems like overlayfs don't need to care about idmappings explicitly anymore and can simply set up struct iattr accordingly directly. Link: https://lore.kernel.org/lkml/CAHk-=win6+ahs1EwLkcq8apqLi_1wXFWbrPf340zYEhObpz4jA@mail.gmail.com [1] Link: https://lore.kernel.org/r/20220621141454.2914719-9-brauner@kernel.org Cc: Seth Forshee <sforshee@digitalocean.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> CC: linux-fsdevel@vger.kernel.org Reviewed-by: Seth Forshee <sforshee@digitalocean.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-04-28ovl: use ovl_path_getxattr() wrapperChristian Brauner
Add a helper that allows to retrieve ovl xattrs from either lower or upper layers. To stop passing mnt and dentry separately everywhere use struct path which more accurately reflects the tight coupling between mount and dentry in this helper. Swich over all places to pass a path argument that can operate on either upper or lower layers. This is needed to support idmapped base layers with overlayfs. Some helpers are always called with an upper dentry, which is now utilized by these helpers to create the path. Make this usage explicit by renaming the argument to "upperdentry" and by renaming the function as well in some cases. Also add a check in ovl_do_getxattr() to catch misuse of these functions. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: use ovl_lookup_upper() wrapperChristian Brauner
Introduce ovl_lookup_upper() as a simple wrapper around lookup_one(). Make it clear in the helper's name that this only operates on the upper layer. The wrapper will take upper layer's idmapping into account when checking permission in lookup_one(). Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: use ovl_do_notify_change() wrapperChristian Brauner
Introduce ovl_do_notify_change() as a simple wrapper around notify_change() to support idmapped layers. The helper mirrors other ovl_do_*() helpers that operate on the upper layers. When changing ownership of an upper object the intended ownership needs to be mapped according to the upper layer's idmapping. This mapping is the inverse to the mapping applied when copying inode information from an upper layer to the corresponding overlay inode. So e.g., when an upper mount maps files that are stored on-disk as owned by id 1001 to 1000 this means that calling stat on this object from an idmapped mount will report the file as being owned by id 1000. Consequently in order to change ownership of an object in this filesystem so it appears as being owned by id 1000 in the upper idmapped layer it needs to store id 1001 on disk. The mnt mapping helpers take care of this. All idmapping helpers are nops when no idmapped base layers are used. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: pass ofs to setattr operationsChristian Brauner
Pass down struct ovl_fs to setattr operations so we can ultimately retrieve the relevant upper mount and take the mount's idmapping into account when creating new filesystem objects. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: pass ofs to creation operationsChristian Brauner
Pass down struct ovl_fs to all creation helpers so we can ultimately retrieve the relevant upper mount and take the mount's idmapping into account when creating new filesystem objects. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: use wrappers to all vfs_*xattr() callsAmir Goldstein
Use helpers ovl_*xattr() to access user/trusted.overlay.* xattrs and use helpers ovl_do_*xattr() to access generic xattrs. This is a preparatory patch for using idmapped base layers with overlay. Note that a few of those places called vfs_*xattr() calls directly to reduce the amount of debug output. But as Miklos pointed out since overlayfs has been stable for quite some time the debug output isn't all that relevant anymore and the additional debug in all locations was actually quite helpful when developing this patch series. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-01-14ovl: don't fail copy up if no fileattr support on upperMiklos Szeredi
Christoph Fritz is reporting that failure to copy up fileattr when upper doesn't support fileattr or xattr results in a regression. Return success in these failure cases; this reverts overlayfs to the old behavior. Add a pr_warn_once() in these cases to still let the user know about the copy up failures. Reported-by: Christoph Fritz <chf.fritz@googlemail.com> Fixes: 72db82115d2b ("ovl: copy up sync/noatime fileattr flags") Cc: <stable@vger.kernel.org> # v5.15 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>