summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2020-07-28xfs: rename the ondisk dquot d_flags to d_typeDarrick J. Wong
The ondisk dquot stores the quota record type in the flags field. Rename this field to d_type to make the _type relationship between the ondisk and incore dquot more obvious. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: improve ondisk dquot flags checkingDarrick J. Wong
Create an XFS_DQTYPE_ANY mask for ondisk dquots flags, and use that to ensure that we never accept any garbage flags when we're loading dquots. While we're at it, restructure the quota type flag checking to use the proper masking. Note that I plan to add y2038 support soon, which will require a new xfs_dqtype_t flag for extended timestamp support, hence all the work to make the type masking work correctly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: create xfs_dqtype_t to represent quota typesDarrick J. Wong
Create a new type (xfs_dqtype_t) to represent the type of an incore dquot (user, group, project, or none). Rename the incore dquot's dq_flags field to q_type. This allows us to replace all the "uint type" arguments to the quota functions with "xfs_dqtype_t type", to make it obvious when we're passing a quota type argument into a function. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: replace a few open-coded XFS_DQTYPE_REC_MASK usesDarrick J. Wong
Fix a few places where we open-coded this mask constant. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: remove unnecessary quota type maskingDarrick J. Wong
When XFS' quota functions take a parameter for the quota type, they only care about the three quota record types (user, group, project). Internal state flags and whatnot should never be passed by callers and are an error. Now that we've moved responsibility for filtering out internal state to the callers, we can drop the masking everywhere else. In other words, if you call a quota function, you must only pass in one of XFS_DQTYPE_{USER,GROUP,PROJ}. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: always use xfs_dquot_type when extracting type from a dquotDarrick J. Wong
Always use the xfs_dquot_type helper to extract the quota type from an incore dquot. This moves responsibility for filtering internal state information and whatnot to anybody passing around a struct xfs_dquot. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: refactor quota type testingDarrick J. Wong
Certain functions can only act upon one quota type, so refactor those functions to use switch statements, in keeping with all the other high level xfs quota api calls. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: remove the XFS_QM_IS[UGP]DQ macrosDarrick J. Wong
Remove these macros and use xfs_dquot_type() for everything. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: refactor testing if a particular dquot is being enforcedDarrick J. Wong
Create a small helper to test if enforcement is enabled for a given incore dquot and replace the open-code logic testing. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: rename XFS_DQ_{USER,GROUP,PROJ} to XFS_DQTYPE_*Darrick J. Wong
We're going to split up the incore dquot state flags from the ondisk dquot flags (eventually renaming this "type") so start by renaming the three flags and the bitmask that are going to participate in this. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: drop the type parameter from xfs_dquot_verifyDarrick J. Wong
xfs_qm_reset_dqcounts (aka quotacheck) is the only xfs_dqblk_verify caller that actually knows the specific quota type that it's looking for. Since everything else just pass in type==0 (including the buffer verifier), drop the parameter and open-code the check like xfs_dquot_from_disk already does. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: add more dquot tracepointsDarrick J. Wong
Add all the xfs_dquot fields to the tracepoint for that type; add a new tracepoint type for the qtrx structure (dquot transaction deltas); and use our new tracepoints. This makes it easier for the author to trace changes to dquot counters for debugging. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: actually bump warning counts when we send warningsDarrick J. Wong
Currently, xfs quotas have the ability to send netlink warnings when a user exceeds the limits. They also have all the support code necessary to convert softlimit warnings into failures if the number of warnings exceeds a limit set by the administrator. Unfortunately, we never actually increase the warning counter, so this never actually happens. Make it so we actually do something useful with the warning counts. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-07-28xfs: assume the default quota limits are always set in xfs_qm_adjust_dqlimitsDarrick J. Wong
We always initialize the default quota limits to something nowadays, so we don't need to check that the defaults are set to something before using them. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-07-28xfs: refactor xfs_trans_apply_dquot_deltasDarrick J. Wong
Hoist the code that adjusts the incore quota reservation count adjustments into a separate function, both to reduce the level of indentation and also to reduce the amount of open-coded logic. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: refactor xfs_trans_dqresvDarrick J. Wong
Now that we've refactored the resource usage and limits into per-resource structures, we can refactor some of the open-coded reservation limit checking in xfs_trans_dqresv. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: refactor xfs_qm_scall_setqlimDarrick J. Wong
Now that we can pass around quota resource and limit structures, clean up the open-coded field setting in xfs_qm_scall_setqlim. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-07-28xfs: refactor quota exceeded testDarrick J. Wong
Refactor the open-coded test for whether or not we're over quota. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: remove unnecessary arguments from quota adjust functionsDarrick J. Wong
struct xfs_dquot already has a pointer to the xfs mount, so remove the redundant parameter from xfs_qm_adjust_dq*. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: refactor default quota limits by resourceDarrick J. Wong
Now that we've split up the dquot resource fields into separate structs, do the same for the default limits to enable further refactoring. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: remove qcore from incore dquotsDarrick J. Wong
Now that we've stopped using qcore entirely, drop it from the incore dquot. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: stop using q_core timers in the quota codeDarrick J. Wong
Add timers fields to the incore dquot, and use that instead of the ones in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>
2020-07-28xfs: stop using q_core warning counters in the quota codeDarrick J. Wong
Add warning counter fields to the incore dquot, and use that instead of the ones in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>
2020-07-28xfs: stop using q_core counters in the quota codeDarrick J. Wong
Add counter fields to the incore dquot, and use that instead of the ones in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>
2020-07-28xfs: stop using q_core limits in the quota codeDarrick J. Wong
Add limits fields in the incore dquot, and use that instead of the ones in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>
2020-07-28xfs: use a per-resource struct for incore dquot dataDarrick J. Wong
Introduce a new struct xfs_dquot_res that we'll use to track all the incore data for a particular resource type (block, inode, rt block). This will help us (once we've eliminated q_core) to declutter quota functions that currently open-code field access or pass around fields around explicitly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>
2020-07-28xfs: stop using q_core.d_id in the quota codeDarrick J. Wong
Add a dquot id field to the incore dquot, and use that instead of the one in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. We also rearrange the start of xfs_dquot to remove padding holes, saving 8 bytes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>
2020-07-28xfs: stop using q_core.d_flags in the quota codeDarrick J. Wong
Use the incore dq_flags to figure out the dquot type. This is the first step towards removing xfs_disk_dquot from the incore dquot. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-07-28xfs: make XFS_DQUOT_CLUSTER_SIZE_FSB part of the ondisk formatDarrick J. Wong
Move the dquot cluster size #define to xfs_format.h. It is an important part of the ondisk format because the ondisk dquot record size is not an even power of two, which means that the buffer size we use is significant here because the kernel leaves slack space at the end of the buffer to avoid having to deal with a dquot record crossing a block boundary. This is also an excuse to fix one of the longstanding discrepancies between kernel and userspace libxfs headers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-07-28xfs: rename dquot incore state flagsDarrick J. Wong
Rename the existing incore dquot "dq_flags" field to "q_flags" to match everything else in the structure, then move the two actual dquot state flags to the XFS_DQFLAG_ namespace from XFS_DQ_. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-07-28xfs: refactor quotacheck flags usageDarrick J. Wong
We only use the XFS_QMOPT flags in quotacheck to signal the quota type, so rip out all the flags handling and just pass the type all the way through. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-07-28xfs: move the flags argument of xfs_qm_scall_trunc_qfiles to XFS_QMOPT_*Darrick J. Wong
Since xfs_qm_scall_trunc_qfiles can take a bitset of quota types that we want to truncate, change the flags argument to take XFS_QMOPT_[UGP}QUOTA so that the next patch can start to deprecate XFS_DQ_*. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-07-28xfs: validate ondisk/incore dquot flagsDarrick J. Wong
While loading dquot records off disk, make sure that the quota type flags are the same between the incore dquot and the ondisk dquot. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-07-28xfs: fix inode quota reservation checksDarrick J. Wong
xfs_trans_dqresv is the function that we use to make reservations against resource quotas. Each resource contains two counters: the q_core counter, which tracks resources allocated on disk; and the dquot reservation counter, which tracks how much of that resource has either been allocated or reserved by threads that are working on metadata updates. For disk blocks, we compare the proposed reservation counter against the hard and soft limits to decide if we're going to fail the operation. However, for inodes we inexplicably compare against the q_core counter, not the incore reservation count. Since the q_core counter is always lower than the reservation count and we unlock the dquot between reservation and transaction commit, this means that multiple threads can reserve the last inode count before we hit the hard limit, and when they commit, we'll be well over the hard limit. Fix this by checking against the incore inode reservation counter, since we would appear to maintain that correctly (and that's what we report in GETQUOTA). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: clear XFS_DQ_FREEING if we can't lock the dquot buffer to flushDarrick J. Wong
In commit 8d3d7e2b35ea, we changed xfs_qm_dqpurge to bail out if we can't lock the dquot buf to flush the dquot. This prevents the AIL from blocking on the dquot, but it also forgets to clear the FREEING flag on its way out. A subsequent purge attempt will see the FREEING flag is set and bail out, which leads to dqpurge_all failing to purge all the dquots. (copy-pasting from Dave Chinner's identical patch) This was found by inspection after having xfs/305 hang 1 in ~50 iterations in a quotaoff operation: [ 8872.301115] xfs_quota D13888 92262 91813 0x00004002 [ 8872.302538] Call Trace: [ 8872.303193] __schedule+0x2d2/0x780 [ 8872.304108] ? do_raw_spin_unlock+0x57/0xd0 [ 8872.305198] schedule+0x6e/0xe0 [ 8872.306021] schedule_timeout+0x14d/0x300 [ 8872.307060] ? __next_timer_interrupt+0xe0/0xe0 [ 8872.308231] ? xfs_qm_dqusage_adjust+0x200/0x200 [ 8872.309422] schedule_timeout_uninterruptible+0x2a/0x30 [ 8872.310759] xfs_qm_dquot_walk.isra.0+0x15a/0x1b0 [ 8872.311971] xfs_qm_dqpurge_all+0x7f/0x90 [ 8872.313022] xfs_qm_scall_quotaoff+0x18d/0x2b0 [ 8872.314163] xfs_quota_disable+0x3a/0x60 [ 8872.315179] kernel_quotactl+0x7e2/0x8d0 [ 8872.316196] ? __do_sys_newstat+0x51/0x80 [ 8872.317238] __x64_sys_quotactl+0x1e/0x30 [ 8872.318266] do_syscall_64+0x46/0x90 [ 8872.319193] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 8872.320490] RIP: 0033:0x7f46b5490f2a [ 8872.321414] Code: Bad RIP value. Returning -EAGAIN from xfs_qm_dqpurge() without clearing the XFS_DQ_FREEING flag means the xfs_qm_dqpurge_all() code can never free the dquot, and we loop forever waiting for the XFS_DQ_FREEING flag to go away on the dquot that leaked it via -EAGAIN. Fixes: 8d3d7e2b35ea ("xfs: trylock underlying buffer on dquot flush") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-07-28xfs: fix inode allocation block res calculation precedenceBrian Foster
The block reservation calculation for inode allocation is supposed to consist of the blocks required for the inode chunk plus (maxlevels-1) of the inode btree multiplied by the number of inode btrees in the fs (2 when finobt is enabled, 1 otherwise). Instead, the macro returns (ialloc_blocks + 2) due to a precedence error in the calculation logic. This leads to block reservation overruns via generic/531 on small block filesystems with finobt enabled. Add braces to fix the calculation and reserve the appropriate number of blocks. Fixes: 9d43b180af67 ("xfs: update inode allocation/free transaction reservations for finobt") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-07-28xfs: drain the buf delwri queue before xfsaild idlesBrian Foster
xfsaild is racy with respect to transaction abort and shutdown in that the task can idle or exit with an empty AIL but buffers still on the delwri queue. This was partly addressed by cancelling the delwri queue before the task exits to prevent memory leaks, but it's also possible for xfsaild to empty and idle with buffers on the delwri queue. For example, a transaction that pins a buffer that also happens to sit on the AIL delwri queue will explicitly remove the associated log item from the AIL if the transaction aborts. The side effect of this is an unmount hang in xfs_wait_buftarg() as the associated buffers remain held by the delwri queue indefinitely. This is reproduced on repeated runs of generic/531 with an fs format (-mrmapbt=1 -bsize=1k) that happens to also reproduce transaction aborts. Update xfsaild to not idle until both the AIL and associated delwri queue are empty and update the push code to continue delwri queue submission attempts even when the AIL is empty. This allows the AIL to eventually release aborted buffers stranded on the delwri queue when they are unlocked by the associated transaction. This should have no significant effect on normal runtime behavior because the xfsaild currently idles only when the AIL is empty and in practice the AIL is rarely empty with a populated delwri queue. The items must be AIL resident to land in the queue in the first place and generally aren't removed until writeback completes. Note that the pre-existing delwri queue cancel logic in the exit path is retained because task stop is external, could technically come at any point, and xfsaild is still responsible to release its buffer references before it exits. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-07-28fs/dax: Remove unused size parameterIra Weiny
Passing size to copy_user_dax implies it can copy variable sizes of data when in fact it calls copy_user_page() which is exactly a page. We are safe because the only caller uses PAGE_SIZE anyway so just remove the variable for clarity. While we are at it change copy_user_dax() to copy_cow_page_dax() to make it clear it is a singleton helper for this one case not implementing what dax_iomap_actor() does. Link: https://lore.kernel.org/r/20200717072056.73134-11-ira.weiny@intel.com Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-28orangefs: posix acl fix...Mike Marshall
Al Viro pointed out that I broke some acl functionality... * ACLs could not be fully removed * posix_acl_chmod would be called while the old ACL was still cached * new mode propagated to orangefs server before ACL. ... when I tried to make sure that modes that got changed as a result of ACL-sets would be sent back to the orangefs server. Not wanting to try and change the code without having some cases to test it with, I began to hunt for setfacl examples that were expressible in pure mode. Along the way I found examples like the following which confused me: user A had a file (/home/A/asdf) with mode 740 user B was in user A's group user C was not in user A's group setfacl -m u:C:rwx /home/A/asdf The above setfacl caused ls -l /home/A/asdf to show a mode of 770, making it appear that all users in user A's group now had full access to /home/A/asdf, however, user B still only had read acces. Madness. Anywho, I finally found that the above (whacky as it is) appears to be "posixly on purpose" and explained in acl(5): If the ACL has an ACL_MASK entry, the group permissions correspond to the permissions of the ACL_MASK entry. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2020-07-28NFS: remove redundant initialization of variable resultColin Ian King
The variable result is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-28fanotify: compare fsid when merging name eventJan Kara
When merging name events, fsids of the two involved events have to match. Otherwise we could merge events from two different filesystems and thus effectively loose the second event. Backporting note: Although the commit cacfb956d46e introducing this bug was merged for 5.7, the relevant code didn't get used in the end until 7e8283af6ede ("fanotify: report parent fid + name + child fid") which will be merged with this patch. So there's no need for backporting this. Fixes: cacfb956d46e ("fanotify: record name info for FAN_DIR_MODIFY event") Reported-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27fsnotify: create method handle_inode_event() in fsnotify_operationsAmir Goldstein
The method handle_event() grew a lot of complexity due to the design of fanotify and merging of ignore masks. Most backends do not care about this complex functionality, so we can hide this complexity from them. Introduce a method handle_inode_event() that serves those backends and passes a single inode mark and less arguments. This change converts all backends except fanotify and inotify to use the simplified handle_inode_event() method. In pricipal, inotify could have also used the new method, but that would require passing more arguments on the simple helper (data, data_type, cookie), so we leave it with the handle_event() method. Link: https://lore.kernel.org/r/20200722125849.17418-9-amir73il@gmail.com Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27fanotify: report parent fid + child fidAmir Goldstein
Add support for FAN_REPORT_FID | FAN_REPORT_DIR_FID. Internally, it is implemented as a private case of reporting both parent and child fids and name, the parent and child fids are recorded in a variable length fanotify_name_event, but there is no name. It should be noted that directory modification events are recorded in fixed size fanotify_fid_event when not reporting name, just like with group flags FAN_REPORT_FID. Link: https://lore.kernel.org/r/20200716084230.30611-23-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27fanotify: report parent fid + name + child fidAmir Goldstein
For a group with fanotify_init() flag FAN_REPORT_DFID_NAME, the parent fid and name are reported for events on non-directory objects with an info record of type FAN_EVENT_INFO_TYPE_DFID_NAME. If the group also has the init flag FAN_REPORT_FID, the child fid is also reported with another info record that follows the first info record. The second info record is the same info record that would have been reported to a group with only FAN_REPORT_FID flag. When the child fid needs to be recorded, the variable size struct fanotify_name_event is preallocated with enough space to store the child fh between the dir fh and the name. Link: https://lore.kernel.org/r/20200716084230.30611-22-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27fanotify: add support for FAN_REPORT_NAMEAmir Goldstein
Introduce a new fanotify_init() flag FAN_REPORT_NAME. It requires the flag FAN_REPORT_DIR_FID and there is a constant for setting both flags named FAN_REPORT_DFID_NAME. For a group with flag FAN_REPORT_NAME, the parent fid and name are reported for directory entry modification events (create/detete/move) and for events on non-directory objects. Events on directories themselves are reported with their own fid and "." as the name. The parent fid and name are reported with an info record of type FAN_EVENT_INFO_TYPE_DFID_NAME, similar to the way that parent fid is reported with into type FAN_EVENT_INFO_TYPE_DFID, but with an appended null terminated name string. Link: https://lore.kernel.org/r/20200716084230.30611-21-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27fanotify: report events with parent dir fid to sb/mount/non-dir marksAmir Goldstein
In a group with flag FAN_REPORT_DIR_FID, when adding an inode mark with FAN_EVENT_ON_CHILD, events on non-directory children are reported with the fid of the parent. When adding a filesystem or mount mark or mark on a non-dir inode, we want to report events that are "possible on child" (e.g. open/close) also with fid of the parent, as if the victim inode's parent is interested in events "on child". Some events, currently only FAN_MOVE_SELF, should be reported to a sb/mount/non-dir mark with parent fid even though they are not reported to a watching parent. To get the desired behavior we set the flag FAN_EVENT_ON_CHILD on all the sb/mount/non-dir mark masks in a group with FAN_REPORT_DIR_FID. Link: https://lore.kernel.org/r/20200716084230.30611-20-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27fanotify: add basic support for FAN_REPORT_DIR_FIDAmir Goldstein
For now, the flag is mutually exclusive with FAN_REPORT_FID. Events include a single info record of type FAN_EVENT_INFO_TYPE_DFID with a directory file handle. For now, events are only reported for: - Directory modification events - Events on children of a watching directory - Events on directory objects Soon, we will add support for reporting the parent directory fid for events on non-directories with filesystem/mount mark and support for reporting both parent directory fid and child fid. Link: https://lore.kernel.org/r/20200716084230.30611-19-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27fsnotify: send event with parent/name info to sb/mount/non-dir marksAmir Goldstein
Similar to events "on child" to watching directory, send event with parent/name info if sb/mount/non-dir marks are interested in parent/name info. The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify interest in parent/name info for events on non-directory inodes. Events on "orphan" children (disconnected dentries) are sent without parent/name info. Events on directories are sent with parent/name info only if the parent directory is watching. After this change, even groups that do not subscribe to events on children could get an event with mark iterator type TYPE_CHILD and without mark iterator type TYPE_INODE if fanotify has marks on the same objects. dnotify and inotify event handlers can already cope with that situation. audit does not subscribe to events that are possible on child, so won't get to this situation. nfsd does not access the marks iterator from its event handler at the moment, so it is not affected. This is a bit too fragile, so we should prepare all groups to cope with mark type TYPE_CHILD preferably using a generic helper. Link: https://lore.kernel.org/r/20200716084230.30611-16-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27inotify: do not set FS_EVENT_ON_CHILD in non-dir mark maskAmir Goldstein
FS_EVENT_ON_CHILD has currently no meaning for non-dir inode marks. In the following patches we want to use that bit to mean that mark's notification group cares about parent and name information. So stop setting FS_EVENT_ON_CHILD for non-dir marks. Link: https://lore.kernel.org/r/20200722125849.17418-3-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27fsnotify: pass dir and inode arguments to fsnotify()Amir Goldstein
The arguments of fsnotify() are overloaded and mean different things for different event types. Replace the to_tell argument with separate arguments @dir and @inode, because we may be sending to both dir and child. Using the @data argument to pass the child is not enough, because dirent events pass this argument (for audit), but we do not report to child. Document the new fsnotify() function argumenets. Link: https://lore.kernel.org/r/20200722125849.17418-7-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>