Age | Commit message (Collapse) | Author |
|
If the server returns the same delegation in an open that we just used
in a delegreturn, we need to ensure we don't apply that stateid if
the delegreturn has freed it on the server.
To do so, we ensure that we do not free the storage for the delegation
until either it is replaced by a new one, or we throw the inode out of
cache.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
In nfs_inode_find_state_and_recover() we want to mark for recovery
only those stateids that match or are older than the supplied
stateid parameter.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Fix the checks in nfs4_inode_make_writeable() to ignore the case where
we hold no delegations. Currently, in such a case, we automatically
flush writes.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Ensure that we check that the delegation is valid in
nfs4_return_incompatible_delegation() before we try to return it.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If the delegation has already been revoked, we want to avoid reclaiming
it on reboot.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If the delegation was revoked, or is already being returned, just
clear the NFS_DELEGATION_RETURN and NFS_DELEGATION_RETURN_IF_CLOSED
flags and keep going.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If the delegation was successfully returned, then mark it as revoked.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If we revoke a delegation, but the stateid's seqid is newer, then
ensure we update the seqid when marking the delegation as revoked.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If the server sent us a new delegation stateid that is more recent than
the one that got revoked, then clear the NFS_DELEGATION_REVOKED flag.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Add a check to ensure that we haven't already removed the delegation
from the inode after we take all the relevant locks.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Rename nfs_inode_return_delegation_noreclaim() to
nfs_inode_evict_delegation(), which better describes what it
does.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If the delegation was revoked, we don't want to retry the delegreturn.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If we're processsing a delegation recall, ignore the delegations that
have already been revoked or returned.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If the delegation has been revoked, ignore it.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If the delegation is marked as being revoked, then don't use it in
the open state structure.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
at the source port when deciding whether or not an RPC call is a replay
of a previous call. This requires clients to perform strange TCP gymnastics
in order to ensure that when they reconnect to the server, they bind
to the same source port.
NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
that do not look at the source port of the connection. This patch therefore
ensures they can ignore the rebind requirement.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If a NFSv3 server is being used as both a DS and as a regular NFSv3 server,
we may want to keep the IO traffic on a separate TCP connection, since
it will typically have very different timeout characteristics.
This patch therefore sets up a flag to separate the two modes of operation
for the nfs_client.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Connecting to the DS is a non-interactive, asynchronous task, so there is
no reason to fire up an extra RPC null ping in order to ensure that the
server is up.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Add a flag to tell the nfs_client it should set RPC_CLNT_CREATE_NOPING when
creating the rpc client.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
We don't need atomic bit ops when initialising a local structure on the
stack.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Simplify the struct iattr timestamp encoding by skipping the step of
an intermediate struct timespec.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Simplify the struct iattr timestamp encoding by skipping the step of
an intermediate struct timespec.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Encode the mtime correctly.
Fixes: 95582b0083883 ("vfs: change inode times to use struct timespec64")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Convert the NFSv4 callbacks to use struct timestamp64, rather than
truncating times to 32-bit values.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
NFSv4 supports 64-bit timestamps, so there is no point in converting
the struct iattr timestamps to 32-bits before encoding.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
NFSv4 supports 64-bit times, so we should switch to using struct
timespec64 when decoding attributes.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If we set nfs_mountpoint_expiry_timeout to a negative value, then
allow that to imply that we do not expire NFSv4 submounts.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Always set XFS_ALLOC_USERDATA for data fork allocations, and check it
in xfs_alloc_is_userdata instead of the current obsfucated check.
Also remove the xfs_alloc_is_userdata and xfs_alloc_allow_busy_reuse
helpers to make the code a little easier to understand.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Move the extent zeroing case there for the XFS_BMAPI_ZERO flag outside
the low-level allocator and into xfs_bmapi_allocate, where is still
is in transaction context, but outside the very lowlevel code where
it doesn't belong.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Avoid duplicate userdata and data fork checks by restructuring the code
so we only have a helper for userdata allocations that combines these
checks in a straight foward way. That also helps to obsoletes the
comments explaining what the code does as it is now clearly obvious.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Move the EOF alignment and checking for the next allocated extent into
the callers to avoid the need to pass the byte based offset and count
as well as looking at the incoming imap. The added benefit is that
the caller can unlock the incoming ilock and the function doesn't have
funny unbalanced locking contexts.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Even if we are asked for a write layout there is no point in logging
the inode unless we actually modified it in some way.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
We should never see delalloc blocks for a pNFS layout, write or not.
Adjust the assert to check for that.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
And move the code dependent on it to the one caller that cares
instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
By open coding xfs_bmap_last_extent instead of calling it through a
double indirection we don't need to handle an error return that
can't happen given that we are guaranteed to have the extent list in
memory already. Also simplify the calling conventions a little and
move the extent list assert from the only caller into the function.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Currently, block device size in not updated on second and further open
for block devices where partition scan is disabled. This is particularly
annoying for example for DVD drives as that means block device size does
not get updated once the media is inserted into a drive if the device is
already open when inserting the media. This is actually always the case
for example when pktcdvd is in use.
Fix the problem by revalidating block device size on every open even for
devices with partition scan disabled.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Factor out code handling revalidation of bdev on disk change into a
common helper.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
No one checks the return value of debugfs_create_atomic_t(), as it's not
needed, so make the return value void, so that no one tries to do so in
the future.
Link: https://lore.kernel.org/r/20191016130332.GA28240@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Pull cifs fix from Steve French:
"A small smb3 memleak fix"
* tag '5.4-rc6-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6:
fix memory leak in large read decrypt offload
|
|
No one checks the return value of debugfs_create_x8(), as it's not
needed, so make the return value void, so that no one tries to do so in
the future.
Link: https://lore.kernel.org/r/20191011132931.1186197-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The callback function of call_rcu() just calls kfree(), so we can use
kfree_rcu() instead of call_rcu() + callback function.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull NFS client bugfixes from Anna Schumaker:
"This contains two delegation fixes (with the RCU lock leak fix marked
for stable), and three patches to fix destroying the the sunrpc back
channel.
Stable bugfixes:
- Fix an RCU lock leak in nfs4_refresh_delegation_stateid()
Other fixes:
- The TCP back channel mustn't disappear while requests are
outstanding
- The RDMA back channel mustn't disappear while requests are
outstanding
- Destroy the back channel when we destroy the host transport
- Don't allow a cached open with a revoked delegation"
* tag 'nfs-for-5.4-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS: Fix an RCU lock leak in nfs4_refresh_delegation_stateid()
NFSv4: Don't allow a cached open with a revoked delegation
SUNRPC: Destroy the back channel when we destroy the host transport
SUNRPC: The RDMA back channel mustn't disappear while requests are outstanding
SUNRPC: The TCP back channel mustn't disappear while requests are outstanding
|
|
Pull block fixes from Jens Axboe:
- Two small nvme fixes, one is a fabrics connection fix, the other one
a cleanup made possible by that fix (Anton, via Keith)
- Fix requeue handling in umb ubd (Anton)
- Fix spin_lock_irq() nesting in blk-iocost (Dan)
- Three small io_uring fixes:
- Install io_uring fd after done with ctx (me)
- Clear ->result before every poll issue (me)
- Fix leak of shadow request on error (Pavel)
* tag 'for-linus-20191101' of git://git.kernel.dk/linux-block:
iocost: don't nest spin_lock_irq in ioc_weight_write()
io_uring: ensure we clear io_kiocb->result before each issue
um-ubd: Entrust re-queue to the upper layers
nvme-multipath: remove unused groups_only mode in ana log
nvme-multipath: fix possible io hang after ctrl reconnect
io_uring: don't touch ctx in setup after ring fd install
io_uring: Fix leaked shadow_req
|
|
|
|
A typo in nfs4_refresh_delegation_stateid() means we're leaking an
RCU lock, and always returning a value of 'false'. As the function
description states, we were always supposed to return 'true' if a
matching delegation was found.
Fixes: 12f275cdd163 ("NFSv4: Retry CLOSE and DELEGRETURN on NFS4ERR_OLD_STATEID.")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
If the delegation is marked as being revoked, we must not use it
for cached opens.
Fixes: 869f9dfa4d6d ("NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
We didn't use -ERESTARTSYS to tell the application layer to restart the
system call, but instead return -EINTR. we can set -EINTR directly when
wakeup by the signal, which can help us save an assignment operation and
comparison operation.
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This adds support for IORING_OP_ASYNC_CANCEL, which will attempt to
cancel requests that have been punted to async context and are now
in-flight. This works for regular read/write requests to files, as
long as they haven't been started yet. For socket based IO (or things
like accept4(2)), we can cancel work that is already running as well.
To cancel a request, the sqe must have ->addr set to the user_data of
the request it wishes to cancel. If the request is cancelled
successfully, the original request is completed with -ECANCELED
and the cancel request is completed with a result of 0. If the
request was already running, the original may or may not complete
in error. The cancel request will complete with -EALREADY for that
case. And finally, if the request to cancel wasn't found, the cancel
request is completed with -ENOENT.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|